# Adding Suggestions
This notebook will walk you through adding suggestions to LightTag throw the API. 

Conceptually, Suggestions come from a Model and a Model belongs to a Schema. Thus in order to add Suggestions you must register a Model under the relevant Schema and then populate the model with actual suggestions. 

To avoid confusion, while you are registering a Model, LightTag will not run your model, instead, it is up to you to generate the suggestions and upload them. Here we'll see how to do that

## The Steps
1. Retreiving the Examples we wish to generate suggestions on
2. Retreiveing the Schema we wish to define a Model for
3. Creating a model and suggestions
4. Uploading the suggestions and Model

## Step 0 - Basic setup

In [1]:
import requests
import json
import pandas as pd
from pprint import pprint
from requests.auth import HTTPBasicAuth
SERVER ="http://localhost:8000"
#SERVER = "https://api-demo.lighttag.io" #The server is https://api-{your_subdomain_name.lighttag.io}
API_BASE = SERVER +'/api/v1/'
LT_USERNAME = "demo" # Username of manager user
LT_PASSWORD = "demo" #password of manager user

response = requests.post(SERVER +"/auth/token/create/",
              json={"username":LT_USERNAME,"password":LT_PASSWORD}
             )
response

<Response [200]>

Once we log in, LightTag will return a token we can use for further calls. We'll put that in dict that we will pass as header

In [2]:
token = response.json()['key']
headers={'Authorization': 'Token {token}'.format(token=token)}
session = requests.session()
session.headers = headers

## Step 1 Getting the data examples we wish to annotate
To generate suggestions we will need data. Particularly we'll need both all of the examples in a dataset as well as the tags from a particular schema. 
While you could gather each of these seperatly, LightTag has a data download view which already returns both of these combined. 



### Retreiv the slug or url of the dataset with the examples you wish to add suggestions to
You can query the datasets endpoint. Remember, you are operating under the default project, so you should query 


projects/**default**/datasets/

In [3]:
datasets = session.get(API_BASE+'projects/default/datasets/').json()
pd.DataFrame(datasets) 
#It's convenient to display the results in a dataframe

Unnamed: 0,aggregation_field,content_field,id,id_field,name,order_field,project_id,slug,url
0,,content,e7dee97f-b133-4b2d-b84a-27c41c342e0b,,fufuf,,2a1daae8-280b-420c-a25f-95535e7b9582,fufuf,http://localhost:8000/api/v1/projects/default/...
1,verse,content,46430219-d856-4241-b22f-38fbbaa60060,,fvgbsfg,book,2a1daae8-280b-420c-a25f-95535e7b9582,fvgbsfg,http://localhost:8000/api/v1/projects/default/...
2,,content,cd330fbe-f191-4824-8aba-42dc661bf0e0,,mufu,,2a1daae8-280b-420c-a25f-95535e7b9582,mufu,http://localhost:8000/api/v1/projects/default/...


### Retreive the examples from that dataset by 
In this example we what a Dataset named Bible with a slug bible. We'll pull all of the examples via the examples endpoint
projects/default/datasets/**bible**/examples/

In [9]:

examples =session.get(API_BASE+'projects/default/datasets/fufuf/examples/').json()

In [10]:
examples

[{'aggregation_value': None,
  'content': ' And her adversary also provoked her sore, for to make her fret, because the LORD had shut up her womb.  ',
  'dataset': 'e7dee97f-b133-4b2d-b84a-27c41c342e0b',
  'id': '44030fb9-d259-4034-8d2a-e2af1bc3e506',
  'metadata': {'book': 'The First Book of the Kings',
   'chapter': 1,
   'verse': 6}},
 {'aggregation_value': None,
  'content': ' And Hannah prayed, and said, My heart rejoiceth in the LORD, mine horn is exalted in the LORD: my mouth is enlarged over mine enemies; because I rejoice in thy salvation.  ',
  'dataset': 'e7dee97f-b133-4b2d-b84a-27c41c342e0b',
  'id': 'b9a5489f-463c-4692-b9b3-c936b6c78e0d',
  'metadata': {'book': 'The First Book of the Kings',
   'chapter': 2,
   'verse': 1}},
 {'aggregation_value': None,
  'content': ' There is none holy as the LORD: for there is none beside thee: neither is there any rock like our God.  ',
  'dataset': 'e7dee97f-b133-4b2d-b84a-27c41c342e0b',
  'id': 'e594904a-a04a-47a2-88f3-0b28e4f06ff3',


We could equivalently do this by usig the URL field on the json returned from the datasets endpoint

In [11]:
bible_dataset = next(filter(lambda x:x['slug']=="bible2",datasets))

StopIteration: 

In [60]:
examples =session.get(bible_dataset['url']+'examples/').json()

In [12]:
examples[0]

{'aggregation_value': None,
 'content': ' And her adversary also provoked her sore, for to make her fret, because the LORD had shut up her womb.  ',
 'dataset': 'e7dee97f-b133-4b2d-b84a-27c41c342e0b',
 'id': '44030fb9-d259-4034-8d2a-e2af1bc3e506',
 'metadata': {'book': 'The First Book of the Kings', 'chapter': 1, 'verse': 6}}

## Step 2 Retreive the Schema and Tags we want to create suggestions for


In [13]:
schemas =session.get(API_BASE+'projects/default/schemas/').json()
pd.DataFrame(schemas)

Unnamed: 0,id,name,slug,url
0,6fba8291-9f5b-401e-a1cc-4176543d6584,mufen,mufen,http://localhost:8000/api/v1/projects/default/...
1,7d074347-2cfc-4ae9-bd93-95ce6b95e9b2,chicken,chicken,http://localhost:8000/api/v1/projects/default/...


In [14]:
schema = schemas[0]
tags = session.get(schema['url']+'tags/').json()
pd.DataFrame(tags)

Unnamed: 0,description,id,name
0,dfg,f833b15e-054d-4b7f-8ede-e1f88e7e5c67,dfvg


### Pro Tip 
To upload suggestions we'll need to use the tag ids, which can be cumbersome. It is often convenient to make a map
drom the tag name to the id like so : 

In [15]:
tagMap = {tag["name"]:tag["id"] for tag in tags}

In [18]:
tagMap

{'dfvg': 'f833b15e-054d-4b7f-8ede-e1f88e7e5c67'}

## Step 3: Create your suggestions
How you create a suggestion is up to you. Youa re free to use a dicitonary, regex, neural network or whatever else
is availble. 
In this example we'll use the [flashtext](https://github.com/vi3k6i5/flashtext) library to suggest the following

* Abraham, Isaac, Jacob and Moses will be labeled person
* Israel, Egypt and Moav will be labeled Nation 
* Dagon will be labeled Pagan god
* Jerusalem will be labeled place



In [19]:
from flashtext import KeywordProcessor
keyword_processor = KeywordProcessor()
keyword_processor.add_keyword("Abraham",tagMap["dfvg"])
keyword_processor.add_keyword("Isaac",tagMap["dfvg"])
keyword_processor.add_keyword("Jaacob",tagMap["dfvg"])
keyword_processor.add_keyword("Peninnah",tagMap["dfvg"])

keyword_processor.add_keyword("Dagon",tagMap["dfvg"])
keyword_processor.add_keyword("Jeruslame",tagMap["dfvg"])
keyword_processor.add_keyword("Egypt",tagMap["dfvg"])
keyword_processor.add_keyword("Israel",tagMap["dfvg"])
keyword_processor.add_keyword("Moav",tagMap["dfvg"])

True

In [20]:
keyword_processor.extract_keywords(" In the Bible Abraham met the LORD one day|",
                                   span_info=True)

[('f833b15e-054d-4b7f-8ede-e1f88e7e5c67', 14, 21)]

### Step 3.1: Iterate over your examples and make a  list of suggestions


In [29]:
suggestions = []
for example in examples: #(Notice the text is located in the content field)
    #For every suggestion that comes out of our model (in this case, flashtext)
    for tag_id,start,end in keyword_processor.extract_keywords(example['content'],span_info=True):
        suggestion= { #Create a suggestion
            "example_id":example['id'], #That refers to a particular example
            "tag_id":tag_id, #and applies a particular tag
            "start":start, #Which starts somewhere in the example
            "end":end # And ends somewhere in the example
        }
        suggestions.append(suggestion)

In [30]:
suggestions.append(suggestions[0])

### Step 3.2: Define a model

In [31]:
model_metadata= { # Define any metadata you'd like to store about the model
    "defined_by": "LightTag",
    "comments": "An example model made with Flashtext"
}
data = {
    "model":{
        "name":"demo_suggestions",  #Give the model a name
        "metadata": model_metadata # Provide metadata (optional)
    },
    "suggestions":suggestions #Attatch the suggestions you made before
}

## Step 4 Upload your model and Suggestions 

In [32]:
schema
resp =session.post(schema['url']+'models/bulk/',
                    json=data,
             )
resp.status_code

400

In [33]:
resp.json()

{'detail': "You've tried to upload two suggestions that overlap on the same model. That is not supported"}