# Adding Suggestions
This notebook will walk you through adding suggestions to LightTag throw the API. 

Conceptually, Suggestions come from a Model and a Model belongs to a Schema. Thus in order to add Suggestions you must register a Model under the relevant Schema and then populate the model with actual suggestions. 

To avoid confusion, while you are registering a Model, LightTag will not run your model, instead, it is up to you to generate the suggestions and upload them. Here we'll see how to do that

## The Steps
1. Retreiving the Examples we wish to generate suggestions on
2. Retreiveing the Schema we wish to define a Model for
3. Creating a model and suggestions
4. Uploading the suggestions and Model
5. Assigning a model to a task definition

## Step 0 - Basic setup

In [1]:
import requests
import json
import pandas as pd
from pprint import pprint
from requests.auth import HTTPBasicAuth
SERVER ="http://localhost:8000"
#SERVER = "https://api-demo.lighttag.io" #The server is https://api-{your_subdomain_name.lighttag.io}
API_BASE = SERVER +'/api/v1/'
LT_USERNAME = "demo" # Username of manager user
LT_PASSWORD = "demo" #password of manager user

response = requests.post(SERVER +"/auth/token/create/",
              json={"username":LT_USERNAME,"password":LT_PASSWORD}
             )
response

<Response [200]>

Once we log in, LightTag will return a token we can use for further calls. We'll put that in dict that we will pass as header

In [2]:
token = response.json()['key']
headers={'Authorization': 'Token {token}'.format(token=token)}
session = requests.session()
session.headers = headers

## Step 1 Getting the data examples we wish to annotate
To generate suggestions we will need data. Particularly we'll need both all of the examples in a dataset as well as the tags from a particular schema. 
While you could gather each of these seperatly, LightTag has a data download view which already returns both of these combined. 



### Retreiv the slug or url of the dataset with the examples you wish to add suggestions to
You can query the datasets endpoint. Remember, you are operating under the default project, so you should query 


projects/**default**/datasets/

In [3]:
datasets = session.get(API_BASE+'projects/default/datasets/').json()
pd.DataFrame(datasets) 
#It's convenient to display the results in a dataframe

Unnamed: 0,aggregation_field,content_field,id,id_field,name,order_field,project_id,slug,url
0,date,text,92ea39c8-1d7d-47eb-b6d3-25ceebc6ba33,,Test Set,time,e488e45a-564a-4b93-8beb-f7aa4c73ea97,test-set,http://localhost:8000/api/v1/projects/default/...
1,date,text,de7ed65d-e685-4ea9-bf3b-0c3c10f20b46,,Training Set,time,e488e45a-564a-4b93-8beb-f7aa4c73ea97,training-set,http://localhost:8000/api/v1/projects/default/...
2,,text,ee688ba1-b0d3-4fdb-9355-d7924e3875e4,,Exploratory Dataset,,e488e45a-564a-4b93-8beb-f7aa4c73ea97,exploratory-dataset,http://localhost:8000/api/v1/projects/default/...


### Retreive the examples from that dataset by 
In this example we what a Dataset named Test Set with a slug test-set. We'll pull all of the examples via the examples endpoint
projects/default/datasets/**test-set**/examples/

In [6]:

examples =session.get(API_BASE+'projects/default/datasets/test-set/examples/').json()

In [8]:
examples[0]

{'aggregation_value': '2017-03-19',
 'content': '#ICYMI: Weekly Address \n➡️https://t.co/ckVx2zgA1x https://t.co/dTGZLvlsGv',
 'dataset': '92ea39c8-1d7d-47eb-b6d3-25ceebc6ba33',
 'id': '734c1e07-3648-44c2-903c-df7ba777c7e5',
 'metadata': {'created_at': 'Sun Mar 19 20:20:22 +0000 2017',
  'date': '2017-03-19',
  'favorite_count': 36626,
  'id_str': '843557782666317826',
  'in_reply_to_user_id_str': None,
  'is_retweet': False,
  'retweet_count': 7472,
  'source': 'Twitter for iPhone',
  'time': 1489954822000000000}}

We could equivalently do this by usig the URL field on the json returned from the datasets endpoint

In [9]:
bible_dataset = next(filter(lambda x:x['slug']=="test-set",datasets))

In [10]:
examples =session.get(bible_dataset['url']+'examples/').json()

In [11]:
examples[0]

{'aggregation_value': '2017-03-19',
 'content': '#ICYMI: Weekly Address \n➡️https://t.co/ckVx2zgA1x https://t.co/dTGZLvlsGv',
 'dataset': '92ea39c8-1d7d-47eb-b6d3-25ceebc6ba33',
 'id': '734c1e07-3648-44c2-903c-df7ba777c7e5',
 'metadata': {'created_at': 'Sun Mar 19 20:20:22 +0000 2017',
  'date': '2017-03-19',
  'favorite_count': 36626,
  'id_str': '843557782666317826',
  'in_reply_to_user_id_str': None,
  'is_retweet': False,
  'retweet_count': 7472,
  'source': 'Twitter for iPhone',
  'time': 1489954822000000000}}

## Step 2 Retreive the Schema and Tags we want to create suggestions for


In [12]:
schemas =session.get(API_BASE+'projects/default/schemas/').json()
pd.DataFrame(schemas)

Unnamed: 0,id,name,slug,url
0,3b73c2b2-73aa-4c42-a265-a0c459abd295,Classifications and tags Schema,classifications-and-tags-schema,http://localhost:8000/api/v1/projects/default/...
1,fbe6fd1d-bb59-4909-8a88-81772cc0d996,Entity Tags only,entity-tags-only,http://localhost:8000/api/v1/projects/default/...
2,9dcf0091-d82c-4292-a09d-eaff062fc1c4,Trump Insult Classification,trump-insult-classification,http://localhost:8000/api/v1/projects/default/...


In [14]:
schema = schemas[1]
tags = session.get(schema['url']+'tags/').json()
pd.DataFrame(tags)

Unnamed: 0,description,id,name
0,Republicans/Democrates etc,0b4e9d4d-96b2-4288-bd52-e269a505cce3,Politcal Group
1,CNN/Fox etc.,14fd0d03-006b-4372-823e-41e32642e3b4,Media organization
2,A word or phrase that is unsulting.,2b162d01-637d-44b6-acf6-77cb8081608f,Insult
3,A political issue the President is discussing.,08728ae4-61c9-43ea-86f4-ffbc5289c5ae,Issue
4,A physical place. For example the White House.,6d331788-d937-4a11-ac38-4d653e56ddbd,Place
5,The proper name of a person (John is a person....,038de1c3-143a-40fc-a14c-627920311055,Person


### Pro Tip 
To upload suggestions we'll need to use the tag ids, which can be cumbersome. It is often convenient to make a map
drom the tag name to the id like so : 

In [15]:
tagMap = {tag["name"]:tag["id"] for tag in tags}

In [16]:
tagMap

{'Insult': '2b162d01-637d-44b6-acf6-77cb8081608f',
 'Issue': '08728ae4-61c9-43ea-86f4-ffbc5289c5ae',
 'Media organization': '14fd0d03-006b-4372-823e-41e32642e3b4',
 'Person': '038de1c3-143a-40fc-a14c-627920311055',
 'Place': '6d331788-d937-4a11-ac38-4d653e56ddbd',
 'Politcal Group': '0b4e9d4d-96b2-4288-bd52-e269a505cce3'}

## Step 3: Create your suggestions
How you create a suggestion is up to you. Youa re free to use a dicitonary, regex, neural network or whatever else
is availble. 
In this example we'll use the [flashtext](https://github.com/vi3k6i5/flashtext) library to suggest the following




In [18]:
from flashtext import KeywordProcessor
keyword_processor = KeywordProcessor()
keyword_processor.add_keyword("Trump",tagMap["Person"])
keyword_processor.add_keyword("Wall",tagMap["Issue"])
keyword_processor.add_keyword("CNN",tagMap["Media organization"])
keyword_processor.add_keyword("Dumb",tagMap["Insult"])

keyword_processor.add_keyword("Russia",tagMap["Place"])
keyword_processor.add_keyword("Democrats",tagMap["Politcal Group"])


True

In [20]:
keyword_processor.extract_keywords(" Democrats don't want Trump to go to Russia. Dumb! ",
                                   span_info=True)

[('0b4e9d4d-96b2-4288-bd52-e269a505cce3', 1, 10),
 ('038de1c3-143a-40fc-a14c-627920311055', 22, 27),
 ('6d331788-d937-4a11-ac38-4d653e56ddbd', 37, 43),
 ('2b162d01-637d-44b6-acf6-77cb8081608f', 45, 49)]

### Step 3.1: Iterate over your examples and make a  list of suggestions


In [21]:
suggestions = []
for example in examples: #(Notice the text is located in the content field)
    #For every suggestion that comes out of our model (in this case, flashtext)
    for tag_id,start,end in keyword_processor.extract_keywords(example['content'],span_info=True):
        suggestion= { #Create a suggestion
            "example_id":example['id'], #That refers to a particular example
            "tag_id":tag_id, #and applies a particular tag
            "start":start, #Which starts somewhere in the example
            "end":end # And ends somewhere in the example
        }
        suggestions.append(suggestion)

### Step 3.2: Define a model

In [24]:
model_metadata= { # Define any metadata you'd like to store about the model
    "defined_by": "LightTag",
    "comments": "An example model made with Flashtext"
}
data = {
    "model":{
        "name":"demo_suggestions",  #Give the model a name
        "metadata": model_metadata # Provide metadata (optional)
    },
    "suggestions":suggestions #Attatch the suggestions you made before
}

## Step 4 Upload your model and Suggestions 

In [25]:
schema
resp =session.post(schema['url']+'models/bulk/',
                    json=data,
             )
resp.status_code

201

In [26]:
resp.json()

{'id': '94fc9df1-4dcf-4542-bb0f-e2b74a3d2ead',
 'metadata': {'comments': 'An example model made with Flashtext',
  'defined_by': 'LightTag'},
 'name': 'demo_suggestions',
 'slug': 'demo_suggestions',
 'url': 'http://localhost:8000/api/v1/projects/default/schemas/entity-tags-only/models/demo_suggestions/'}

## Step 5 Assign you suggestion model to a task
Once we have a model, we need to tell LightTag to show its results to our annotators. Suggestions are shown with the rest of the work, and so we assign one or more models to an individual task. 
In this example we'll assign  two models to our task. The one we defined and the LightTag default model

### Get the model ids for the schema we defined this model on

In [28]:
schema

{'id': 'fbe6fd1d-bb59-4909-8a88-81772cc0d996',
 'name': 'Entity Tags only',
 'slug': 'entity-tags-only',
 'url': 'http://localhost:8000/api/v1/projects/default/schemas/entity-tags-only/'}