## Watson NLP Example for Text Extensions for Pandas

This demo shows how to use the `watson` module from Text Extension for Pandas to 
process a Watson NLP response from the IBM cloud into Pandas DataFrames for analysis.
The responses should be in the form of decoded JSON Python and the following features
will be processed into DataFrames:

* entities
* keywords
* relations
* semantic_roles
* syntax with sentences and tokens

To properly authenticate with IBM Cloud, please set the environment variable
`IBM_AUTH_KEY` with your correct apikey to make requests to 
`ibm_watson.NaturalLanguageUnderstandingV1`.

In [1]:
# INITIALIZATION BOILERPLATE

# The Jupyter kernel for this notebook usually starts up inside the notebooks
# directory, but the text_extensions_for_pandas package code is in the parent
# directory. Add that parent directory to the front of the Python include path.
import sys
if (sys.path[0] != ".."):
    sys.path[0] = ".."

import json
import os
from ibm_watson import NaturalLanguageUnderstandingV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
from ibm_watson.natural_language_understanding_v1 import Features, CategoriesOptions, ConceptsOptions, EmotionOptions, EntitiesOptions, KeywordsOptions, \
    MetadataOptions, RelationsOptions, SemanticRolesOptions, SentimentOptions, SyntaxOptions, SyntaxOptionsTokens
from text_extensions_for_pandas.io.watson import watson_nlp_parse_response

In [23]:
# Retrieve the APIKEY for authentication
apikey = os.environ.get("IBM_AUTH_KEY")
if apikey is None:
    raise ValueError("Expected apikey in the environment variable 'IBM_AUTH_KEY'")

In [24]:
# Initialize the authenticator for making requests
authenticator = IAMAuthenticator(apikey)
natural_language_understanding = NaturalLanguageUnderstandingV1(
    version='2019-07-12',
    authenticator=authenticator
)

natural_language_understanding.set_service_url('https://api.us-south.natural-language-understanding.watson.cloud.ibm.com/instances/21b9b875-4ddb-46ad-bb22-d78747622ca7')

In [25]:
# Make the request
response = natural_language_understanding.analyze(
    url='https://raw.githubusercontent.com/CODAIT/text-extensions-for-pandas/master/resources/holy_grail.txt',
    features=Features(
        #categories=CategoriesOptions(limit=3), 
        #concepts=ConceptsOptions(limit=3), 
        #emotion=EmotionOptions(targets=['grail']),
        entities=EntitiesOptions(sentiment=True,limit=3),
        keywords=KeywordsOptions(sentiment=True,emotion=True,limit=3),
        #metadata=MetadataOptions(),
        relations=RelationsOptions(),
        semantic_roles=SemanticRolesOptions(limit=3),
        #sentiment=SentimentOptions(targets=['Arthur']),
        syntax=SyntaxOptions(sentences=True, tokens=SyntaxOptionsTokens(lemma=True, part_of_speech=True))  # Experimental
    )).get_result()

In [12]:
# View response as JSON
#print(json.dumps(response, indent=2))

In [13]:
# Get the response as processed Pandas DataFrames
dfs = watson_nlp_parse_response(response)

In [26]:
dfs.keys()

dict_keys(['entities', 'keywords', 'relations', 'semantic_roles', 'syntax.sentence', 'syntax.tokens'])

In [27]:
dfs['keywords']

Unnamed: 0,count,emotion.anger,emotion.disgust,emotion.fear,emotion.joy,emotion.sadness,relevance,sentiment.label,sentiment.score,text
0,1,0.071927,0.031335,0.058051,0.691404,0.175057,0.746411,neutral,0.0,legend of King Arthur
1,1,0.021033,0.095661,0.01634,0.810654,0.046902,0.642571,positive,0.835873,Sir Lancelot
2,1,0.112061,0.033299,0.043658,0.747356,0.09149,0.642235,neutral,0.0,King Arthur


In [28]:
dfs['entities']

Unnamed: 0,confidence,count,relevance,sentiment.label,sentiment.mixed,sentiment.score,text,type
0,1.0,12,0.956097,negative,1.0,-0.312834,Arthur,Person
1,1.0,5,0.678523,positive,,0.835873,Lancelot,Person
2,0.977538,2,0.644313,neutral,,0.0,Monty Python,Person


In [29]:
dfs['syntax.sentence']

Unnamed: 0,location,text,char_span
0,"[0, 273]",Monty Python and the Holy Grail is a 1975 Brit...,"[0, 273): 'Monty Python and the Holy Grail is ..."
1,"[274, 405]",It was conceived during the hiatus between the...,"[274, 405): 'It was conceived during the hiatu..."
2,"[407, 642]","While the group's first film, And Now for Some...","[407, 642): 'While the group's first film, And..."
3,"[643, 720]","Thirty years later, Idle used the film as the ...","[643, 720): 'Thirty years later, Idle used the..."
4,"[722, 823]",Monty Python and the Holy Grail grossed more t...,"[722, 823): 'Monty Python and the Holy Grail g..."
5,"[824, 954]","In the US, it was selected as the second-best ...","[824, 954): 'In the US, it was selected as the..."
6,"[955, 1122]","In the UK, readers of Total Film magazine in 2...","[955, 1122): 'In the UK, readers of Total Film..."
7,"[1122, 1256]","[5] In AD 932, King Arthur and his squire, Pa...","[1122, 1256): '[5] In AD 932, King Arthur and ..."
8,"[1257, 1488]","Along the way, he recruits Sir Bedevere the Wi...","[1257, 1488): 'Along the way, he recruits Sir ..."
9,"[1489, 1639]","Arthur leads the men to Camelot, but upon furt...","[1489, 1639): 'Arthur leads the men to Camelot..."
