# Watson Natural Language Understanding Tutorial

This is a brief tutorial that ensures you are able to access the Watson NLU service and that you have the tools to learn more.  We highly encourage you to visit the links provided throughout to further your understanding of the additional capabilities that Watson NLU has to offer.

* [Watson NLU getting started](https://cloud.ibm.com/docs/services/natural-language-understanding?topic=natural-language-understanding-getting-started)
* [Watson NLU API docs](https://cloud.ibm.com/apidocs/natural-language-understanding)

If you have not already carried out instructions that proceeded this tutorial please do so now.  You should have

1. Installed the [Watson Developer Cloud Python software SDK](https://github.com/watson-developer-cloud)
2. Created a `natural language understanding resource` in the [IBM Cloud](https://cloud.ibm.com)
3. Saved the url, API key, and version associated with your resource in a file located in your home directory.

In [2]:
import sys
import os
import json
from ibm_watson import NaturalLanguageUnderstandingV1
from ibm_watson.natural_language_understanding_v1 import Features, EntitiesOptions, KeywordsOptions
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

If you have saved your key, version and url in the `~/.ibm/ibmauth.py` then the following code block should not raise an exception.

 **It is important to be explicit about the version** with every request, because software changes and with APIs the source code is not always readily available.  From the perspective of being able to reproduce your analysis at some point in the future the version is essential.

In [3]:
### import API key
apikey_dir = os.path.join(os.path.expanduser("~"),".ibm")
sys.path.append(apikey_dir)

if not os.path.exists(apikey_dir):
    raise Exception("please store you API key in file within 'apikey_dir' before proceeding")

from ibmauth import NLU_KEY, NLU_URL, NLU_VERSION

It is common practice to wrap connection code in a function help ensure clean handling of errors.

In [4]:
def connect_watson_nlu():
    """
    establish a connection to watson nlu service
    """
    
    authenticator = IAMAuthenticator(NLU_KEY)
    service = NaturalLanguageUnderstandingV1(version=NLU_VERSION,
                                             authenticator=authenticator)

    service.set_service_url(NLU_URL)

    print("\nConnection established.\n")
    return(service)

In [5]:
## establish a connection to the service

service = connect_watson_nlu()


Connection established.



The Watson NLU service has the following features

* **Categories** - Categorize your content using a five-level classification hierarchy.
* **Concepts** - Identify high-level concepts that aren't necessarily directly referenced in the text.
* **Emotions** - Analyze emotion conveyed by specific target phrases or by the document as a whole.
* **Entities** - Find people, places, events, and other types of entities mentioned in your content.
* **Keywords** - Search your content for relevant keywords.
* **Metadata** - For HTML and URL input, get the author of the webpage, the page title, and the publication date.
* **Relations** - Recognize when two entities are related, and identify the type of relation.
* **Semantic Role** - Parse into subject-action-object form, and identify subjects or objects of an action.
* **Sentiment** -Analyze the sentiment toward specific target phrases and the sentiment of the document as a whole.
* **Custom Models** - Identify custom entities and relations unique to your domain with Watson Knowledge Studio.

The Watson NLU service can use as input both URLs and text in the form of strings.  Change the `target_url` below to see how the output changes.

In [10]:
## url example
target_url = "https://www.ibm.com/blogs/insights-on-business/healthcare/the-enormous-potential-of-ai-for-pharmaceutical"
response = service.analyze(url=target_url,
                           features=Features(entities=EntitiesOptions(),
                                             keywords=KeywordsOptions())).get_result()
print(json.dumps(response, indent=2))

{
  "usage": {
    "text_units": 1,
    "text_characters": 5942,
    "features": 2
  },
  "retrieved_url": "https://www.ibm.com/blogs/insights-on-business/healthcare/the-enormous-potential-of-ai-for-pharmaceutical/",
  "language": "en",
  "keywords": [
    {
      "text": "Dr. Lester Russell",
      "relevance": 0.676314,
      "count": 1
    },
    {
      "text": "genuine practical applications of machine intelligence",
      "relevance": 0.661631,
      "count": 1
    },
    {
      "text": "adoption of Artificial Intelligence",
      "relevance": 0.601401,
      "count": 1
    },
    {
      "text": "Medical professionals",
      "relevance": 0.591791,
      "count": 2
    },
    {
      "text": "skills gap",
      "relevance": 0.572012,
      "count": 1
    },
    {
      "text": "main challenge",
      "relevance": 0.56778,
      "count": 1
    },
    {
      "text": "important move",
      "relevance": 0.564713,
      "count": 1
    },
    {
      "text": "research-intensive org

You can see from the JSON formatted output that the entity type, text, counts and more is readily available.  Often in the application of natural language processing there text takes the form of container of documents or a corpus.

In [13]:
## text example
text = 'NYC is a great city, but the winters are cold.  JFK and Newark are close by so it is easy to get away'
response = service.analyze(text=text,
                           features=Features(entities=EntitiesOptions(sentiment=True),
                                                 keywords=KeywordsOptions())).get_result()

    

print(json.dumps(response, indent=2))

{
  "usage": {
    "text_units": 1,
    "text_characters": 101,
    "features": 2
  },
  "language": "en",
  "keywords": [
    {
      "text": "great city",
      "relevance": 0.982499,
      "emotion": {
        "sadness": 0.194384,
        "joy": 0.552311,
        "fear": 0.041998,
        "disgust": 0.015716,
        "anger": 0.03428
      },
      "count": 1
    },
    {
      "text": "JFK",
      "relevance": 0.772045,
      "emotion": {
        "sadness": 0.292683,
        "joy": 0.160798,
        "fear": 0.201947,
        "disgust": 0.185217,
        "anger": 0.212865
      },
      "count": 1
    },
    {
      "text": "Newark",
      "relevance": 0.659126,
      "emotion": {
        "sadness": 0.292683,
        "joy": 0.160798,
        "fear": 0.201947,
        "disgust": 0.185217,
        "anger": 0.212865
      },
      "count": 1
    },
    {
      "text": "winters",
      "relevance": 0.58918,
      "emotion": {
        "sadness": 0.194384,
        "joy": 0.552311,
       

Working with text is incredibly nuanced.  NYC and Newark are without a doubt locations, but in the context of this sentence JFK refers to the airport rather than the person.  Despite this there is a lot of information to be gained.  We have also included the sentiment flag.  Checkout the [Watson NLU API docs](https://cloud.ibm.com/apidocs/natural-language-understanding) and find at least one or more way you can modify the types of output.  For example you could enable the `emotion` flag under keywords.

See the other [Watson SDK example](https://github.com/watson-developer-cloud/python-sdk/tree/master/examples) to work with the other Watson services like [visual recognition](https://www.ibm.com/watson/services/visual-recognition) and [speech to text](https://www.ibm.com/watson/services/speech-to-text).  To access those services you can create then save the an API key, url and version in the same file with an appropriate variable name.  Also it is good to keep in mind that Python is just one of several [SDK APIs than can be used for NLU](https://cloud.ibm.com/docs/services/natural-language-understanding?topic=watson-using-sdks#using-sdks) and other services.