# About

This Python notebook demonstrates extracting themes from each sentence in the favorite animals sample meeting using several methods:
- Themes based on keywords or nouns extracted using the Watson Natural Language Understanding default model
- Themes based on entities extracted using a custom language model, trained to recognize "Animal" entities and deployed in Watson Natural Language Understanding

This notebook is a sample to support a paper presentation at CASCONxEVOKE 2021.

See:
- [CASCONxEVOKE 2021](https://pheedloop.com/casconevoke2021/site/home)
- [Presentation](https://pheedloop.com/casconevoke2021/site/sessions/?id=SESPZ87C5K5VZKT28)
- [Samples GitHub repo](https://github.com/spackows/CASCON-2021_Processing_video)

# Step 1: Download sentences of meeting transcript to working directory

A file containing an array of corrected sentences for a sample meeting recording is available here: [favorite-animals-short-meeting_sentences_arr_corrected.json](https://raw.githubusercontent.com/spackows/CASCON-2021_Processing_video/main/sample-meeting/favorite-animals-short-meeting_sentences_arr_corrected.json)

In this step, download that sentences array file to the notebook working directory.

In [1]:
# Download the file
import urllib.request
sentences_url = "https://raw.githubusercontent.com/spackows/CASCON-2021_Processing_video/main/sample-meeting/favorite-animals-short-meeting_sentences_arr_corrected.json"
sentences_filename = "favorite-animals-short-meeting_sentences_arr_corrected.json"
urllib.request.urlretrieve( sentences_url, sentences_filename )

('favorite-animals-short-meeting_sentences_arr_corrected.json',
 <http.client.HTTPMessage at 0x7faeac27f6a0>)

In [8]:
# View the contents of the working directory
!ls

favorite-animals-short-meeting_sentences_arr_corrected.json


In [24]:
# Read the sentences from the file
import json

with open( sentences_filename ) as json_file:
    sentences_arr = json.load( json_file )
    
print( json.dumps( sentences_arr[0:6], indent=2 ), "\n..." )

[
  "Thanks, everybody, for joining me for this short meeting.",
  "What I wanted to do, today, was to go around the room and asked people to share what is their favorite animal and why.",
  "Okay. My name is Heather and my favorite animal is a dog.",
  "And the reason that it's my favorite animal, probably a lot of people's favorite animal, because dogs are such loving companions who are so very loyal.",
  "And I feel like they seem to know when you need them to come snuggle by you.",
  "They're very perceptive of your feelings and they want to please you."
] 
...


# Step 2: Get Credentials for Watson Natural Language Understanding

1. Create a free (Lite plan) instance of the Watson Natural Language Understanding service in the IBM Cloud catalog: [Watson Natural Language Understanding](https://cloud.ibm.com/catalog/services/natural-language-understanding)
2. On the **Service credentials** tab of your Watson Natural Language Understanding service instance, generate new credentials and then copy the **apikey** and the **url**[1] into the code cell below

[1] Note, the url should be of the form: `https://api.<region>.natural-language-understanding.watson.cloud.ibm.com/instances/<unique-instance-ID>`

In [10]:
nlu_apikey = ""
nlu_url = ""

# Step 3: Extract themes based on keywords and nouns

In this step, use Watson Speech to Text - with the default language model - to extract keywords and nouns.

See:
- [Watson Natural Language Understanding API documentation](https://cloud.ibm.com/apidocs/natural-language-understanding?code=python)
- [Authentication](https://cloud.ibm.com/apidocs/natural-language-understanding?code=python#authentication)
- [`analyze`](https://cloud.ibm.com/apidocs/natural-language-understanding?code=python#analyze)

In [None]:
# Install the library
!pip install --upgrade "ibm-watson>=5.2.3"

In [11]:
# Authenticate with the Watson Speech to Text service
from ibm_watson import NaturalLanguageUnderstandingV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
from ibm_watson.natural_language_understanding_v1 import Features, EntitiesOptions, KeywordsOptions, SyntaxOptions, SyntaxOptionsTokens
authenticator = IAMAuthenticator( nlu_apikey )
natural_language_understanding = NaturalLanguageUnderstandingV1( version="2021-08-01", authenticator=authenticator )
natural_language_understanding.set_service_url( nlu_url )

## Test with one sentence

In [12]:
# Call the API with one sentence
nlu_result = natural_language_understanding.analyze( 
    text=sentences_arr[0], 
    features=Features( 
        keywords=KeywordsOptions(),
        syntax=SyntaxOptions( tokens=SyntaxOptionsTokens( part_of_speech=True ) )
    )).get_result()

In [13]:
# Print raw NLU results
import json
print( json.dumps( nlu_result, indent=2 ) )

{
  "usage": {
    "text_units": 1,
    "text_characters": 57,
    "features": 2
  },
  "syntax": {
    "tokens": [
      {
        "text": "Thanks",
        "part_of_speech": "NOUN",
        "location": [
          0,
          6
        ]
      },
      {
        "text": ",",
        "part_of_speech": "PUNCT",
        "location": [
          6,
          7
        ]
      },
      {
        "text": "everybody",
        "part_of_speech": "PRON",
        "location": [
          8,
          17
        ]
      },
      {
        "text": ",",
        "part_of_speech": "PUNCT",
        "location": [
          17,
          18
        ]
      },
      {
        "text": "for",
        "part_of_speech": "SCONJ",
        "location": [
          19,
          22
        ]
      },
      {
        "text": "joining",
        "part_of_speech": "VERB",
        "location": [
          23,
          30
        ]
      },
      {
        "text": "me",
        "part_of_speech": "PRON",
        "locati

In [14]:
# Print the "themes" of the sentence, based on keywords
keywords_arr = []
for keyword_result in nlu_result["keywords"]:
    keywords_arr.append( keyword_result["text"] )
print( "Themes by keywords:\n" )
print( ", ".join( keywords_arr ) + " | " + sentences_arr[0] )

Themes by keywords:

Thanks, short meeting | Thanks, everybody, for joining me for this short meeting.


In [15]:
# Print the "themes" of the sentence, based on nouns
import re
nouns_arr = []
for syntax_result in nlu_result["syntax"]["tokens"]:
    if re.match( r"NOUN", syntax_result["part_of_speech"] ):
        nouns_arr.append( syntax_result["text"] )
print( "Themes by nouns:\n" )
print( ", ".join( nouns_arr ) + " | " + sentences_arr[0] )

Themes by nouns:

Thanks, meeting | Thanks, everybody, for joining me for this short meeting.


## Extract keywords and nouns for all sentences

In [16]:
# Loop through sentences_arr, calling the API for each sentence
nlu_results_arr = []
for sentence in sentences_arr:
    nlu_result = natural_language_understanding.analyze(
        text=sentence, 
        features=Features( 
            keywords=KeywordsOptions(),
            syntax=SyntaxOptions( tokens=SyntaxOptionsTokens( part_of_speech=True ) )
        )).get_result()
    nlu_results_arr.append( { "sentence" : sentence, "nlu_result" : nlu_result } )    

In [None]:
# Print raw results
print( json.dumps( nlu_results_arr, indent=2 ) )

### Themes based on keywords for all sentences

In [18]:
# Print an HTML table to make it easier to see the results
html_keywords = "<table><tr><th>Theme (by keywords)</th><th style='text-align: left;'>Sentence</th></tr>"
for item in nlu_results_arr:
    keywords_arr = []
    sentence = item["sentence"]
    for keyword_result in item["nlu_result"]["keywords"]:
        keywords_arr.append( keyword_result["text"] )
    theme = ", ".join( keywords_arr )
    html_keywords += "<tr><td>" + theme + "</td><td style='text-align: left;'>" + sentence + "</td></tr>"
html_keywords += "</table>"

In [19]:
from IPython.core.display import display, HTML
display( HTML( html_keywords ) )

Theme (by keywords),Sentence
"Thanks, short meeting","Thanks, everybody, for joining me for this short meeting."
"favorite animal, today, room, people","What I wanted to do, today, was to go around the room and asked people to share what is their favorite animal and why."
"favorite animal, name, Heather, dog",Okay. My name is Heather and my favorite animal is a dog.
"favorite animal, lot of people, reason, dogs, such loving companions","And the reason that it's my favorite animal, probably a lot of people's favorite animal, because dogs are such loving companions who are so very loyal."
,And I feel like they seem to know when you need them to come snuggle by you.
feelings,They're very perceptive of your feelings and they want to please you.
"mean dogs, dogs, humans","And I don't think they're, well there are mean dogs, but overall most dogs are very lovable and only just want to please humans."
"favorite animal, dogs",And so I would say dogs are my favorite animal.
,I can go next.
"name Sara, different animal, once, while","So, my name Sara, and every once in a while, I actually reflect on a different animal and I feel like I connect with a different one."


### Themes based on nouns for all sentences

In [20]:
# Print an HTML table to make it easier to see the results
html_nouns = "<table><tr><th>Theme (by nouns)</th><th style='text-align: left;'>Sentence</th></tr>"
for item in nlu_results_arr:
    sentence = item["sentence"]
    nouns_arr = []
    for token_result in item["nlu_result"]["syntax"]["tokens"]:
        if re.match( r"NOUN", token_result["part_of_speech"] ):
            nouns_arr.append( token_result["text"] )
    theme = ", ".join( nouns_arr )
    html_nouns += "<tr><td>" + theme + "</td><td style='text-align: left;'>" + sentence + "</td></tr>"
html_nouns += "</table>"

In [21]:
display( HTML( html_nouns ) )

Theme (by nouns),Sentence
"Thanks, meeting","Thanks, everybody, for joining me for this short meeting."
"today, room, people, animal","What I wanted to do, today, was to go around the room and asked people to share what is their favorite animal and why."
"name, animal, dog",Okay. My name is Heather and my favorite animal is a dog.
"reason, animal, lot, people, animal, dogs, companions","And the reason that it's my favorite animal, probably a lot of people's favorite animal, because dogs are such loving companions who are so very loyal."
,And I feel like they seem to know when you need them to come snuggle by you.
feelings,They're very perceptive of your feelings and they want to please you.
"dogs, dogs, humans","And I don't think they're, well there are mean dogs, but overall most dogs are very lovable and only just want to please humans."
"dogs, animal",And so I would say dogs are my favorite animal.
,I can go next.
"name, once, while, animal, one","So, my name Sara, and every once in a while, I actually reflect on a different animal and I feel like I connect with a different one."


# Step 4: Extract themes based on entities, using a custom language model

In this step, demonstrate how much better the results are when using a custom language model.

A file containing sample results from a model customized to recognize "Animal" entities is available here: [favorite-animals-short-meeting_nlu-custom-model-results.json](https://raw.githubusercontent.com/spackows/CASCON-2021_Processing_video/main/sample-meeting/favorite-animals-short-meeting_nlu-custom-model-results.json)

Creating a custom language model to use with Watson Natural Language Understanding is demonstrated in full detail in these workshops: [CASCON-2019_NLP-workshops](https://github.com/spackows/CASCON-2019_NLP-workshops)

In [22]:
# Download the file
import urllib.request
custom_nlu_url = "https://raw.githubusercontent.com/spackows/CASCON-2021_Processing_video/main/sample-meeting/favorite-animals-short-meeting_nlu-custom-model-results.json"
custom_nlu_filename = "favorite-animals-short-meeting_nlu-custom-model-results.json"
urllib.request.urlretrieve( custom_nlu_url, custom_nlu_filename )

('favorite-animals-short-meeting_nlu-custom-model-results.json',
 <http.client.HTTPMessage at 0x7fae9520b100>)

In [135]:
# View the contents of the working directory
!ls

favorite-animals-short-meeting_nlu-custom-model-results.json
favorite-animals-short-meeting_sentences_arr_corrected.json


In [25]:
# Read the sentences from the file
with open( custom_nlu_filename ) as json_file:
    nlu_results_arr_custom = json.load( json_file )
    
print( json.dumps( nlu_results_arr_custom[1:3], indent=2 ), "\n..." )

[
  {
    "sentence": "What I wanted to do, today, was to go around the room and asked people to share what is their favorite animal and why.",
    "nlu_result": {
      "entities": []
    }
  },
  {
    "sentence": "Okay. My name is Heather and my favorite animal is a dog.",
    "nlu_result": {
      "entities": [
        {
          "type": "Animal",
          "text": "dog",
          "relevance": 0.81586,
          "count": 1
        }
      ]
    }
  }
] 
...


In [None]:
# View all the raw, custom results for interest
print( json.dumps( nlu_results_arr_custom, indent=2 ) )

In [27]:
# A simple helper function for stemming
def singular( input ):
    return re.sub( r"s$", "", input )

# Print an HTML table to make it easier to see the results
html_custom_entities = "<table><tr><th>Theme (by custom entities)</th><th style='text-align: left;'>Sentence</th></tr>"
for item in nlu_results_arr_custom:
    sentence = item["sentence"]
    entites_arr = []
    entites_results_arr = item["nlu_result"]["entities"]
    for entity_result in entites_results_arr:
        if re.match( r"Animal", entity_result["type"] ):
            entites_arr.append( singular( entity_result["text"] ).lower() )
    theme = ", ".join( entites_arr )
    html_custom_entities += "<tr><td>" + theme + "</td><td style='text-align: left;'>" + sentence + "</td></tr>"
html_custom_entities += "</table>"

In [28]:
display( HTML( html_custom_entities ) )

Theme (by custom entities),Sentence
,"Thanks, everybody, for joining me for this short meeting."
,"What I wanted to do, today, was to go around the room and asked people to share what is their favorite animal and why."
dog,Okay. My name is Heather and my favorite animal is a dog.
dog,"And the reason that it's my favorite animal, probably a lot of people's favorite animal, because dogs are such loving companions who are so very loyal."
,And I feel like they seem to know when you need them to come snuggle by you.
,They're very perceptive of your feelings and they want to please you.
dog,"And I don't think they're, well there are mean dogs, but overall most dogs are very lovable and only just want to please humans."
dog,And so I would say dogs are my favorite animal.
,I can go next.
,"So, my name Sara, and every once in a while, I actually reflect on a different animal and I feel like I connect with a different one."
