<a href="https://colab.research.google.com/github/RitinDev/projects-programming-data-sciences/blob/main/class2/Assignment1_Progress.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Interacting with the IBM Watson Natural Language Understanding API

Another useful API, especially when dealing with text, is the [IBM Watson  Natural Language Understanding API](https://console.bluemix.net/catalog/services/natural-language-understanding), which offers a variety of text analysis functionalities, such as sentiment analysis, entity extraction, keyword extraction, etc.

We will give a couple of examples below, to understand how we can take an unstructured piece of text (either the text alone, or a URL with text), and perform some analysis.





## Sentiment ana emotion analysis

We will first start with the `/analyze` API call ([documentation](https://cloud.ibm.com/apidocs/natural-language-understanding#analyzeget)), which takes as input a piece of text, and returns an analysis across various dimensions. 

The API supports the following analyses:

`categories,classifications,concepts,emotion,entities,keywords,metadata,relations,semantic_roles,sentiment,summarization (experimental),syntax`

The API supports not only English, but also a [variety of non-English languages](https://cloud.ibm.com/docs/natural-language-understanding?topic=natural-language-understanding-detectable-languages).

In our introductory attempt, we will use the `sentiment` and `emotion` and focus on English texts. 



In [1]:
import requests

In [2]:
URL = 'https://api.us-south.natural-language-understanding.watson.cloud.ibm.com/instances/9e683088-0d12-4399-8118-518f3e60e8c4'

# My own API key. It may run out of quota
# You can register and get your own credentials
# The ones below have a quota of 1000 calls per day 
# and can run out quickly if multiple people use these
API_KEY = 'yx39wyiwPNGm7DoDUPCSJB4SzFkr0qurARfbGYyEdaoC'

def analyzeText(text=None, url=None):

    endpoint = f"{URL}/v1/analyze"
    username = "apikey"
    password = API_KEY
    
    parameters = {
        'features': 'emotion,sentiment',
        'version' : '2022-04-07',
        'text': text,
        'language' : 'en',
        'url' : url # this is an alternative to sending the text
    }

    resp = requests.get(endpoint, params=parameters, auth=(username, password))
    
    return resp.json()

### Exercise

* First of all, **get your own credentials for the IBM Watson API**. The demo key that we use above has a limited quota.
* Use an API to get news articles. 
    * Option 1: Use the API at https://newsapi.org to fetch the news from various sources. Print the entities that are currently being discussed in the news, together with their relevance value and the associated sentiment.
    * Option 2: Use the NY Times API to fetch the Top Stories News. You can register and get an API key at https://developer.nytimes.com/. The `Top Stories V2 API` provides the details of the news of the day: (The API call documentation is at https://developer.nytimes.com/docs/top-stories-product/1/overview and the API Call is  https://api.nytimes.com/svc/topstories/v2/home.json?api-key=PUTYOURKEYHERE). Repeat the entity extraction process from above.
    * Option 3: Use the Guardian API at https://open-platform.theguardian.com/documentation/ to fetch news from The Guardian.


In [3]:
# !sudo -H pip3 install newsapi-python
!pip install newsapi-python

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting newsapi-python
  Downloading newsapi_python-0.2.6-py2.py3-none-any.whl (7.9 kB)
Installing collected packages: newsapi-python
Successfully installed newsapi-python-0.2.6


In [12]:
from newsapi import NewsApiClient

# Init
newsapi = NewsApiClient(api_key='a7eab21c34e545dba418c4344d59a54f')

# /v2/top-headlines
top_headlines = newsapi.get_top_headlines(q = 'queen',
                                          language='en',
                                          page_size=50)

# top_headlines.keys()
articles = top_headlines['articles']
for article in articles:
#     article_url = article['url']
#     # print(article)
    print(article['content'])
    print(article['url'])
#     data = analyzeText(url=article_url)
#     print(data['sentiment']['document'])
#     print(data['emotion']['document'])
    print()

People queue to pay respects to the Queen, as the coffin of Britain’s Queen Elizabeth lies in state, in London, Britain, September 14, 2022. REUTERS
LONDON — Mourners from all walks of life filed pa… [+4165 chars]
https://www.inquirer.net

It was 3.30pm when Leigh, Thony and James arrived at the back of the queue at Millennium Bridge to see the Queen lie in state.
None of them identified as royalists but had different reasons to come.… [+4042 chars]
https://news.sky.com/story/sustained-by-pimms-lager-and-katsu-curry-but-only-respect-for-the-queen-in-the-queue-12697688

A member of the Royal Guard surrounding the coffin of Queen Elizabeth II has fainted off the podium during the 24-hour vigil for the late monarch.
https://www.skynews.com.au/world-news/united-kingdom/queens-guard-collapses-in-front-of-her-coffin/video/aec6340ccaad1f488dd855e7136d65c4

None
https://www.cnn.com/2022/09/14/uk/queen-elizabeth-silent-procession-intl-gbr/index.html

A member of the Royal Guard surrounding th

In [16]:
!sudo pip3 install -U -q PyMySQL sqlalchemy sql_magic

In [17]:
from sqlalchemy import create_engine

conn_string = "mysql+pymysql://{user}:{password}@{host}/".format(
    host="db.ipeirotis.org", user="student", password="dwdstudent2015"
)

engine = create_engine(conn_string)

In [18]:
# Query to create a database
db_name = "public"
create_db_query = (
    f"CREATE DATABASE IF NOT EXISTS {db_name} DEFAULT CHARACTER SET 'utf8'"
)

# Create a database
engine.execute(create_db_query)

<sqlalchemy.engine.cursor.LegacyCursorResult at 0x7f748e4ff610>

In [49]:
suffix = "rm5486"
table_name = f"{suffix}_news"
# Create a table
create_table_query = f"""CREATE TABLE IF NOT EXISTS {db_name}.{table_name} 
                                (content varchar(15000), 
                                url varchar(250), 
                                PRIMARY KEY(url)
                                )"""
engine.execute(create_table_query)

<sqlalchemy.engine.cursor.LegacyCursorResult at 0x7f7486468e50>

In [50]:
query_template = f"""
                    INSERT IGNORE INTO 
                    {db_name}.{table_name}(content,  url) 
                    VALUES (%s, %s)
                  """
for article in articles:
    content = article['content']
    url = article['url']

    print("Inserting article", content, "with", url, "as URL")
    query_parameters = (content, url)
    engine.execute(query_template, query_parameters)

Inserting article People queue to pay respects to the Queen, as the coffin of Britain’s Queen Elizabeth lies in state, in London, Britain, September 14, 2022. REUTERS
LONDON — Mourners from all walks of life filed pa… [+4165 chars] with https://www.inquirer.net as URL
Inserting article It was 3.30pm when Leigh, Thony and James arrived at the back of the queue at Millennium Bridge to see the Queen lie in state.
None of them identified as royalists but had different reasons to come.… [+4042 chars] with https://news.sky.com/story/sustained-by-pimms-lager-and-katsu-curry-but-only-respect-for-the-queen-in-the-queue-12697688 as URL
Inserting article A member of the Royal Guard surrounding the coffin of Queen Elizabeth II has fainted off the podium during the 24-hour vigil for the late monarch. with https://www.skynews.com.au/world-news/united-kingdom/queens-guard-collapses-in-front-of-her-coffin/video/aec6340ccaad1f488dd855e7136d65c4 as URL
Inserting article None with https://www.cnn.com/2

In [51]:
results = engine.execute(f"SELECT * FROM {db_name}.{table_name}")
rows = results.fetchall()
results.close()

In [47]:
for row in rows:
    print("Content:", row["content"])
    print("URL:", row["url"])
    print("=============================================")

Content: A member of the Royal Guard surrounding the coffin of Queen Elizabeth II has collapsed in front of hundreds of mourners.
It comes as members of the public have been filing into Westminster Hall to p… [+2415 chars]
URL: https://7news.com.au/entertainment/royal-family/member-of-the-royal-guard-surrounding-the-coffin-of-queen-elizabeth-collapses--c-8244739
Content: The Queens funeral is scheduled for 4 a.m., mountain daylight time, Monday 
Photos and a book of condolence for Queen Elizabeth II have been set up at Edmonton City Hall, Friday, Sept. 9, 2022. Phot… [+5325 chars]
URL: https://edmontonjournal.com/news/local-news/alberta-to-mark-day-of-mourning-but-no-holiday-for-queens-funeral
Content: It was 3.30pm when Leigh, Thony and James arrived at the back of the queue at Millennium Bridge to see the Queen lie in state.
None of them identified as royalists but had different reasons to come.… [+4042 chars]
URL: https://news.sky.com/story/sustained-by-pimms-lager-and-katsu-curr

In [85]:
sentiment_scores = []

for row in rows:
  article_url = row["url"]
  data = analyzeText(url=article_url)
  entry = {}
  entry['url'] = article_url
  try:
    entry['score'] = data['sentiment']['document']['score']
    # print(data['sentiment']['document'])
  except KeyError:
    entry['score'] = 0
    # print("ERROR: Cannot determine sentiment analysis score for this article")
  
  print(entry)
  sentiment_scores.append(entry);

{'url': 'https://7news.com.au/entertainment/royal-family/member-of-the-royal-guard-surrounding-the-coffin-of-queen-elizabeth-collapses--c-8244739', 'score': -0.608437}
{'url': 'https://edmontonjournal.com/news/local-news/alberta-to-mark-day-of-mourning-but-no-holiday-for-queens-funeral', 'score': 0.465382}
{'url': 'https://news.sky.com/story/sustained-by-pimms-lager-and-katsu-curry-but-only-respect-for-the-queen-in-the-queue-12697688', 'score': -0.32244}
{'url': 'https://www.abc.net.au/news/2022-09-15/live-updates-queen-elizabeth-ii-coffin-procession-westminster/101441332', 'score': -0.384216}
{'url': 'https://www.asiaone.com/entertainment/ex-hong-kong-actress-theresa-lee-ageing-gracefully-shows-wrinkles-and-pigmentation', 'score': 0.314691}
{'url': 'https://www.channelnewsasia.com/world/king-charles-william-and-harry-reunited-grief-escort-queens-coffin-2937566', 'score': 0.277354}
{'url': 'https://www.cnn.com/2022/09/14/uk/queen-elizabeth-silent-procession-intl-gbr/index.html', 'score

In [86]:
table_name = f"{suffix}_sentiment_score"
# Create a table
create_table_query = f"""CREATE TABLE IF NOT EXISTS {db_name}.{table_name} 
                                (url varchar(1000),
                                sentiment_score varchar(1000), 
                                PRIMARY KEY(url)
                                )"""
engine.execute(create_table_query)

<sqlalchemy.engine.cursor.LegacyCursorResult at 0x7f748d9e5c90>

In [87]:
query_template = f"""
                    INSERT IGNORE INTO 
                    {db_name}.{table_name}(url, sentiment_score) 
                    VALUES (%s, %s)
                  """
for score in sentiment_scores:
    url = score['url']
    sentiment_score = score['score']

    print("Inserting URL", url, "with", sentiment_score, "as score")
    query_parameters = (url, sentiment_score)
    engine.execute(query_template, query_parameters)

Inserting URL https://7news.com.au/entertainment/royal-family/member-of-the-royal-guard-surrounding-the-coffin-of-queen-elizabeth-collapses--c-8244739 with -0.608437 as score
Inserting URL https://edmontonjournal.com/news/local-news/alberta-to-mark-day-of-mourning-but-no-holiday-for-queens-funeral with 0.465382 as score
Inserting URL https://news.sky.com/story/sustained-by-pimms-lager-and-katsu-curry-but-only-respect-for-the-queen-in-the-queue-12697688 with -0.32244 as score
Inserting URL https://www.abc.net.au/news/2022-09-15/live-updates-queen-elizabeth-ii-coffin-procession-westminster/101441332 with -0.384216 as score
Inserting URL https://www.asiaone.com/entertainment/ex-hong-kong-actress-theresa-lee-ageing-gracefully-shows-wrinkles-and-pigmentation with 0.314691 as score
Inserting URL https://www.channelnewsasia.com/world/king-charles-william-and-harry-reunited-grief-escort-queens-coffin-2937566 with 0.277354 as score
Inserting URL https://www.cnn.com/2022/09/14/uk/queen-elizabeth

In [88]:
results = engine.execute(f"SELECT * FROM {db_name}.{table_name}")
rows = results.fetchall()
results.close()

In [89]:
for row in rows:
    print("URL:", row["url"])
    print("Sentiment Score:", row["sentiment_score"])
    print("=============================================")

URL: https://7news.com.au/entertainment/royal-family/member-of-the-royal-guard-surrounding-the-coffin-of-queen-elizabeth-collapses--c-8244739
Sentiment Score: -0.608437
URL: https://edmontonjournal.com/news/local-news/alberta-to-mark-day-of-mourning-but-no-holiday-for-queens-funeral
Sentiment Score: 0.465382
URL: https://news.sky.com/story/sustained-by-pimms-lager-and-katsu-curry-but-only-respect-for-the-queen-in-the-queue-12697688
Sentiment Score: -0.32244
URL: https://www.abc.net.au/news/2022-09-15/live-updates-queen-elizabeth-ii-coffin-procession-westminster/101441332
Sentiment Score: -0.384216
URL: https://www.asiaone.com/entertainment/ex-hong-kong-actress-theresa-lee-ageing-gracefully-shows-wrinkles-and-pigmentation
Sentiment Score: 0.314691
URL: https://www.channelnewsasia.com/world/king-charles-william-and-harry-reunited-grief-escort-queens-coffin-2937566
Sentiment Score: 0.277354
URL: https://www.cnn.com/2022/09/14/uk/queen-elizabeth-silent-procession-intl-gbr/index.html
Sentim