<a href="https://www.kaggle.com/code/thatbrock/hcde-530-monkeylearn-sentiment-demo?scriptVersionId=103086212" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

## Overview
This notebook demonstrates using [MonkeyLearn](https://monkeylearn.com/) to do sentiment analysis of text via the MonkeyLearn API. Sentiment Analysis is just one of many models that be accessed via MonkeyLearn. Generally speaking, the process for using these models follows a similar pattern as is outlined here. This will work as a Kaggle notebook or a standalone notebook if you are running on a different Jupyter Notebook server, but in the latter case, you will need to store you API safely using a different method.

## Getting Ready
Create a user account at MonkeyLearn and request an API Key. In the notebook's Add-ons (on Kaggle), create a new Secret and add the MonkeyLearn API key. 

In [2]:
import pandas as pd
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
secret_value_0 = user_secrets.get_secret("monkeyKey")

Next you have to install the MonkeyLearn package by opening the Console and typing `pip install monkeylearn`. You need to do this each time you start the notebook's kernel, because external packages do not persist across kernel sessions.

After that, you can use the demonstration template provided by MondkeyLearn to make a query:

In [5]:
from monkeylearn import MonkeyLearn

ml = MonkeyLearn(secret_value_0)
# The MonkeyLearn example uses this code: data = ["This is a great tool!"]
# We are taking input from an input box
phrase = input('Type a phrase to analyze:')
data = [phrase]
model_id = 'cl_pi3C7JiL'
result = ml.classifiers.classify(model_id, data)
print(result.body)

Type a phrase to analyze: What an ordinary day.


[{'text': 'What an ordinary day.', 'external_id': None, 'error': False, 'classifications': [{'tag_name': 'Positive', 'tag_id': 122921383, 'confidence': 0.863}]}]


We can make this a little easier to read by using the `dumps()` method from the pandas json library.

In [6]:
import json
print(json.dumps(result.body, indent=1))

[
 {
  "text": "What an ordinary day.",
  "external_id": null,
  "error": false,
  "classifications": [
   {
    "tag_name": "Positive",
    "tag_id": 122921383,
    "confidence": 0.863
   }
  ]
 }
]


Note, the result returned from the MonkeyLearn API is a list containing the results. Since we only made a single request, there is only one item in the list.

In [17]:
type(result.body)

list

You can take the results of the list and import them into a pandas dataframe for further analysis.

In [8]:
import pandas as pd
df = pd.DataFrame(result.body)
df

Unnamed: 0,text,external_id,error,classifications
0,What an ordinary day.,,False,"[{'tag_name': 'Positive', 'tag_id': 122921383,..."


The result looks great the classifications are buried inside a dictionary. To include them as a new column (in a new table), we can use the argument `meta` to specify a list of additional data we want from the result. (This takes a list as an argument, but 'external_id' is "None" and 'error' is "False", so actually we don't really need those columns.)

In [13]:
classifications = pd.json_normalize(result.body, record_path =['classifications'], meta=['text','external_id', 'error'])
classifications

Unnamed: 0,tag_name,tag_id,confidence,text,external_id,error
0,Positive,122921383,0.863,What an ordinary day.,,False


# Doing this from a file
The MonkeyLearn API is expecting to receive requests as a list. In the example above we sent it a list of one item taken from the text input field. To make this query from all of the comments in a file, we just need to read the file, and then create as a list containing its comments, and send that to the MonkeyLearn API. 

In the `user_comments.csv` file, the first item in the list is the respondent number, followed by their comment. We need to retrieve only their comment and put it into a list. We can iterate trough the lines in the file and split the line on the comma that separates the user number from their comment. Their comment can then be added to a new list. the `strip()` method removes the invisible newline character from the text.

In [15]:
commentdata = []                      # create an empty list to store just the comments
fh = open('../input/user-comments/user_comments.csv', 'r') # open the file
for line in fh:                       # process each line in the file
    words = line.strip().split(",")   # split on the comma and strip out the newline character
    commentdata.append(words[1])      # append the second item in words to the new list
fh.close()                            # be nice and close the file properly
print(commentdata)

['It was great while it it was free. Now I am not so sure.', 'I hated the interface', 'I would use it again in a heartbeat', "It's not for me.", 'I would die for this app. Love it!', 'It has some real problems', 'I like the link sharing feature', 'Take it or leave it', 'My brother loves it', 'I really like the feedback option!']


Now that we have a list of the comments, we can pass all of these items to the API:

***Be careful what you do here. Each comment is a request from the API. If your list is longer than the number of API calls on the free tier (currently 1000/month), the API will stop returning results until you pay for the service.***

In [16]:
model_id = 'cl_pi3C7JiL'
result = ml.classifiers.classify(model_id, commentdata)
print(result.body)

[{'text': 'It was great while it it was free. Now I am not so sure.', 'external_id': None, 'error': False, 'classifications': [{'tag_name': 'Negative', 'tag_id': 122921385, 'confidence': 0.953}]}, {'text': 'I hated the interface', 'external_id': None, 'error': False, 'classifications': [{'tag_name': 'Negative', 'tag_id': 122921385, 'confidence': 0.892}]}, {'text': 'I would use it again in a heartbeat', 'external_id': None, 'error': False, 'classifications': [{'tag_name': 'Positive', 'tag_id': 122921383, 'confidence': 0.743}]}, {'text': "It's not for me.", 'external_id': None, 'error': False, 'classifications': [{'tag_name': 'Negative', 'tag_id': 122921385, 'confidence': 0.626}]}, {'text': 'I would die for this app. Love it!', 'external_id': None, 'error': False, 'classifications': [{'tag_name': 'Positive', 'tag_id': 122921383, 'confidence': 0.998}]}, {'text': 'It has some real problems', 'external_id': None, 'error': False, 'classifications': [{'tag_name': 'Negative', 'tag_id': 1229213

Again, we can examing the data in a more readable manner with `json.dumps()`.

In [17]:
print(json.dumps(result.body, indent=1))

[
 {
  "text": "It was great while it it was free. Now I am not so sure.",
  "external_id": null,
  "error": false,
  "classifications": [
   {
    "tag_name": "Negative",
    "tag_id": 122921385,
    "confidence": 0.953
   }
  ]
 },
 {
  "text": "I hated the interface",
  "external_id": null,
  "error": false,
  "classifications": [
   {
    "tag_name": "Negative",
    "tag_id": 122921385,
    "confidence": 0.892
   }
  ]
 },
 {
  "text": "I would use it again in a heartbeat",
  "external_id": null,
  "error": false,
  "classifications": [
   {
    "tag_name": "Positive",
    "tag_id": 122921383,
    "confidence": 0.743
   }
  ]
 },
 {
  "text": "It's not for me.",
  "external_id": null,
  "error": false,
  "classifications": [
   {
    "tag_name": "Negative",
    "tag_id": 122921385,
    "confidence": 0.626
   }
  ]
 },
 {
  "text": "I would die for this app. Love it!",
  "external_id": null,
  "error": false,
  "classifications": [
   {
    "tag_name": "Positive",
    "tag_id": 1229

Now it's easy to load the data into a new dataframe to manipulate it easily. We can use the `json_normalize()` method again to extract just the things we want from the source data. Here, we are excluding `external_id` and `error` since they aren't useful.

In [39]:
df = pd.DataFrame(result.body)
df = pd.json_normalize(result.body, record_path =['classifications'], meta=['text'])
df

Unnamed: 0,tag_name,tag_id,confidence,text
0,Negative,122921385,0.953,It was great while it it was free. Now I am no...
1,Negative,122921385,0.892,I hated the interface
2,Positive,122921383,0.743,I would use it again in a heartbeat
3,Negative,122921385,0.626,It's not for me.
4,Positive,122921383,0.998,I would die for this app. Love it!
5,Negative,122921385,0.907,It has some real problems
6,Positive,122921383,0.901,I like the link sharing feature
7,Neutral,122921384,0.622,Take it or leave it
8,Positive,122921383,0.978,My brother loves it
9,Positive,122921383,0.961,I really like the feedback option!


# Visualizing the data
It's usually easier to see large tables of data aggregated and visualized. Here we aggregate the tags by their categories and make a simple barchart with plotly.

In [40]:
import plotly.express as px
# Here we use a column with categorical data
fig = px.histogram(df, x="tag_name", labels={"tag_name": "Sentiment"}, title="Sentiment Analysis of User Comments")
fig.show()