# Detecting Emotion within Discussion Comments 
 
The ability to automatically detect the sentiment expressed in textual data will provide Cath's Cabs with a greater insight into the nuance of the comments associated with research notes. This functionality will support further analysis into the correlation between a comment and a researcher's notes. For example, a comment that has a positive sentiment may be linked to a user gaining valuable information from such note and could contribute to a ranking system in terms of usefulness of research notes. Such a feature could greatly innovate Cath's Cabs's software. 

Sentiment Analysis is an area of NLP which aims to automatically identify and extract opinions within a given text by gauging the attitude, sentiment, evaluation and emotion of a speaker/writer based on the computational treatment of the text.

## Sentiment Analysis

For the purposes of the concept work herein, we use an *Amazon* fine food reviews dataset.

In [6]:
import pandas as pd 
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

# Step 1: Read in data and print

df = pd.read_csv('fine-food-reviews-ee.csv')
display(df.head())

# Step 2: Retrieve the top 10 reviews for demonstration purposes
df = df[:10]

# Step 3: Convert 'Text' column to list
review_list = df['Text'].tolist()
# print(review_list)

Unnamed: 0,Id,ProductId,UserId,ProfileName,HelpfulnessNumerator,HelpfulnessDenominator,Score,Time,Summary,Text
0,1.0,B001E4KFG0,A3SGXH7AUHU8GW,delmartian,1.0,1.0,5.0,1303862000.0,Good Quality Dog Food,I have bought several of the Vitality canned d...
1,2.0,B00813GRG4,A1D87F6ZCVE5NK,dll pa,0.0,0.0,1.0,1346976000.0,Not as Advertised,Product arrived labeled as Jumbo Salted Peanut...
2,3.0,B000LQOCH0,ABXLMWJIXXAIN,"Natalia Corres ""Natalia Corres""",1.0,1.0,4.0,1219018000.0,"""Delight"" says it all",This is a confection that has been around a fe...
3,4.0,B000UA0QIQ,A395BORC6FGVXV,Karl,3.0,3.0,2.0,1307923000.0,Cough Medicine,If you are looking for the secret ingredient i...
4,5.0,B006K2ZZ7K,A1UQRSCLF8GW1T,"Michael D. Bigham ""M. Wassir""",0.0,0.0,5.0,1350778000.0,Great taffy,Great taffy at a great price. There was a wid...



The demonstration reports the results of VADER and the Tone Analyzer classifications for the five example comments. Both VADER and IBM's Tone Analyzer produce a score which correlates to the associated sentiment/emotion. As mentioned above, VADER's score is based off of three different thresholds in relation to the compound score whereas the Tone Analyzer is on a scale of zero to one. A score of 0.75 or higher means it's likely that the emotional indicators are spot-on.

In [4]:
# Step 1: Initialise 'SentimentIntensityAnalyzer' object
analyser = SentimentIntensityAnalyzer()

# Step 2: Classify reviews by sentiment on *compound score*
def sentiment_analyzer_scores(review_comments):
    for i in review_comments:
        score = analyser.polarity_scores(i)
        if score['compound'] >= 0.05:
            print(i, "\n")
            print("Compound Score: ", score)
            print("Overall Sentiment: Positive", "\n")
            print("------------------------------------------------------------------------------------------------")
        elif score['compound'] <= - 0.05:
            print(i, "\n")
            print("Compound Score: ", score)
            print("Overall Sentiment: Negative", "\n")
            print("------------------------------------------------------------------------------------------------")
        else:
            print(i, "\n")
            print("Compound Score: ", score)
            print("Overall Sentiment: Neutral", "\n")
            print("------------------------------------------------------------------------------------------------")

sentiment_scores = sentiment_analyzer_scores(review_list)

I have bought several of the Vitality canned dog food products and have found them all to be of good quality. The product looks more like a stew than a processed meat and it smells better. My Labrador is finicky and she appreciates this product better than  most. 

Compound Score:  {'neg': 0.0, 'neu': 0.695, 'pos': 0.305, 'compound': 0.9441}
Overall Sentiment: Positive 

------------------------------------------------------------------------------------------------
Product arrived labeled as Jumbo Salted Peanuts...the peanuts were actually small sized unsalted. Not sure if this was an error or if the vendor intended to represent the product as "Jumbo". 

Compound Score:  {'neg': 0.138, 'neu': 0.862, 'pos': 0.0, 'compound': -0.5664}
Overall Sentiment: Negative 

------------------------------------------------------------------------------------------------
This is a confection that has been around a few centuries.  It is a light, pillowy citrus gelatin with nuts - in this case Filbert

The given examples clearly express sentiments and emotions. It is important to note that if a comment were to express two sentiments, for example, "I found these notes really helpful, but they could have included a little more information" then more sophisticated techniques may be required such as IBM's Tone Analyzer. Overall, the results from VADER and the Tone Analyzer are in line with our pragmatic competence as human readers which allows us to interpret that these are intuitive results.

## Interactive Visualisation Example

The visualisation below shows the use of `Plotly`.

In [5]:
import plotly.io as pio
import plotly.express as px
import plotly.offline as py

df = px.data.iris()
fig = px.scatter(df, x="sepal_width", y="sepal_length", color="species", size="sepal_length")
fig