# Sentiment Analysis - E2E example

![nlp](https://wrm5sysfkg-flywheel.netdna-ssl.com/wp-content/uploads/2019/01/NLP-Technology-in-Healthcare.jpg)

A full e2e example using VADER and some movie review data.

In [21]:
import numpy as np
import pandas as pd

In [22]:
df = pd.read_csv('./resources/moviereviews.tsv', sep='\t')
df.head()

Unnamed: 0,label,review
0,neg,how do films like mouse hunt get into theatres...
1,neg,some talented actresses are blessed with a dem...
2,pos,this has been an extraordinary year for austra...
3,pos,according to hollywood movies made in last few...
4,neg,my first press screening of 1998 and already i...


## Clean the Data

In [23]:
df.dropna(inplace=True)
print("[-][001][Data cleaning]: removed missing values with dropna")

In [24]:
# create a list of row indexes for which the review
# is not present or is a set of empty spaces as text

blanks = []

for i, lb, rv in df.itertuples():
    # index, label, review
    if type(rv) == str:
        if rv.isspace():
            blanks.append(i)
            
print("[-][002][Data cleaning]: Building list of blanks. {len(blanks)} found.")

In [27]:
# drop the blanks and perform inplace
if blanks:
    df.drop(blanks, inplace=True)
    print("[-][003][Data cleaning]: rows with blanks removed")
    blanks = []
else:
    print("[r][003][Data cleaning]: blanks removed already completed")

[r][003][Data cleaning]: blanks removed already completed


## Check the data

In [28]:
df['label'].value_counts()

pos    969
neg    969
Name: label, dtype: int64

## Sentiment Analysis

In [29]:
from nltk.sentiment.vader import SentimentIntensityAnalyzer

In [30]:
sid = SentimentIntensityAnalyzer()

In [31]:
df['scores'] = df['review'].apply(lambda review:sid.polarity_scores(review))

In [32]:
df['compound'] = df['scores'].apply(lambda d:d['compound'])

In [33]:
df.head()

Unnamed: 0,label,review,scores,compound
0,neg,how do films like mouse hunt get into theatres...,"{'neg': 0.121, 'neu': 0.778, 'pos': 0.101, 'co...",-0.9125
1,neg,some talented actresses are blessed with a dem...,"{'neg': 0.12, 'neu': 0.775, 'pos': 0.105, 'com...",-0.8618
2,pos,this has been an extraordinary year for austra...,"{'neg': 0.068, 'neu': 0.781, 'pos': 0.15, 'com...",0.9951
3,pos,according to hollywood movies made in last few...,"{'neg': 0.071, 'neu': 0.782, 'pos': 0.147, 'co...",0.9972
4,neg,my first press screening of 1998 and already i...,"{'neg': 0.091, 'neu': 0.817, 'pos': 0.093, 'co...",-0.2484


In [34]:
df['comp_score'] = df['compound'].apply(lambda score: 'pos' if score >= 0 else 'neg')

In [35]:
df.head()

Unnamed: 0,label,review,scores,compound,comp_score
0,neg,how do films like mouse hunt get into theatres...,"{'neg': 0.121, 'neu': 0.778, 'pos': 0.101, 'co...",-0.9125,neg
1,neg,some talented actresses are blessed with a dem...,"{'neg': 0.12, 'neu': 0.775, 'pos': 0.105, 'com...",-0.8618,neg
2,pos,this has been an extraordinary year for austra...,"{'neg': 0.068, 'neu': 0.781, 'pos': 0.15, 'com...",0.9951,pos
3,pos,according to hollywood movies made in last few...,"{'neg': 0.071, 'neu': 0.782, 'pos': 0.147, 'co...",0.9972,pos
4,neg,my first press screening of 1998 and already i...,"{'neg': 0.091, 'neu': 0.817, 'pos': 0.093, 'co...",-0.2484,neg


## Reporting

In [36]:
from sklearn.metrics import classification_report, accuracy_score, confusion_matrix

In [37]:
accuracy_score(df['label'], df['comp_score'])

0.6357069143446853

In [39]:
print(classification_report(df['label'], df['comp_score']))

              precision    recall  f1-score   support

         neg       0.72      0.44      0.55       969
         pos       0.60      0.83      0.70       969

    accuracy                           0.64      1938
   macro avg       0.66      0.64      0.62      1938
weighted avg       0.66      0.64      0.62      1938



In [40]:
print(confusion_matrix(df['label'], df['comp_score']))

[[427 542]
 [164 805]]


# Summary

The results above demonstrate the outcome of using VADER, bearing in mind it involved low-code, zero training and default result sets. We can see the classic problem with working with text in that within many of our data samples we will find sarcasm and even the best machine learning algorithms have a hard time dealing with sarcasm.

This is prevalent in the analysis of the negative reviews on the basis that sarcasm correlates highly with negativity and is common playground of sarcasm. 

Finally, we can see that the results are fair for an import, call & evaluate use-case but we cannot claim that the results here are in any way good or enlightening. 