# Intro to NLP for AI | 04 - Sentiment Analysis

## Rule-Based Sentiment Analysis

Sentiment Analysis is a necessary process to understand the emotions or opinions expressed in texts. By analyzing these texts, we can classify them as positive, negative or neutral. This helps businesses and researchers track public mood, brand reputation or reactions to events in real time.

Rule-Based is a practical approach to analyze text without training or using machine learning models. This results in rules based on which the text is labeled as positive/negative/neutral. These rules are also known as lexicons. TextBlob, VADER and SentiWordNet are widely used rule-based (lexicon-based) approaches.

In [3]:
sentence1 = "i had a great time at the movie it was really funny"
sentence2 = "i had a great time at the movie but the parking was terrible"
sentence3 = "i had a great time at the movie but the parking wasn't great"
sentence4 = "i went to see a movie"

### TextBlob

TextBlob is a Python library used for NLP. It relies on NLTK (Natural Language Toolkit). When a sentence is given, it returns back two things: `polarity` and `subjectivity`.

Polarity score ranges from -1 to 1. A score of -1 means the words are super negative, like "disgusting" or "awful". A score of 1 means the words are super positive, like "excellent" or "best".

Subjectivity score, on the other hand, goes from 0 to 1. If it's close to 1, it means the sentence has a lot of personal opinion instead of just facts.

In [4]:
from textblob import TextBlob

In [28]:
sentiment_score1 = TextBlob(sentence1)
print(sentence1, "->", sentiment_score1.sentiment)

i had a great time at the movie it was really funny -> Sentiment(polarity=0.525, subjectivity=0.875)


In [30]:
sentiment_score2 = TextBlob(sentence2)
print(sentence2, "->", sentiment_score2.sentiment)

i had a great time at the movie but the parking was terrible -> Sentiment(polarity=-0.09999999999999998, subjectivity=0.875)


In [31]:
sentiment_score3 = TextBlob(sentence3)
print(sentence3, "->", sentiment_score3.sentiment)

i had a great time at the movie but the parking wasn't great -> Sentiment(polarity=0.8, subjectivity=0.75)


In [33]:
sentiment_score4 = TextBlob(sentence4)
print(sentence4, "->", sentiment_score4.sentiment)

i went to see a movie -> Sentiment(polarity=0.0, subjectivity=0.0)


### VADER

VADER (Valence Aware Dictionary and sEntiment Reasoner) is a sentiment analysis tool which is designed to analyze social media text and informal language. Unlike traditional sentiment analysis methods, it excels at detecting sentiment in short pieces of text like tweets, product reviews or user comments which contain slang, emojis and abbreviations. It uses a pre-built lexicon of words associated with sentiment values and applies specific rules to calculate sentiment scores.

VADER works by analyzing the polarity of words and assigning a sentiment score to each wordd based on its emotional value. These individual word scores are then combined to calculate an overall sentiment score for the entire text. It uses compound score which is a normalized value between -1 and +1 representing the overall sentiment.

In [9]:
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

In [10]:
vader_sentiment = SentimentIntensityAnalyzer()
print(sentence1, "->", vader_sentiment.polarity_scores(sentence1))

i had a great time at the movie it was really funny -> {'neg': 0.0, 'neu': 0.578, 'pos': 0.422, 'compound': 0.807}


In [11]:
print(sentence2, "->", vader_sentiment.polarity_scores(sentence2))

i had a great time at the movie but the parking was terrible -> {'neg': 0.234, 'neu': 0.621, 'pos': 0.144, 'compound': -0.3818}


In [12]:
print(sentence3, "->", vader_sentiment.polarity_scores(sentence3))

i had a great time at the movie but the parking wasn't great -> {'neg': 0.247, 'neu': 0.611, 'pos': 0.142, 'compound': -0.4387}


In [13]:
print(sentence4, "->", vader_sentiment.polarity_scores(sentence4))

i went to see a movie -> {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}


### Pre-trained Transformer Models

By installing `transformers` library from Hugging Face, you'll be able to use any publicly available transformer models to perform sentiment analysis on texts.

In [14]:
import transformers
from transformers import pipeline

In [15]:
sentiment_pipeline = pipeline("sentiment-analysis")

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cpu


In [16]:
print(sentence1)
print(sentiment_pipeline(sentence1))

i had a great time at the movie it was really funny
[{'label': 'POSITIVE', 'score': 0.9998176693916321}]


In [17]:
print(sentence2)
print(sentiment_pipeline(sentence2))

i had a great time at the movie but the parking was terrible
[{'label': 'NEGATIVE', 'score': 0.9977464079856873}]


In [18]:
print(sentence3)
print(sentiment_pipeline(sentence3))

i had a great time at the movie but the parking wasn't great
[{'label': 'NEGATIVE', 'score': 0.9984902143478394}]


In [19]:
print(sentence4)
print(sentiment_pipeline(sentence4))

i went to see a movie
[{'label': 'POSITIVE', 'score': 0.9802700281143188}]


In [20]:
sentiment_analysis_with_model = pipeline("sentiment-analysis", model="finiteautomata/bertweet-base-sentiment-analysis")

emoji is not installed, thus not converting emoticons or emojis into text. Install emoji: pip3 install emoji==0.6.0
Device set to use cpu


In [21]:
print(sentence1)
print(sentiment_analysis_with_model(sentence1))

i had a great time at the movie it was really funny
[{'label': 'POS', 'score': 0.9923344254493713}]


In [22]:
print(sentence2)
print(sentiment_analysis_with_model(sentence2))

i had a great time at the movie but the parking was terrible
[{'label': 'NEG', 'score': 0.5355545878410339}]


In [23]:
print(sentence3)
print(sentiment_analysis_with_model(sentence3))

i had a great time at the movie but the parking wasn't great
[{'label': 'POS', 'score': 0.6234411001205444}]


In [24]:
print(sentence4)
print(sentiment_analysis_with_model(sentence4))

i went to see a movie
[{'label': 'NEU', 'score': 0.9007400274276733}]
