### Sentiments Analysis ###

Sentiment analysis also known as opinion mining refers to the automated task of determining the subjective 'tone' or mood  of a given natural language response or text. It aims to gauge the subjective judgments and feelings in the texts.

In its simplest form it is a multiclass text classification text where the text is classified into positive, neutral, or negative sentiment. The number of classes can vary according to the nature of the task, for example it can include emotions like happy, angry and sad. 



#### Application of sentiment analysis
Sentiment analysis has applications in a wide variety of domains including analyzing user reviews, tweet sentiment, etc. Let’s go through some of them here:

- Social media monitoring: analyzing trends and opinitions
- Movie reviews: Analysing online movie reviews and feedback to judge the quality of the movie, 
- News sentiment analysis: analyzing trends and their sentiments 
- Brand monitoring and market research: understanding what users are saying about a product
- medical uses: monitoring for depression in a person


#### Rule-based sentiment analysis
Rule-based sentiment analysis is one of the very basic approaches to calculate text sentiments. It only requires minimal pre-work and the idea is quite simple, this method does not use any machine learning to figure out the text sentiment. For example, we can figure out the sentiments of a sentence by counting the number of times the user has used the word “sad” in his/her tweet. 

Now, let’s check out some python packages that work using this method.

Install the following:

- textblob
- vader: 
- flair

### Textblob 
It is a simple python library that offers API access to different NLP tasks such as sentiment analysis, spelling correction, etc.

Textblob sentiment analyzer returns two properties for a given input sentence:polarity and subjectivity

The sentiment property returns a namedtuple of the form Sentiment(polarity, subjectivity). The polarity score is a float within the range [-1.0, 1.0]. The subjectivity is a float within the range [0.0, 1.0] where 0.0 is very objective and 1.0 is very subjective. Subjective sentences generally refer to personal opinion, emotion, or judgment. 


In [None]:
!pip install textblob



In [None]:
from textblob import TextBlob

testimonial = TextBlob("I am the amazing Textblob. Textblob is wonderful!")
print(testimonial.sentiment)


Sentiment(polarity=0.8, subjectivity=0.95)


In [None]:
sentences = ["I am very kind and good",
             "I am very angry",
             ]
for sentence in sentences:
    vs = TextBlob(sentence)
    print("{:.<50} {}".format(sentence, str(vs.sentiment)))

I am very kind and good........................... Sentiment(polarity=0.74, subjectivity=0.8)
I am very angry................................... Sentiment(polarity=-0.65, subjectivity=1.0)


### VADER

VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media. It is fully open-sourced under the [MIT License] 

VADER incorporate numerous lexical features common to sentiment expression in microblogs, including:

- a full list of Western-style emoticons, for example, :-) denotes a smiley face and generally indicates positive sentiment
- sentiment-related acronyms and initialisms (e.g., LOL and WTF are both examples of sentiment-laden initialisms)
- commonly used slang with sentiment value (e.g., nah, meh and giggly).

VADER incorporate word-order sensitive relationships between terms. For example, degree modifiers (also called intensifiers, booster words, or degree adverbs) impact sentiment intensity by either increasing or decreasing the intensity:

- "The service here is extremely good"
- "The service here is good"
- "The service here is marginally good"

A sentiment polarity (positive/negative), and the sentiment intensity on a scale from –4 to +4 is used. For example, the word "okay" has a positive valence of 0.9, "good" is 1.9, and "great" is 3.1, whereas "horrible" is –2.5, the frowning emoticon :( is –2.2, and "sucks" and it's slang derivative "sux" are both –1.5

In [None]:
!pip install vaderSentiment



In [None]:
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
    #note: depending on how you installed (e.g., using source code download versus pip install), you may need to import like this:
    #from vaderSentiment import SentimentIntensityAnalyzer

# --- examples -------
sentences = ["VADER is smart, handsome, and funny.",  # positive sentence example
             "VADER is smart, handsome, and funny!",  # punctuation emphasis handled correctly (sentiment intensity adjusted)
             "VADER is very smart, handsome, and funny.", # booster words handled correctly (sentiment intensity adjusted)
             "VADER is VERY SMART, handsome, and FUNNY.",  # emphasis for ALLCAPS handled
             "VADER is VERY SMART, handsome, and FUNNY!!!", # combination of signals - VADER appropriately adjusts intensity
             "VADER is VERY SMART, uber handsome, and FRIGGIN FUNNY!!!", # booster words & punctuation make this close to ceiling for score
             "VADER is not smart, handsome, nor funny.",  # negation sentence example
             "The book was good.",  # positive sentence
             "At least it isn't a horrible book.",  # negated negative sentence with contraction
             "The book was only kind of good.", # qualified positive sentence is handled correctly (intensity adjusted)
             "The plot was good, but the characters are uncompelling and the dialog is not great.", # mixed negation sentence
             "Today SUX!",  # negative slang with capitalization emphasis
             "Today only kinda sux! But I'll get by, lol", # mixed sentiment example with slang and constrastive conjunction "but"
             "Make sure you :) or :D today!",  # emoticons handled
             "Catch utf-8 emoji such as such as 💘 and 💋 and 😁",  # emojis handled
             "Not bad at all"  # Capitalized negation
             ]

analyzer = SentimentIntensityAnalyzer()
for sentence in sentences:
    vs = analyzer.polarity_scores(sentence)
    print("{:.<65} {}".format(sentence, str(vs)))

VADER is smart, handsome, and funny.............................. {'neg': 0.0, 'neu': 0.254, 'pos': 0.746, 'compound': 0.8316}
VADER is smart, handsome, and funny!............................. {'neg': 0.0, 'neu': 0.248, 'pos': 0.752, 'compound': 0.8439}
VADER is very smart, handsome, and funny......................... {'neg': 0.0, 'neu': 0.299, 'pos': 0.701, 'compound': 0.8545}
VADER is VERY SMART, handsome, and FUNNY......................... {'neg': 0.0, 'neu': 0.246, 'pos': 0.754, 'compound': 0.9227}
VADER is VERY SMART, handsome, and FUNNY!!!...................... {'neg': 0.0, 'neu': 0.233, 'pos': 0.767, 'compound': 0.9342}
VADER is VERY SMART, uber handsome, and FRIGGIN FUNNY!!!......... {'neg': 0.0, 'neu': 0.294, 'pos': 0.706, 'compound': 0.9469}
VADER is not smart, handsome, nor funny.......................... {'neg': 0.646, 'neu': 0.354, 'pos': 0.0, 'compound': -0.7424}
The book was good................................................ {'neg': 0.0, 'neu': 0.508, 'pos': 0.492, 'co

In [None]:
vs

{'compound': 0.431, 'neg': 0.0, 'neu': 0.513, 'pos': 0.487}

### VADER Scoring
The compound score is computed by summing the valence scores of each word in the lexicon, adjusted according to the rules, and then normalized to be between -1 (most extreme negative) and +1 (most extreme positive). This is the most useful metric if you want a single unidimensional measure of sentiment for a given sentence. Calling it a 'normalized, weighted composite score' is accurate.

It is also useful for researchers who would like to set standardized thresholds for classifying sentences as either positive, neutral, or negative. Typical threshold values (used in the literature cited on this page) are:

positive sentiment: compound score >= 0.05
neutral sentiment: (compound score > -0.05) and (compound score < 0.05)
negative sentiment: compound score <= -0.05
The pos, neu, and neg scores are ratios for proportions of text that fall in each category (so these should all add up to be 1... or close to it with float operation). These are the most useful metrics if you want multidimensional measures of sentiment for a given sentence.

### Flair 

Flair is a state-of-the-art natural language processing (NLP) models for text. It can perform named entity recognition (NER), part-of-speech tagging (PoS), sense disambiguation and classification and is supported with a rapidly growing number of languages.  Flair pretrained sentiment analysis model is trained on IMDB dataset but it can be trained on your own dataset. 

source: https://github.com/flairNLP/flair

In [1]:
#install in anaconda cmd window run as administrator, then restart jupyter notebook

!pip install flair

Collecting flair
  Downloading flair-0.10-py3-none-any.whl (322 kB)
[K     |████████████████████████████████| 322 kB 5.4 MB/s 
Collecting sentencepiece==0.1.95
  Downloading sentencepiece-0.1.95-cp37-cp37m-manylinux2014_x86_64.whl (1.2 MB)
[K     |████████████████████████████████| 1.2 MB 38.3 MB/s 
[?25hCollecting konoha<5.0.0,>=4.0.0
  Downloading konoha-4.6.5-py3-none-any.whl (20 kB)
Collecting bpemb>=0.3.2
  Downloading bpemb-0.3.3-py3-none-any.whl (19 kB)
Collecting gdown==3.12.2
  Downloading gdown-3.12.2.tar.gz (8.2 kB)
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
    Preparing wheel metadata ... [?25l[?25hdone
Collecting ftfy
  Downloading ftfy-6.0.3.tar.gz (64 kB)
[K     |████████████████████████████████| 64 kB 2.6 MB/s 
[?25hCollecting langdetect
  Downloading langdetect-1.0.9.tar.gz (981 kB)
[K     |████████████████████████████████| 981 kB 38.6 MB/s 
Collecting mpld3==0.3
  Downloading mpld3-0.3.tar.gz

In [4]:
#it will download the database may take a while (266M)
from flair.models import TextClassifier
from flair.data import Sentence

classifier = TextClassifier.load('en-sentiment')
sentences = ["I am very kind and good",
             "I am very angry",
             ]
for sentence in sentences:
    s = Sentence(sentence)
    classifier.predict(s)
    print('Sentence above is: ', s.labels)

2022-01-12 16:11:15,088 loading file /root/.flair/models/sentiment-en-mix-distillbert_4.pt
Sentence above is:  [POSITIVE (0.9923)]
Sentence above is:  [NEGATIVE (0.9995)]


### Question
Create a common set of sentences (similar to the sentences in the VADER example) and use the 3 different libraries to extract the sentiments. Compare the results. 
- Which library you think is better? 
- What improvements do you think can be incorporated into the library to improve the classification? 

### Answer 
I think vader is better, because it can Analyze the intensity of sentiment in the text

In [None]:
#Q1
sentences1 = [" I was delighted at your promotion.",
              ' I’m pleased with my son’s grades this year.',
             " His insult drives me mad! ",
             'He was annoyed by my indifference. ',
              ]

# vader
for sentence in sentences1:
    vs = analyzer.polarity_scores(sentence)
    print("{:.<65} {}".format(sentence, str(vs)))

# textblob
for sentence in sentences1:
    vs = TextBlob(sentence)
    print("{:.<50} {}".format(sentence, str(vs.sentiment)))

# flair
for sentence in sentences1:
    s = Sentence(sentence)
    classifier.predict(s)
    #print("{:.<65} {}".format(sentence, vs.labels))
    print('Sentence above is: ', s.labels)

 I was delighted at your promotion............................... {'neg': 0.0, 'neu': 0.602, 'pos': 0.398, 'compound': 0.5106}
 I’m pleased with my son’s grades this year...................... {'neg': 0.0, 'neu': 0.707, 'pos': 0.293, 'compound': 0.4404}
 His insult drives me mad! ...................................... {'neg': 0.694, 'neu': 0.306, 'pos': 0.0, 'compound': -0.7777}
He was annoyed by my indifference. .............................. {'neg': 0.487, 'neu': 0.513, 'pos': 0.0, 'compound': -0.4215}
 I was delighted at your promotion................ Sentiment(polarity=0.7, subjectivity=0.7)
 I’m pleased with my son’s grades this year....... Sentiment(polarity=0.5, subjectivity=1.0)
 His insult drives me mad! ....................... Sentiment(polarity=-0.78125, subjectivity=1.0)
He was annoyed by my indifference. ............... Sentiment(polarity=-0.4, subjectivity=0.8)
Sentence above is:  [POSITIVE (0.9945)]
Sentence above is:  [POSITIVE (0.9808)]
Sentence above is:  [NEGATIVE (0

### Answer 
2.  
More data can be added. If a language that expresses emotions is commonly used, then the accuracy of the word can be increased in a large amount of data. For example, the word "interesting" was originally neutral, but was used as a derogatory meaning Much more, the model will automatically discover its negative connotations.