# NLP Tutorial: Sentiment Analysis with Python
Author: Argyris Argyrou, PhD candidate @ Cyprus University of Technology

### What is Sentiment Analysis—Opinion Mining?
Sentiment Analysis also known as Opinion Mining is a field within Natural Language Processing (NLP) that builds systems that try to identify and extract opinions within text. Usually, besides identifying the opinion, these systems extract attributes of the expression e.g.:

- **Polarity**: if the speaker express a positive or negative opinion,
- **Subject**: the thing that is being talked about,
- **Opinion holder**: the person, or entity that expresses the opinion.


### Labeled texts
One approach to sentiment analysis starts with **labeled
texts** and uses supervised machine learning trained on the labeled text data to
classify the polarity of new texts. 

### Sentiment Lexicons

Another approach creates a sentiment
**lexicon** and scores the text based on some function that describes how the words
and phrases of the text matches the lexicon. 

### What Is an Opinion?
Text information can be broadly categorized into two main types: **facts** and **opinions**.<br>
<br>**Facts** are objective expressions about something. <br>**Opinions** are usually subjective expressions that describe people’s sentiments, appraisals, and feelings toward a subject or topic.

Sentiment analysis, just as many other NLP problems, can be modeled as a *classification* problem where two sub-problems must be resolved:

- **subjectivity classification**: Classifying a sentence as *subjective* or *objective*.
- **polarity classification**: Classifying a sentence as expressing a *positive*, *negative* or *neutral opinion*.

### Direct vs Comparative Opinions
Direct opinions give an opinion about a entity directly, for example:
<br><br>
> "The picture quality of camera A is poor."<br>
> “The picture quality of camera A is better than that of camera B.”

###  Explicit vs Implicit Opinions

An explicit opinion on a subject is an opinion explicitly expressed in a subjective sentence. 
> “The voice quality of this phone is amazing.”<br>

An implicit opinion on a subject is an opinion implied in an objective sentence. The following sentence expresses an implicit negative opinion:
> “The earphone broke in two days.” <br>

### Scope

Sentiment analysis can be applied at different levels of scope: **Document** level, **Sentence** level, **Sub-sentence** level (sub-expressions within a sentence).
<br> 
- Polarity (positive, negative, neutral)
- Emotions (angry, happy, sad, etc)
- Identify intentions (e.g. interested v. not interested)

### Challenges

- Subjectivity and Tone
- Context and Polarity
- Irony and Sarcasm
- Comparisons
- Emojis
- Defining Neutral


### Use Cases & Applications

- Social media monitoring
- Brand monitoring
- Voice of customer (VoC)
- Customer service
- Workforce analytics and voice of employee
- Product analytics
- Market research and analysis

### Sentiment Analysis Algorithms

- **Rule-based** systems that perform sentiment analysis based on a set of manually crafted rules-lexicons.
- **Automatic** systems that rely on machine learning techniques to learn from data-classifying.
- **Hybrid** systems that combine both rule based and automatic approaches.

### Sentiment Analysis with Afinn

The AFINN lexicon is perhaps one of the simplest and most popular lexicons that can be used extensively for sentiment analysis. 

In [None]:
# To install AFINN, simply run
# !pip install afinn

In [9]:
# initialize afinn sentiment analyzer
from afinn import Afinn
af = Afinn()
af.score('This is utterly excellent')

3.0

In [12]:
af = Afinn(emoticons=True) # afinn = Afinn(language='da')
af.score('I saw that yesterday :)')

2.0

### Sentiment Analysis with TextBlob
TextBlob is another excellent open-source library for performing NLP tasks with ease, including sentiment analysis. 

In [3]:
# To install AFINN, simply run
# !pip install -U textblob

The sentiment property returns a namedtuple of the form Sentiment(polarity, subjectivity). The polarity score is a float within the range [-1.0, 1.0]. The subjectivity is a float within the range [0.0, 1.0] where 0.0 is very objective and 1.0 is very subjective.

In [17]:
# Each word in the lexicon has scores for:
# 1)     polarity: negative vs. positive    (-1.0 => +1.0)
# 2) subjectivity: objective vs. subjective (+0.0 => +1.0)
# 3)    intensity: modifies next word?      (x0.5 => x2.0)

from textblob import TextBlob

testimonial = TextBlob("Textblob is amazingly simple to use. What great fun!")
testimonial.sentiment

Sentiment(polarity=0.39166666666666666, subjectivity=0.4357142857142857)

### Further reading
The following are the most frequently cited and read papers in the sentiment analysis community in general:

- Opinion mining and sentiment analysis (Pang and Lee, 2008)
- Recognizing contextual polarity in phrase-level sentiment analysis (Wilson, Wiebe and Hoffmann, 2005).
- Sentiment analysis and subjectivity (Liu, 2010)
- A survey of opinion mining and sentiment analysis (Liu and Zhang, 2012)
- Sentiment analysis and opinion mining (Liu, 2012)


- Progress in NLP https://nlpprogress.com/english/sentiment_analysis.html

- Objective: Use sentiment scores of live tweets to evaluate the mood of each state. http://individual.utoronto.ca/zabet/twitter_sentiment_analysis.html

- This tutorial is an abstract from https://monkeylearn.com/sentiment-analysis/