# Sentiment Analysis
Sentiment analysis is the name given to the process of extracting the sentiment from a text (or classifying it into categories such as positive, negative and neutral).
It is possible for the classification to also contain extra information regarding how intense this sentiment is (e.g. not just positive/negative, but very positive, positive, negative, very negative) as well as nuanced emotions.

### Lexicon-based Sentiment

Lexicons are dictionaries that contain a collection of words and their corresponding sentiment value (for example ‘happy’ - positive, ‘sad’ - negative).
These lexicons are used for measuring the overall sentiment of a text by usually naively adding up all of these sentiment values. 

### Afinn
AFINN is a lexicon-based sentiment analysis tool that uses a predefined list of English words, each assigned an integer sentiment score between -5 (most negative) and +5 (most positive). To analyze the sentiment of a given piece of text, AFINN sums up the sentiment scores of individual words found in the text. The resulting score provides an overall measure of the sentiment polarity of the input text.

In [1]:
#!pip install afinn

Collecting afinn
  Downloading afinn-0.1.tar.gz (52 kB)
[2K     [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m52.6/52.6 kB[0m [31m556.2 kB/s[0m eta [36m0:00:00[0m kB/s[0m eta [36m0:00:01[0m:01[0m
[?25h  Preparing metadata (setup.py) ... [?25ldone
[?25hBuilding wheels for collected packages: afinn
  Building wheel for afinn (setup.py) ... [?25ldone
[?25h  Created wheel for afinn: filename=afinn-0.1-py3-none-any.whl size=53431 sha256=3b8a665d3d9889adf9f8324f8bf8c13b5468efcda608d4b9c3f51336e974f11c
  Stored in directory: /home/andy/.cache/pip/wheels/79/91/ee/8374d9bc8c6c0896a2db75afdfd63d43653902407a0e76cd94
Successfully built afinn
Installing collected packages: afinn
Successfully installed afinn-0.1


In [3]:
from afinn import Afinn
afinn = Afinn()
afinn.score('This is utterly excellent!') # Positive

3.0

In [4]:
afinn.score('Things were going pretty bad in there, some got sick and had to return.') # Negative

-4.0

In [5]:
afinn.score('This is not at all excellent!') # Naively believed to be positive

3.0

AFINN is simple, fast, and effective for sentiment analysis in short, informal text, although it might not be as accurate as more sophisticated machine learning models or lexicon-based approaches that consider linguistic nuances.

### Exercise 1 - Build your own Afinn!

Let's create our own sentiment analyzer using the AFINN-en-165.txt data.
This contains information for 3382 unique English words, each word having a sentiment score between -5 and 5 associated to it.

Create a function called get_sentiment_score(text) that takes as input a text and gives as output the sentiment score.

The output sentiment score will be the sum of sentiments of each word in the text (use 0 for words not found in Afinn).


In [6]:
import re
import nltk
from nltk.stem import WordNetLemmatizer
#nltk.download('punkt')
#nltk.download('wordnet')
lemmatizer = WordNetLemmatizer()

# Load AFINN lexicon into a dictionary
def load_afinn(filepath):
    afinn = {}
    with open(filepath, "r") as file:
        for line in file:
            word, score = line.strip().split('\t')
            afinn[word] = int(score)
    return afinn

# Write your code here
def get_sentiment_score(text):
    words=text.lower().split(' ')
    sentiment_score = 0
    for word in words:
        word = word.strip(" .,;'?!`~-")
        word = lemmatizer.lemmatize(word, pos="v")
        if word in my_afinn:
            sentiment_score+=my_afinn[word]
    return sentiment_score

# Load AFINN lexicon
filepath = "AFINN-en-165.txt"
my_afinn = load_afinn(filepath)

In [7]:
get_sentiment_score("The plane crashed.")

-2