# Background
The objective of this project is to classify the overall sentiment of a tweet's context as neutral, negative, or positive using NLP classifiers. To complete, this project, we are given a dataset of 27,481 tweets, where 22,464 of those tweets were captured as having either a neutral, negative, or positive sentiment. Our goal is to use this training data of ~27.5k tweets to predict the sentiment of the 3,534 tweets in our testing data set.

# Objective
Run Affin on testing set and see how accurate it is

## Importing Libraries

In [1]:
from afinn import Afinn
import re
import numpy as np
import pandas as pd

## Read Testing Dataset

In [2]:
test = pd.read_csv("/Users/bethelikejiofor/Documents/GitHub/ENTITY-Final-Project/Data/test.csv")
test.head()

Unnamed: 0,textID,text,sentiment
0,f87dea47db,Last session of the day http://twitpic.com/67ezh,neutral
1,96d74cb729,Shanghai is also really exciting (precisely -...,positive
2,eee518ae67,"Recession hit Veronique Branquinho, she has to...",negative
3,01082688c6,happy bday!,positive
4,33987a8ee5,http://twitpic.com/4w75p - I like it!!,positive


In [3]:
# Instantiate Affin
afn = Afinn()

## Wrangling Dataset

In [4]:
def removepunct(text):
    text = re.sub(r'[^\w\s]', '', text)
    return text

In [5]:
test = test[['text', 'sentiment']]
test['text'] = test.text.astype(str).str.lower()
test['text_clean'] = test['text'].apply(removepunct)
test.head()

Unnamed: 0,text,sentiment,text_clean
0,last session of the day http://twitpic.com/67ezh,neutral,last session of the day httptwitpiccom67ezh
1,shanghai is also really exciting (precisely -...,positive,shanghai is also really exciting precisely s...
2,"recession hit veronique branquinho, she has to...",negative,recession hit veronique branquinho she has to ...
3,happy bday!,positive,happy bday
4,http://twitpic.com/4w75p - i like it!!,positive,httptwitpiccom4w75p i like it


## Affin Sentiment Analysis

In [8]:
def sentiment_analyzer(text):
    
    #compute scores (polarity) and labels
    score = afn.score(text)
    if score > 0:
        return 'positive'
    if score < 0:
        return 'negative'
    if score == 0:
        return 'neutral'

In [9]:
test['sentiment_pred'] = test['text_clean'].apply(sentiment_analyzer)

In [11]:
pos = len(test[test['sentiment']=='positive'])
pos1 = len(test[(test['sentiment']=='positive') & (test['sentiment_pred']=='positive')])
print('Positive accuracy: ',pos1/pos*100,'%' )

Positive accuracy:  84.49682683590208 %


In [12]:
neg = len(test[test['sentiment']=='negative'])
neg1 = len(test[(test['sentiment']=='negative') & (test['sentiment_pred']=='negative')])
print('Negative accuracy: ',neg1/neg*100,'%' )

Negative accuracy:  62.437562437562434 %


In [13]:
neu = len(test[test['sentiment']=='neutral'])
neu1 = len(test[(test['sentiment']=='neutral') & (test['sentiment_pred']=='neutral')])
print('Neutral accuracy: ',neu1/neu*100,'%' )

Neutral accuracy:  54.05594405594406 %
