# Twitter US Sentiment Analysis using ULMFiT

## Step 1: Downloading the data

You can get the data from the official [Kaggle Page](https://www.kaggle.com/crowdflower/twitter-airline-sentiment/version/4#Tweets.csv)

Download and Install Fast AI

In [None]:
!pip install fastai

## Step 2: Preparing and Processing the data

In this notebook, we will be doing some initial data processing. To begin with, we will read in each of the reviews and combine them into a single input structure. Then, we will split the dataset into a training set and a testing set.

In [1]:
import os
import glob
import pandas as pd
import numpy as np

def read_data(data_dir='tweets.csv'):
    #read data from the csv file
    data = pd.read_csv(data_dir)              
    return data

In [2]:
data = read_data()

#1. Get the number of postive, negative or neutral sentiments
overall_count = data['airline_sentiment'].count()
count_positive = data[data['airline_sentiment'] == 'positive']['airline_sentiment'].count()
count_negative = data[data['airline_sentiment'] == 'negative']['airline_sentiment'].count()
count_neutral = data[data['airline_sentiment'] == 'neutral']['airline_sentiment'].count()

print("The total number of Positive sentiments tweets are", count_positive)
print("The total number of Negative sentiments tweets are",count_negative)
print("The total number of Neutral sentiments tweets are",count_neutral)

assert (count_positive + count_negative + count_neutral) == overall_count, \
'AssertionError: There Seems to be some null values in the airline_sentiment column, please remove the null values or make sure to impute them appropriately'

# Now lets do one more check
if data[data['airline_sentiment'] == 'negative'].notnull()['tweet_id'].count() == count_negative:
    print()
    print('There is no null values associated with negative tweets')    
else:
    print("There are about {} values associated with negative tweets".format(data[data['airline_sentiment'] == 'negative'].notnull()['tweet_id'].count()))

# print(data)
# #Look into the data
# neutral_positive_sentiment = data[ ((data['airline_sentiment'] == 'neutral') | (data['airline_sentiment'] == 'positive')) ]
# neutral_positive_sentiment['negativereason'].isnull()

# We have confirmed that there is null values related with positive ans negative scores
# We have concerned with 2 columns 1. airline_sentiment 2. Text
cleaned_data = data[['airline_sentiment', 'text']].dropna()
cleaned_data

The total number of Positive sentiments tweets are 2363
The total number of Negative sentiments tweets are 9178
The total number of Neutral sentiments tweets are 3099

There is no null values associated with negative tweets


Unnamed: 0,airline_sentiment,text
0,neutral,@VirginAmerica What @dhepburn said.
1,positive,@VirginAmerica plus you've added commercials t...
2,neutral,@VirginAmerica I didn't today... Must mean I n...
3,negative,@VirginAmerica it's really aggressive to blast...
4,negative,@VirginAmerica and it's a really big bad thing...
...,...,...
14635,positive,@AmericanAir thank you we got on a different f...
14636,negative,@AmericanAir leaving over 20 minutes Late Flig...
14637,neutral,@AmericanAir Please bring American Airlines to...
14638,negative,"@AmericanAir you have my money, you change my ..."


In [3]:
import re

#remove @someone from the tweets, we will also remove # from the tweets
def process_text(text):
    word = re.sub(r'@\w+\s+',"", text)
    word = re.sub(r'\'',"", word)
    word = re.sub(r'#','',word)
    return word
    
    
cleaned_data['text'] = cleaned_data['text'].apply(lambda x: process_text(x))    

In [4]:
cleaned_data.columns=['labels', 'text']
cleaned_data

Unnamed: 0,labels,text
0,neutral,What said.
1,positive,plus youve added commercials to the experience...
2,neutral,I didnt today... Must mean I need to take anot...
3,negative,"its really aggressive to blast obnoxious ""ente..."
4,negative,and its a really big bad thing about it
...,...,...
14635,positive,thank you we got on a different flight to Chic...
14636,negative,leaving over 20 minutes Late Flight. No warnin...
14637,neutral,Please bring American Airlines to BlackBerry10
14638,negative,"you have my money, you change my flight, and d..."


In [5]:
# Split into Training and test set in the ratio 80:20
import numpy as np
from sklearn.model_selection import train_test_split
train_df, test_df = train_test_split(cleaned_data, test_size=0.20)

Now that we've read the raw training and testing data from the downloaded dataset, we will save it to a file so that it can be accessed later

In [6]:
train_df = train_df.reset_index(drop=True)
test_df = test_df.reset_index(drop=True)

In [7]:
# Save the training and testing data in a CSV file for future use
train_df.to_csv('train.csv', header=False, index=False)
test_df.to_csv('test.csv', header=False, index=False)

# Model Training

### Only run the below cells if you want to train the model from sratch
#### PS: it will take longer to finish

Text models, data, and training
The text module of the fastai library contains all the necessary functions to define a Dataset suitable for the various NLP (Natural Language Processing) tasks and quickly generate models you can use for them. Specifically:

text.transform contains all the scripts to preprocess your data, from raw text to token ids,
text.data contains the definition of TextDataBunch, which is the main class you'll need in NLP,
text.learner contains helper functions to quickly create a language model or an RNN classifier.

**We are using the text.learner of the fast.ai to create a sentiment classifer which essentially is an RNN Clasifier**

In [9]:
from fastai.text import * 
# Language model data
data_lm = TextLMDataBunch.from_csv('', 'train.csv')
# Classifier model data
data_clas = TextClasDataBunch.from_csv('', 'train.csv', vocab=data_lm.train_ds.vocab, bs=32)

In [10]:
data_lm.save('data_lm_export.pkl')
data_clas.save('data_clas_export.pkl')

In [11]:
data_lm = load_data('', 'data_lm_export.pkl')
data_clas = load_data('', 'data_clas_export.pkl', bs=16)

In [13]:
data_clas.show_batch()

text,target
xxbos xxmaj hi have a question re future xxmaj flight xxmaj booking xxmaj problems . xxup xxunk - xxup xxunk 29 / 9 xxup xxunk - xxup lax 8 / 10 xxup lax - xxup xxunk 13 / 10 . i m * xxup xxunk xxmaj what is checked bag xxunk for xxup xxunk - xxup lax ?,neutral
xxbos xxup xxunk u xxup us xxup airways xxup with xxup yo xxup shitty xxup chicken xxup xxunk xxup xxunk xxup that xxup so xxup xxunk xxup and u xxup xxunk xxup make xxup me xxup wait xxup in a 6 xxup hr xxup layover xxup xxunk u xxup and,negative
"xxbos e xxrep 4 y ! xxmaj cancelled xxmaj flightlations , xxmaj flight xxmaj booking xxmaj problemss , reflight xxmaj booking xxmaj problemss , but y all got me on the same flight out tonight ( not tomorrow ) & & the xxup fc upgrade . xxmaj thx !",positive
xxbos 7 xxup weeks xxmaj late flightr xxup and i xxup still xxup have xxup not xxup received xxup my xxup miles xxup from xxup the mileageplus xxmaj gift xxmaj card $ 150 xxup xxunk xxup card i xxup handed xxup over ! ! !,negative
xxbos xxmaj my xxmaj flight xxmaj booking xxmaj problems xxup xxunk just times out when i select it under xxmaj manage xxmaj my xxmaj flight xxmaj booking xxmaj problems for months now . i have emailed but no response . xxmaj help ?,negative


In [12]:
# first create the language classifier and train it using the 
learn = text_classifier_learner(data_clas, AWD_LSTM, drop_mult=0.5)
learn.fit_one_cycle(1, 1e-2)

epoch,train_loss,valid_loss,accuracy,time
0,0.909974,0.986095,0.621852,19:57


In [14]:
learn.predict("This was a great movie!")

(Category negative, tensor(0), tensor([0.8158, 0.0961, 0.0880]))

In [15]:
learn.freeze_to(-2)
learn.fit_one_cycle(1, slice(5e-3/2., 5e-3))

epoch,train_loss,valid_loss,accuracy,time
0,0.835096,0.890275,0.626547,20:54


In [16]:
learn.predict("best")

(Category negative, tensor(0), tensor([0.8608, 0.0862, 0.0530]))

In [17]:
learn.unfreeze()
learn.fit_one_cycle(1, slice(2e-3/100, 2e-3))

epoch,train_loss,valid_loss,accuracy,time
0,0.721916,0.739642,0.673069,21:53


In [18]:
learn.predict("This was a great movie!")

The history saving thread hit an unexpected error (OperationalError('database or disk is full')).History will not be written to the database.


(Category negative, tensor(0), tensor([0.6062, 0.0696, 0.3242]))

In [19]:
learn.fit_one_cycle(1, slice(2e-3/100, 2e-3))

epoch,train_loss,valid_loss,accuracy,time
0,0.674309,0.651649,0.71831,23:38


In [28]:
learn.predict("This was a great movie!")

(Category negative, tensor(0), tensor([0.4941, 0.3924, 0.1135]))

In [29]:
learn.fit_one_cycle(1, slice(2e-3/100, 2e-3))

epoch,train_loss,valid_loss,accuracy,time
0,0.667502,0.654175,0.722578,20:00


In [61]:
learn.predict("This was a great movie!")

str

In [30]:
# Save the model
learn.save_encoder('ft_enc')

## Load the trained Model

In [None]:
learn = text_classifier_learner(data_clas, AWD_LSTM, drop_mult=0.5)
learn.load_encoder('ft_enc')

In [57]:
test_data = test_df.text.values
test_data

array(['to allow payment by Apple Pay - Patriarc http://t.co/bdUaUZFhW2',
       'Definitely! Lots of announcements and the app is great.', 'AIRPORT CODE TEST GO. SLC.  BOS MCO',
       'to add insult to injury, I have to go pick it up myself. Real class act! Ill stick the Delta from now on.',
       ..., 'sucks!! Teco teco reclameaqui TripAdvisor http://t.co/auGJsCmoLu',
       'Lol, k. “@JetBlue: Our fleets on fleek. http://t.co/IUX94Rgc83”',
       'Im here airport.  Ive waited 3 hours for my bag. No one knows shit. Mgmt knows nothing.  Very mad customer',
       'thanks!!'], dtype=object)

In [82]:
predicted_labels = []
for i in test_data:
    label = str(learn.predict(i)[0])
    predicted_labels.append(label)
predicted_labels

['neutral',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'positive',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'positive',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'neutral',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'neutral',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'neutral',
 'negative',
 'negative',
 'negative',
 'positive',
 'negative',
 'neutral',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'negative',
 'positive',
 'negative',
 'positive',
 'ne

# Conclusion/Output

## Evaluation Metrics

## Accuracy
We are finding accuracy to find out actual number of correct vs incorrect classifications but since are dataset is not properly balanced then using only Accuracy as the evaluation metric is not enough

In [86]:
from sklearn.metrics import accuracy_score
y_true = test_df['labels']
accuracy_score(y_true, predicted_labels)

0.7353142076502732

We are getting an accuracy of arounf 73% after 3 rounds of training but that does not say anything about our model so we'll find out precision, recall and F1-score to evaluate our results

In [89]:
from sklearn.metrics import classification_report
target_names = ['Negative', 'Positive', 'Neutral']
print(classification_report(y_true, predicted_labels, target_names=target_names))

              precision    recall  f1-score   support

    Negative       0.73      0.96      0.83      1809
    Positive       0.70      0.30      0.42       613
     Neutral       0.85      0.44      0.58       506

    accuracy                           0.74      2928
   macro avg       0.76      0.57      0.61      2928
weighted avg       0.74      0.74      0.70      2928



Precision tell us what portion of predicted Positives is truly Positive while recall refers to the percentage of total relevant results correctly classified by our model. From the above table we can say that **precision** values for Negative, Positive and Neutral classes is pretty good which essentially means that 73% of Negatively classified tweets are truly negative, 70% postively classified tweets are truly positive and 85% of the tweets classified as neutral are trulty neutral in sentiment.
**Recall** values tells us that about 96% of Negatives were classified correctly whereas the correctly classified for positive and neutral is 30% and 44% respectively.
Since the precision and recall scores are indicate different conclusions for our model therefore we will choose F1-Score to find out how good our model actually is.
Now **F1-score**, F1-score find out the harmonic mean of precison and recall.In other words it maintains a balance between the precision and recall for our classifier.. If your precision is low, then F1 is low and if the recall is low again your F1 score is low.

**F1 score for Negative sentiment is good which means our model is good enough to identify negative tweets but it is low for positive and neutal sentiment which means that it performs poorly to identify these two sentiments. Since the data for positive and negative tweets is low we can roll with the results for now but if we want to improve the accuracy we'll need more positive and neutral tweets**