# Sentiment Analysis performance benchmark

In general, documents with similar sentiments, would be close to each other in the embeddings feature space. This can become another method to judge the performance of sentiment analysis models.

In this work, we aim to perform a benchmark of recent sentiment analysis works and models, reproduce their results, and judge their performance in comparison to baseline methods.

This work has the following plan :

- Understanding the data
- 

Throughout this project, we are working with an Ubuntu 20.04 distribution, on Python 3.7. We are going to use libraries such as Pytorch 1.8 and Tensorflow 2, which would be using GPU. You can find below the specs of the GPU we have been using for our experiments.

In [1]:
!nvidia-smi

Mon Dec  7 22:09:06 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.05    Driver Version: 450.51.05    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  GeForce GTX 1060    On   | 00000000:01:00.0 Off |                  N/A |
| N/A   62C    P0    27W /  N/A |    396MiB /  6078MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+---------------------------------------------------------------------------

### I - Understanding the data

We would be working throughout this experiment on a Twitter dataset of complains and reviews of people about airline companies. These reviews got a 

In [2]:
import pandas as pd

In [3]:
df = pd.read_csv('tweets.csv')

In [4]:
df.head()

Unnamed: 0,tweet_id,airline_sentiment,airline_sentiment_confidence,negativereason,negativereason_confidence,airline,airline_sentiment_gold,name,negativereason_gold,retweet_count,text,tweet_coord,tweet_created,tweet_location,user_timezone
0,570306133677760513,neutral,1.0,,,Virgin America,,cairdin,,0,@VirginAmerica What @dhepburn said.,,2015-02-24 11:35:52 -0800,,Eastern Time (US & Canada)
1,570301130888122368,positive,0.3486,,0.0,Virgin America,,jnardino,,0,@VirginAmerica plus you've added commercials t...,,2015-02-24 11:15:59 -0800,,Pacific Time (US & Canada)
2,570301083672813571,neutral,0.6837,,,Virgin America,,yvonnalynn,,0,@VirginAmerica I didn't today... Must mean I n...,,2015-02-24 11:15:48 -0800,Lets Play,Central Time (US & Canada)
3,570301031407624196,negative,1.0,Bad Flight,0.7033,Virgin America,,jnardino,,0,@VirginAmerica it's really aggressive to blast...,,2015-02-24 11:15:36 -0800,,Pacific Time (US & Canada)
4,570300817074462722,negative,1.0,Can't Tell,1.0,Virgin America,,jnardino,,0,@VirginAmerica and it's a really big bad thing...,,2015-02-24 11:14:45 -0800,,Pacific Time (US & Canada)


In [5]:
tweets = df[['tweet_id','text','airline_sentiment','airline_sentiment_confidence']]

In [6]:
tweets.head()

Unnamed: 0,tweet_id,text,airline_sentiment,airline_sentiment_confidence
0,570306133677760513,@VirginAmerica What @dhepburn said.,neutral,1.0
1,570301130888122368,@VirginAmerica plus you've added commercials t...,positive,0.3486
2,570301083672813571,@VirginAmerica I didn't today... Must mean I n...,neutral,0.6837
3,570301031407624196,@VirginAmerica it's really aggressive to blast...,negative,1.0
4,570300817074462722,@VirginAmerica and it's a really big bad thing...,negative,1.0


In [7]:
tweets.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14640 entries, 0 to 14639
Data columns (total 4 columns):
 #   Column                        Non-Null Count  Dtype  
---  ------                        --------------  -----  
 0   tweet_id                      14640 non-null  int64  
 1   text                          14640 non-null  object 
 2   airline_sentiment             14640 non-null  object 
 3   airline_sentiment_confidence  14640 non-null  float64
dtypes: float64(1), int64(1), object(2)
memory usage: 457.6+ KB


In [8]:
tweets.isnull().sum()

tweet_id                        0
text                            0
airline_sentiment               0
airline_sentiment_confidence    0
dtype: int64

In [9]:
tweets['airline_sentiment'].unique()

array(['neutral', 'positive', 'negative'], dtype=object)

In [10]:
print("Size of neutrals is :: " + str(tweets['airline_sentiment'][tweets['airline_sentiment']=='neutral'].shape[0]))
print("Size of positives is :: " + str(tweets['airline_sentiment'][tweets['airline_sentiment']=='positive'].shape[0]))
print("Size of negatives is :: " + str(tweets['airline_sentiment'][tweets['airline_sentiment']=='negative'].shape[0]))

Size of neutrals is :: 3099
Size of positives is :: 2363
Size of negatives is :: 9178
