# Sentiment Evaluation of Twitter and YouTube Data
## Tasks

1. Install packages and load evaluation datasets with Google NLP scores
2. Run VADER over evaluation texts
3. Run BERT over evaluation texts
4. Evaluate against sentiment annotations and compare with Google NLP

### Install requirements. 

The following cell contains all the necessary dependencies needed for this task. If you run the cell everything will be installed. 

* [`vaderSentiment`](https://github.com/cjhutto/vaderSentiment) is a Python package for a Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text.
* [`transformers`](https://huggingface.co/) is a Python package for creating and working with transformers. [Here](https://huggingface.co/docs) is the documentation of `transformers`.
* [`torch`](https://pytorch.org/) is a Python machine learning framework. We need this here for `transformers` since this package uses internally `torch`. [Here](https://pytorch.org/docs/stable/index.html) is the documentation of `torch`.
* [`pandas`](https://pandas.pydata.org/docs/index.html) is a Python package for creating and working with tabular data. [Here](https://pandas.pydata.org/docs/reference/index.html) is the documentation of `pandas`.

In [2]:
! pip install vaderSentiment
! pip install transformers sentencepiece
! pip install torch torchvision torchaudio
! pip install pandas



You may need to restart the Kernel after installing the dependencies!

### Import requirements
The cell below imports all necessary dependancies. Make sure they are installed (see cell above).

In [4]:
import pandas as pd
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
from transformers import pipeline

# 1. Load evaluation datasets and Google NLP scores

## 1.1 Load datasets
First read the Twitter and Youtube Comments CSV files (`Twitter-Sentiment.csv` and `YouTubeComments-Sentiment.csv`) and save them in a pandas Dataframe.

In [5]:
# Read Twitter data
twitter_data = pd.read_csv("Twitter-Sentiment.csv")
# print(twitter_data)

# Read Youtube data
youtube_data = pd.read_csv("YouTubeComments-Sentiment.csv")
# print(youtube_data)

# 2. Run VADER over evaluation texts

## 2.1 Run VADER over the first tweet

In this task you should use VADER for sentiment analysis. For this we use the `vaderSentiment` package. You first have to intatiate a new `SentimentIntensityAnalyzer` and use the `polarity_scores` method of it for the analysis. Apply this for the first tweet. Is it a good classification?

[Here](https://github.com/cjhutto/vaderSentiment) under 'Code Examples' you can find some example code how to use this package.

In [6]:
#  Intatiate a new SentimentIntensityAnalyzer
vader = SentimentIntensityAnalyzer()

# Cassify first tweet and print
first_tweet = twitter_data["text"][0]
first_tweet_classification = vader.polarity_scores(first_tweet)

print(f"First Tweet: {first_tweet}\n")
print(f"Classification first Tweet: {first_tweet_classification}\n")

First Tweet: ?RT @justinbiebcr: The bigger the better....if you know what I mean ;)

Classification first Tweet: {'neg': 0.0, 'neu': 0.853, 'pos': 0.147, 'compound': 0.2263}



The analyzed tweet is predominantly neutral (neu: 0.853) but leans slightly positive overall (compound: 0.2263 and pos: 0.147).
There’s no detectable negativity (neg: 0.0), so the tone of the tweet is likely neutral to mildly positive.

The classification is reasonable but not perfect. VADER captures the neutral structure and mild positivity but misses the playful, suggestive tone implied by the wink emoji and double entendre. It also overlooks the broader context and cultural nuances, such as the implied humor in "if you know what I mean." While suitable for general analysis, it lacks the sophistication to interpret subtle humor, innuendo, or contextual cues in tweets like this.

## 2.2 Run VADER over each text

Now use VADER for all the text data of the Twitter and the Youtube dataframe. Create a new column in the dataframes called `VADER_compound` where you save the `compound` result (look at the output dictonary of the `polarity_scores` method).

*Important: Make sure `compound` is a float*

If this runs slow on your computer you can use the precomputed values in the provided CSV files which are present in the column `VADER_compund_precomputed` for further tasks.

In [7]:
# Using VADER for sentiment analysis of twitter data
vader = SentimentIntensityAnalyzer()
twitter_data["VADER_compound"] = 0.0

#for i in range(10):
for i in range(len(twitter_data["text"])):
    # use polarity_scores method to get the sentiment scores
    sentiment_dict = vader.polarity_scores(twitter_data["text"][i])
    # Save the compound result as float in the dataset. 
    # Notice: .loc is way slower here.... but worked for us ;)
    twitter_data.loc[i, "VADER_compound"] = sentiment_dict["compound"]

In [8]:
# Using VADER for sentiment analysis of YouTube data
vader = SentimentIntensityAnalyzer()
youtube_data["VADER_compound"] = 0.0

#for i in range(10):
for i in range(len(youtube_data["text"])):
    # use polarity_scores method to get the sentiment scores
    sentiment_dict = vader.polarity_scores(youtube_data["text"][i])
    # Save the compound result as float in the dataset. 
    # Notice: .loc is way slower here.... but worked for us ;)
    youtube_data.loc[i, "VADER_compound"] = sentiment_dict["compound"]

## 2.3 VADER as a classifier

To get the three Classes `Positive`, `Negative` and `Neutral` we use the compound score with the following thresholds:

* `compound > 0.5`: `"Positive"`
* `compound < -0.5`: `"Negative"`
* `else`: `"Neutral"`

Create a new column called `VADER_class` which contains the three computed classes.

In [9]:
# Create new column for computed classes
twitter_data["VADER_class"] = "Neutral"
youtube_data["VADER_class"] = "Neutral"

# Classify Twitter Data
twitter_data.loc[twitter_data["VADER_compound"] > 0.5, "VADER_class"] = "Positive"
twitter_data.loc[twitter_data["VADER_compound"] < 0.5, "VADER_class"] = "Negative"

# Classify YouTube Data
youtube_data.loc[youtube_data["VADER_compound"] > 0.5, "VADER_class"] = "Positive"
youtube_data.loc[youtube_data["VADER_compound"] < 0.5, "VADER_class"] = "Negative"

# 3. Use a BERT based model for sentiment analysis

## 3.1 BERT
BERT (Bidirectional Encoder Representation from Transformers) is a machine learning technique for natural language processing. There are already pretrained models available in the `transformers` package. You can look [here](https://huggingface.co/models?sort=downloads&search=sentiment) and choose a model for the next tasks. (We suggest [this](https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest) (`"cardiffnlp/twitter-roberta-base-sentiment-latest"`) model, but you can use any available, just make sure it is suitable for sentiment analysis).

First create a `pipeline` where you set your model by the `model` keyword argument. You can then use this method to pass text which should be classified. [Here](https://huggingface.co/blog/sentiment-analysis-python#2-how-to-use-pre-trained-sentiment-analysis-models-with-python) is a tutorial how to use this.

As before save the classes in a new row 'BERT_class'. The call to your pipeline returns a dictionary where there is a key `label` which contains already the `Positive`, `Negative` or `Neutral` class (Be aware that this is based on the model you choose, sometimes these classes are named differently so you have to rename them by hand, this is not the case if you use the suggested model).

Based on you computer this may take some time, if it is too slow for you, you can again use the precomputed classes `'BERT_class_precomputed'` in the CSV Files for further tasks.

In [10]:
# Using BERT-Base-Uncased model for sentiment analysis
#sentiment_pipeline = pipeline(model=f"cardiffnlp/twitter-roberta-base-sentiment-latest")

# Create new column for computed BERT classes
#twitter_data["BERT_class"] = "Neutral"
#youtube_data["BERT_class"] = "Neutral"

# twitter_data

# column_to_classify = "BERT_class_precomputed"
# column_to_classify = "BERT_class"


# 4. Evaluate against sentiment annotations and compare with Google NLP

## 4.1 Convert GoogleNLP scores to classes

As with VADER and BERT, compute classes from the GoogleNLP score, which is given in the column `googleScore`. For this use following thresholds:

* `googleScore > 0.3`: `"Positive"`
* `googleScore < -0.3`: `"Negativ"`
* `else`: `"Neutral"`

Save the classes in a new column named `GoogleNLP_class`.


In [11]:
# Create new column for Google NLP classes
twitter_data["GoogleNLP_class"] = "Neutral"
youtube_data["GoogleNLP_class"] = "Neutral"

# Classify Twitter Data
twitter_data.loc[twitter_data["googleScore"] > 0.3, "GoogleNLP_class"] = "Positive"
twitter_data.loc[twitter_data["googleScore"] < -0.3, "GoogleNLP_class"] = "Negative"

# Classify YouTube Data
youtube_data.loc[youtube_data["googleScore"] > 0.3, "GoogleNLP_class"] = "Positive"
youtube_data.loc[youtube_data["googleScore"] < -0.3, "GoogleNLP_class"] = "Negative"

youtube_data
twitter_data

Unnamed: 0,label,text,googleScore,VADER_compound_precomputed,BERT_class_precomputed,VADER_compound,VADER_class,GoogleNLP_class
0,Positive,?RT @justinbiebcr: The bigger the better....if...,0.3,0.2263,Positive,0.2263,Negative,Neutral
1,Positive,"Listening to the ""New Age"" station on @Slacker...",0.2,0.0000,Neutral,0.0000,Negative,Neutral
2,Neutral,I favorited a YouTube video -- Drake and Josh ...,0.0,0.4019,Positive,0.4019,Negative,Neutral
3,Positive,i didnt mean knee high I ment in lengt it goes...,0.8,0.8632,Positive,0.8632,Positive,Positive
4,Neutral,I wana see the vid Kyan,0.0,0.0000,Neutral,0.0000,Negative,Neutral
...,...,...,...,...,...,...,...,...
4204,Neutral,"So far, i'm seeing the opposite of what you're...",0.4,0.0000,Negative,0.0000,Negative,Positive
4205,Neutral,RT @Nescreation I'm Yours w/ hearts Ladies Cam...,0.3,0.6486,Neutral,0.6486,Positive,Neutral
4206,Positive,"RT @JoseCarol: If you fall, GET UP!, if you're...",0.3,0.6531,Positive,0.6531,Positive,Neutral
4207,Neutral,@MakikiGirl I'm giving my 2 Japanese Chins a b...,-0.1,-0.2023,Negative,-0.2023,Negative,Neutral


## 4.2 Evaluate on Twitter

First, let's calculate the accuracy for all three classifiers on the Twitter and Youtube data, print the results.

### Accuracy Formula
$$
\text{Accuracy} = \frac{\text{Number of Correct Predictions}}{\text{Total Number of Samples}}
$$

In [24]:
# Get number of Samples
number_twitter_samples = len(twitter_data.index)
number_youtube_samples = len(youtube_data.index)
print(f"Total Number of Twitter Samples: {number_twitter_samples}")
print(f"Total Number of YouTube Samples: {number_youtube_samples}\n")

Total Number of Twitter Samples: 4209
Total Number of YouTube Samples: 3293



In [25]:
# Choose colum to evaluate of BERT calssifications
column_to_validate = "BERT_class_precomputed"
# column_to_validate = "BERT_class"

In [26]:
# Get number of Correct Predictions for Twitter Samples
correct_predictiions_VADER_on_twitter_data = twitter_data[twitter_data["VADER_class"]==twitter_data["label"]].shape[0]
print(f"Number of Correct Predictions for Twitter by VADER: {correct_predictiions_VADER_on_twitter_data}")
correct_predictiions_BERT_on_twitter_data = twitter_data[twitter_data[column_to_validate]==twitter_data["label"]].shape[0]
print(f"Number of Correct Predictions for Twitter by BERT: {correct_predictiions_BERT_on_twitter_data}")
correct_predictiions_GoogleNLP_on_twitter_data = twitter_data[twitter_data["GoogleNLP_class"]==twitter_data["label"]].shape[0]
print(f"Number of Correct Predictions for Twitter by Google NLP: {correct_predictiions_GoogleNLP_on_twitter_data}\n")

Number of Correct Predictions for Twitter by VADER: 758
Number of Correct Predictions for Twitter by BERT: 2672
Number of Correct Predictions for Twitter by Google NLP: 2825



In [27]:
# Get number of Correct Predictions for YouTube Samples
correct_predictiions_VADER_on_youtube_data = youtube_data[youtube_data["VADER_class"]==youtube_data["label"]].shape[0]
print(f"Number of Correct Predictions for YouTube by VADER: {correct_predictiions_VADER_on_youtube_data}")
correct_predictiions_BERT_on_youtube_data = youtube_data[youtube_data[column_to_validate]==youtube_data["label"]].shape[0]
print(f"Number of Correct Predictions for YouTube by BERT: {correct_predictiions_BERT_on_youtube_data}")
correct_predictiions_GoogleNLP_on_youtube_data = youtube_data[youtube_data["GoogleNLP_class"]==youtube_data["label"]].shape[0]
print(f"Number of Correct Predictions for YouTube by Google NLP: {correct_predictiions_GoogleNLP_on_youtube_data}\n")

Number of Correct Predictions for YouTube by VADER: 1374
Number of Correct Predictions for YouTube by BERT: 2448
Number of Correct Predictions for YouTube by Google NLP: 2172



In [28]:
# Calculate Accuracy of Pedictions on Twitter Samples
accuracy_VADER_on_twitter = correct_predictiions_VADER_on_twitter_data / number_twitter_samples
accuracy_BERT_on_twitter = correct_predictiions_BERT_on_twitter_data / number_twitter_samples
accuracy_GoogleNLP_on_twitter = correct_predictiions_GoogleNLP_on_twitter_data / number_twitter_samples

print(f"Accuracy of VADER on Twitter Samples: {accuracy_VADER_on_twitter} meaning {accuracy_VADER_on_twitter*100:.2f}%")
print(f"Accuracy of BERT on Twitter Samples: {accuracy_BERT_on_twitter} meaning {accuracy_BERT_on_twitter*100:.2f}%")
print(f"Accuracy of Google NLP on Twitter Samples: {accuracy_GoogleNLP_on_twitter} meaning {accuracy_GoogleNLP_on_twitter*100:.2f}%\n")

Accuracy of VADER on Twitter Samples: 0.18009028272748873 meaning 18.01%
Accuracy of BERT on Twitter Samples: 0.6348301259206462 meaning 63.48%
Accuracy of Google NLP on Twitter Samples: 0.6711808030411024 meaning 67.12%



In [30]:
# Calculate Accuracy of Pedictions on YouTube Samples
accuracy_VADER_on_youtube = correct_predictiions_VADER_on_youtube_data / number_youtube_samples
accuracy_BERT_on_youtube = correct_predictiions_BERT_on_youtube_data / number_youtube_samples
accuracy_GoogleNLP_on_youtube = correct_predictiions_GoogleNLP_on_youtube_data / number_youtube_samples

print(f"Accuracy of VADER on YouTube Samples: {accuracy_VADER_on_youtube} meaning {accuracy_VADER_on_youtube*100:.2f}%")
print(f"Accuracy of BERT on YouTube Samples: {accuracy_BERT_on_youtube} meaning {accuracy_BERT_on_youtube*100:.2f}%")
print(f"Accuracy of Google NLP on YouTube Samples: {accuracy_GoogleNLP_on_youtube} meaning {accuracy_GoogleNLP_on_youtube*100:.2f}%\n")

Accuracy of VADER on YouTube Samples: 0.41724870938354086 meaning 41.72%
Accuracy of BERT on YouTube Samples: 0.7433950804737322 meaning 74.34%
Accuracy of Google NLP on YouTube Samples: 0.6595809292438506 meaning 65.96%



Next calculate the precision of the `"Positive"` class for the Twitter and Youtube data.
This is calculated as follows:
$
\begin{align}
    precision = \frac{TP}{TP + FP}
\end{align}
$
*Note: Here the Positive samples are the one with the the class `"Positive"`*

### True Positives (TP) for the "Positive" Class

To calculate the **True Positives (TP)** for the `"Positive"` class, we need to identify the cases where:
- The **actual class** is `"Positive"`, and
- The **predicted class** is also `"Positive"` (lable fits classification).

In [40]:
# Calculate True Positive
TP_VADER_on_twitter_data = twitter_data[(twitter_data["VADER_class"] == twitter_data["label"]) & (twitter_data["VADER_class"] == "Positive")].shape[0]
TP_BERT_on_twitter_data = twitter_data[(twitter_data[column_to_validate] == twitter_data["label"]) & (twitter_data[column_to_validate] == "Positive")].shape[0]
TP_GoogleNLP_on_twitter_data = twitter_data[(twitter_data["GoogleNLP_class"] == twitter_data["label"]) & (twitter_data["GoogleNLP_class"] == "Positive")].shape[0]

print(f"True Positive VADER on Twitter Samples: {TP_VADER_on_twitter_data}")
print(f"True Positive BERT on Twitter Samples: {TP_BERT_on_twitter_data}")
print(f"True Positive Google NLP on Twitter Samples: {TP_GoogleNLP_on_twitter_data}\n")

TP_VADER_on_youtube_data = youtube_data[(youtube_data["VADER_class"] == youtube_data["label"]) & (youtube_data["VADER_class"] == "Positive")].shape[0]
TP_BERT_on_youtube_data = youtube_data[(youtube_data[column_to_validate] == youtube_data["label"]) & (youtube_data[column_to_validate] == "Positive")].shape[0]
TP_GoogleNLP_on_youtube_data = youtube_data[(youtube_data["GoogleNLP_class"] == youtube_data["label"]) & (youtube_data["GoogleNLP_class"] == "Positive")].shape[0]

print(f"True Positive VADER on YouTube Samples: {TP_VADER_on_youtube_data}")
print(f"True Positive BERT on YouTube Samples: {TP_BERT_on_youtube_data}")
print(f"True Positive Google NLP on YouTube Samples: {TP_GoogleNLP_on_youtube_data}\n")


True Positive VADER on Twitter Samples: 427
True Positive BERT on Twitter Samples: 537
True Positive Google NLP on Twitter Samples: 328

True Positive VADER on YouTube Samples: 912
True Positive BERT on YouTube Samples: 1202
True Positive Google NLP on YouTube Samples: 914



### False Positives (FP) for the "Positive" Class

To calculate the **False Positives (FP)** for the `"Positive"` class, we need to identify the cases where:
- The **predicted class** is `"Positive"` but
- The **actual class** is **not** `"Positive"` (lable does not fit classification).

In [41]:
# Calculate False Positive
FP_VADER_on_twitter_data = twitter_data[(twitter_data["VADER_class"] != twitter_data["label"]) & (twitter_data["VADER_class"] == "Positive")].shape[0]
FP_BERT_on_twitter_data = twitter_data[(twitter_data[column_to_validate] != twitter_data["label"]) & (twitter_data[column_to_validate] == "Positive")].shape[0]
FP_GoogleNLP_on_twitter_data = twitter_data[(twitter_data["GoogleNLP_class"] != twitter_data["label"]) & (twitter_data["GoogleNLP_class"] == "Positive")].shape[0]

print(f"False Positive VADER on Twitter Samples: {FP_VADER_on_twitter_data}")
print(f"False Positive BERT on Twitter Samples: {FP_BERT_on_twitter_data}")
print(f"False Positive Google NLP on Twitter Samples: {FP_GoogleNLP_on_twitter_data}\n")

FP_VADER_on_youtube_data = youtube_data[(youtube_data["VADER_class"] != youtube_data["label"]) & (youtube_data["VADER_class"] == "Positive")].shape[0]
FP_BERT_on_youtube_data = youtube_data[(youtube_data[column_to_validate] != youtube_data["label"]) & (youtube_data[column_to_validate] == "Positive")].shape[0]
FP_GoogleNLP_on_youtube_data = youtube_data[(youtube_data["GoogleNLP_class"] != youtube_data["label"]) & (youtube_data["GoogleNLP_class"] == "Positive")].shape[0]

print(f"False Positive VADER on YouTube Samples: {FP_VADER_on_youtube_data}")
print(f"False Positive BERT on YouTube Samples: {FP_BERT_on_youtube_data}")
print(f"False Positive Google NLP on YouTube Samples: {FP_GoogleNLP_on_youtube_data}\n")


False Positive VADER on Twitter Samples: 774
False Positive BERT on Twitter Samples: 964
False Positive Google NLP on Twitter Samples: 651

False Positive VADER on YouTube Samples: 354
False Positive BERT on YouTube Samples: 372
False Positive Google NLP on YouTube Samples: 270



In [44]:
# Check the total number of positive predictions – just for verification
print(f"Positive VADER on Twitter Samples: {twitter_data[twitter_data["VADER_class"] == "Positive"].shape[0]}")
print(f"Positive BERT on Twitter Samples: {twitter_data[twitter_data["BERT_class_precomputed"] == "Positive"].shape[0]}")
print(f"Positive Google NLP on Twitter Samples: {twitter_data[twitter_data["GoogleNLP_class"] == "Positive"].shape[0]}\n")

print(f"Positive VADER on YouTube Samples: {youtube_data[youtube_data["VADER_class"] == "Positive"].shape[0]}")
print(f"Positive BERT on YouTube Samples: {youtube_data[youtube_data["BERT_class_precomputed"] == "Positive"].shape[0]}")
print(f"Positive Google NLP on YouTube Samples: {youtube_data[youtube_data["GoogleNLP_class"] == "Positive"].shape[0]}\n")

Positive VADER on Twitter Samples: 1201
Positive BERT on Twitter Samples: 1501
Positive Google NLP on Twitter Samples: 979

Positive VADER on YouTube Samples: 1266
Positive BERT on YouTube Samples: 1574
Positive Google NLP on YouTube Samples: 1184



Calculating
$
\begin{align}
    precision = \frac{TP}{TP + FP}
\end{align}
$

In [52]:
# Calculate Precision on Twitter Data
precision_VADER_on_twitter_data = TP_VADER_on_twitter_data / (TP_VADER_on_twitter_data + FP_VADER_on_twitter_data)
precision_BERT_on_twitter_data = TP_BERT_on_twitter_data / (TP_BERT_on_twitter_data + FP_BERT_on_twitter_data)
precision_GoogleNLP_on_twitter_data = TP_GoogleNLP_on_twitter_data / (TP_GoogleNLP_on_twitter_data + FP_GoogleNLP_on_twitter_data)

print(f"Precision VADER on Twitter Samples: {precision_VADER_on_twitter_data} meaning {precision_VADER_on_twitter_data*100:.2f}%")
print(f"Precision BERT on Twitter Samples: {precision_BERT_on_twitter_data} meaning {precision_BERT_on_twitter_data*100:.2f}%")
print(f"Precision Google NLP on Twitter Samples: {precision_GoogleNLP_on_twitter_data} meaning {precision_GoogleNLP_on_twitter_data*100:.2f}%\n") 	 

# Calculate Precision on YouTube Data
precision_VADER_on_youtube_data = TP_VADER_on_youtube_data / (TP_VADER_on_youtube_data + FP_VADER_on_youtube_data)
precision_BERT_on_youtube_data = TP_BERT_on_youtube_data / (TP_BERT_on_youtube_data + FP_BERT_on_youtube_data)
precision_GoogleNLP_on_youtube_data = TP_GoogleNLP_on_youtube_data / (TP_GoogleNLP_on_youtube_data + FP_GoogleNLP_on_youtube_data)

print(f"Precision VADER on YouTube Samples: {precision_VADER_on_youtube_data} meaning {precision_VADER_on_youtube_data*100:.2f}%")
print(f"Precision BERT on YouTube Samples: {precision_BERT_on_youtube_data} meaning {precision_BERT_on_youtube_data*100:.2f}%")
print(f"Precision Google NLP on YouTube Samples: {precision_GoogleNLP_on_youtube_data} meaning {precision_GoogleNLP_on_youtube_data*100:.2f}%\n") 	 

Precision VADER on Twitter Samples: 0.3555370524562864 meaning 35.55%
Precision BERT on Twitter Samples: 0.357761492338441 meaning 35.78%
Precision Google NLP on Twitter Samples: 0.3350357507660878 meaning 33.50%

Precision VADER on YouTube Samples: 0.7203791469194313 meaning 72.04%
Precision BERT on YouTube Samples: 0.7636594663278272 meaning 76.37%
Precision Google NLP on YouTube Samples: 0.7719594594594594 meaning 77.20%



Now calculate the recall score. This is done by:
$
\begin{align}
    recall = \frac{TP}{TP + FN}
\end{align}
$
*Note: Here the Positive samples are the one with the the class `"Positive"`*

### False Negatives (FN) for the "Positive" Class

To calculate the **False Negatives (FN)** for the `"Positive"` class, we need to identify the cases where:
- The **actual class** is `"Positive"`, but
- The **predicted class** is **not** `"Positive"` (i.e., it is `"Negative"` or `"Neutral"`).


In [54]:
# Calculate False Negative for class "Positive"
FN_VADER_on_twitter_data = twitter_data[(twitter_data["label"] == "Positive" ) & (twitter_data["VADER_class"] != "Positive")].shape[0]
FN_BERT_on_twitter_data = twitter_data[(twitter_data["label"] == "Positive") & (twitter_data[column_to_validate] != "Positive")].shape[0]
FN_GoogleNLP_on_twitter_data = twitter_data[(twitter_data["label"] == "Positive") & (twitter_data["GoogleNLP_class"] != "Positive")].shape[0]

print(f"False Negative (class \"Positive\") VADER on Twitter Samples: {FN_VADER_on_twitter_data}")
print(f"False Negative (class \"Positive\") BERT on Twitter Samples: {FN_BERT_on_twitter_data}")
print(f"False Negative (class \"Positive\") Google NLP on Twitter Samples: {FN_GoogleNLP_on_twitter_data}\n")

FN_VADER_on_youtube_data = youtube_data[(youtube_data["label"] == "Positive") & (youtube_data["VADER_class"] != "Positive")].shape[0]
FN_BERT_on_youtube_data = youtube_data[(youtube_data["label"] == "Positive") & (youtube_data[column_to_validate] != "Positive")].shape[0]
FN_GoogleNLP_on_youtube_data = youtube_data[(youtube_data["label"] == "Positive") & (youtube_data["GoogleNLP_class"] != "Positive")].shape[0]

print(f"False Negative (class \"Positive\") VADER on YouTube Samples: {TP_VADER_on_youtube_data}")
print(f"False Negative (class \"Positive\") BERT on YouTube Samples: {TP_BERT_on_youtube_data}")
print(f"False Negative (class \"Positive\") Google NLP on YouTube Samples: {TP_GoogleNLP_on_youtube_data}\n")


False Negative (class "Positive") VADER on Twitter Samples: 160
False Negative (class "Positive") BERT on Twitter Samples: 50
False Negative (class "Positive") Google NLP on Twitter Samples: 259

False Negative (class "Positive") VADER on YouTube Samples: 1565
False Negative (class "Positive") BERT on YouTube Samples: 473
False Negative (class "Positive") Google NLP on YouTube Samples: 851



Calculate 
$
\begin{align}
    recall = \frac{TP}{TP + FN}
\end{align}
$

In [57]:
# Calculate recall for class "Positive"
recall_VADER_on_twitter_data = TP_VADER_on_twitter_data / (TP_VADER_on_twitter_data + FN_VADER_on_twitter_data)
recall_BERT_on_twitter_data = TP_BERT_on_twitter_data / (TP_BERT_on_twitter_data + FN_BERT_on_twitter_data)
recall_GoogleNLP_on_twitter_data = TP_GoogleNLP_on_twitter_data / (TP_GoogleNLP_on_twitter_data + FN_GoogleNLP_on_twitter_data)

print(f"Recall (class \"Positive\") VADER on Twitter Samples: {recall_VADER_on_twitter_data} meaning {recall_VADER_on_twitter_data*100:.2f}%")
print(f"Recall (class \"Positive\") BERT on Twitter Samples: {recall_BERT_on_twitter_data} meaning {recall_BERT_on_twitter_data*100:.2f}%")
print(f"Recall (class \"Positive\") Google NLP on Twitter Samples: {recall_GoogleNLP_on_twitter_data} meaning {recall_GoogleNLP_on_twitter_data*100:.2f}%\n") 

recall_VADER_on_youtube_data = TP_VADER_on_youtube_data / (TP_VADER_on_youtube_data + FN_VADER_on_youtube_data)
recall_BERT_on_youtube_data = TP_BERT_on_youtube_data / (TP_BERT_on_youtube_data + FN_BERT_on_youtube_data)
recall_GoogleNLP_on_youtube_data = TP_GoogleNLP_on_youtube_data / (TP_GoogleNLP_on_youtube_data + FN_GoogleNLP_on_youtube_data)

print(f"Recall (class \"Positive\") VADER on YouTube Samples: {recall_VADER_on_youtube_data} meaning {recall_VADER_on_youtube_data*100:.2f}%")
print(f"Recall (class \"Positive\") BERT on YouTube Samples: {recall_BERT_on_youtube_data} meaning {recall_BERT_on_youtube_data*100:.2f}%")
print(f"Recall (class \"Positive\") Google NLP on YouTube Samples: {recall_GoogleNLP_on_youtube_data} meaning {recall_GoogleNLP_on_youtube_data*100:.2f}%\n") 

Recall (class "Positive") VADER on Twitter Samples: 0.727427597955707 meaning 72.74%
Recall (class "Positive") BERT on Twitter Samples: 0.9148211243611585 meaning 91.48%
Recall (class "Positive") Google NLP on Twitter Samples: 0.5587734241908007 meaning 55.88%

Recall (class "Positive") VADER on YouTube Samples: 0.7960325534079349 meaning 79.60%
Recall (class "Positive") BERT on YouTube Samples: 0.809931506849315 meaning 80.99%
Recall (class "Positive") Google NLP on YouTube Samples: 0.6808 meaning 68.08%



Calculate the Recall and the Precision score now also for the negative class. The Precision is calculated as:
$
\begin{align}
    precision = \frac{TP}{TP + FP}
\end{align}
$
*Note: Here the Positive samples are the one with the the class `"Negative"`*

And the Recall is calculated as:
$
\begin{align}
    recall = \frac{TP}{TP + FN}
\end{align}
$
*Note: Here the Positive samples are the one with the the class `"Negative"`*

In [None]:
# Your Code goes here!


# To learn more
1. What was the best performing method for Youtube? Did that fit your expectations?
2. What was the best performing method for Twitter? Did that fit your expectations?
4. Do you observe any differences between prediction of positive and negative sentiment? What is the role of the imbalance between postive and negative classes in the calculation of accuracy?
