<a href="https://colab.research.google.com/github/julrods/aggressive-tweet-analyzer/blob/main/3_Evaluation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Environment" data-toc-modified-id="Environment-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Environment</a></span><ul class="toc-item"><li><span><a href="#Libraries" data-toc-modified-id="Libraries-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Libraries</a></span></li><li><span><a href="#Functions" data-toc-modified-id="Functions-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Functions</a></span></li></ul></li><li><span><a href="#Evaluation" data-toc-modified-id="Evaluation-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Evaluation</a></span><ul class="toc-item"><li><span><a href="#BERT-setup" data-toc-modified-id="BERT-setup-2.1"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>BERT setup</a></span></li><li><span><a href="#Making-predictions-for-the-evaluation-data" data-toc-modified-id="Making-predictions-for-the-evaluation-data-2.2"><span class="toc-item-num">2.2&nbsp;&nbsp;</span>Making predictions for the evaluation data</a></span></li><li><span><a href="#Assessing-the-results" data-toc-modified-id="Assessing-the-results-2.3"><span class="toc-item-num">2.3&nbsp;&nbsp;</span>Assessing the results</a></span></li></ul></li><li><span><a href="#Conclusion" data-toc-modified-id="Conclusion-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Conclusion</a></span></li></ul></div>

# Evaluation

In this notebook I have evaluated the BERT binary classifier that I fine-tuned with the Aggression dataset. The evaluation is done with 17k+ comments that I scraped myself from Kevin Spacey's last 3 Instagram posts. 

Results: 
- 83% of the comments were labeled as "not aggressive" and 17% as "aggressive".
- The model achieved a precision of 85% for class "agressive" (class 1).

## Environment

### Libraries

In [None]:
!pip install transformers

In [None]:
# Base libraries
import os
import tensorflow as tf
import pandas as pd
import numpy as np
import pickle

# ML and DL libraries
from sklearn.metrics import confusion_matrix, f1_score, classification_report
from transformers import TFBertModel, TFBertForSequenceClassification

In [None]:
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


### Functions

In [None]:
def load_vectors(dataset_name):
  """
  Input: name of the dataset
  Output: dataset tokenized into 3 vectors (input_ids, attention_masks and labels) ready to be inputed into the model 
  """
  
  # Create the paths
  pickle_inp_path = f'/content/gdrive/MyDrive/Cyber-bullying-project/data/3_tokenized_data/bert_inp_{dataset_name}.pkl'
  pickle_mask_path = f'/content/gdrive/MyDrive/Cyber-bullying-project/data/3_tokenized_data/bert_mask_{dataset_name}.pkl'

  # Load the files
  input_ids = pickle.load(open(pickle_inp_path, 'rb'))
  attention_masks = pickle.load(open(pickle_mask_path, 'rb'))

  return input_ids, attention_masks

In [None]:
def bert_setup():
  """ 
  Loads BERT for sequence classification; sets the loss, metric and optimizer; compiles the model
  """
  
  base_model = TFBertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)
  
  loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
  metric = tf.keras.metrics.SparseCategoricalAccuracy('accuracy')
  optimizer = tf.keras.optimizers.Adam(learning_rate=2e-5,
                                       epsilon=1e-08)
  
  base_model.compile(loss = loss, optimizer = optimizer, metrics = [metric])
  
  return base_model

In [None]:
def evaluate_model(model_name, inputs, mask, base_model):
  """ 
  Input: name of the model we want to load; 
         evaluation data split into 2 vectors (inputs and mask); 
         base model to load the weights into
  Output: predicted labels
  """
 
  # Load the model weights
  model_save_path = f'/content/gdrive/MyDrive/Cyber-bullying-project/models/{model_name}.h5'
  base_model.load_weights(model_save_path)
  trained_model = base_model
  
  # Make predictions
  preds = trained_model.predict([inputs, mask],
                                batch_size=32)
  
  # Find the predicted labels
  pred_labels = [np.argmax(pred) for pred in preds[0]]

  return pred_labels

## Evaluation

### BERT setup

In [None]:
# BERT setup
base_model = bert_setup()

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=570.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=536063208.0, style=ProgressStyle(descri…




All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


### Making predictions for the evaluation data

In [None]:
# Load the vectorized evaluation data
input_ids, attention_masks = load_vectors('eval_data')

In [None]:
# Make predictions
pred_labels = evaluate_model('aggression_model_1epoch', input_ids, attention_masks, base_model)

In [None]:
# Load the clean evaluation dataframe and store the labels in a column
eval_data_path = '/content/gdrive/MyDrive/Cyber-bullying-project/data/4_evaluation_data/clean_evaluation_data.csv'
eval_data = pd.read_csv(eval_data_path)
eval_data['label'] = pred_labels

In [None]:
# Keep the original comment and the label only and save in a new file
labeled_eval_data = eval_data[['text', 'label']]
labeled_eval_data_path = '/content/gdrive/MyDrive/Cyber-bullying-project/data/4_evaluation_data/labeled_evaluation_data.csv'
labeled_eval_data.to_csv(labeled_eval_data_path, index = False)

### Assessing the results

In [None]:
labeled_eval_data.sample(10)

Unnamed: 0,text,label
5254,Please com bake kevin,0
10151,Thanks Kevin. You’ll be back.,0
12334,Merry christmas mr president 🌹💙,0
2932,Sex offender,1
13906,Super weird,0
842,Pedo,1
14643,Miss you frank Underwood @kevinspacey 😭😭😭,0
7855,We miss you!!!,0
10847,You are the best,0
5135,Need you back in HOC man,0


From the above sample we can see that the model predicted the positive comments correctly as not aggressive, and the aggressive comments were correctly predicted as well. The "Super weird" comment is not positive, but not aggressive either, so it was classified as not-aggressive. 

In [None]:
# Print the number
number_total_comments = len(labeled_eval_data)
number_class0_comments = len(labeled_eval_data[labeled_eval_data['label'] == 0])
number_class1_comments = len(labeled_eval_data[labeled_eval_data['label'] == 1])
print(f'Out of {number_total_comments} comments, {number_class0_comments} ({round(number_class0_comments/number_total_comments*100)}%) were labeled as "aggressive" and {number_class1_comments} ({round(number_class1_comments/number_total_comments*100)}%) as "not aggressive".')

Out of 17749 comments, 14790 (83%) were labeled as "aggressive" and 2959 (17%) as "not aggressive".


The next step was to go through the comments that were labeled as "aggressive" to check if they were correctly classified. I did it manualy on Google Sheets and created a new column named wrong_label with values 0 or 1 (0 = correct label, the comment is aggressive / 1 = wrong label, the comment is not aggressive). Due to time constraints, I did not do it for the comments that were labeled by the model as "not aggressive".

In [None]:
# Loading the dataset that contains the comments labeled as aggressive by the model, checked by me
labeled_eval_data_checked_path = '/content/gdrive/MyDrive/Cyber-bullying-project/data/4_evaluation_data/labeled_evaluation_data_checked.csv'
checked = pd.read_csv(labeled_eval_data_checked_path) 
checked.head()

Unnamed: 0,text,label,wrong_label
0,Rapist,1,
1,Racist pedo,1,
2,I so fucking love you Kevin! Happy to see you'...,1,1.0
3,fuck you asshole,1,
4,You are My perfec daddy ❤️😍😍,1,1.0


In [None]:
# I only wrote 1 for the mislabeled instances, so we have to fill the null values with 0 for the ones that are correctly labeled
checked['wrong_label'] = checked['wrong_label'].fillna(0).astype(int)
checked.head()

Unnamed: 0,text,label,wrong_label
0,Rapist,1,0
1,Racist pedo,1,0
2,I so fucking love you Kevin! Happy to see you'...,1,1
3,fuck you asshole,1,0
4,You are My perfec daddy ❤️😍😍,1,1


In [None]:
# Save the file with the filled values
checked.to_csv(labeled_eval_data_checked_path, index = False)

In [None]:
# Evaluating the precision for class 1 (classified as "aggressive" by the model)
precision = 1 - checked['wrong_label'].sum() / len(checked)
precision_percent = (1 - checked['wrong_label'].sum() / len(checked)) * 100
print(f'Out of all the instances labelled as class 1 (aggressive comments), {precision_percent:.2f}% were correct. The precision of the model for class 1 is {precision:.2f}')

Out of all the instances labelled as class 1 (aggressive comments), 85.64% were correct. The precision of the model for class 1 is 0.86


In [None]:
# Sample of false positives: 
checked[checked['wrong_label']==1].sample(5)

Unnamed: 0,text,label,wrong_label
2460,Welcomback boss🔥,1,1
1120,You are fucking AWESOME!,1,1
47,Awww yissss ima watch this shit,1,1
2312,"Y’all crazy, accuser was anonymous and died. I...",1,1
2271,Shit men. Come back to house of cards.,1,1


## Conclusion

The precision for class one was satisfactory at 85.64%. 

Most of the false positives contain words that are negative but used in a "friendly" manner:  
- beast
- boss
- bullshit
- crack
- crap
- fuck/fucking
- goat
- motherfucker
- savage
- shit
- son of a bitch
- stupid
- sucks

Some recurrent examples: 
- I fucking love you Kevin
- You're a fucking God you son of a bitch
- Fuck yeah
- Hell yes
- House of Cards sucks without you

To improve the model we could train it again with sentences that have swear words, some that are truly aggressive and some that are just friendly banter. 