# Part 2 (0.25) Emotion Regression: How angry are you?
BERT is the eldest Transformer and it is still very popular and very well-performing. But, of course, there are many more
transformers out there and some seem to be performing better on different tasks.
In the last exercise, you used CNN’s to explore the emotion detection task. Now that you have the data loading ready, we ask you
to adapt it to read the Regression set of the same dataset in English. The idea of this part is straightforward. We expect you to
submit a fine-tuned Transformer model for the Affect of Anger in the English Regression task. We have uploaded the txt files of
the dataset in the exercise’s material folder. We hope you will explore a few models (at least briefly read through their papers and
choose one to train). Of course, you can also do this experimentally. We leave this decision to you. You are allowed to complete
Part 2 by using simpletransformers (easier to implement) by following their Regression Guide. Of course, you can always use
HuggingFace just like you did for Part 1

In [None]:
!pip install transformer
!pip install simpletransformers

[31mERROR: Could not find a version that satisfies the requirement transformer (from versions: none)[0m[31m
[0m[31mERROR: No matching distribution found for transformer[0m[31m


In [None]:
import logging
import pandas as pd
import torch
from scipy.stats import pearsonr
from simpletransformers.classification import ClassificationModel, ClassificationArgs

## Loading and exploring dataset

IMPORTANT:
Download the dataset from material section of olat for exercise 5. Rename the train dataset to ex05_train.txt and test dataset to ex05_test.txt. Use these files to run ipython notebook.

In [None]:
train_file = "ex05_train.txt"
test_file = "ex05_test.txt"

In [None]:
# read the dataset
columns = ['ID', 'Tweet', 'Affect Dimension', 'Intensity Score']
train_df = pd.read_csv(train_file, sep='\t', names=columns, header=None, skiprows=1)
eval_df = pd.read_csv(test_file, sep='\t', names=columns, header=None, skiprows=1)

In [None]:
# explore evaluation dataset
eval_df.head(50)

Unnamed: 0,ID,Tweet,Affect Dimension,Intensity Score
0,2018-En-02328,@PageShhh1 I know you mean well but I'm offend...,anger,0.734
1,2018-En-02617,"Let go of resentment, it will hold you back, d...",anger,0.422
2,2018-En-01021,"No, I'm not 'depressed because of the weather,...",anger,0.663
3,2018-En-03737,#AmarnathTerrorAttack Muslims are killing eve...,anger,0.703
4,2018-En-03407,Prepare to suffer the sting of Ghost Rider's p...,anger,0.719
5,2018-En-02897,"@ajduey04303 We've been broken up a while, bot...",anger,0.359
6,2018-En-04119,"Just know USA, all Canadians don't agree with ...",anger,0.844
7,2018-En-01392,I hate getting woken up out my sleep 😡,anger,0.823
8,2018-En-03023,@Mothercarehelp after being assured we would g...,anger,0.675
9,2018-En-01244,#UKVI why are you so difficult to access? And ...,anger,0.803


In [None]:
# evaluate training dataset
train_df.head(50)

Unnamed: 0,ID,Tweet,Affect Dimension,Intensity Score
0,2017-En-10264,@xandraaa5 @amayaallyn6 shut up hashtags are c...,anger,0.562
1,2017-En-10072,it makes me so fucking irate jesus. nobody is ...,anger,0.75
2,2017-En-11383,Lol Adam the Bull with his fake outrage...,anger,0.417
3,2017-En-11102,@THATSSHAWTYLO passed away early this morning ...,anger,0.354
4,2017-En-11506,@Kristiann1125 lol wow i was gonna say really?...,anger,0.438
5,2017-En-10779,I need a 🍱sushi date🍙 @AnzalduaG 🍝an olive gua...,anger,0.271
6,2017-En-11588,"And Republicans, you, namely Graham, Flake, Sa...",anger,0.354
7,2017-En-11282,@leepg \n\nLike a rabid dog I pulled out the b...,anger,0.333
8,2017-En-11507,@MisterAK47 it's very telling that racist bigo...,anger,0.556
9,2017-En-10849,Follow up. Follow through. Be #relentless. #su...,anger,0.125


## Preprocessing dataset

In [None]:
import re
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
import nltk

nltk.download('stopwords')
nltk.download('punkt')

# Remove stopwords
def remove_stopwords(text):
  stop_words = set(stopwords.words('english'))
  words = word_tokenize(text)
  filtered_words = [word for word in words if word.lower() not in stop_words]
  return ' '.join(filtered_words)

def preprocess_data(df):
  # save rows whose Affect dimention is anger, reject rest
  df = df[df['Affect Dimension'] == 'anger']

  # remove all the usernames
  df['Tweet'] = df['Tweet'].apply(lambda x: re.sub(r'@\w+', '', x))

  # remove special characters
  df['Tweet'] = df['Tweet'].apply(lambda x: re.sub(r'[^a-zA-Z0-9\s]', '', x))

  # remove words which have hashtags
  df['Tweet'] = df['Tweet'].apply(lambda x: re.sub(r'#\w+', '', x))

  # drop irrelevant columns
  df.drop('Affect Dimension', axis=1, inplace=True)
  df.drop('ID', axis=1, inplace=True)

  # rename column names
  df.rename(columns={'Tweet': 'text', 'Intensity Score': 'labels'}, inplace=True)

  # return data
  return df

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


In [None]:
# preprocess train and evaluation dataset
train_df_preprocessed = preprocess_data(train_df)
eval_df_preprocessed  = preprocess_data(eval_df)

In [None]:
# explore preprocessed training dataset
train_df_preprocessed.head(4)

Unnamed: 0,text,labels
0,shut up hashtags are cool offended,0.562
1,it makes me so fucking irate jesus nobody is c...,0.75
2,Lol Adam the Bull with his fake outrage,0.417
3,passed away early this morning in a fast and ...,0.354


In [None]:
# explore preprocessed evaluation dataset
eval_df_preprocessed.head(4)

Unnamed: 0,text,labels
0,I know you mean well but Im offended Prick,0.734
1,Let go of resentment it will hold you back do ...,0.422
2,No Im not depressed because of the weather Im ...,0.663
3,AmarnathTerrorAttack Muslims are killing ever...,0.703


## Training and evaluation of model

In [None]:
def train(train_df, model_type, model_name, epochs=1):
  logging.basicConfig(level=logging.INFO)
  transformers_logger = logging.getLogger("transformers")
  transformers_logger.setLevel(logging.WARNING)

  # Adding model args, enabling regression
  model_args = ClassificationArgs()
  model_args.num_train_epochs = epochs
  model_args.regression = True
  model_args.overwrite_output_dir=True
  ignore_mismatched_sizes=True

  # Init a ClassificationModel
  model = ClassificationModel(
    model_type,
    model_name,
    num_labels=1,
    args=model_args,
    use_cuda=True
  )

  # Train the model
  model.train_model(train_df)

  # predict the results on train data
  predictions, _ = model.predict(train_df["text"].tolist())

  # calcualte pearson r score on training data
  r_value, _ = pearsonr(predictions, train_df["labels"].values)
  return r_value, model

In [None]:
def evaluate(eval_df, model):
  # Evaluate the model
  _, predictions, _ = model.eval_model(eval_df)

  # calculate pearson r scores on evaluation dataset
  r_value, _ = pearsonr(predictions, eval_df["labels"].values)
  return r_value

We will explore 3 popular modals:
1. roberta
2. bert
3. xlnet

Based on the best pearson r value, we will choose the best model. We will then evaluate that model with evaluation dataset.

In [None]:
# train models
train_r_val_roberta, roberta_model = train(train_df_preprocessed, "roberta", "roberta-base", epochs=5)
train_r_val_bert, bert_model = train(train_df_preprocessed, "bert", "bert-base-uncased", epochs=5)
train_r_val_xlnet, xlnet_model = train(train_df_preprocessed, "xlnet", "xlnet-base-cased", epochs=5)

# choose the model based on best r_values
print("r values of roberta model:" , train_r_val_roberta)
print("r values of bert model:" , train_r_val_bert)
print("r values of xlnet model:" , train_r_val_xlnet)

Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.out_proj.bias', 'classifier.out_proj.weight', 'classifier.dense.weight', 'classifier.dense.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


  0%|          | 0/1701 [00:00<?, ?it/s]

Epoch:   0%|          | 0/5 [00:00<?, ?it/s]

Running Epoch 0 of 5:   0%|          | 0/213 [00:00<?, ?it/s]



Running Epoch 1 of 5:   0%|          | 0/213 [00:00<?, ?it/s]

Running Epoch 2 of 5:   0%|          | 0/213 [00:00<?, ?it/s]

Running Epoch 3 of 5:   0%|          | 0/213 [00:00<?, ?it/s]

Running Epoch 4 of 5:   0%|          | 0/213 [00:00<?, ?it/s]

  0%|          | 0/1701 [00:00<?, ?it/s]

  0%|          | 0/213 [00:00<?, ?it/s]

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


  0%|          | 0/1701 [00:00<?, ?it/s]

Epoch:   0%|          | 0/5 [00:00<?, ?it/s]

Running Epoch 0 of 5:   0%|          | 0/213 [00:00<?, ?it/s]

Running Epoch 1 of 5:   0%|          | 0/213 [00:00<?, ?it/s]

Running Epoch 2 of 5:   0%|          | 0/213 [00:00<?, ?it/s]

Running Epoch 3 of 5:   0%|          | 0/213 [00:00<?, ?it/s]

Running Epoch 4 of 5:   0%|          | 0/213 [00:00<?, ?it/s]

  0%|          | 0/1701 [00:00<?, ?it/s]

  0%|          | 0/213 [00:00<?, ?it/s]

Some weights of XLNetForSequenceClassification were not initialized from the model checkpoint at xlnet-base-cased and are newly initialized: ['logits_proj.bias', 'sequence_summary.summary.bias', 'sequence_summary.summary.weight', 'logits_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


  0%|          | 0/1701 [00:00<?, ?it/s]

Epoch:   0%|          | 0/5 [00:00<?, ?it/s]

Running Epoch 0 of 5:   0%|          | 0/213 [00:00<?, ?it/s]

Running Epoch 1 of 5:   0%|          | 0/213 [00:00<?, ?it/s]

Running Epoch 2 of 5:   0%|          | 0/213 [00:00<?, ?it/s]

Running Epoch 3 of 5:   0%|          | 0/213 [00:00<?, ?it/s]

Running Epoch 4 of 5:   0%|          | 0/213 [00:00<?, ?it/s]

  0%|          | 0/1701 [00:00<?, ?it/s]

  0%|          | 0/213 [00:00<?, ?it/s]

r values of roberta model: 0.9471595602563633
r values of bert model: 0.9686198124523652
r values of xlnet model: 0.9328781033095346


In [None]:
# based on pearson r scores, roberta is the best performing model. Therefore, we will evaluate this model on our evaluation dataset.
eval_r_value = evaluate(eval_df_preprocessed, roberta_model)
print("roberta model r values: ", eval_r_value)

  0%|          | 0/1002 [00:00<?, ?it/s]

Running Evaluation:   0%|          | 0/126 [00:00<?, ?it/s]

roberta model r values:  0.7719890188629034
