# RoBERTa Soft Voting

As RoBERTa has maximum of 512 tokens evaluated in one go, article that is longer than that has to be divided in parts and run in chunks. This are called windows. Previously we evaluated every window and labeled it as 0 or 1. Then we made a hard voting over all windows (tie was labeled 1).
In this script we use soft vote to classify the article. We run softmax function over the returned values to get probabilities of every window. We sum the probabilities of windows and then classify the article.

>**Note:** This was run in Google Colab, so there is no direct reference to the data. The data used was the same as in repository.

## Imports

In [1]:
from google.colab import drive
import glob

drive.mount('/content/drive')

Mounted at /content/drive


In [3]:
!pip install simpletransformers -q

[?25l[K     |█▋                              | 10kB 30.5MB/s eta 0:00:01[K     |███▎                            | 20kB 15.3MB/s eta 0:00:01[K     |█████                           | 30kB 13.7MB/s eta 0:00:01[K     |██████▋                         | 40kB 12.6MB/s eta 0:00:01[K     |████████▏                       | 51kB 9.0MB/s eta 0:00:01[K     |█████████▉                      | 61kB 9.8MB/s eta 0:00:01[K     |███████████▌                    | 71kB 9.5MB/s eta 0:00:01[K     |█████████████▏                  | 81kB 9.7MB/s eta 0:00:01[K     |██████████████▊                 | 92kB 9.8MB/s eta 0:00:01[K     |████████████████▍               | 102kB 8.2MB/s eta 0:00:01[K     |██████████████████              | 112kB 8.2MB/s eta 0:00:01[K     |███████████████████▊            | 122kB 8.2MB/s eta 0:00:01[K     |█████████████████████▎          | 133kB 8.2MB/s eta 0:00:01[K     |███████████████████████         | 143kB 8.2MB/s eta 0:00:01[K     |███████████████████████

In [4]:
import pandas as pd
import numpy as np
import torch 
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, matthews_corrcoef
from sklearn.model_selection import KFold
from simpletransformers.classification import ClassificationModel, ClassificationArgs
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression

## Load Data

In [5]:
# CHANGE TO YOUR PATH
colab_resources_path = "/content/drive/My Drive/Machine Learning/Project/colab_resources"

In [6]:
data_files = glob.glob(colab_resources_path + "/*.csv")
data_files += glob.glob(colab_resources_path + "/*.py")
for data_file in data_files:
  print('Copying file {} to colab root.'.format(data_file))
  !cp "$data_file" .

Copying file /content/drive/My Drive/Machine Learning/Project/colab_resources/test.csv to colab root.
Copying file /content/drive/My Drive/Machine Learning/Project/colab_resources/am_additional.csv to colab root.
Copying file /content/drive/My Drive/Machine Learning/Project/colab_resources/random.csv to colab root.
Copying file /content/drive/My Drive/Machine Learning/Project/colab_resources/am.csv to colab root.
Copying file /content/drive/My Drive/Machine Learning/Project/colab_resources/nam.csv to colab root.
Copying file /content/drive/My Drive/Machine Learning/Project/colab_resources/data_preprocess.py to colab root.
Copying file /content/drive/My Drive/Machine Learning/Project/colab_resources/data_preprocess_old.py to colab root.


In [7]:
from data_preprocess import getTrainData

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


In [8]:
train_data = getTrainData(include_random=True)
train_data = train_data.rename(columns={"label": "labels"})

In [9]:
train_data.labels.value_counts()

1    801
0    793
Name: labels, dtype: int64

## Test

In [10]:
from scipy.special import softmax
def getProbabilitiesRoberta(pred):
  
  return np.array([np.sum(softmax(j, axis=1), axis=0)/len(j) for j in pred])

In [11]:
model_args= ClassificationArgs(sliding_window=True)
model_args.num_train_epochs=4
model_args.save_best_model= True
model_args.tie_value = 1
model_args.batch_size = 16
model_args.learning_rate = 2e-5
model_args.overwrite_output_dir = True
model_args.max_seq_length = 512
model_args.no_cache=True
model_args.max_grad_norm = 1
model_args.use_multiprocessing = True
model_args.manual_seed = 4
model_args.reprocess_input_data = True
model_args.evaluate_during_training = True
model_args.labels_list = [0, 1]

In [13]:
n=6
seed=42
kf = KFold(n_splits=n, random_state=seed, shuffle=True)
acc, prec, rec, f1, mcc = [], [],[],[], []

for train_index, val_index in kf.split(train_data): 
    train_df = train_data.iloc[train_index]
    val_df = train_data.iloc[val_index]

    #### RoBERTa
    model = ClassificationModel('roberta', 'roberta-base', args=model_args)
    model.train_model(train_df, eval_df=val_df, acc=matthews_corrcoef)
    result, model_outputs, wrong_predictions = model.eval_model(val_df, acc=matthews_corrcoef) 

    prob = getProbabilitiesRoberta(model_outputs)
    prob = prob[:, 0]
    predictions = np.where(prob > 0.5, 0, 1)

    acc.append(accuracy_score(val_df.labels, predictions))
    prec.append(precision_score(val_df.labels, predictions))
    rec.append(recall_score(val_df.labels, predictions))
    f1.append(f1_score(val_df.labels, predictions))
    mcc.append(matthews_corrcoef(val_df.labels, predictions))


HBox(children=(FloatProgress(value=0.0, description='Downloading', max=481.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=501200538.0, style=ProgressStyle(descri…




Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.bias', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=898823.0, style=ProgressStyle(descripti…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=456318.0, style=ProgressStyle(descripti…




HBox(children=(FloatProgress(value=0.0, max=1328.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Epoch', max=4.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Running Epoch 0 of 4', max=491.0, style=ProgressStyle(des…






HBox(children=(FloatProgress(value=0.0, description='Running Epoch 1 of 4', max=491.0, style=ProgressStyle(des…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 2 of 4', max=491.0, style=ProgressStyle(des…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 3 of 4', max=491.0, style=ProgressStyle(des…





HBox(children=(FloatProgress(value=0.0, max=266.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=97.0, style=ProgressStyle(descri…




Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.bias', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out

HBox(children=(FloatProgress(value=0.0, max=1328.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Epoch', max=4.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Running Epoch 0 of 4', max=488.0, style=ProgressStyle(des…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 1 of 4', max=488.0, style=ProgressStyle(des…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 2 of 4', max=488.0, style=ProgressStyle(des…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 3 of 4', max=488.0, style=ProgressStyle(des…





HBox(children=(FloatProgress(value=0.0, max=266.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=101.0, style=ProgressStyle(descr…




Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.bias', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out

HBox(children=(FloatProgress(value=0.0, max=1328.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Epoch', max=4.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Running Epoch 0 of 4', max=480.0, style=ProgressStyle(des…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 1 of 4', max=480.0, style=ProgressStyle(des…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 2 of 4', max=480.0, style=ProgressStyle(des…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 3 of 4', max=480.0, style=ProgressStyle(des…





HBox(children=(FloatProgress(value=0.0, max=266.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=108.0, style=ProgressStyle(descr…




Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.bias', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out

HBox(children=(FloatProgress(value=0.0, max=1328.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Epoch', max=4.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Running Epoch 0 of 4', max=499.0, style=ProgressStyle(des…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 1 of 4', max=499.0, style=ProgressStyle(des…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 2 of 4', max=499.0, style=ProgressStyle(des…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 3 of 4', max=499.0, style=ProgressStyle(des…





HBox(children=(FloatProgress(value=0.0, max=266.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=89.0, style=ProgressStyle(descri…




Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.bias', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out

HBox(children=(FloatProgress(value=0.0, max=1329.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Epoch', max=4.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Running Epoch 0 of 4', max=498.0, style=ProgressStyle(des…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 1 of 4', max=498.0, style=ProgressStyle(des…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 2 of 4', max=498.0, style=ProgressStyle(des…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 3 of 4', max=498.0, style=ProgressStyle(des…





HBox(children=(FloatProgress(value=0.0, max=265.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=90.0, style=ProgressStyle(descri…




Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.bias', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out

HBox(children=(FloatProgress(value=0.0, max=1329.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Epoch', max=4.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Running Epoch 0 of 4', max=484.0, style=ProgressStyle(des…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 1 of 4', max=484.0, style=ProgressStyle(des…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 2 of 4', max=484.0, style=ProgressStyle(des…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 3 of 4', max=484.0, style=ProgressStyle(des…





HBox(children=(FloatProgress(value=0.0, max=265.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=105.0, style=ProgressStyle(descr…




### Results

In [14]:
print('RoBERTa score: ')
print('Accuracy: ', np.round(np.mean(acc), 4))
print('Precision: ', np.round(np.mean(prec), 4))
print('Recall: ', np.round(np.mean(rec), 4))
print('F1: ', np.round(np.mean(f1), 4))
print('MCC: ', np.round(np.mean(mcc), 4))

RoBERTa score: 
Accuracy:  0.9573
Precision:  0.9552
Recall:  0.9605
F1:  0.9576
MCC:  0.9147


## Conclusion

RoBERTa soft voting (**F1: 0.9576**) windows turned out to improve the results of RoBERTa hard voting (**F1: 0.9505**) windows.