# Ensemble - Roberta + Logistic Regression

This notebook shows ensemble of logistic regression and RoBERTa (base) using soft voting. Weights were put on both classifiers (LR and RoBERTa) according to their F1 scores.

>**Note:** This was run in Google Colab, so there is no direct reference to the data. The data used was the same as in repository.

## Imports

In [2]:
from google.colab import drive
import glob

drive.mount('/content/drive')

Mounted at /content/drive


In [1]:
!pip install simpletransformers -q

[K     |████████████████████████████████| 204kB 8.1MB/s 
[K     |████████████████████████████████| 2.9MB 9.6MB/s 
[K     |████████████████████████████████| 71kB 8.5MB/s 
[K     |████████████████████████████████| 1.8MB 44.9MB/s 
[K     |████████████████████████████████| 1.1MB 61.5MB/s 
[K     |████████████████████████████████| 7.4MB 62.3MB/s 
[K     |████████████████████████████████| 51kB 8.9MB/s 
[K     |████████████████████████████████| 317kB 52.7MB/s 
[K     |████████████████████████████████| 1.4MB 58.8MB/s 
[K     |████████████████████████████████| 133kB 56.2MB/s 
[K     |████████████████████████████████| 102kB 15.2MB/s 
[K     |████████████████████████████████| 163kB 54.5MB/s 
[K     |████████████████████████████████| 102kB 14.0MB/s 
[K     |████████████████████████████████| 4.5MB 48.7MB/s 
[K     |████████████████████████████████| 112kB 64.4MB/s 
[K     |████████████████████████████████| 890kB 53.9MB/s 
[K     |████████████████████████████████| 71kB 11.2MB/s 
[K 

In [5]:
import pandas as pd
import numpy as np
import torch 
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, matthews_corrcoef
from sklearn.model_selection import KFold
from simpletransformers.classification import ClassificationModel, ClassificationArgs
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression

## Load Data

In [3]:
# CHANGE TO YOUR PATH
colab_resources_path = "/content/drive/My Drive/Machine Learning/Project/colab_resources"

In [4]:
data_files = glob.glob(colab_resources_path + "/*.csv")
data_files += glob.glob(colab_resources_path + "/*.py")
for data_file in data_files:
  print('Copying file {} to colab root.'.format(data_file))
  !cp "$data_file" .

Copying file /content/drive/My Drive/Machine Learning/Project/colab_resources/nam.csv to colab root.
Copying file /content/drive/My Drive/Machine Learning/Project/colab_resources/am.csv to colab root.
Copying file /content/drive/My Drive/Machine Learning/Project/colab_resources/am_additional.csv to colab root.
Copying file /content/drive/My Drive/Machine Learning/Project/colab_resources/random.csv to colab root.
Copying file /content/drive/My Drive/Machine Learning/Project/colab_resources/data_preprocess.py to colab root.


In [6]:
from data_preprocess import getTrainData

In [12]:
train_data = getTrainData(include_random=True)
train_data = train_data.rename(columns={"label": "labels"})

## Test

In [8]:
from scipy.special import softmax
def getProbabilitiesRoberta(pred):
  
  return np.array([np.sum(softmax(j, axis=1), axis=0)/len(j) for j in pred])

In [9]:
model_args= ClassificationArgs(sliding_window=True)
model_args.num_train_epochs=4
model_args.save_best_model= True
model_args.tie_value = 1
model_args.batch_size = 16
model_args.learning_rate = 2e-5
model_args.overwrite_output_dir = True
model_args.max_seq_length = 512
model_args.max_grad_norm = 1
model_args.use_multiprocessing = True
model_args.manual_seed = 4
model_args.reprocess_input_data = True
model_args.evaluate_during_training = True
model_args.labels_list = [0, 1]

In [13]:
n=6
seed=42
kf = KFold(n_splits=n, random_state=seed, shuffle=True)
mcc_lr, f1_lr = [], []
mcc_rb, f1_rb = [], []
acc, prec, rec, f1, mcc = [], [],[],[], []

for train_index, val_index in kf.split(train_data): 
    train_df = train_data.iloc[train_index]
    val_df = train_data.iloc[val_index]

    ### LogReg
    cv = TfidfVectorizer(strip_accents='ascii', lowercase=True, stop_words='english')
    X_train_cv = cv.fit_transform(train_df.text)
    X_val_cv = cv.transform(val_df.text)
    
    lr = LogisticRegression(random_state=0, C=17, penalty='l2', max_iter=1000)
    lr.fit(X_train_cv, train_df.labels)
    predictions_lr = lr.predict(X_val_cv)

    f1_lr.append(f1_score(val_df.labels, predictions_lr))
    mcc_lr.append(matthews_corrcoef(val_df.labels, predictions_lr))

    #### RoBERTa
    model = ClassificationModel('roberta', 'roberta-base', args=model_args)
    model.train_model(train_df, eval_df=val_df, acc=matthews_corrcoef)
    result, model_outputs, wrong_predictions = model.eval_model(val_df, acc=matthews_corrcoef) 

    predictions_rb = np.array([np.rint(np.mean(np.argmax(j, axis=1))) for j in model_outputs]).astype(int)

    f1_rb.append(f1_score(val_df.labels, predictions_rb))
    mcc_rb.append(matthews_corrcoef(val_df.labels, predictions_rb))

    ##### ENSEMBLE
    w_lr = 0.94 # LR F1 score
    w_rf = 0.95 # RoBERTa F1 score

    prob_lr = np.array(lr.predict_proba(X_val_cv))
    prob_rb = getProbabilitiesRoberta(model_outputs)

    prob_lr = prob_lr[:, 0]
    prob_rb = prob_rb[:, 0]

    prob = (prob_lr*w_lr + prob_rb*w_rf)/(w_lr+w_rf)

    predictions = np.where(prob > 0.5, 0, 1)

    acc.append(accuracy_score(val_df.labels, predictions))
    prec.append(precision_score(val_df.labels, predictions))
    rec.append(recall_score(val_df.labels, predictions))
    f1.append(f1_score(val_df.labels, predictions))
    mcc.append(matthews_corrcoef(val_df.labels, predictions))


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.bias', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out

HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=1328.0), HTML(value='')))




HBox(children=(HTML(value='Epoch'), FloatProgress(value=0.0, max=4.0), HTML(value='')))

HBox(children=(HTML(value='Running Epoch 0 of 4'), FloatProgress(value=0.0, max=494.0), HTML(value='')))






HBox(children=(HTML(value='Running Epoch 1 of 4'), FloatProgress(value=0.0, max=494.0), HTML(value='')))




HBox(children=(HTML(value='Running Epoch 2 of 4'), FloatProgress(value=0.0, max=494.0), HTML(value='')))




HBox(children=(HTML(value='Running Epoch 3 of 4'), FloatProgress(value=0.0, max=494.0), HTML(value='')))





HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=266.0), HTML(value='')))




HBox(children=(HTML(value='Running Evaluation'), FloatProgress(value=0.0, max=94.0), HTML(value='')))




Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.bias', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out

HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=1328.0), HTML(value='')))




HBox(children=(HTML(value='Epoch'), FloatProgress(value=0.0, max=4.0), HTML(value='')))

HBox(children=(HTML(value='Running Epoch 0 of 4'), FloatProgress(value=0.0, max=482.0), HTML(value='')))






HBox(children=(HTML(value='Running Epoch 1 of 4'), FloatProgress(value=0.0, max=482.0), HTML(value='')))




HBox(children=(HTML(value='Running Epoch 2 of 4'), FloatProgress(value=0.0, max=482.0), HTML(value='')))




HBox(children=(HTML(value='Running Epoch 3 of 4'), FloatProgress(value=0.0, max=482.0), HTML(value='')))





HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=266.0), HTML(value='')))




HBox(children=(HTML(value='Running Evaluation'), FloatProgress(value=0.0, max=107.0), HTML(value='')))




Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.bias', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out

HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=1328.0), HTML(value='')))




HBox(children=(HTML(value='Epoch'), FloatProgress(value=0.0, max=4.0), HTML(value='')))

HBox(children=(HTML(value='Running Epoch 0 of 4'), FloatProgress(value=0.0, max=493.0), HTML(value='')))






HBox(children=(HTML(value='Running Epoch 1 of 4'), FloatProgress(value=0.0, max=493.0), HTML(value='')))




HBox(children=(HTML(value='Running Epoch 2 of 4'), FloatProgress(value=0.0, max=493.0), HTML(value='')))




HBox(children=(HTML(value='Running Epoch 3 of 4'), FloatProgress(value=0.0, max=493.0), HTML(value='')))





HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=266.0), HTML(value='')))




HBox(children=(HTML(value='Running Evaluation'), FloatProgress(value=0.0, max=96.0), HTML(value='')))




Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.bias', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out

HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=1328.0), HTML(value='')))




HBox(children=(HTML(value='Epoch'), FloatProgress(value=0.0, max=4.0), HTML(value='')))

HBox(children=(HTML(value='Running Epoch 0 of 4'), FloatProgress(value=0.0, max=499.0), HTML(value='')))






HBox(children=(HTML(value='Running Epoch 1 of 4'), FloatProgress(value=0.0, max=499.0), HTML(value='')))




HBox(children=(HTML(value='Running Epoch 2 of 4'), FloatProgress(value=0.0, max=499.0), HTML(value='')))




HBox(children=(HTML(value='Running Epoch 3 of 4'), FloatProgress(value=0.0, max=499.0), HTML(value='')))





HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=266.0), HTML(value='')))




HBox(children=(HTML(value='Running Evaluation'), FloatProgress(value=0.0, max=90.0), HTML(value='')))




Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.bias', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out

HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=1329.0), HTML(value='')))




HBox(children=(HTML(value='Epoch'), FloatProgress(value=0.0, max=4.0), HTML(value='')))

HBox(children=(HTML(value='Running Epoch 0 of 4'), FloatProgress(value=0.0, max=488.0), HTML(value='')))






HBox(children=(HTML(value='Running Epoch 1 of 4'), FloatProgress(value=0.0, max=488.0), HTML(value='')))




HBox(children=(HTML(value='Running Epoch 2 of 4'), FloatProgress(value=0.0, max=488.0), HTML(value='')))




HBox(children=(HTML(value='Running Epoch 3 of 4'), FloatProgress(value=0.0, max=488.0), HTML(value='')))





HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=265.0), HTML(value='')))




HBox(children=(HTML(value='Running Evaluation'), FloatProgress(value=0.0), HTML(value='')))




Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.bias', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out

HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=1329.0), HTML(value='')))




HBox(children=(HTML(value='Epoch'), FloatProgress(value=0.0, max=4.0), HTML(value='')))

HBox(children=(HTML(value='Running Epoch 0 of 4'), FloatProgress(value=0.0, max=485.0), HTML(value='')))






HBox(children=(HTML(value='Running Epoch 1 of 4'), FloatProgress(value=0.0, max=485.0), HTML(value='')))




HBox(children=(HTML(value='Running Epoch 2 of 4'), FloatProgress(value=0.0, max=485.0), HTML(value='')))




HBox(children=(HTML(value='Running Epoch 3 of 4'), FloatProgress(value=0.0, max=485.0), HTML(value='')))





HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=265.0), HTML(value='')))




HBox(children=(HTML(value='Running Evaluation'), FloatProgress(value=0.0, max=104.0), HTML(value='')))




### Results

In [15]:
print('Logistic regression score: ')
print('F1 LR: ', np.round(np.mean(f1_lr), 4))
print('MCC LR: ', np.round_(np.mean(mcc_lr), 4))
print()

print('RoBERTa score: ')
print('F1 RoBERTa: ', np.round(np.mean(f1_rb), 4))
print('MCC RoBERTa: ', np.round(np.mean(mcc_rb), 4))
print()

print('Ensemble score: ')
print('Accuracy Ensemble: ', np.round(np.mean(acc), 4))
print('Precision Ensemble: ', np.round(np.mean(prec), 4))
print('Recall Ensemble: ', np.round(np.mean(rec), 4))
print('F1 Ensemble: ', np.round(np.mean(f1), 4))
print('MCC Ensemble: ', np.round(np.mean(mcc), 4))

Logistic regression score: 
F1 LR:  0.9398
MCC LR:  0.8771

RoBERTa score: 
F1 RoBERTa:  0.9511
MCC RoBERTa:  0.9011

Ensemble score: 
Accuracy Ensemble:  0.9573
Precision Ensemble:  0.9552
Recall Ensemble:  0.9619
F1 Ensemble:  0.9583
MCC Ensemble:  0.915


## Conclusion

Ensemble score (**F1: 0.9583**) is **better** than RoBERTa score (F1: 0.9511) in F1 and MCC.
