## Emotion dataset

This notebook is to generate the results on [emotion dataset](https://huggingface.co/datasets/emotion) taken from hugging face. The model in this work is [bhadresh-savani/bert-base-uncased-emotion](https://huggingface.co/bhadresh-savani/bert-base-uncased-emotion).

#### Import packages

In [None]:
import sys
sys.path.append('../src/')

import numpy as np
import pickle5 as pkl
import tensorflow_hub as hub
import util_funcs as uf
from nlx_babybear import RFBabyBear
from inference_triage import PapabearClassifierEmotion, TriagedClassifier

from transformers import AutoTokenizer, AutoModelForTokenClassification

from nltk.tokenize.treebank import TreebankWordDetokenizer
from sklearn.model_selection import KFold
import matplotlib.pyplot as plt

#### Loading the data

In [None]:
filename = '../data/emotion/train_emotion.pkl'
texts_train, y_train, _ = uf.open_pkl(filename)
doc, labels = np.asarray(texts_train), np.asarray(y_train)


filename = '../data/emotion/test_emotion.pkl'
texts_test, y_test, _ = uf.open_pkl(filename)
texts_test, y_test = np.asarray(texts_test), np.asarray(y_test)

There are 6 classes in this dataset (0:"sadness", 1:'joy', 2:'love', 3:'anger', 4:'fear', 5:'surprise'). The distribution of these classes in the training dataset is shown in the following figure.

In [None]:
plt.hist(labels)
plt.xlabel('Class ID')
plt.ylabel('Frequency')
plt.title('Class distribution on training dataset')

#### Input file:

`model`: The model used as [papabear model]((https://huggingface.co/bhadresh-savani/bert-base-uncased-emotion))

`confidence_th_options`: The values for confidence threshold

`metric`: The metric to find the performance. It can be one of the "accuracy", "recall", "f1_score" and "precision".

`metric_threshold`: The minimum value of performance we are expecting for the final model to have.

In [None]:
model='bhadresh-savani/bert-base-uncased-emotion'
metric = "accuracy"
metric_threshold = .9
confidence_th_options = np.arange(0,1.005,.005)

#### Instantiate babybear and papbear models

In [None]:
language_model = hub.load("https://tfhub.dev/google/universal-sentence-encoder/4")
# language_model = SIF("latest-small")
papabear = PapabearClassifierEmotion(model)
babybear = RFBabyBear(language_model)

inf_traige = TriagedClassifier("classification", babybear, papabear, metric_threshold, "accuracy", confidence_th_options)

#### hyper-parameter tuning

Here we will train inference triage to find the confidence threshold.

In [None]:
inf_traige.train(doc, labels)

print(f"Confidence threshold is: {inf_traige.confidence_th}")

print(f"The following plots are the saving vs Threshold for different CV fold")

#### Training babybear model
We train the babybear model on all the data.

In [None]:
babybear = RFBabyBear(language_model)
babybear.train(doc, labels, n_class=len(np.unique(labels)))

#### Applying inference triage on the test dataset
All the results are also saved in '../output/emotion.resullts'

In [None]:
inf_traige.babybear = babybear
a = inf_traige.score(texts_dev, y_dev)

dump_data = {}
dump_data['result'] = a
dump_data['confidence_th'] = inf_traige.confidence_th
dump_data['indx_conf_th'] = inf_traige.indx_conf_th
dump_data['metric'] = inf_traige.metric
dump_data['metric_threshold'] = inf_traige.metric_threshold
dump_data['performance'] = inf_traige.performance
dump_data['saving'] = inf_traige.saving
dump_data['tot_time'] = inf_traige.tot_time
with open('../output/emotion.resullts', 'wb') as outp:  # Overwrites any existing file.
        pkl.dump(dump_data, outp, pkl.HIGHEST_PROTOCOL)

#### Plot cpu/gpu run time!

In [None]:
plt.scatter(inf_traige['tot_time'], np.asarray(inf_traige['performance'])*100, color='r', label='GPU run time')
plt.xlabel('Time (sec)')
plt.ylabel(str(inf_traige['metric']))

y = np.arange(0, 105, .1)
x = y * 0 + inf_traige['tot_time'][inf_traige['indx_conf_th']]
plt.plot(x, y, '--', label='accuracy at confidence threshold =' + str(str(inf_traige['performance'][inf_traige['indx_conf_th']]*100)) + '%')
plt.ylim([min(inf_traige['performance'])*100-5, 105])

x = np.arange(-.5, max(inf_traige['tot_time'])+.5, .1)
y = x * 0 + inf_traige['performance'][inf_traige['indx_conf_th']]*100
plt.plot(x, y, '--', label='Time at confidence_th')
plt.xlim([-.1, max(inf_traige['tot_time'])+.5])
plt.legend(loc=0)

Saving vs confidence threshold

In [None]:
plt.scatter(np.arange(0,1.005,.005),inf_traige['saving'], color='r', label='GPU run time')
plt.xlabel('confidence threshol')
plt.ylabel('saving')

Performance vs confidence threshold

In [None]:
plt.scatter(np.arange(0,1.005,.005),inf_traige['performance'], color='r', label='GPU run time')
plt.xlabel('confidence threshol')
plt.ylabel(str(inf_traige['metric']))

Gpu run time vs confidence threshold

In [None]:
plt.scatter(np.arange(0,1.005,.005),inf_traige['tot_time'], color='r', label='GPU run time')
plt.xlabel('confidence threshold')
plt.ylabel('time')