### Check Hardware & RAM availability:

Commands to check for available GPU and ram allocation on runtime

In [1]:
gpu_info = !nvidia-smi
gpu_info = '\n'.join(gpu_info)
if gpu_info.find('failed') >= 0:
  print('Select the Runtime > "Change runtime type" menu to enable a GPU accelerator, ')
  print('and then re-execute this cell.')
else:
  print(gpu_info)

Sat Sep 11 08:29:01 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.63.01    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   41C    P0    28W / 250W |      0MiB / 16280MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [2]:
from psutil import virtual_memory
ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))

Your runtime has 27.3 gigabytes of available RAM



### References:
* https://huggingface.co/
* https://arxiv.org/abs/1907.11692

### Install Required Libraries for Transformer Models:

* Pre-Trained Transformer models are part of Hugging Face Library(transformers).
* Similarly, any datatset part of Hugging Face can be called from the datasets library.
* Finally we will use a high level abstraction package called k-train to simplify our modelling and predictions

In [3]:
!pip install ktrain
!pip install transformers
!pip install datasets

Collecting ktrain
  Downloading ktrain-0.27.3.tar.gz (25.3 MB)
[K     |████████████████████████████████| 25.3 MB 1.3 MB/s 
[?25hCollecting scikit-learn==0.23.2
  Downloading scikit_learn-0.23.2-cp37-cp37m-manylinux1_x86_64.whl (6.8 MB)
[K     |████████████████████████████████| 6.8 MB 33.5 MB/s 
Collecting langdetect
  Downloading langdetect-1.0.9.tar.gz (981 kB)
[K     |████████████████████████████████| 981 kB 48.8 MB/s 
Collecting cchardet
  Downloading cchardet-2.1.7-cp37-cp37m-manylinux2010_x86_64.whl (263 kB)
[K     |████████████████████████████████| 263 kB 52.1 MB/s 
Collecting syntok
  Downloading syntok-1.3.1.tar.gz (23 kB)
Collecting seqeval==0.0.19
  Downloading seqeval-0.0.19.tar.gz (30 kB)
Collecting transformers<=4.3.3,>=4.0.0
  Downloading transformers-4.3.3-py3-none-any.whl (1.9 MB)
[K     |████████████████████████████████| 1.9 MB 50.4 MB/s 
[?25hCollecting sentencepiece
  Downloading sentencepiece-0.1.96-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1

In [4]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import ktrain
from ktrain import text
import tensorflow as tf
from sklearn.model_selection import train_test_split
from datasets import list_datasets
from datasets import load_dataset
from sklearn.metrics import classification_report, confusion_matrix
import timeit
import warnings

pd.set_option('display.max_columns', None)
warnings.simplefilter(action="ignore")

### Loading Emotion Dataset:

In [5]:
emotion_train = load_dataset('emotion', split='train')
emotion_val = load_dataset('emotion', split='validation')
emotion_test = load_dataset('emotion', split='test')
print("Details for Emotion Train Dataset: ", emotion_train.shape)
print("Details for Emotion Validation Dataset: ", emotion_val.shape)
print("Details for Emotion Test Dataset: ", emotion_test.shape)

Downloading:   0%|          | 0.00/1.66k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.61k [00:00<?, ?B/s]

Using custom data configuration default


Downloading and preparing dataset emotion/default (download: 1.97 MiB, generated: 2.07 MiB, post-processed: Unknown size, total: 4.05 MiB) to /root/.cache/huggingface/datasets/emotion/default/0.0.0/348f63ca8e27b3713b6c04d723efe6d824a56fb3d1449794716c0f0296072705...


Downloading:   0%|          | 0.00/1.66M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/204k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/207k [00:00<?, ?B/s]

0 examples [00:00, ? examples/s]

0 examples [00:00, ? examples/s]

0 examples [00:00, ? examples/s]

Dataset emotion downloaded and prepared to /root/.cache/huggingface/datasets/emotion/default/0.0.0/348f63ca8e27b3713b6c04d723efe6d824a56fb3d1449794716c0f0296072705. Subsequent calls will reuse this data.


Using custom data configuration default
Reusing dataset emotion (/root/.cache/huggingface/datasets/emotion/default/0.0.0/348f63ca8e27b3713b6c04d723efe6d824a56fb3d1449794716c0f0296072705)
Using custom data configuration default
Reusing dataset emotion (/root/.cache/huggingface/datasets/emotion/default/0.0.0/348f63ca8e27b3713b6c04d723efe6d824a56fb3d1449794716c0f0296072705)


Details for Emotion Train Dataset:  (16000, 2)
Details for Emotion Validation Dataset:  (2000, 2)
Details for Emotion Test Dataset:  (2000, 2)


In [6]:
print("\nTrain Dataset Features for Emotion: \n", emotion_train.features)
print("\nTest Dataset Features for Emotion: \n", emotion_val.features)
print("\nTest Dataset Features for Emotion: \n", emotion_test.features)


Train Dataset Features for Emotion: 
 {'text': Value(dtype='string', id=None), 'label': ClassLabel(num_classes=6, names=['sadness', 'joy', 'love', 'anger', 'fear', 'surprise'], names_file=None, id=None)}

Test Dataset Features for Emotion: 
 {'text': Value(dtype='string', id=None), 'label': ClassLabel(num_classes=6, names=['sadness', 'joy', 'love', 'anger', 'fear', 'surprise'], names_file=None, id=None)}

Test Dataset Features for Emotion: 
 {'text': Value(dtype='string', id=None), 'label': ClassLabel(num_classes=6, names=['sadness', 'joy', 'love', 'anger', 'fear', 'surprise'], names_file=None, id=None)}


In [7]:
emotion_train_df = pd.DataFrame(data=emotion_train)
emotion_val_df = pd.DataFrame(data=emotion_val)

In [8]:
class_label_names = ['sadness', 'joy', 'love', 'anger', 'fear', 'surprise']

### Instantiating a BERT Instance:

In [9]:
bert_transformer = text.Transformer('bert-base-uncased', maxlen=512, classes=class_label_names, batch_size=6)

Downloading:   0%|          | 0.00/570 [00:00<?, ?B/s]

In [10]:
X_train = emotion_train_df[:]["text"]
y_train = emotion_train_df[:]["label"]
X_test = emotion_val_df[:]["text"]
y_test = emotion_val_df[:]["label"]
print(X_train.shape, y_train.shape, X_test.shape, y_test.shape)

(16000,) (16000,) (2000,) (2000,)


In [11]:
bert_train = bert_transformer.preprocess_train(X_train.to_list(), y_train.to_list())
bert_val = bert_transformer.preprocess_test(X_test.to_list(), y_test.to_list())

preprocessing train...
language: en
train sequence lengths:
	mean : 19
	95percentile : 41
	99percentile : 52


Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/466k [00:00<?, ?B/s]

Is Multi-Label? False
preprocessing test...
language: en
test sequence lengths:
	mean : 19
	95percentile : 40
	99percentile : 52


### Compile BERT in a K-Train Learner Object:

In [12]:
bert_model = bert_transformer.get_classifier()

Downloading:   0%|          | 0.00/536M [00:00<?, ?B/s]

In [13]:
bert_learner_ins = ktrain.get_learner(model=bert_model,
                            train_data=bert_train,
                            val_data=bert_val,
                            batch_size=6)

### BERT Model Details:

In [14]:
bert_learner_ins.model.summary()

Model: "tf_bert_for_sequence_classification"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
bert (TFBertMainLayer)       multiple                  109482240 
_________________________________________________________________
dropout_37 (Dropout)         multiple                  0         
_________________________________________________________________
classifier (Dense)           multiple                  4614      
Total params: 109,486,854
Trainable params: 109,486,854
Non-trainable params: 0
_________________________________________________________________


### BERT Optimal Learning Rates:

As per the evaluations made in the research paper "**RoBERTa: A Robustly Optimized BERT Pretraining Approach**", below are the best choices in terms of fine-tuning the model:

* Batch Sizes => {16, 32}
* Learning Rates => {1e−5, 2e−5, 3e−5}

We will choose the maximum among these for our fine-tuning and evaluation purposes.

### Fine Tuning BERT on Emotion Dataset:

In [15]:
bert_fine_tune_start_time = timeit.default_timer()
bert_learner_ins.fit_onecycle(lr=3e-5, epochs=3)
bert_fine_tune_stop_time = timeit.default_timer()

print("\nTotal time in minutes for Fine-Tuning BERT on Emotion Dataset: \n", (bert_fine_tune_stop_time - bert_fine_tune_start_time)/60)



begin training using onecycle policy with max lr of 3e-05...
Epoch 1/3
Epoch 2/3
Epoch 3/3

Total time in minutes for Fine-Tuning BERT on Emotion Dataset: 
 58.0473978842


### Checking BERT performance metrics:

In [16]:
bert_learner_ins.validate()

              precision    recall  f1-score   support

           0       0.97      0.97      0.97       550
           1       0.96      0.96      0.96       704
           2       0.87      0.89      0.88       178
           3       0.97      0.91      0.94       275
           4       0.86      0.96      0.91       212
           5       0.89      0.81      0.85        81

    accuracy                           0.94      2000
   macro avg       0.92      0.92      0.92      2000
weighted avg       0.94      0.94      0.94      2000



array([[535,   2,   0,   2,  11,   0],
       [  0, 673,  23,   2,   2,   4],
       [  1,  19, 158,   0,   0,   0],
       [ 15,   2,   0, 249,   9,   0],
       [  1,   0,   0,   3, 204,   4],
       [  1,   4,   0,   0,  10,  66]])

In [17]:
bert_learner_ins.validate(class_names=class_label_names)

              precision    recall  f1-score   support

     sadness       0.97      0.97      0.97       550
         joy       0.96      0.96      0.96       704
        love       0.87      0.89      0.88       178
       anger       0.97      0.91      0.94       275
        fear       0.86      0.96      0.91       212
    surprise       0.89      0.81      0.85        81

    accuracy                           0.94      2000
   macro avg       0.92      0.92      0.92      2000
weighted avg       0.94      0.94      0.94      2000



array([[535,   2,   0,   2,  11,   0],
       [  0, 673,  23,   2,   2,   4],
       [  1,  19, 158,   0,   0,   0],
       [ 15,   2,   0, 249,   9,   0],
       [  1,   0,   0,   3, 204,   4],
       [  1,   4,   0,   0,  10,  66]])

In [18]:
bert_learner_ins.view_top_losses(preproc=bert_transformer)

----------
id:1599 | loss:6.68 | true:fear | pred:anger)

----------
id:1195 | loss:6.46 | true:anger | pred:joy)

----------
id:1111 | loss:6.11 | true:joy | pred:anger)

----------
id:1987 | loss:5.76 | true:joy | pred:anger)



### Saving BERT Model:

In [19]:
bert_predictor = ktrain.get_predictor(bert_learner_ins.model, preproc=bert_transformer)
bert_predictor.get_classes()

['sadness', 'joy', 'love', 'anger', 'fear', 'surprise']

In [20]:
emotion_test_df = pd.DataFrame(data=emotion_test)
print("\nShape of Test Dataset: ", emotion_test_df.shape,"\n\n")
emotion_test_df.head()


Shape of Test Dataset:  (2000, 2) 




Unnamed: 0,text,label
0,im feeling rather rotten so im not very ambiti...,0
1,im updating my blog because i feel shitty,0
2,i never make her separate from me because i do...,0
3,i left with my bouquet of red and yellow tulip...,1
4,i was feeling a little vain when i did this one,0


In [21]:
emotion_test_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2000 entries, 0 to 1999
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   text    2000 non-null   object
 1   label   2000 non-null   int64 
dtypes: int64(1), object(1)
memory usage: 31.4+ KB


In [22]:
label_dict = {0: "sadness", 1: "joy", 2: "love", 3: "anger", 4: "fear", 5: "surprise"}
emotion_test_df["label"] = emotion_test_df["label"].map(label_dict)
emotion_test_df.head()

Unnamed: 0,text,label
0,im feeling rather rotten so im not very ambiti...,sadness
1,im updating my blog because i feel shitty,sadness
2,i never make her separate from me because i do...,sadness
3,i left with my bouquet of red and yellow tulip...,joy
4,i was feeling a little vain when i did this one,sadness


In [23]:
emotion_test_df[emotion_test_df.columns] = emotion_test_df[emotion_test_df.columns].astype(str)

In [24]:
X_test_new = emotion_test_df[:]["text"]
y_test_new = emotion_test_df[:]["label"]
print(X_test_new.shape, y_test_new.shape)

(2000,) (2000,)


In [25]:
test_predictions = bert_predictor.predict(X_test_new.to_list())

In [26]:
print(confusion_matrix(y_test_new, test_predictions))

[[246  13   2   0  14   0]
 [  1 212   0   0   3   8]
 [  3   0 661  27   1   3]
 [  1   0  22 136   0   0]
 [  3   6   2   0 570   0]
 [  0  18   3   0   2  43]]


In [27]:
print(classification_report(y_test_new, test_predictions))

              precision    recall  f1-score   support

       anger       0.97      0.89      0.93       275
        fear       0.85      0.95      0.90       224
         joy       0.96      0.95      0.95       695
        love       0.83      0.86      0.84       159
     sadness       0.97      0.98      0.97       581
    surprise       0.80      0.65      0.72        66

    accuracy                           0.93      2000
   macro avg       0.90      0.88      0.89      2000
weighted avg       0.93      0.93      0.93      2000

