# Emotion Classification in short texts with BERT

Applying BERT to the problem of multiclass text classification. Our dataset consists of written dialogs, messages and short stories. Each dialog utterance/message is labeled with one of the five emotion categories: joy, anger, sadness, fear, neutral. 

## Workflow: 
1. Import Data
2. Data preprocessing and downloading BERT
3. Training and validation
4. Saving the model

Multiclass text classification with BERT and [ktrain](https://github.com/amaiya/ktrain). Use google colab for a free GPU 

👋  **Let's start** 

In [ ]:
!pip install ktrain
!pip install tensorflow


In [1]:

import os
os.environ['TF_USE_LEGACY_KERAS'] = 'True'

In [2]:
import pandas as pd
import numpy as np



In [3]:
import ktrain
from ktrain import text



In [4]:
data = pd.read_csv("data/Annotated ABSA with Emotions Dataset.csv")

def merge_emotions(emotion):
    if emotion in ['Anger', 'Disgust', 'Fear', 'Sadness']:
        return 'Anger'
    elif emotion == 'Joy':
        return 'Joy'
    else:  # 'Surprise'
        return 'Surprise'
# Apply the function to the 'Emotion Class' column
data['Emotion'] = data['Emotion'].apply(merge_emotions)


In [6]:
data['Emotion'].value_counts()


Emotion
Joy         3132
Anger       1501
Surprise     199
Name: count, dtype: int64

In [7]:
from sklearn.model_selection import train_test_split

data = data[data['polarity'] != 'conflict']
data = data.drop(columns="polarity")

data_train, data_test = train_test_split(data, test_size=0.25)



## 1. Import Data

In [10]:
data_train

Unnamed: 0,Text,Emotion
1308,"The bruscetta is a bit soggy, but the salads w...",Anger
1886,The one vegetarian entree (Abby's treasure) wa...,Surprise
2511,"However, being foodies, we were utterly disapp...",Anger
3683,The place is small and cramped(1) but the food...,Joy
3099,then she made a fuss about not being able to a...,Surprise
...,...,...
1207,I had the cod with paella (spicy and very fill...,Joy
692,"I recommend the garlic shrimp, okra (bindi), a...",Joy
2604,"We been there and we really enjoy(1) the food,...",Joy
568,"However, they've got the most amazing pastrami...",Surprise


In [11]:
data_test

Unnamed: 0,Text,Emotion
1321,"Waiters tend to forget drinks completely, food...",Anger
2914,Save room for deserts - they're to die for.,Joy
1714,"While we enjoyed the food, we were highly disa...",Anger
3953,The service is descent(1) even when this small...,Joy
312,Try their plain pizza with fresh garlic or egg...,Joy
...,...,...
3559,The food itself was just ok - nothing spectacu...,Joy
1272,"Nha Trang, while being notorious for utter lac...",Anger
1479,Delicious food at a great price but do not go ...,Joy
1105,The food is delicious and beautifully prepared...,Joy


In [12]:
X_train = data_train.Text.tolist()
X_test = data_test.Text.tolist()


y_train = data_train.Emotion.tolist()
y_test = data_test.Emotion.tolist()

data = pd.concat([data_train,data_test],ignore_index=True)

class_names = ['Joy','Anger', 'Surprise']

print('size of training set: %s' % (len(data_train['Text'])))
print('size of validation set: %s' % (len(data_test['Text'])))
print(data.Emotion.value_counts())

data

size of training set: 3545
size of validation set: 1182
Emotion
Joy         3062
Anger       1470
Surprise     195
Name: count, dtype: int64


Unnamed: 0,Text,Emotion
0,"The bruscetta is a bit soggy, but the salads w...",Anger
1,The one vegetarian entree (Abby's treasure) wa...,Surprise
2,"However, being foodies, we were utterly disapp...",Anger
3,The place is small and cramped(1) but the food...,Joy
4,then she made a fuss about not being able to a...,Surprise
...,...,...
4722,The food itself was just ok - nothing spectacu...,Joy
4723,"Nha Trang, while being notorious for utter lac...",Anger
4724,Delicious food at a great price but do not go ...,Joy
4725,The food is delicious and beautifully prepared...,Joy


In [14]:
encoding = {'Anger': 0, 'Joy': 1, 'Surprise': 2}

# Integer values for each class
y_train = [encoding[x] for x in y_train]
y_test = [encoding[x] for x in y_test]

## 2. Data preprocessing

* The text must be preprocessed in a specific way for use with BERT. This is accomplished by setting preprocess_mode to ‘bert’. The BERT model and vocabulary will be automatically downloaded

* BERT can handle a maximum length of 512, but let's use less to reduce memory and improve speed. 

In [15]:
(x_train,  y_train), (x_test, y_test), preproc = text.texts_from_array(x_train=X_train, y_train=y_train,
                                                                       x_test=X_test, y_test=y_test,
                                                                       class_names=class_names,
                                                                       preprocess_mode='bert',
                                                                       maxlen=350, 
                                                                       max_features=35000)


preprocessing train...
language: en


Is Multi-Label? False
preprocessing test...
language: en


task: text classification


In [16]:
x_train

[array([[  101,  1996,  7987, ...,  3256,  4059,   102],
        [  101,  1996,  2028, ...,  2019, 26285,   102],
        [  101,  2174,  1010, ...,     0,     0,     0],
        ...,
        [  101,  2057,  2042, ...,  2428,  2204,   102],
        [  101,  2174,  1010, ...,     0,     0,     0],
        [  101,  1996, 25545, ...,  2307,  2005,   102]]),
 array([[0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        ...,
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0]])]

## 2. Training and validation


Loading the pretrained BERT for text classification 

In [17]:
model = text.text_classifier('bert', train_data=(x_train, y_train), preproc=preproc)

Is Multi-Label? False
maxlen is 30




done.


Wrap it in a Learner object

In [18]:
learner = ktrain.get_learner(model, train_data=(x_train, y_train), 
                             val_data=(x_test, y_test),
                             batch_size=6)

Train the model. More about tuning learning rates [here](https://github.com/amaiya/ktrain/blob/master/tutorial-02-tuning-learning-rates.ipynb)

In [19]:
learner.fit_onecycle(2e-5, 3)




begin training using onecycle policy with max lr of 2e-05...


<tf_keras.src.callbacks.History at 0x1f58c5393d0>

Validation

In [20]:
learner.validate(val_data=(x_test, y_test), class_names=class_names)

              precision    recall  f1-score   support

         Joy       0.78      0.85      0.81       360
       Anger       0.89      0.92      0.90       763
    Surprise       0.00      0.00      0.00        59

    accuracy                           0.85      1182
   macro avg       0.55      0.59      0.57      1182
weighted avg       0.81      0.85      0.83      1182


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


array([[305,  55,   0],
       [ 63, 700,   0],
       [ 25,  34,   0]], dtype=int64)

#### Testing with other inputs

In [21]:
predictor = ktrain.get_predictor(learner.model, preproc)
predictor.get_classes()

['Joy', 'Anger', 'Surprise']

In [27]:
import time 

message = 'Not only was the food outstanding, but the little perks were great.'

start_time = time.time() 
prediction = predictor.predict(message)

print('predicted: {} ({:.2f})'.format(prediction, (time.time() - start_time)))

predicted: Joy (0.71)


## 4. Saving Bert model


In [0]:
# let's save the predictor for later use
predictor.save("models/bert_model")

Done! to reload the predictor use: ktrain.load_predictor