<a href="https://colab.research.google.com/github/KelvinLam05/Sentiment-analysis-From-binary-to-multi-class-classification/blob/main/Multi_class_Emotion_Classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Goal of the project**

In this notebook, we will build an emotional classification model. More specifically, we will build a model that classifies text data into six basic emotions: joy, sadness, anger, fear, love, and surprise.

**Data set information**

This dataset contains six basic emotions expressed through text.

In [None]:
# Importing libraries
import pandas as pd
import numpy as np
import ktrain
import tensorflow as tf
from ktrain import text
from sklearn.model_selection import train_test_split

In [None]:
# Load dataset
df = pd.read_csv('/content/Emotions dataset.txt', delimiter = ';', header = None, names = ['text','label'])

In [None]:
# Examine the data
df.head()

Unnamed: 0,text,label
0,i didnt feel humiliated,sadness
1,i can go from feeling so hopeless to so damned...,sadness
2,im grabbing a minute to post i feel greedy wrong,anger
3,i am ever feeling nostalgic about the fireplac...,love
4,i am feeling grouchy,anger


In [None]:
# Overview of all variables, their datatypes
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 16000 entries, 0 to 15999
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   text    16000 non-null  object
 1   label   16000 non-null  object
dtypes: object(2)
memory usage: 250.1+ KB


**Preprocessing**

In [None]:
# Checking for missing values
df.isnull().sum().sort_values(ascending = False)

label    0
text     0
dtype: int64

In [None]:
# Checking the distribution of classes
df['label'].value_counts() 

joy         5362
sadness     4666
anger       2159
fear        1937
love        1304
surprise     572
Name: label, dtype: int64

It is evident, that the dataset is imbalanced.

**Split the train and test data**

In [None]:
X = df['text']

In [None]:
y = df['label']

In [None]:
# Isolate X and y variables, and perform train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 42, shuffle = True, stratify = y)

In [None]:
from sklearn.preprocessing import LabelEncoder

In [None]:
le = LabelEncoder()

In [None]:
y_train_enc = y_train.copy()

In [None]:
y_train_enc = pd.DataFrame(data = y_train_enc, columns = ['label'])

In [None]:
y_train_enc['label_encoded'] = le.fit_transform(y_train_enc['label'].values)

In [None]:
y_train_enc

Unnamed: 0,label,label_encoded
3431,sadness,4
2664,joy,2
15,joy,2
10548,joy,2
1984,love,3
...,...,...
10110,sadness,4
10114,sadness,4
10506,sadness,4
6081,joy,2


In [None]:
y_train_enc['label_encoded'].unique() 

array([4, 2, 3, 0, 1, 5])

In [None]:
# Label encode the target variable 
le.fit(y_train)
y_train = le.transform(y_train)
y_test = le.transform(y_test)

**Preprocess data and build a transformer model**

In [None]:
# Transformer model
MODEL_NAME = 'roberta-base' 

In [None]:
t = text.Transformer(MODEL_NAME, maxlen = 500, class_names =  ['anger', 'fear', 'joy', 'love', 'sadness', 'surprise'])

Downloading:   0%|          | 0.00/481 [00:00<?, ?B/s]

We must supply a class_names argument to the Transformer constructor, which tells ktrain how indices map to class names. In this case, class_names = ['anger', 'fear', 'joy', 'love', 'sadness', 'surprise'] because 0 = anger, 1 = fear, etc.

In [None]:
# Convert training set into a list
X_tr = pd.DataFrame(data = X_train, columns = ['text'])
X_tr = X_tr['text'].tolist()

In [None]:
y_tr = pd.DataFrame(data = y_train, columns = ['label'])
y_tr = y_tr['label'].tolist()

In [None]:
# Convert testing set into a list
X_te = pd.DataFrame(data = X_test, columns = ['text'])
X_te = X_te['text'].tolist()

In [None]:
y_te = pd.DataFrame(data = y_test, columns = ['label'])
y_te = y_te['label'].tolist()

In [None]:
# Preprocessing training and testing set 
trn = t.preprocess_train(X_tr, y_tr)
val = t.preprocess_test(X_te, y_te)

preprocessing train...
language: en
train sequence lengths:
	mean : 19
	95percentile : 40
	99percentile : 52


Downloading:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Is Multi-Label? False
preprocessing test...
language: en
test sequence lengths:
	mean : 19
	95percentile : 42
	99percentile : 53


In [None]:
# Model classifier
model = t.get_classifier()

Downloading:   0%|          | 0.00/657M [00:00<?, ?B/s]

In [None]:
# Wrap model and data in ktrain.Learner object
learner = ktrain.get_learner(model, train_data = trn, val_data = val, batch_size = 6)

**Estimate a good learning rate**

In Section 3.3 of “Cyclical Learning Rates for Training Neural Networks.”, Leslie N. Smith argued that you could estimate a good learning rate by training the model initially with a very low learning rate and increasing it (either linearly or exponentially) at each iteration. The paper suggests to use values like 1e-3, 1e-4, 1e-5 and 0 to start with, if there is no notion of what is correct weight decay value. 

**Train model**

In the paper “A disciplined approach to neural network hyper-parameters: Part 1 — learning rate, batch size, momentum, and weight decay” , Leslie Smith describes approach to set hyper-parameters (namely learning rate, momentum and weight decay) and batch size. In particular, he suggests 1 Cycle policy to apply learning rates.

In [None]:
# Training using the 1cycle policy
learner.fit_onecycle(1e-5, 1)



begin training using onecycle policy with max lr of 1e-05...


<keras.callbacks.History at 0x7f2b67fdcbd0>

**Evaluate/Inspect model**

In [None]:
# Evaluate model
learner.validate(class_names = t.get_classes())

              precision    recall  f1-score   support

       anger       0.91      0.90      0.91       432
        fear       0.92      0.86      0.89       388
         joy       0.95      0.90      0.93      1072
        love       0.79      0.85      0.82       261
     sadness       0.94      0.95      0.94       933
    surprise       0.71      1.00      0.83       114

    accuracy                           0.91      3200
   macro avg       0.87      0.91      0.89      3200
weighted avg       0.92      0.91      0.91      3200



array([[390,  15,   4,   1,  22,   0],
       [  6, 334,   1,   0,  14,  33],
       [  7,   5, 969,  58,  21,  12],
       [  2,   1,  35, 221,   2,   0],
       [ 23,   9,  11,   1, 887,   2],
       [  0,   0,   0,   0,   0, 114]])

Macro-average is preferable if there is a class imbalance problem.

With macro-average, a classifier is encouraged to try to recognize every class correctly. Since it is usually harder for the classifier to identify the small classes, this often makes it sacrifice some performance on the large classes. 

Whereas with micro-average, a classifier is encouraged to focus on the largest classes, possibly at the expense of the smallest ones.

We are able to achieve a macro-averaged accuracy of 89%.

**Preprocess data and build a transformer model**

In [None]:
# Transformer model
MODEL_NAME = 'bhadresh-savani/distilbert-base-uncased-emotion'  

In [None]:
t = text.Transformer(MODEL_NAME, maxlen = 500, class_names = ['anger', 'fear', 'joy', 'love', 'sadness', 'surprise'])

Downloading:   0%|          | 0.00/768 [00:00<?, ?B/s]

In [None]:
# Convert training set into a list
X_tr = pd.DataFrame(data = X_train, columns = ['text'])
X_tr = X_tr['text'].tolist()

In [None]:
y_tr = pd.DataFrame(data = y_train, columns = ['label'])
y_tr = y_tr['label'].tolist()

In [None]:
# Convert testing set into a list
X_te = pd.DataFrame(data = X_test, columns = ['text'])
X_te = X_te['text'].tolist()

In [None]:
y_te = pd.DataFrame(data = y_test, columns = ['label'])
y_te = y_te['label'].tolist()

In [None]:
# Pre-process training and testing sets
trn = t.preprocess_train(X_tr, y_tr)
val = t.preprocess_test(X_te, y_te)

preprocessing train...
language: en
train sequence lengths:
	mean : 19
	95percentile : 40
	99percentile : 52


Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/112 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/291 [00:00<?, ?B/s]

Is Multi-Label? False
preprocessing test...
language: en
test sequence lengths:
	mean : 19
	95percentile : 42
	99percentile : 53


In [None]:
# Model classifier
model = t.get_classifier()

Downloading:   0%|          | 0.00/268M [00:00<?, ?B/s]

In [None]:
# Wrap model and data in ktrain.Learner object
learner = ktrain.get_learner(model, train_data = trn, val_data = val, batch_size = 6)

**Train model**

In [None]:
# Training using the 1cycle policy
learner.fit_onecycle(1e-5, 1)



begin training using onecycle policy with max lr of 1e-05...


<keras.callbacks.History at 0x7f2bfa04fad0>

**Evaluate/Inspect model**

In [None]:
# Evaluate model
learner.validate(class_names = t.get_classes())

              precision    recall  f1-score   support

       anger       0.99      0.99      0.99       432
        fear       0.95      0.98      0.97       388
         joy       0.98      0.99      0.98      1072
        love       0.96      0.92      0.94       261
     sadness       1.00      1.00      1.00       933
    surprise       0.94      0.87      0.90       114

    accuracy                           0.98      3200
   macro avg       0.97      0.96      0.96      3200
weighted avg       0.98      0.98      0.98      3200



array([[ 426,    3,    0,    0,    3,    0],
       [   2,  382,    0,    0,    1,    3],
       [   0,    0, 1058,   11,    0,    3],
       [   1,    0,   20,  240,    0,    0],
       [   2,    1,    0,    0,  930,    0],
       [   0,   15,    0,    0,    0,   99]])

We are able to achieve a macro-averaged accuracy of 96%.

**Preprocess data and build a transformer model**

In [None]:
# Transformer model
MODEL_NAME = 'distilbert-base-uncased'  

In [None]:
t = text.Transformer(MODEL_NAME, maxlen = 500, class_names = ['anger', 'fear', 'joy', 'love', 'sadness', 'surprise'])

Downloading:   0%|          | 0.00/483 [00:00<?, ?B/s]

In [None]:
# Convert training set into a list
X_tr = pd.DataFrame(data = X_train, columns = ['text'])
X_tr = X_tr['text'].tolist()

In [None]:
y_tr = pd.DataFrame(data = y_train, columns = ['label'])
y_tr = y_tr['label'].tolist()

In [None]:
# Convert testing set into a list
X_te = pd.DataFrame(data = X_test, columns = ['text'])
X_te = X_te['text'].tolist()

In [None]:
y_te = pd.DataFrame(data = y_test, columns = ['label'])
y_te = y_te['label'].tolist()

In [None]:
# Pre-process training and testing sets 
trn = t.preprocess_train(X_tr, y_tr)
val = t.preprocess_test(X_te, y_te)

preprocessing train...
language: en
train sequence lengths:
	mean : 19
	95percentile : 40
	99percentile : 52


Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/466k [00:00<?, ?B/s]

Is Multi-Label? False
preprocessing test...
language: en
test sequence lengths:
	mean : 19
	95percentile : 42
	99percentile : 53


In [None]:
# Model classifier
model = t.get_classifier()

Downloading:   0%|          | 0.00/363M [00:00<?, ?B/s]

In [None]:
# Wrap model and data in ktrain.Learner object
learner = ktrain.get_learner(model, train_data = trn, val_data = val, batch_size = 6)

In [None]:
# Training using the 1cycle policy
learner.fit_onecycle(1e-5, 1)



begin training using onecycle policy with max lr of 1e-05...


<keras.callbacks.History at 0x7f5e67197a10>

In [None]:
# Evaluate model
learner.validate(class_names = t.get_classes())

              precision    recall  f1-score   support

       anger       0.94      0.90      0.92       432
        fear       0.90      0.91      0.90       388
         joy       0.95      0.90      0.93      1072
        love       0.75      0.91      0.82       261
     sadness       0.94      0.96      0.95       933
    surprise       0.82      0.81      0.81       114

    accuracy                           0.92      3200
   macro avg       0.88      0.90      0.89      3200
weighted avg       0.92      0.92      0.92      3200



array([[390,   5,   5,   3,  29,   0],
       [  9, 352,   3,   0,   8,  16],
       [  8,   3, 970,  76,  12,   3],
       [  0,   1,  18, 237,   5,   0],
       [  7,  15,  14,   2, 894,   1],
       [  0,  16,   6,   0,   0,  92]])

We are able to achieve a macro-averaged accuracy of 89%.

**Preprocess data and build a transformer model**

In [None]:
# Transformer model
MODEL_NAME = 'bert-base-uncased'  

In [None]:
t = text.Transformer(MODEL_NAME, maxlen = 500, class_names = ['anger', 'fear', 'joy', 'love', 'sadness', 'surprise'])

Downloading:   0%|          | 0.00/570 [00:00<?, ?B/s]

In [None]:
# Convert training set into a list
X_tr = pd.DataFrame(data = X_train, columns = ['text'])
X_tr = X_tr['text'].tolist()

In [None]:
y_tr = pd.DataFrame(data = y_train, columns = ['label'])
y_tr = y_tr['label'].tolist()

In [None]:
# Convert testing set into a list
X_te = pd.DataFrame(data = X_test, columns = ['text'])
X_te = X_te['text'].tolist()

In [None]:
y_te = pd.DataFrame(data = y_test, columns = ['label'])
y_te = y_te['label'].tolist()

In [None]:
# Pre-process training and testing sets 
trn = t.preprocess_train(X_tr, y_tr)
val = t.preprocess_test(X_te, y_te)

preprocessing train...
language: en
train sequence lengths:
	mean : 19
	95percentile : 40
	99percentile : 52


Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/466k [00:00<?, ?B/s]

Is Multi-Label? False
preprocessing test...
language: en
test sequence lengths:
	mean : 19
	95percentile : 42
	99percentile : 53


In [None]:
# Model classifier
model = t.get_classifier()

Downloading:   0%|          | 0.00/536M [00:00<?, ?B/s]

In [None]:
# Wrap model and data in ktrain.Learner object
learner = ktrain.get_learner(model, train_data = trn, val_data = val, batch_size = 6)

In [None]:
# Training using the 1cycle policy
learner.fit_onecycle(1e-5, 1)



begin training using onecycle policy with max lr of 1e-05...


<keras.callbacks.History at 0x7f45041676d0>

In [None]:
# Evaluate model
learner.validate(class_names = t.get_classes())

              precision    recall  f1-score   support

       anger       0.92      0.91      0.91       432
        fear       0.87      0.94      0.90       388
         joy       0.95      0.92      0.93      1072
        love       0.78      0.83      0.81       261
     sadness       0.94      0.95      0.95       933
    surprise       0.89      0.75      0.82       114

    accuracy                           0.92      3200
   macro avg       0.89      0.88      0.89      3200
weighted avg       0.92      0.92      0.92      3200



array([[392,  14,   3,   2,  21,   0],
       [  9, 366,   0,   0,   9,   4],
       [  7,   3, 981,  57,  19,   5],
       [  1,   1,  38, 217,   4,   0],
       [ 18,  17,   8,   1, 887,   2],
       [  0,  22,   6,   0,   0,  86]])

We are able to achieve a macro-averaged accuracy of 89%.

**Conclusion**

The best result among all the four models we trained belongs to DistilBERT (emotion). We were able to achieve a macro-averaged accuracy of 96% with a good f1-score for each of the predicted classes. 