# Bert Transformer

In [None]:
# Instantiating the BertModel class
from classes import BertModel

In [2]:
# Relevant libraries
import pandas as pd
import torch
from sklearn.utils import resample
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix
from transformers import DistilBertTokenizerFast, DistilBertForSequenceClassification, Trainer, TrainingArguments, EarlyStoppingCallback, get_scheduler
from torch.utils.data import Dataset

## 1. Binary classification (Positive vs Negative emotion)

In [3]:
# For binary classification (Positive vs Negative emotion)
binary_model = BertModel(data_path="df_raw.csv", model_type="distilbert-base-uncased", mode="binary")

# Load and process data
binary_model.load_and_process_data()

# Tokenize the data
binary_model.tokenize_data()

# Build the model
binary_model.build_model()

# Train the model (you can adjust epochs and batch size as needed)
binary_model.train_model(epochs=3, batch_size=8)

# Evaluate the model
binary_model.evaluate_model()


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss
1,0.6911,0.696872
2,0.6806,0.678519
3,0.6586,0.616132


Classification Report (binary classification):
              precision    recall  f1-score   support

           0       0.71      0.76      0.74        46
           1       0.69      0.64      0.67        39

    accuracy                           0.71        85
   macro avg       0.70      0.70      0.70        85
weighted avg       0.71      0.71      0.70        85

Confusion Matrix (binary classification):
[[35 11]
 [14 25]]


The results from the BERT binary classification model suggest a moderate performance in sentiment analysis of emotions in tweets. The training and validation losses decrease over the epochs, indicating that the model is improving its learning. The classification report shows a precision of 0.71 for class 0 (likely representing negative sentiment) and 0.69 for class 1 (positive sentiment). Recall for class 0 is slightly higher (0.76), indicating that the model is better at identifying negative emotions. However, the recall for class 1 is lower (0.64), meaning the model misses more positive sentiment instances. The F1-scores are balanced, with class 0 performing slightly better. The confusion matrix highlights that there are 35 true negatives, 25 true positives, 14 false positives, and 11 false negatives. Overall, the model has a 71% accuracy, which is decent but shows room for improvement in detecting positive emotions.


## 2. Multi-class classification (Negative, Neutral, Positive emotion)

In [4]:
# For multi-class classification (Negative, Neutral, Positive emotion)
multi_model = BertModel(data_path="df_raw.csv", model_type="distilbert-base-uncased", mode="multi")

# Load and process data
multi_model.load_and_process_data()

# Tokenize the data
multi_model.tokenize_data()

# Build the model
multi_model.build_model()

# Train the model (you can adjust epochs and batch size as needed)
multi_model.train_model(epochs=3, batch_size=8)

# Evaluate the model
multi_model.evaluate_model()


Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss
1,0.5272,0.455341
2,0.1493,0.381938
3,0.1919,0.42089


Classification Report (multi classification):
              precision    recall  f1-score   support

           0       0.94      0.99      0.97       361
           1       0.87      0.85      0.86       332
           2       0.88      0.84      0.86       332

    accuracy                           0.90      1025
   macro avg       0.90      0.89      0.89      1025
weighted avg       0.90      0.90      0.90      1025

Confusion Matrix (multi classification):
[[359   0   2]
 [ 15 281  36]
 [  9  43 280]]


The multi-class classification model performs well in predicting three distinct classes, achieving an overall accuracy of 90%. Class 0 is the most accurately predicted, with high precision (0.94) and recall (0.99). Class 1 and Class 2 show slightly lower performance, with precision and recall values around 0.87-0.88 and 0.84-0.85, respectively, indicating some misclassification between these two classes. The confusion matrix highlights that most predictions are correct, but there are a few errors, particularly between Class 1 and Class 2. Overall, the model demonstrates strong and balanced performance across all classes.

### Results of the 2 bert models

In [1]:
# Data for the models
data = {
    'Model': ['BERT Binary', 'BERT Multiclass'],
    'Accuracy': [0.71, 0.90],
    'Precision': [0.71, 0.90],
    'Recall': [0.76, 0.89],
    'F1-Score': [0.74, 0.89]
}

# Create DataFrame
df = pd.DataFrame(data)

# Display the table
print(df)


             Model  Accuracy  Precision  Recall  F1-Score
0      BERT Binary      0.71       0.71    0.76      0.74
1  BERT Multiclass      0.90       0.90    0.89      0.89


The BERT Multiclass model demonstrates significantly better performance compared to the BERT Binary model, with higher accuracy, precision, recall, and F1-score across all metrics. This suggests that BERT's ability to handle complex, multi-class sentiment classification tasks leads to more accurate and reliable predictions.