# **Real-Time Facial Emotion Recognition (ITS - Computer Vision)**

## Project Content <a id = 0></a>

### Contents

- [Introduction](#1)
- [Setup](#2)
- [Dataset](#3)
- [Data pipeline](#4)
- [Preprocessing](#5)
- [Train/val split](#6)
- [Training utils](#7)
- [Model](#8)
- [Train](#9)
- [Load & history](#10)
- [Curves](#11)
- [Test eval](#12)

## Setup

## 1. Introduction <a id = 1></a>

### Project Overview

Goal: build a real-time facial emotion recognition demo for the ITS Computer Vision course. The pipeline covers face detection, preprocessing, emotion classification, and on-screen visualization (bounding box, label, confidence, FPS).

- Target classes: angry, disgust, fear, happy, sad, surprise, neutral.
- Real-time inference loop: camera → face detection → preprocessing (crop/align, normalize, resize) → emotion model → optional smoothing → visualization.

### Dataset and Model

- Primary dataset: FER2013 (see `data/link_to_data.txt` for the Kaggle source).
- 48×48 grayscale images; naturally imbalanced classes; consider augmentation and temporal smoothing.
- Suggested model: lightweight CNN (Keras/TensorFlow) or PyTorch equivalent. Target ≥ 20 FPS on a mid-range laptop.

### Tech Stack (suggested)

- Python 3.9+
- OpenCV for video I/O and basic CV ops
- TensorFlow/Keras or PyTorch for the emotion model
- NumPy/Pandas/Matplotlib/Seaborn for preprocessing/analysis

### Ethical Use and Limitations

- Emotion recognition is probabilistic and sensitive to lighting, pose, occlusions, and dataset bias.
- Educational use only; do not use for high-stakes decisions; obtain consent before capturing or analyzing video.

### Roadmap

1) Use the webcam to extract the face.
2) Convert the face crop to match the dataset style.
3) Train on FER2013 (or load a pretrained model).
4) Run the webcam image through the model.
5) Overlay the predicted label and confidence on the face box.

[Project Content](#0)

## 2. Setup <a id = 2></a>

Import libraries.

In [1]:
# Basic Python Packages
import os
import random
import time
import pickle

# Numpy Library
import numpy as np

# Pandas Library and Settings
import pandas as pd

# Visualization Libraries (Matplotlib, Seaborn)
import matplotlib.pyplot as plt
import seaborn as sns

# SKLearn Libarary
from sklearn.metrics import confusion_matrix as sk_confusion_matrix

# Tensorflow Library
import tensorflow as tf
    
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPool2D, Dense, Flatten, BatchNormalization, Dropout
from tensorflow.keras import callbacks
from tensorflow.keras.callbacks import Callback
from tensorflow.keras.utils import plot_model, to_categorical
from keras.metrics.accuracy_metrics import Accuracy, CategoricalAccuracy
from tensorflow.keras.models import load_model

# OpenCV
import cv2

%matplotlib inline

ModuleNotFoundError: No module named 'tensorflow'

In [None]:
gpus = tf.config.experimental.list_physical_devices("GPU")

for gpu in gpus:
    print(gpu)

In [None]:
for gpu in gpus:
    tf.config.experimental.set_memory_growth(gpu, True)

In [None]:
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
config.gpu_options.visible_device_list = "0,1"

session = tf.compat.v1.Session(config=config)

tf.compat.v1.keras.backend.set_session(session)

Changing tensorflow settings.

In [None]:
gpus = tf.config.experimental.list_physical_devices("GPU")
for gpu in gpus:
    tf.config.experimental.set_memory_growth(gpu, True)

[Project Content](#0)

## 3. Dataset <a id = 3></a>

Set the dataset directory and check it.

In [None]:

data_directory = "../data/archive"

print(f"{os.listdir(data_directory)}")

['test', 'train']


In [None]:
train_directory = os.path.join(data_directory, "train")
test_directory = os.path.join(data_directory, "test")

print(f"Data directory     {data_directory}")
print(f"Train directory    {train_directory}")
print(f"Test directory     {test_directory}")

List train/test folders.

In [None]:
print(f"{os.listdir(train_directory)}")
print(f"{os.listdir(test_directory)}")

We have seven classes. Count images per class.

Define class names.

In [None]:
expressions_list = os.listdir(train_directory)

In [None]:
train_dataset_info_df = pd.DataFrame(columns=["Expression", "Size", "Proportion %"])
train_expression_size = []

for expression in expressions_list:
    
    index = expressions_list.index(expression)
    
    train_expression_directory = os.path.join(train_directory, expression)
    train_expression_size.append(len(os.listdir(train_expression_directory)))
    
train_expression_proportion = [round((expression_size / sum(train_expression_size)) * 100, 2)\
                               for expression_size in train_expression_size]

train_dataset_info_df["Expression"] = expressions_list
train_dataset_info_df["Size"] = train_expression_size
train_dataset_info_df["Proportion %"] = train_expression_proportion

total_size = train_dataset_info_df["Size"].sum()
total_proportion = train_dataset_info_df["Proportion %"].sum()

total_row = pd.DataFrame({"Expression": ["Total"],
                          "Size": [total_size],
                          "Proportion %": [total_proportion]})

train_dataset_info_df = pd.concat([train_dataset_info_df, total_row], ignore_index=True)

train_dataset_info_df = train_dataset_info_df.style
train_dataset_info_df = train_dataset_info_df.apply(lambda x: ['background-color: green' if\
                                                    i == len(x)-1 else ''\
                                                    for i in range(len(x))], axis=0)

train_dataset_info_df

In [None]:
test_dataset_info_df = pd.DataFrame(columns=["Expression", "Size", "Proportion %"])
test_expression_size = []

for expression in expressions_list:
    
    index = expressions_list.index(expression)
    
    test_expression_directory = os.path.join(test_directory, expression)
    test_expression_size.append(len(os.listdir(test_expression_directory)))
    
test_expression_proportion = [round((expression_size / sum(test_expression_size)) * 100, 2)\
                              for expression_size in test_expression_size]

test_dataset_info_df["Expression"] = expressions_list
test_dataset_info_df["Size"] = test_expression_size
test_dataset_info_df["Proportion %"] = test_expression_proportion

total_size = test_dataset_info_df["Size"].sum()
total_proportion = test_dataset_info_df["Proportion %"].sum()

total_row = pd.DataFrame({"Expression": ["Total"],
                          "Size": [total_size],
                          "Proportion %": [total_proportion]})

test_dataset_info_df = pd.concat([test_dataset_info_df, total_row], ignore_index=True)

test_dataset_info_df = test_dataset_info_df.style
test_dataset_info_df = test_dataset_info_df.apply(lambda x: ['background-color: green'\
                                                  if i == len(x)-1 else ''\
                                                  for i in range(len(x))], axis=0)

test_dataset_info_df

It seems that about 20% of the data is in the test set and the rest are in the train set.

The training dataset is imbalanced, with different proportions for each expression category.</br>
Imbalanced datasets can pose challenges during training, as the model may be biased towards the majority class(es) and struggle to accurately classify the minority classes.

Let's check if the distribution of different facial expressions images the same in both train and test sets.

In [None]:
title_font = {"family" : "arial", "color" : "k", "weight" : "bold", "size" : 14}
axes_font = {"family" : "arial", "color" : "grey", "weight" : "bold", "size" : 12}

sorted_index = np.argsort(train_expression_size)[::-1]

values = np.array(train_expression_size)[sorted_index]
label = np.array(expressions_list)[sorted_index]

colors = plt.cm.Set3(np.linspace(0, 1, len(values)))

plt.bar(label, values, color=colors)

plt.xlabel("Expressions", fontdict=axes_font)
plt.ylabel("Train Expression Size", fontdict=axes_font)
plt.title("Train Expression Size by Expression", fontdict=title_font)

plt.gca().spines["top"].set_visible(False)
plt.gca().spines["right"].set_visible(False)

In [None]:
title_font = {"family" : "arial", "color" : "k", "weight" : "bold", "size" : 14}
axes_font = {"family" : "arial", "color" : "grey", "weight" : "bold", "size" : 12}

sorted_index = np.argsort(test_expression_size)[::-1]

values = np.array(test_expression_size)[sorted_index]
label = np.array(expressions_list)[sorted_index]

colors = plt.cm.Set3(np.linspace(0, 1, len(values)))

plt.bar(label, values, color=colors)

plt.xlabel("Expressions", fontdict=axes_font)
plt.ylabel("Test Expression Size", fontdict=axes_font)
plt.title("Test Expression Size by Expression", fontdict=title_font)

plt.gca().spines["top"].set_visible(False)
plt.gca().spines["right"].set_visible(False)

Let's check one image for each expression to get better understanding of the dataset.

In [None]:
title_font = {"family" : "arial", "color" : "k", "weight" : "bold", "size" : 14}

fig, axes = plt.subplots(1, len(expressions_list), figsize=(len(expressions_list) * 4, 4))

i = 0

while i < len(expressions_list):
    
    expression = expressions_list[i]

    expression_directory = os.path.join(train_directory, expression)
    images_list = os.listdir(expression_directory)
    
    image_directory = os.path.join(expression_directory, random.choice(images_list))
    
    image = cv2.imread(image_directory, cv2.IMREAD_GRAYSCALE)
        
    axes[i].imshow(image, cmap="gray")
    axes[i].set_title(expression, fontdict=title_font)
    
    i += 1

plt.show()

Nice. Let's get deeper to check the images as numpy arrays.

In [None]:
print(f"Images shape is: {image.shape}")

And for the last step, we want to get assured that all field have the same format type.

In [None]:
formats = []

for dir in [train_directory, test_directory]:

    i = 0

    print(f"Checking {dir.split('/')[1]} data:")
    
    while i < len(expressions_list):
        
        expression = expressions_list[i]

        expression_directory = os.path.join(dir, expression)
        images_list = os.listdir(expression_directory)
        
        for image in images_list:
                    
            format = image.split(".")[1]
            
            if format not in formats:
                formats.append(format)

        print(f"    {expression} Checked.")
        
        i += 1

print("-"*30)
print(f"File formats are: {formats}")

Since everything is ok, we can get further and build a data pypeline.

[Project Content](#0)

## 4. Data pipeline <a id = 4></a>

First of all, let's create a dataset from image files to use them for training the neural network.</br>
In this step, we also label the data.

In [None]:
data = tf.keras.preprocessing.image_dataset_from_directory(train_directory,
                                                           image_size=(48, 48),
                                                           batch_size=64,
                                                           color_mode="grayscale")

print(f"Data type is:       {type(data)}")

The calsses are assigned like the way below.

In [None]:
expressions_list = data.class_names

expressions_list

In order to use the data, we should iterate over it using the numpy iterator.

In [None]:
data_iterator = data.as_numpy_iterator()

data_iterator

To use this iterated data, we have to get batch from it to feed the neural network.</br>
This step will be completed in the training stage.

In [None]:
batch = data_iterator.next()

print(f"Each batch has {len(batch)} parts of data.")
print(f"Each batch's images part has the shape of {batch[0].shape}")
print(f"Each batch's images part has the shape of {batch[1].shape}")

This means that each batch has two parts of data, images and labels.</br>
Images part has 32 images which are 

Before going further, let's check the train dataset by their labels.

In [None]:
title_font = {"family" : "arial", "color" : "k", "weight" : "bold", "size" : 14}

print("Facial expressions are:")

for expression in expressions_list:
    print(f"    {expressions_list.index(expression)}. {expression}")

indexes = np.random.randint(0, batch[0].shape[0], 14)

fig, axes = plt.subplots(2, 7, figsize=(28, 8))

i = 0
j = 0

for index in indexes:
    
    axes[i, j].imshow(batch[0][index].astype(int), cmap="gray")
    axes[i, j].set_title(batch[1][index], fontdict=title_font)

    j += 1
    
    if j==7:
        i = 1
        j = 0
        
plt.show()

[Project Content](#0)

## Preprocessing

## 5. Scaling The Dataset <a id = 5></a>

Firstly, let's check the data values' minimum and maximum.

In [None]:
print(f"Data Minimum: {batch[0].min()}")
print(f"Data Maximum: {batch[0].max()}")

Like most of the times the range of values is between 0 and 255.</br>
Now we can scale the data by deviding its values by 255 to make the learning process faster.

We can do this by using the map function in data pipeline.

In [None]:
data = data.map(lambda x, y: (x/255., y))

Now, we've created the data scaler and the scaler is now embedded in the data pypeline.

Let's check the next batch.

In [None]:
batch = data.as_numpy_iterator().next()

print(f"Data Minimum: {batch[0].min()}")
print(f"Data Maximum: {batch[0].max()}")

As we know, this won't make any change in the images appearance.

[Project Content](#0)

## 6. Train/val split <a id = 6></a>

Since the train and validation data is a package of data named train we should split that to validate the model when training it.

In [None]:
train_size = int(len(data)*0.875)
validation_size = int(len(data)*0.125)

print(f"The train dataset size will be {train_size}.")
print(f"The validation dataset size will be {validation_size}.")

In [None]:
train = data.take(train_size)
validation = data.skip(train_size).take(validation_size)

[Project Content](#0)

## Modeling

## 7. Training utils <a id = 7></a>

In order to create a deep neural network, we define a variable taking the whole model in it and then add settings and features to it.

In [None]:
def get_compiled_model(input_shape, optimizer, loss, metrics):
    """
    This is a Python function that compiles and returns a neural network model using the Keras library.

    Args:
        input_shape (tuple)             The shape of the input data for the model.
        optimizer (str or callable)     The optimizer to use for training the model.
        loss (str or callable)          The loss function to use during training.
        metrics (list)                  The list of evaluation metrics for the model.

    Returns:
        model ()                        The compiled Keras model object that can be used for training.
    """
        
    model = Sequential()
    
    # Convolutional and pooling layers
    
    model.add(Conv2D(32, (3, 3), strides=1, activation="relu",
                     padding="same", input_shape=(48, 48, 1)))
    
    model.add(Conv2D(64, (3, 3), strides=1, activation="relu",
                     padding="same"))

    model.add(BatchNormalization())
    
    model.add(MaxPool2D(pool_size=(2, 2)))
    
    model.add(Dropout(0.25))

    # Convolutional and pooling layers
    
    model.add(Conv2D(128, (3, 3), strides=1, activation="relu",
                     padding="same", kernel_regularizer=tf.keras.regularizers.l2(0.01)))
    
    model.add(Conv2D(256, (3, 3), strides=1, activation="relu",
                     padding="same", kernel_regularizer=tf.keras.regularizers.l2(0.01)))
    
    model.add(BatchNormalization())
    
    model.add(MaxPool2D(pool_size=(2, 2)))
    
    model.add(Dropout(0.25))

    # Flatten and dense layer
    
    model.add(Flatten())
    
    model.add(Dense(256, activation="relu"))
    
    model.add(BatchNormalization())
    
    model.add(Dropout(0.25))
    
    # Flatten and dense layer
    
    model.add(Dense(512, activation="relu"))
    
    model.add(BatchNormalization())
    
    model.add(Dropout(0.25))
    
    # Final layer
    
    model.add(Dense(7, activation="softmax"))
    
    # Compiler
    
    model.compile(optimizer=optimizer, loss=loss, metrics=metrics)

    return model

In [None]:
def train_model(model, train_data, epochs, validation_data, callbacks):
    """
    Summary:
    This is a Python function that trains a given neural network model
    on the provided training data and returns the training history.

    Args:
        model (tensorflow.keras model)          The neural network model object to train.
        train_data (numpy.ndarray)              The training data to use for training the model.
        train_target (numpy.ndarray)            The target values for the training data.
        epochs (int)                            The number of epochs to train the model.
        validation_data (numpy.ndarray)         The validation data to use for evaluation.
        callbacks (list)                        A list of Keras callbacks to use during training.

    Returns:
        history ()                              The training history object that contains information
                                                about the training and validation metrics over each epoch.
    """

    history = model.fit(train_data,
                        epochs=epochs,
                        validation_data=validation_data,
                        callbacks=callbacks,
                        verbose=1)
    
    return history

Here, we define some callbacks to get more insights when training or validating the model.

In [None]:
# Chemins des répertoires pour le modèle, les checkpoints et l'historique
model_base_dir = os.path.join("..", "model")
checkpoints_dir = os.path.join(model_base_dir, "checkpoints")
history_dir = os.path.join(model_base_dir, "history")

os.makedirs(model_base_dir, exist_ok=True)
os.makedirs(checkpoints_dir, exist_ok=True)
os.makedirs(history_dir, exist_ok=True)

class TrainingCallbacks(Callback):
    
    def __init__(self):
        self.start_time = None

    def on_train_begin(self, logs=None):
        self.start_time = time.time()
        print("Starting training ...")

    def on_epoch_end(self, epoch, logs=None):
        elapsed_time = time.time() - self.start_time
        print(f"Epoch {epoch + 1} completed in {elapsed_time:.2f} seconds")

    def on_train_end(self, logs=None):
        total_time = time.time() - self.start_time
        print(f"Training finished in {total_time:.2f} seconds")
        
logs_cb = callbacks.TensorBoard(log_dir="logs")

checkpoint = callbacks.ModelCheckpoint(filepath=os.path.join(checkpoints_dir, "model.h5"),
                                       save_best_only=True,
                                       monitor="val_accuracy")
        
callbacks_list = [TrainingCallbacks(), logs_cb, checkpoint]

[Project Content](#0)

## 8. Model <a id = 8></a>

By using the function above, now we can builld the architecture defined in the function and parameters determined as the arguments.

In [None]:
model = get_compiled_model((48, 48, 1),
                           optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001),
                           loss="sparse_categorical_crossentropy",
                           metrics=["accuracy"])

model.summary()

It's better to plot the architecture to understand the network better.

In [None]:
plot_model(model, show_shapes=True,
           show_layer_names=False,
           expand_nested=True,
           rankdir="TB",
           dpi=100)

[Project Content](#0)

## 9. Training <a id = 9></a>

Now we can train the model and save its information in a variable to check it out later.

In [None]:
history = train_model(model,
                      train_data=train,
                      epochs=60,
                      validation_data=validation,
                      callbacks=callbacks_list)

In [None]:
# Sauvegarde du modèle final et de l'historique dans ../model
# (en plus du meilleur modèle sauvegardé automatiquement par ModelCheckpoint)
model.save(os.path.join(model_base_dir, "final_model.keras"))
model.save(os.path.join(model_base_dir, "final_model.h5"))

with open(os.path.join(history_dir, "history.pkl"), "wb") as f:
    pickle.dump(history.history, f)


In the end, we should save the model's history data in a readable format too.

In [None]:
# The model's history can be saved using this block of code.

"""with open(os.path.join("history", "history.pkl"), "wb") as f:
    pickle.dump(history.history, f)"""

[Project Content](#0)

## 10. Loading The Model and its History <a id = 10></a>

Firslty, we open the history file we have saved after training the model.

In [None]:
# Chargement du modèle et de l'historique depuis ../model
"""
from tensorflow.keras.models import load_model

# Charger le meilleur modèle sauvegardé par le callback
model = load_model(os.path.join("..", "model", "checkpoints", "model.h5"))
# ou pour charger le modèle final
# model = load_model(os.path.join("..", "model", "final_model.keras"))

# Charger l'historique d'entraînement
with open(os.path.join("..", "model", "history", "history.pkl"), "rb") as f:
    history = pickle.load(f)
"""

# Si 'history' est un objet History (après entraînement dans la même session),
# convertir en dict. Si c'est déjà un dict chargé depuis pickle, on ne touche pas.
try:
    history = history.history
except AttributeError:
    pass


In [None]:
# The model can be loaded and the model's history can be read using this block of code.


"""model = load_model(os.path.join("checkpoints", "model.h5"))

with open(os.path.join("history", "history.pkl"), "rb") as f:
    history = pickle.load(f)"""

history = history.history

Now we have access the data saved during the training process.

In [None]:
history_df = pd.DataFrame(history)

history_df

[Project Content](#0)

## 11. Plotting The Models's Loss and Accuracy <a id = 11></a>

Let's check how the model performs in both training and validation datasets through epochs.

In [None]:
title_font = {"family" : "arial", "color" : "k", "weight" : "bold", "size" : 14}
axes_font = {"family" : "arial", "color" : "#023553", "weight" : "bold", "size" : 12}

fig = plt.figure(figsize=(25, 8))

plt.plot(history["loss"], color="#3BB47E", label="Training loss")
plt.plot(history["val_loss"], color="#FF605C", label="Validation Loss")

plt.xticks(range(len(history["loss"])))

plt.legend(loc="upper right")

plt.title("Loss vs Epochs", fontdict = title_font)
plt.xlabel("Epoch Number", fontdict = axes_font)
plt.ylabel("Loss", fontdict = axes_font)

plt.grid(True, axis="x", alpha=0.5, linestyle="--")

max_val_acc_row_index = history_df[history_df["val_accuracy"] == max(history_df["val_accuracy"])].index[-1]

plt.scatter(max_val_acc_row_index,
            history_df.loc[max_val_acc_row_index, "val_loss"])

plt.show()

It's obvious that model suffers from overfitting problem.</br>
This can be the result of many issues, which will be addressed later in this notebook.

We can also check its loss values.

In [None]:
title_font = {"family" : "arial", "color" : "k", "weight" : "bold", "size" : 14}
axes_font = {"family" : "arial", "color" : "#023553", "weight" : "bold", "size" : 12}

fig = plt.figure(figsize=(25, 8))

plt.plot(history["accuracy"], color="#3BB47E", label="Training accuracy")
plt.plot(history["val_accuracy"], color="#FF605C", label="Validation accuracy")

plt.xticks(range(len(history["accuracy"])))

plt.legend(loc="upper left")

plt.title("Accuracy vs Epochs", fontdict = title_font)
plt.xlabel("Epoch Number", fontdict = axes_font)
plt.ylabel("Accuracy", fontdict = axes_font)

plt.grid(True, axis="x", alpha=0.5, linestyle="--")

plt.scatter(max_val_acc_row_index,
            history_df.loc[max_val_acc_row_index, "val_accuracy"])

plt.show()

[Project Content](#0)

## 12. Model's Performance Evaluation <a id = 12></a>

The last step is to check how the model performs on the testing dataset.</br>
For this goal we will evaluate the model using three metrics.

In [None]:
accuracy = Accuracy()
categorical_accuracy = CategoricalAccuracy()

We haven't load the test dataset in the model, so we should build a data pipeline for that.</br>
And then we should update the metrics for each batch of the test set.

In [None]:
test_directory = os.path.join("/kaggle/input/fer2013", "test")

test = tf.keras.preprocessing.image_dataset_from_directory(test_directory,
                                                           image_size=(48, 48),
                                                           batch_size=64,
                                                           color_mode="grayscale")

Now we can evaluate the model.

In [None]:
num_classes = 7
confusion_matrix = np.zeros((num_classes, num_classes))

for batch in test.as_numpy_iterator():

    test_data, test_target = batch
    test_target = to_categorical(test_target, num_classes=num_classes)

    test_target_pred = model.predict(test_data, verbose=0)

    categorical_accuracy.update_state(test_target, test_target_pred)
    accuracy.update_state(test_target, test_target_pred)

    test_target_pred_labels = np.argmax(test_target_pred, axis=1)

    batch_confusion_matrix = sk_confusion_matrix(np.argmax(test_target, axis=1), test_target_pred_labels, labels=range(num_classes))
    confusion_matrix += batch_confusion_matrix

Let's check the results.

In [None]:
print("Testing Results")
print("-"*30)

print(f"Accuracy               {(accuracy.result()*100):.4f}")
print(f"Categorical Accuracy   {(categorical_accuracy.result()*100):.4f}")

Looking at the testing results, there are a couple of observations:

**Accuracy**</br>

The overall accuracy of 77.7276% indicates that the model is correctly predicting the facial expression for approximately 77.7% of the test samples.</br>
However, it's important to consider the class distribution to gain a deeper understanding of the model's performance.

**Categorical Accuracy**</br>

The categorical accuracy of 30.3706% suggests that the model is struggling to accurately classify the test samples into their respective facial expression categories.</br>
This metric measures the percentage of samples for which the highest predicted class matches the true class.

The low categorical accuracy indicates that the model might be biased towards the majority class(es) or facing difficulties in distinguishing between the different facial expressions, especially the minority classes.

Let's check the confusion matrix to understand how the model performed the predictions.

In [None]:
title_font = {"family" : "arial", "color" : "k", "weight" : "bold", "size" : 14}
axes_font = {"family" : "arial", "color" : "#023553", "weight" : "bold", "size" : 12}

normalized_confusion_matrix = confusion_matrix / confusion_matrix.sum(axis=1, keepdims=True)

fig, ax = plt.subplots(figsize=(10, 8))

heatmap = sns.heatmap(normalized_confusion_matrix, annot=True, fmt=".2f", cmap="Blues")

ax.set_xlabel("Predicted Labels", fontdict=axes_font)
ax.set_ylabel("True Labels", fontdict=axes_font)
ax.set_title("Confusion Matrix", fontdict=title_font)

ax.xaxis.set_ticklabels(expressions_list)
ax.yaxis.set_ticklabels(expressions_list)

plt.xticks(rotation=45)

plt.show()

Looking at the confusion matrix, a few observations can be made:

**Majority Class Bias**</br>

The diagonal elements of the confusion matrix (from top left to bottom right) represent the correctly classified samples.</br>
It appears that the model performs relatively well in predicting the "happy" class, as it has the highest percentage (64%) of correct predictions among all the classes.</br>
On the other hand, the model struggles to accurately classify the "neutral" and "sad" classes, as indicated by lower percentages in the corresponding diagonal elements.

**Misclassifications**</br>

The off-diagonal elements of the confusion matrix represent misclassifications.</br>
For example, the model tends to confuse "angry" samples with "disgust," "fear," and "happy" classes.</br>
Similarly, "sad" samples are often misclassified as "angry," "disgust," and "fear."

**Imbalanced Misclassifications**</br>

It's worth noting that the misclassifications are not evenly distributed.</br>
For instance, the "neutral" class has a relatively high proportion of misclassifications, particularly being confused with "disgust," "fear," and "sad."

Based on the confusion matrix, it seems that the model struggles to accurately differentiate between certain facial expressions, especially those with similar visual characteristics.</br>
This highlights the need to further fine-tune the model, consider data augmentation techniques, and potentially explore more complex architectures or advanced techniques to enhance its performance in distinguishing between these classes.

[Project Content](#0)