# Serena Emotion Detector - Training Notebook

This notebook is used to setup and train Serena Emotion Detector in Vertex AI. The output will be saved to our GCS bucket `serena-shsw-datasets/models` folder.  
To evaluate the model, use `evaluate.ipynb` notebook in this directory.


## Background

Serena Emotion Detector is a CNN model that detects 7 emotions (`angry`, `disgust`, `fear`, `happy`, `neutral`, `sad`, `surprise`) from a person's front-facing photo. We use [FER2013](https://www.kaggle.com/deadskull7/fer2013) dataset since it is a popular dataset for emotion detection.

When we were starting with creating our model, we used to create the architecture from scratch. But after multiple trial and errors, the best we could get was around 64% accuracy. Even then, that took about 3 hours for every 10 epoch training session in Vertex AI. Other than that, our own models always faced problems where it would classify wrong emotions or would just be biased towards one emotion class.

After learning from our mistakes, learning more about CNN, and learning from tutorials; we decided to use transfer learning. We used transfer learning since it would be quicker to train and we don't have to design the architecture from scratch, instead we would just need to modify the dataset and some layers to fit our needs of detecting 7 classes of emotions.

The model we use for transfer learning is [MobileNetV2](https://www.ict-srilanka.com/blog/what-is-mobilenetv2). We use MobileNetV2 since it is designed to be lightweight for devices with limited resources (e.g. mobile phones, IoT devices, etc). We think this would make predictions quicker when deployed on Cloud Run or later in the future when we want to embed it directly into our IoT device `SerenBox`.


## Setup

We store our dataset in GCS. There are 7 classes, each class seperated into a folder. We'll save the model using the latest `.keras` instead of `.h5` since it's more modern and easier to move around since it will save the weight and model configuration in one file.  
If you want to try it out yourself, you need to replace `train_dataset_path` to your own FER-2013 dataset path. You can download FER-2013 dataset [here](https://www.kaggle.com/msambare/fer2013).

> 🚧 Warning
>
> This notebook was designed to be run in OUR Vertex AI environment. If you want to run it yourself, you need to change some code to fit your environment.  
> You can directly use our model without having to train it first by following the steps in `evaluate.ipynb` notebook.


Run this `gcsfuse` cell if you are using Vertex AI workbench and can't list the folders inside of "/gcs"

In [1]:
!gcsfuse --implicit-dirs "~/gcs"

I1212 17:24:38.773088 2023/12/12 17:24:38.773037 Start gcsfuse/0.42.5 (Go version go1.20.3) for app "" using mount point: /home/jupyter/gcs
daemonize.Run: readFromProcess: sub-process: mountWithArgs: mountWithConn: Mount: mount: running /usr/bin/fusermount: exit status 1


In [2]:
import os

import cv2
import numpy as np
from tensorflow import keras, data
from keras import layers, Model
from keras.applications import MobileNetV2
from keras.callbacks import ModelCheckpoint

gcs_path = "/home/jupyter/gcs/serena-shsw-datasets/"
train_dataset_path = os.path.join(
    gcs_path, "FER-2013/train"  # TODO: change this to your own dataset
)
test_dataset_path = os.path.join(
    gcs_path, "FER-2013/test"  # TODO: change this to your own dataset
)
model_save_path = os.path.join(
    gcs_path,
    "models/serena-emotion-detector.keras",  # TODO: change this to your own path
)

classes = ["angry", "disgust", "fear", "happy", "neutral", "sad", "surprise"]

2023-12-12 17:24:46.626947: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/conda/lib/python3.7/site-packages/cv2/../../lib64:
2023-12-12 17:24:46.626990: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


## Processing Training Data


Load training & validation data

In [3]:
# Kwargs for image_dataset_from_directory
img_size = 224
labels = "inferred"
label_mode = "int"
class_names = classes
color_mode = "rgb"
batch_size = 32 * 6
image_size = (img_size, img_size)
shuffle = True
interpolation = "bilinear"
follow_links = False


def create_training_data():
    training_data = keras.utils.image_dataset_from_directory(
        directory=train_dataset_path,
        labels=labels,
        label_mode=label_mode,
        class_names=class_names,
        color_mode=color_mode,
        batch_size=batch_size,
        image_size=image_size,
        shuffle=shuffle,
        interpolation=interpolation,
        follow_links=follow_links,
        seed=123,
    )

    return training_data


def create_validation_data():
    validation_data = keras.utils.image_dataset_from_directory(
        directory=test_dataset_path,
        labels=labels,
        label_mode=label_mode,
        class_names=class_names,
        color_mode=color_mode,
        batch_size=batch_size,
        image_size=image_size,
        shuffle=shuffle,
        interpolation=interpolation,
        follow_links=follow_links,
        seed=321,
    )

    return validation_data

In [4]:
training_data = create_training_data()

Found 28709 files belonging to 7 classes.


2023-12-12 17:26:31.440464: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/conda/lib/python3.7/site-packages/cv2/../../lib64:
2023-12-12 17:26:31.440509: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2023-12-12 17:26:31.440540: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (emotion-detector-nb): /proc/driver/nvidia/version does not exist
2023-12-12 17:26:31.440971: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [5]:
validation_data = create_validation_data()

Found 7178 files belonging to 7 classes.


In [6]:
print(training_data.class_names)
print(validation_data.class_names)
print("Same order: ", training_data.class_names == validation_data.class_names)

['angry', 'disgust', 'fear', 'happy', 'neutral', 'sad', 'surprise']
['angry', 'disgust', 'fear', 'happy', 'neutral', 'sad', 'surprise']
Same order:  True


Setup image loading strategy

In [7]:
AUTOTUNE = data.AUTOTUNE

training_data = training_data.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
validation_data = validation_data.cache().prefetch(buffer_size=AUTOTUNE)

Normalize dataset to range 0-1

In [8]:
training_normalization_layer = layers.Rescaling(1./255)
validation_normalization_layer = layers.Rescaling(1./255)

normalized_training_data = training_data.map(lambda x, y: (training_normalization_layer(x), y))
normalized_validation_data = validation_data.map(lambda x, y: (validation_normalization_layer(x), y))
image_batch, labels_batch = next(iter(normalized_training_data))
first_image = image_batch[0]

print("min: ",np.min(first_image), "max: ",np.max(first_image))


2023-12-12 17:27:09.139268: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:390] Filling up shuffle buffer (this may take a while): 922 of 1536
2023-12-12 17:27:15.447941: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:415] Shuffle buffer filled.
2023-12-12 17:27:17.487597: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:390] Filling up shuffle buffer (this may take a while): 1 of 1000
2023-12-12 17:27:19.452084: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:390] Filling up shuffle buffer (this may take a while): 2 of 1000
2023-12-12 17:27:29.261075: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:390] Filling up shuffle buffer (this may take a while): 7 of 1000
2023-12-12 17:27:40.885606: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:390] Filling up shuffle buffer (this may take a while): 13 of 1000
2023-12-12 17:27:50.622667: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:390] Filling up shuffle buffer (this may take a while): 18 of 1000
2023-

ResourceExhaustedError: Failed to allocate memory for the batch of component 0 [Op:IteratorGetNext]

## Creating Transfer Learning Model


Create pretrained model from `MobileNetV2`.


In [None]:
pretrained_model = MobileNetV2()
pretrained_model.summary()

Create new layers from the pretrained model.


In [11]:
input_layer = pretrained_model.layers[0].input
base_output_layer = pretrained_model.layers[-2].output

output_layer = layers.Dense(128)(base_output_layer)
output_layer = layers.Activation("relu")(output_layer)
output_layer = layers.Dense(64)(output_layer)
output_layer = layers.Activation("relu")(output_layer)
output_layer = layers.Dense(7, activation="softmax")(output_layer)

print(output_layer)

new_model = Model(
    inputs=input_layer,
    outputs=output_layer,
)
new_model.summary()

In [12]:
new_model.compile(
    loss="sparse_categorical_crossentropy", optimizer="adam", metrics=["accuracy"]
)

## Train Model

Start training the model and saving the best model.

> 🚧 Warning
>
> DO NOT TRAIN DIRECTLY ON YOUR LOCAL COMPUTER, unless you have a really beefy computer with atleast 100GB of RAM. Why? Because the dataset is huge and it would take a loooonngggg time to train locally.
> To train, run `train.sh` to package this notebook and train it on Vertex AI using `n1-highmem-8` VM + 1 `NVIDIA_TESLA_T4` accelerator.


In [None]:
history = new_model.fit(normalized_training_data, validation_data=normalized_validation_data, epochs=25)

In [None]:
new_model.save(model_save_path)
print("Saved model to: " + model_save_path)

## Evaluate Training Results


In [None]:
import io
from google.cloud import storage

def upload_plot_to_gcs(
    buffer, bucket_name, folder_name, plot_name="plot.png", format="png"
):
    """
    Uploads a matplotlib plot from a buffer to a Google Cloud Storage bucket.

    Args:
      buffer: A BytesIO object containing the plot data.
      bucket_name: The name of the bucket to upload the plot to.
      folder_name: The name of the folder within the bucket to upload the plot to.
      plot_name: The filename of the uploaded plot. (Default: "plot.png")
      format: The format of the plot image. (Default: "png")

    Returns:
      None
    """
    # Create GCS client
    client = storage.Client()

    # Upload the plot to GCS
    blob = client.bucket(bucket_name).blob(f"{folder_name}/{plot_name}.{format}")
    blob.upload_from_string(buffer.getvalue(), content_type=f"image/{format}")

    # Print confirmation message
    print(
        f"Plot uploaded to GCS: gs://{bucket_name}/{folder_name}/{plot_name}.{format}"
    )

In [1]:
import matplotlib.pyplot as plt
# Mendapatkan data pelatihan (training) dari history
training_loss = history.history['loss']
training_accuracy = history.history['accuracy']

plt.figure(figsize=(20,10))
plt.subplot(1, 2, 1)
plt.suptitle('Optimizer : Adam', fontsize=10)
plt.ylabel('Loss', fontsize=16)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.legend(loc='upper right')

plt.subplot(1, 2, 2)
plt.ylabel('Accuracy', fontsize=16)
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.legend(loc='lower right')

# Menampilkan grafik
plt.tight_layout()
plt.show()

# Simpan ke GCS
buffer = io.BytesIO()
plt.savefig(buffer, format="png")
buffer.seek(0)
upload_plot_to_gcs(buffer, "serena-shsw-datasets", "plots", "serena-emotion-detector-eval", "png")

plt.show()

NameError: name 'history' is not defined