# Modeling Using .csv Data:

Since the models I made using the image/.jpg data were not performing well and were taking an hour-per-epoch to run, I will try to recreate these models and build new ones using the exact same data, only this time in .csv format. The purpose of this is that the models may potentially learn better using the data in .csv format, as well as run much faster than the image data.
____________________________

### Imports

In [None]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications import VGG16
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D
from tensorflow.keras.models import Model
import cv2
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from sklearn.metrics import accuracy_score
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.regularizers import l2
from sklearn.model_selection import train_test_split
from tensorflow.keras import layers

______________________________
## Data Cleaning/EDA

In [None]:
# Mounting my drive

from google.colab import drive

drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
# Reading in the data

train_data = pd.read_csv("/content/drive/MyDrive/GA Data Science Bootcamp/Projects/Capstone Project/NEW_facial_expression_recognition_data/train_data.csv")
test_data = pd.read_csv("/content/drive/MyDrive/GA Data Science Bootcamp/Projects/Capstone Project/NEW_facial_expression_recognition_data/test_data.csv")

In [None]:
train_data.head()

Unnamed: 0,emotion,pixels
0,0,70 80 82 72 58 58 60 63 54 58 60 48 89 115 121...
1,0,151 150 147 155 148 133 111 140 170 174 182 15...
2,2,231 212 156 164 174 138 161 173 182 200 106 38...
3,4,24 32 36 30 32 23 19 20 30 41 21 22 32 34 21 1...
4,6,4 0 0 0 0 0 0 0 0 0 0 0 3 15 23 28 48 50 58 84...


In [None]:
test_data.head()

Unnamed: 0,emotion,pixels
0,0,254 254 254 254 254 249 255 160 2 58 53 70 77 ...
1,1,156 184 198 202 204 207 210 212 213 214 215 21...
2,4,69 118 61 60 96 121 103 87 103 88 70 90 115 12...
3,6,205 203 236 157 83 158 120 116 94 86 155 180 2...
4,3,87 79 74 66 74 96 77 80 80 84 83 89 102 91 84 ...


In [None]:
print(train_data['emotion'].value_counts())
print(test_data['emotion'].value_counts())

3    7215
6    4965
4    4830
2    4097
0    3995
5    3171
1     436
Name: emotion, dtype: int64
3    895
4    653
6    607
2    496
0    467
5    415
1     56
Name: emotion, dtype: int64


In [None]:
#Extracting and preprocessing the pixel values

train_pixels = train_data['pixels'].apply(lambda x: np.fromstring(x, dtype=int, sep=' ') / 255.0)
test_pixels = test_data['pixels'].apply(lambda x: np.fromstring(x, dtype=int, sep=' ') / 255.0)

In [None]:
#Reshaping the pixel arrays

image_size = 48
train_images = np.vstack(train_pixels).reshape(-1, image_size, image_size, 1)
test_images = np.vstack(test_pixels).reshape(-1, image_size, image_size, 1)

In [None]:
#Encoding the emotion labels

train_labels = to_categorical(train_data['emotion'], num_classes=7)

_____________________________
## Modeling

### Basic CNN Model - Two Convolutional Blocks, No Early Stopping, 10 Epochs:

In [None]:
model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(image_size, image_size, 1)),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Flatten(),
    Dense(128, activation='relu'),
    Dropout(0.5),
    Dense(7, activation='softmax')
])

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

model.fit(train_images, train_labels, epochs=10, validation_split=0.2)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x7b2faaa8c9d0>

As we can see above, this basic CNN model concluded with an training accuracy of about 0.57, and a testing accuracy of about 0.53. Although our previous model (a CNN with early stopping, using the .jpg/image data) performed better regarding testing accuracy (~0.70), the model above performed significantly better in regard to variance (difference of ~0.15 between acc and val_acc in the previous model, vs a difference of ~0.04 in the above model). This is an indication that by training our models on .csv data may be moving us in a positive direction to create a better model.

### Repeating Above Model for 25 Epochs:

In [None]:
model2 = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(image_size, image_size, 1)),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Flatten(),
    Dense(128, activation='relu'),
    Dropout(0.5),
    Dense(7, activation='softmax')
])

model2.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

model2.fit(train_images, train_labels, epochs=25, validation_split=0.2)

Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25


<keras.src.callbacks.History at 0x7e7e7449c190>

After trying to run the previous model again for 25 epochs instead of 10, we can see that more adjustments will be needed if we want to meet the "70% accuracy on both the training and testing set" criteria. The above model performed well until it reached the 12th epoch, where it started to become overfit. Below I will try to add in early stopping and another convolutional block to see if it makes a difference.

### CNN Model - 3 Convolutional Blocks, Early Stopping, set for 25 Epochs:

In [None]:
# Define your model with additional layers
model3 = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(image_size, image_size, 1)),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Conv2D(128, (3, 3), activation='relu'),  # Additional Conv2D layer
    MaxPooling2D((2, 2)),
    Flatten(),
    Dense(128, activation='relu'),
    Dropout(0.5),
    Dense(7, activation='softmax')
])

model3.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Implement early stopping
early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

# Fit the model with early stopping
history = model3.fit(
    train_images, train_labels,
    epochs=25,
    validation_split=0.2,
    callbacks=[early_stopping]  # Add early stopping callback
)

Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25


As we can see above, the adjustments made to the previous model had little to no effect on combating high variance. The model ended up concluding at 13 epochs due to the early stopping clause. Below, I will try a regularization technique to see if we can solve the overfitting issue.

### Repeating Above Model, Adding L2 Regularizers to Combat Overfitting:

In [None]:
# Define your model with L2 regularization
model4 = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(image_size, image_size, 1),
           kernel_regularizer=l2(0.01)),  # Add L2 regularization
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu', kernel_regularizer=l2(0.01)),  # Add L2 regularization
    MaxPooling2D((2, 2)),
    Conv2D(128, (3, 3), activation='relu', kernel_regularizer=l2(0.01)),  # Add L2 regularization
    MaxPooling2D((2, 2)),
    Flatten(),
    Dense(128, activation='relu', kernel_regularizer=l2(0.01)),  # Add L2 regularization
    Dropout(0.5),
    Dense(7, activation='softmax')
])

model4.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Implement early stopping
early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

# Fit the model with early stopping
history = model4.fit(
    train_images, train_labels,
    epochs=25,
    validation_split=0.2,
    callbacks=[early_stopping]  # Add early stopping callback
)

Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25


Our model performed much better in terms of being overfit, although it stopped at the 10th epoch due to our validation loss not improving for 5 straight epochs, and our accuracy scores ended up being very poor. It appears we injected too much bias in our model in attempt to combat high variance.

At this point, after making six different models using the same data (2 with the .jgp data, 4 with the .csv data), a new approach might be necessary. It is possible that the labeling in the FER2013 data may not be have been done as accurately as we would need it to be in order to reasonably create a model that will score a 70% accuracy on both the training and testing set.

With this insight, I will try to fit new models on a new dataset, specifically being the CK+ (Cohn-Kanade) dataset, which is specifically designed for facial expression analysis, and often used for more detailed studies. We will see if this new dataset has an effect on our ability to create a model that scores a 70% accuracy on both the training and testing set.
____________________________________________

# Modeling CK+ Dataset

In [None]:
# Reading in the data

data2 = pd.read_csv('/content/drive/MyDrive/GA Data Science Bootcamp/Projects/Capstone Project/CK+_dataset/ckextended.csv')

In [None]:
pixel_data = data2['pixels'].apply(lambda x: np.fromstring(x, sep=' ')) # Extracting and preprocess pixel data
emotion_labels = data2['emotion'] # Extracting emotion labels from the 'emotion' column

In [None]:
# Creating a 2D NumPy array and reshaping it into 3D arrays

image_data = np.vstack(pixel_data.to_numpy()).reshape(-1, 48, 48, 1).astype('float32')

In [None]:
# NormalizING the pixel values

image_data /= 255.0

In [None]:
# Convert emotion labels to one-hot encoded categorical format
emotion_labels_categorical = to_categorical(emotion_labels, num_classes=8)

# Setting up training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
    image_data,
    emotion_labels_categorical,
    test_size=0.2,
    random_state=42
)

### Basic CNN Model - Two Convolutional Blocks with Early Stopping, 10 Epochs:

In [None]:
model5 = Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(48, 48, 1)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(8, activation='softmax')
])

model5.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

In [None]:
# Define early stopping callback
early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

In [None]:
# Fit the model with early stopping
history5 = model5.fit(
    X_train, y_train,
    epochs=15,
    validation_split=0.2,
    callbacks=[early_stopping]  # Add early stopping callback
)

Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


As we can see above, our model performed much better when fit using the CK+ dataset. With an training accuracy of about 0.87 and a testing accuracy of about 0.86, the above model significantly outperformed all of our previous models in terms of both accuracy and variance. I will try another model below, where I increase the number of epochs and add in another convolutional block to see if we can increase the training and testing accuracy even further.

### CNN Model - 3 Convolutional Blocks, Early Stopping, set for 40 Epochs:

In [None]:
model6 = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(48, 48, 1)),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Conv2D(128, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Flatten(),
    Dense(128, activation='relu'),
    Dropout(0.5),
    Dense(8, activation='softmax')
])

# Compile the model
model6.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

In [None]:
# Fit the model with early stopping for 40 epochs
history6 = model6.fit(
    X_train, y_train,
    epochs=40,  # Increased to 40 epochs
    validation_split=0.2,
    callbacks=[early_stopping]
)

Epoch 1/40
Epoch 2/40
Epoch 3/40
Epoch 4/40
Epoch 5/40
Epoch 6/40
Epoch 7/40
Epoch 8/40
Epoch 9/40
Epoch 10/40
Epoch 11/40
Epoch 12/40
Epoch 13/40
Epoch 14/40
Epoch 15/40
Epoch 16/40
Epoch 17/40
Epoch 18/40
Epoch 19/40
Epoch 20/40
Epoch 21/40
Epoch 22/40
Epoch 23/40
Epoch 24/40
Epoch 25/40
Epoch 26/40
Epoch 27/40
Epoch 28/40
Epoch 29/40
Epoch 30/40
Epoch 31/40


As we can see above, the model performed even better with a few more epochs. It also continued to perform well in terms of variance as it progressed through epochs, for the most part. With a training accuracy of about 0.90 and peak testing accuracy of about 0.85 in epoch 29, we have far exceeded our goal of creating a model that scores a 70% accuracy on both the training and testing set, while also maintaining an appropriate level of variance.