# Homework_3_432001358_CSCE_633_600
## November 12, 2021
## CSCE 633 600 (Machine Learning) Homework 3
### Name: Rohan Chaudhury
### UIN: 432001358


### Question 1: Machine learning for facial emotion recognition
In this problem, we will process face images coming from the Facial Expression Recognition
Challenge (presented in the International Conference of Machine Learning in 2013). The data
is uploaded under Homework3 folder in the shared Google Drive. You are given three sets of
data: training set (i.e., Q1 Train Data.csv), testing set (i.e., Q1 Test Data.csv), and validation
set (i.e., Q1 Validation Data.csv).
The data consists of 48X48 pixel grayscale images of faces. The faces have been automatically registered so that the face is more or less centered and occupies about the same amount
of space in each image. The task is to categorize each face based on the emotion shown in the
facial expression in seven categories. More information on the data can also be found in this
link.
All three files contain two columns:
1. The column labeled as "emotion" contains the emotion class with numeric code ranging
from 0 to 6 (0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral).
2. The column labeled as "pixels" contains the 2304 (i.e., 48 X 48) space-separated pixel
values of the image in row-wise order, i.e., the first 48 numbers correspond to the first row
of the image, the next 48 numbers to the second row of the image, etc.

In [3]:
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import tensorflow as tf


train_data = pd.read_csv('../input/face-emotion-recognition-data/Q1_Train_Data.csv')
test_data = pd.read_csv('../input/face-emotion-recognition-data/Q1_Test_Data.csv')
validation_data = pd.read_csv('../input/face-emotion-recognition-data/Q1_Validation_Data.csv')

print (" ")
print("Shape of Train Data: {}".format(train_data.shape))
print (" ")
print("Shape of Test Data: {}".format(test_data.shape))
print (" ")
print("Shape of Validation Data: {}".format(validation_data.shape))
print (" ")
print(train_data.head)




### (a) (1 points) Visualization: Randomly select and visualize 1-2 images per emotion. Note: You can find a useful link on image pre-processing here: https://www.tensorflow.org/api_docs/python/tf/image/per_image_standardization 

In [6]:
def string_to_int(st):
  ar= list((int(x)/255 - 0.5)  for x in st.split(' '))
  return ar

def get_pixels_int(df):
  df['pixels_int']=df.apply(lambda st: np.array(string_to_int(st['pixels'])),axis=1)

get_pixels_int(train_data)
get_pixels_int(test_data)
get_pixels_int(validation_data)


print(train_data.head)




In [1]:
# plt.rcParams["figure.figsize"] = (2,300)

plt.figure(figsize=(400,400))

emotion_category={0:'Angry', 1:'Disgust', 2:'Fear', 3:'Happy', 4:'Sad', 5:'Surprise', 6:'Neutral'}

emotions= train_data['emotion'].unique().tolist()

fig, ax=plt.subplots(len(emotions),2,figsize=(15,15))
fig.tight_layout(pad=3.0)
# print (emotions)
plot_num=0
for emotion in emotions:
  imgs=train_data[train_data["emotion"]==emotion]
  samples=imgs.sample(n=2)
  for i in range(samples.shape[0]):
    img=samples.iloc[i]
    ax[int(plot_num/2),plot_num%2].set_title("Emotion label: {}".format(emotion_category[emotion]))
    ax[int(plot_num/2),plot_num%2].imshow(np.array(img['pixels_int'],cmap=plt.get_cmap('gray')).reshape(48,48))
    plot_num+=1

  

### (b) (1 points) Data exploration: Count the number of samples per emotion in the training data.

In [9]:
samples_count=train_data.groupby(['emotion']).count()
samples_count=samples_count[['pixels']].rename({'pixels': 'size'}, axis=1)  
samples_count

### (c) (4 points) Image classification with FNNs: In this part, you will use a feedforward neural network (FNN) (also called multilayer perceptron") to perform the emotion classification task. The input of the FNN comprises of all the pixels of the image.



In [10]:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.utils import to_categorical
from tensorflow.keras import regularizers

train_emotions=np.array(list(train_data['emotion']))
train_pixels=np.array(list(train_data['pixels_int']))


import time

class TimeHistory(tf.keras.callbacks.Callback):
    def on_train_begin(self, logs={}):
        self.times = []
        self.epoch_time_start = time.time()
    # def on_epoch_begin(self, epoch, logs={}):
    #     self.epoch_time_start = time.time()

    def on_train_end(self, epoch, logs={}):
        self.times.append(time.time() - self.epoch_time_start)



model1 = Sequential([
  Dense(784*2, activation='relu', input_shape=(48*48,), name="first_hidden_layer"),
  Dense(784, activation='relu', name="second_hidden_layer"),
  Dense(784//2, activation='relu', name="third_hidden_layer"),
  Dense(784//4, activation='relu', name="fourth_hidden_layer"),
  Dense(len(emotions), activation='softmax'),
])

model2 = Sequential([
  Dense(784*2, activation='elu', input_shape=(48*48,), name="first_hidden_layer"),
  Dense(784, activation='elu', name="second_hidden_layer"),
  Dense(784//2, activation='elu', name="third_hidden_layer"),
  Dropout(0.25),
  Dense(len(emotions), activation='softmax'),
])


model3 = Sequential([
  Dense(2000, activation='elu', input_shape=(48*48,), name="first_hidden_layer"),
  Dense(1000, activation='elu', name="second_hidden_layer"),
  Dense(500, activation='elu', name="third_hidden_layer"),
  Dropout(0.25),
  Dense(len(emotions), activation='softmax'),
])


model4 = Sequential([
  Dense(2000, activation='elu', input_shape=(48*48,), name="first_hidden_layer"),
  Dense(1000, activation='elu', name="second_hidden_layer"),
  Dense(500, activation='elu', name="third_hidden_layer"),
  Dense(250, activation='elu', name="fourth_hidden_layer"),
  Dense(784, activation='relu', name="fifth_hidden_layer"),
  Dropout(0.25),
  Dense(len(emotions), activation='softmax'),
])


model5 = Sequential([
  Dense(2000, activation='elu', input_shape=(48*48,), name="first_hidden_layer", kernel_regularizer=regularizers.l2(0.0001)),
  Dense(1000, activation='elu', name="second_hidden_layer", kernel_regularizer=regularizers.l2(0.0001)),
  Dense(500, activation='elu', name="third_hidden_layer", kernel_regularizer=regularizers.l2(0.0001)),
  Dense(250, activation='elu', name="fourth_hidden_layer", kernel_regularizer=regularizers.l2(0.0001)),
  Dropout(0.25),
  Dense(len(emotions), activation='softmax'),
])









In [11]:
time_callback = TimeHistory()

model1.compile(optimizer='adam', loss='categorical_crossentropy',metrics=['accuracy'])
model2.compile(optimizer='adam', loss='categorical_crossentropy',metrics=['accuracy'])
model3.compile(optimizer='adam', loss='categorical_crossentropy',metrics=['accuracy'])
model4.compile(optimizer='adam', loss='categorical_crossentropy',metrics=['accuracy'])
model5.compile(optimizer='adam', loss='categorical_crossentropy',metrics=['accuracy'])



print("Train image shape: ", train_pixels.shape)
print(train_emotions.shape)

time_to_train=[]
flatten_train_images = train_pixels

history1=model1.fit(np.array(flatten_train_images), to_categorical(train_emotions), epochs=20, batch_size=256,callbacks = [time_callback])
time_to_train.append(time_callback.times)


history2=model2.fit(np.array(flatten_train_images), to_categorical(train_emotions), epochs=20, batch_size=256,callbacks = [time_callback])
time_to_train.append(time_callback.times)
history3=model3.fit(np.array(flatten_train_images), to_categorical(train_emotions), epochs=20, batch_size=256,callbacks = [time_callback])
time_to_train.append(time_callback.times)
history4=model4.fit(np.array(flatten_train_images), to_categorical(train_emotions), epochs=20, batch_size=256,callbacks = [time_callback])
time_to_train.append(time_callback.times)
history5=model5.fit(np.array(flatten_train_images), to_categorical(train_emotions), epochs=20, batch_size=256,callbacks = [time_callback])
time_to_train.append(time_callback.times)



### (c.i) (3 points) Experiment on the validation set with different FNN hyper-parameters, e.g.layers, nodes per layer, activation function, dropout, weight regularization, etc. For each hyper-parameter combination that you have used, please report the following: (1) emotion classification accuracy on the training and validation sets; (2) running time for training the FNN; (3) parameters for each FNN. For 2-3 hyper-parameter combinations, please also plot the cross-entropy loss over the number of iterations during training. Note: If running the FNN takes a long time, you can subsample the input images to a smaller size (e.g., 24 x 24).

In [12]:
# print (time_to_train)
# for i in time_to_train:
#   print (sum(i))

validation_emotions=np.array(list(validation_data['emotion']))
validation_pixels=np.array(list(validation_data['pixels_int']))

flatten_validation_images = validation_pixels
validation_performances=[]
models=[model1,model2,model3,model4,model5]
model_names=['model1','model2','model3','model4','model5']
histories=[history1,history2,history3,history4,history5]
for i in range(len(models)):
  print (" ")
  print ("Required details for {}".format(model_names[i]))
  print (" ")
  performance1 = models[i].evaluate(flatten_train_images, to_categorical(train_emotions))
  print("Emotion Classification Accuracy on the Training set: {0}".format(performance1[1]))
  print (" ")
  performance2 = models[i].evaluate(flatten_validation_images, to_categorical(validation_emotions))
  validation_performances.append(performance2[1])
  print("Emotion Classification Accuracy on the Validation set: {0}".format(performance2[1]))
  print (" ")
  print ("Running time for training the FNN: {} ms".format(str(time_to_train[i][0])))
  print (" ")
  print ("Parameters for the model:")
  print (" ")
  print (models[i].get_config())
  print (" ")
  print (models[i].summary())


  print (" ")
  print('Number of Epochs used to train the model: ', len(histories[i].history['loss']))

  

  print(" ")
  # print(history.history.keys())
  # summarize history for loss
  plt.plot(histories[i].history['loss'])
  plt.title('{} loss vs epochs'.format(model_names[i]))
  plt.ylabel('cross-entropy loss')
  plt.xlabel('epoch')
  plt.legend(['train'], loc='upper left')
  plt.show()
  # summarize history for accuracy
  plt.plot(histories[i].history['accuracy'])
  plt.title('{} accuracy vs epochs'.format(model_names[i]))
  plt.ylabel('accuracy')
  plt.xlabel('epoch')
  plt.legend(['train'], loc='upper left')
  plt.show()

  print (" ")
  print (" ")

# performance = model1.evaluate(flatten_test_images, to_categorical(validation_emotions))
# print("Accuracy on Test samples: {0}".format(performance[1]))
# performance = model2.evaluate(flatten_test_images, to_categorical(validation_emotions))
# print("Accuracy on Test samples: {0}".format(performance[1]))
# performance = model3.evaluate(flatten_test_images, to_categorical(validation_emotions))
# print("Accuracy on Test samples: {0}".format(performance[1]))
# performance = model4.evaluate(flatten_test_images, to_categorical(validation_emotions))
# print("Accuracy on Test samples: {0}".format(performance[1]))
# performance = model5.evaluate(flatten_test_images, to_categorical(validation_emotions))
# print("Accuracy on Test samples: {0}".format(performance[1]))

### Answer:

Five different FNN models have been trained on the training dataset and evaluated on the validation dataset. The required details are shown above. 
The values in a tabular form is shown below:

Model name | Accuracy on Train dataset (%)| Accuracy on Validation dataset (%)| Training Time (ms) | Parameters Count
-----------|---------------------------|--------------------------------|---------------|--------------
model1          |92.24          |46.86      |12.81  |5,230,463
model2         |88.66     |48.59  |11.98    |5,154,807
model3      |88.71  |47.20    |12.60    |7,115,007
model4      |88.01    |45.03   |13.95    |7,439,029
model5        |68.06     |45.11   |14.74  |7,238,507

### (c.ii) (1 point) Run the best model that was found based on the validation set from question (c.i) on the testing set. Report the emotion classification accuracy on the testing set.

### Answer: 
### The best model that was found based on the validation set is:

In [13]:
# print (validation_performances)
max_validation=validation_performances.index(max(validation_performances))
print (model_names[max_validation])
print (models[max_validation].summary())

print("Emotion Classification Accuracy on the Validation set for the model: {} ".format(validation_performances[max_validation]))

### The emotion classification accuracy of the model on the testing dataset is: 

In [14]:
test_emotions=np.array(list(test_data['emotion']))
test_pixels=np.array(list(test_data['pixels_int']))

flatten_test_images = test_pixels

test_performance = models[max_validation].evaluate(flatten_test_images, to_categorical(test_emotions))
print("Emotion Classification Accuracy on the Testing set: {0}".format(test_performance[1]))

### (d) (4 points) Image classification with CNNs: In this part, you will use a convolutional neural network (CNN) to perform the emotion classification task.

In [15]:
from tensorflow.keras.layers import Conv2D, Flatten, MaxPooling2D


common_features_1 = [Conv2D(32, kernel_size=3, activation='relu', input_shape=(48,48,1)), 
            Conv2D(32, kernel_size=3, activation='relu'), 
            MaxPooling2D(pool_size=(2,2)),
            Conv2D(64, kernel_size=3, activation='relu'),
            Conv2D(64, kernel_size=3, activation='relu'),
            MaxPooling2D(pool_size=(2,2)), Flatten(),]
classifier_1 = [Dense(512, activation='relu'), Dense(len(emotions), activation='softmax'),]

cnn_model_1 = Sequential(common_features_1+classifier_1)



common_features_2 = [Conv2D(64, kernel_size=3, activation='relu', input_shape=(48,48,1),use_bias=True), 
            Conv2D(64, kernel_size=3, activation='relu',use_bias=True), 
            MaxPooling2D(pool_size=(2,2)),
            Conv2D(128, kernel_size=3, activation='relu',use_bias=True,  kernel_regularizer =tf.keras.regularizers.l2( l=0.001)),
            Conv2D(128, kernel_size=3, activation='relu',use_bias=True,  kernel_regularizer =tf.keras.regularizers.l2( l=0.001)),
            MaxPooling2D(pool_size=(2,2)), Flatten(),]
classifier_2 = [Dense(512, activation='relu',use_bias=True,  kernel_regularizer =tf.keras.regularizers.l2( l=0.01)), Dense(len(emotions), activation='softmax',use_bias=True),]

cnn_model_2 = Sequential(common_features_2+classifier_2)


common_features_3 = [Conv2D(64, kernel_size=3, activation='elu',input_shape=(48,48,1),use_bias=True), 
            Conv2D(64, kernel_size=3, activation='elu',padding='same',use_bias=True), 
            MaxPooling2D(pool_size=(2,2)),
            Conv2D(128, kernel_size=3, activation='elu',padding='same',use_bias=True), 
            MaxPooling2D(pool_size=(2,2)),
            Conv2D(128, kernel_size=3, activation='elu', strides=(2, 2),padding='same',use_bias=True,  kernel_regularizer =tf.keras.regularizers.l1( l=0.001)),
            Conv2D(128, kernel_size=3, activation='elu', strides=(2, 2),padding='same',use_bias=True, kernel_regularizer =tf.keras.regularizers.l1( l=0.001)),
            MaxPooling2D(pool_size=(2,2)), Flatten(),]
classifier_3 = [Dense(512, activation='elu',use_bias=True,  kernel_regularizer =tf.keras.regularizers.l1( l=0.01)),Dropout(0.25), Dense(len(emotions), activation='softmax',use_bias=True),]

cnn_model_3 = Sequential(common_features_3+classifier_3)



In [16]:
# print(cnn_model_1.summary())  # Compare number of parameteres against FFN

time_callback = TimeHistory()
cnn_model_1.compile(optimizer='adam', loss='categorical_crossentropy',metrics=['accuracy'],)
cnn_model_2.compile(optimizer='adam', loss='categorical_crossentropy',metrics=['accuracy'],)
cnn_model_3.compile(optimizer='adam', loss='categorical_crossentropy',metrics=['accuracy'],)
time_to_train_cnn=[]

train_images_3d = flatten_train_images.reshape(len(flatten_train_images),48,48,1)
test_images_3d = flatten_test_images.reshape(len(flatten_test_images),48,48,1)


cnn_history_1=cnn_model_1.fit(train_images_3d, to_categorical(train_emotions), epochs=15, batch_size=256,callbacks = [time_callback])
time_to_train_cnn.append(time_callback.times)
cnn_history_2=cnn_model_2.fit(train_images_3d, to_categorical(train_emotions), epochs=20, batch_size=256,callbacks = [time_callback])
time_to_train_cnn.append(time_callback.times)
cnn_history_3=cnn_model_3.fit(train_images_3d, to_categorical(train_emotions), epochs=20, batch_size=256,callbacks = [time_callback])
time_to_train_cnn.append(time_callback.times)



### (d.i) (3 points) Experiment on the validation set with different CNN hyper-parameters, e.g. layers, filter size, stride size, activation function, dropout, weight regularization, etc. For each hyper-parameter combination that you have used, please report the following: (1) emotion classification accuracy on the training and validation sets; (2) running time for training the FNN; (3) parameters for each CNN. How do these metrics compare to the FNN?

In [17]:
validation_images_3d = flatten_validation_images.reshape(len(flatten_validation_images),48,48,1)

validation_performances_cnn=[]
cnn_models=[cnn_model_1,cnn_model_2,cnn_model_3]
cnn_model_names=['cnn_model_1','cnn_model_2','cnn_model_3']
cnn_histories=[cnn_history_1,cnn_history_2,cnn_history_3]
for i in range(len(cnn_models)):
  print (" ")
  print ("Required details for {}".format(cnn_model_names[i]))
  print (" ")
  performance1_cnn = cnn_models[i].evaluate(train_images_3d, to_categorical(train_emotions))
  print("Emotion Classification Accuracy on the Training set: {0}".format(performance1_cnn[1]))
  print (" ")
  performance2_cnn = cnn_models[i].evaluate(validation_images_3d, to_categorical(validation_emotions))
  validation_performances_cnn.append(performance2_cnn[1])
  print("Emotion Classification Accuracy on the Validation set: {0}".format(performance2_cnn[1]))
  print (" ")
  print ("Running time for training the FNN: {} ms".format(str(time_to_train_cnn[i][0])))
  print (" ")
  print ("Parameters for the model:")
  print (" ")
  print (cnn_models[i].get_config())
  print (" ")
  print (cnn_models[i].summary())


  print (" ")
  print('Number of Epochs used to train the model: ', len(cnn_histories[i].history['loss']))

  

  print(" ")
  # print(history.history.keys())
  # summarize history for loss
  plt.plot(cnn_histories[i].history['loss'])
  plt.title('{} loss vs epochs'.format(cnn_model_names[i]))
  plt.ylabel('cross-entropy loss')
  plt.xlabel('epoch')
  plt.legend(['train'], loc='upper left')
  plt.show()
  # summarize history for accuracy
  plt.plot(cnn_histories[i].history['accuracy'])
  plt.title('{} accuracy vs epochs'.format(cnn_model_names[i]))
  plt.ylabel('accuracy')
  plt.xlabel('epoch')
  plt.legend(['train'], loc='upper left')
  plt.show()

  print (" ")
  print (" ")






# performance = cnn_model.evaluate(test_images_3d, to_categorical(test_emotions))

# print("Accuracy on Test samples: {0}".format(performance[1]))

### Answer:

Three different CNN models have been trained on the training dataset and evaluated on the validation dataset. The required details are shown above.
The values in a tabular form is shown below:

Model name | Accuracy on Train dataset (%)| Accuracy on Validation dataset (%)| Training Time (ms) | Parameters Count
-----------|---------------------------|--------------------------------|---------------|--------------
cnn_model_1      |99.45       |57.23   |37.71  |2,723,303
cnn_model_2         |74.28    |55.75  |85.32    | 5,571,527
cnn_model_3     |38.88  |39.73    |91.26  |476,231

### (d.ii) (1 point) Run the best model that was found based on the validation set from question (d.i) on the testing set. Report the emotion classification accuracy on the testing set. How does this metric compare to the FNN?

### Answer: 
### The best model that was found based on the validation set is:

In [18]:
max_validation_cnn=validation_performances_cnn.index(max(validation_performances_cnn))
print (cnn_model_names[max_validation_cnn])
print (cnn_models[max_validation_cnn].summary())

print("Emotion Classification Accuracy on the Validation set for the cnn model: {} ".format(validation_performances_cnn[max_validation_cnn]))

### The emotion classification accuracy of the model on the testing dataset is:

In [19]:
test_images_3d = flatten_test_images.reshape(len(flatten_test_images),48,48,1)

test_performance_cnn = cnn_models[max_validation_cnn].evaluate(test_images_3d, to_categorical(test_emotions))
print("Emotion Classification Accuracy on the Testing set for the CNN: {0}".format(test_performance_cnn[1]))
print ("  ")
print("And Emotion Classification Accuracy on the Testing set for the FNN: {0}".format(test_performance[1]))

### We can see that the Emotion Classification Accuracy on the Testing set for the CNN (57.25%) is better than that of the FNN (47.50%).



### (e) (1 point) Bayesian optimization for hyper-parameter tuning: Instead of performing grid or random search to tune the hyper-parameters of the CNN, we can also try a model-based method for finding the optimal hyper-parameters through Bayesian optimization. This method performs a more intelligent search on the hyper-parameter space in order to estimate the best set of hyper-parameters for the data. Use publicly available libraries (e.g., hyperopt in Python) to perform a Bayesian optimization on the hyper-parameter space using the validation set. Report the emotion classification accuracy on the testing set.


In [20]:
!pip install hyperopt

In [21]:
from hyperopt import hp, fmin, tpe, STATUS_OK, Trials

In [22]:
def optimize_cnn(hyperparameter):
  
  # Define model using hyperparameters 
  cnn_model = Sequential([Conv2D(32, kernel_size=hyperparameter['conv_kernel_size'], activation=hyperparameter['activation_fn'], input_shape=(48,48,1)), 
            Conv2D(32, kernel_size=hyperparameter['conv_kernel_size'], activation=hyperparameter['activation_fn']), 
            MaxPooling2D(pool_size=(2,2)), Dropout(hyperparameter['dropout_prob']),
            Conv2D(64, kernel_size=hyperparameter['conv_kernel_size'], activation=hyperparameter['activation_fn']),
            Conv2D(128, kernel_size=hyperparameter['conv_kernel_size'], activation=hyperparameter['activation_fn']), 
            MaxPooling2D(pool_size=(2,2)), Dropout(hyperparameter['dropout_prob']), 
            Flatten(),
            Dense(512, hyperparameter['activation_fn']), 
            Dense(len(emotions), activation='softmax'),])
  
  cnn_model.compile(optimizer=hyperparameter['optimizer'], loss='categorical_crossentropy', metrics=['accuracy'],)


  train_X, train_y = train_images_3d, train_emotions
  valid_X, valid_y = validation_images_3d, validation_emotions

  _ = cnn_model.fit(train_X, to_categorical(train_y), epochs=10, batch_size=256, verbose=0)
  # Evaluating accuracy on validation data
  performance = cnn_model.evaluate(valid_X, to_categorical(valid_y), verbose=0)

  print("Hyperparameters: ", hyperparameter, "Accuracy: ", performance[1])
  print("----------------------------------------------------")

  return({"status": STATUS_OK, "loss": -1*performance[1], "model":cnn_model})
  

# Define search space for hyper-parameters
space = {
    # The kernel_size for convolutions:
    'conv_kernel_size': hp.choice('conv_kernel_size', [1, 3, 5]),
    # Uniform distribution in finding appropriate dropout values
    'dropout_prob': hp.uniform('dropout_prob', 0.1, 0.35),
    # Choice of optimizer 
    'optimizer': hp.choice('optimizer', ['Adam', 'sgd']),
    #choice of activation function
    'activation_fn': hp.choice('activation', ['relu', 'sigmoid', 'elu']),
}

trials = Trials()

# Find the best hyperparameters
best = fmin(
        optimize_cnn,
        space,
        algo=tpe.suggest,
        trials=trials,
        max_evals=25,
    )

print("==================================")
print("Best Hyperparameters", best)


test_model = trials.results[np.argmin([r['loss'] for r in trials.results])]['model']

performance = test_model.evaluate(test_images_3d, to_categorical(test_emotions))

print("==================================")
print("Test Accuracy: ", performance[1])

### The best hypermeters are: {'activation': 'relu', 'conv_kernel_size': 5, 'dropout_prob': 0.20422812367824345, 'optimizer': 'Adam'}


### The emotion classification accuracy on the testing set with the best hyperparameters is as follows:

In [23]:
performance = test_model.evaluate(test_images_3d, to_categorical(test_emotions))

print("==================================")
print("Test Accuracy: ", performance[1])

### (f) (Bonus - 1 point) Fine-tuning: Use a pre-trained CNN (e.g., the pre-trained example of the MNIST dataset that we saw in class, or any other available pre-trained CNN) and fine-tune it on the FER data. Please experiment with different fine-tuning hyper-parameters (e.g., layers to fine-tune, regularization during fine-tuning) on the validation set. Report the classification accuracy for all hyper-parameter combinations on the validation set. Also report the classification accuracy with the best hyper-parameter combination on the testing set.

In [53]:
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.applications.inception_v3 import InceptionV3
from tensorflow.keras.applications.efficientnet import EfficientNetB0
from tensorflow.keras.applications.resnet50 import ResNet50

from tensorflow.keras.applications.inception_resnet_v2 import InceptionResNetV2
from tensorflow.python.keras.layers import Dense, Flatten, GlobalAveragePooling2D, Activation, Flatten, Dropout, BatchNormalization
from keras.applications.inception_resnet_v2 import preprocess_input
from tensorflow.keras.layers import RandomFlip, RandomRotation
from tensorflow.keras import Input, Model
from tensorflow.image import grayscale_to_rgb
from tensorflow import convert_to_tensor


model_ft_1 = Sequential()
model_ft_1.add(ResNet50(input_shape=(48,48,3), include_top=False, pooling='avg', weights="imagenet"))
# model.trainable=False
model_ft_1.add(Dense(512))
model_ft_1.add(Activation('relu'))
model_ft_1.add(Dense(1024))
model_ft_1.add(Activation('relu'))
model_ft_1.add(Dense(512))
model_ft_1.add(Activation('relu'))
model_ft_1.add(Dropout(0.3))
model_ft_1.add(Dense(len(emotions), activation='softmax'))
model_ft_1.layers[0].trainable = False
model_ft_1.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])



model_ft_2 = Sequential()
model_ft_2.add( MobileNetV2(input_shape=(48,48,3), include_top=False, pooling='avg', weights="imagenet"))
# model.trainable=False
model_ft_2.add(Dense(512))
model_ft_2.add(Activation('relu'))
model_ft_2.add(Dense(1024))
model_ft_2.add(Activation('relu'))
model_ft_2.add(Dense(512))
model_ft_2.add(Activation('relu'))
model_ft_2.add(Dropout(0.4))
model_ft_2.add(Dense(len(emotions), activation='softmax'))
model_ft_2.layers[0].trainable = False
model_ft_2.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])




model_ft_3 = Sequential()
model_ft_3 .add( MobileNetV2(input_shape=(48,48,3), include_top=False, pooling='avg', weights="imagenet"))
# model.trainable=False
model_ft_3 .add(Dense(512, kernel_regularizer=regularizers.l2(0.01)))
model_ft_3 .add(Activation('elu'))
model_ft_3 .add(Dense(1024, kernel_regularizer=regularizers.l2(0.01)))
model_ft_3 .add(Activation('elu'))
model_ft_3 .add(Dense(512, kernel_regularizer=regularizers.l2(0.01)))
model_ft_3 .add(Activation('elu'))
model_ft_3 .add(Dropout(0.3))
model_ft_3 .add(Dense(len(emotions), activation='softmax'))
model_ft_3 .layers[0].trainable = False
model_ft_3 .compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])




train_images_3d_3=grayscale_to_rgb(convert_to_tensor(train_images_3d))

history_ft_1 = model_ft_1.fit(train_images_3d_3, to_categorical(train_emotions), epochs=10, batch_size=256, verbose=1)
history_ft_2 = model_ft_2.fit(train_images_3d_3, to_categorical(train_emotions), epochs=15, batch_size=256, verbose=1)
history_ft_3 = model_ft_3.fit(train_images_3d_3, to_categorical(train_emotions), epochs=15, batch_size=256, verbose=1)


In [54]:
validation_images_3d_3=grayscale_to_rgb(convert_to_tensor(validation_images_3d))

# performance = model_ft_3.evaluate(validation_images_3d_3, to_categorical(validation_emotions))

# print("==================================")
# print("Test Accuracy: ", performance[1])


validation_performances_ft=[]
ft_models=[model_ft_1,model_ft_2,model_ft_3]
ft_model_names=['model_ft_1','model_ft_2','model_ft_3']
ft_histories=[history_ft_1,history_ft_2,history_ft_3]
for i in range(len(ft_models)):
  print (" ")
  print ("Required details for {}".format(ft_model_names[i]))
  print (" ")
  performance1_ft = ft_models[i].evaluate(train_images_3d_3, to_categorical(train_emotions))
  print("Emotion Classification Accuracy on the Training set: {0}".format(performance1_ft[1]))
  print (" ")
  performance2_ft = ft_models[i].evaluate(validation_images_3d_3, to_categorical(validation_emotions))
  validation_performances_ft.append(performance2_ft[1])
  print("Emotion Classification Accuracy on the Validation set: {0}".format(performance2_ft[1]))
  print (" ")
  print ("Parameters for the model:")
  print (" ")
  print (ft_models[i].summary())


  print (" ")
  print('Number of Epochs used to train the model: ', len(ft_histories[i].history['loss']))

  

  print(" ")
  # print(history.history.keys())
  # summarize history for loss
  plt.plot(ft_histories[i].history['loss'])
  plt.title('{} loss vs epochs'.format(ft_model_names[i]))
  plt.ylabel('cross-entropy loss')
  plt.xlabel('epoch')
  plt.legend(['train'], loc='upper left')
  plt.show()
  # summarize history for accuracy
  plt.plot(ft_histories[i].history['accuracy'])
  plt.title('{} accuracy vs epochs'.format(ft_model_names[i]))
  plt.ylabel('accuracy')
  plt.xlabel('epoch')
  plt.legend(['train'], loc='upper left')
  plt.show()

  print (" ")
  print (" ")




### Answer:

Three different fine-tuned models have been trained on the training dataset and evaluated on the validation dataset. The required details are shown above.
The values in a tabular form is shown below:

Model name | Accuracy on Train dataset (%)| Accuracy on Validation dataset (%) | Total Trainable Parameters Count
-----------|---------------------------|--------------------------------|--------------
model_ft_1     |35.55      |34.05   |2,102,791
cnn_model_2         |98.22  |42.74  |1,709,575
cnn_model_3     |90.38  |40.54  |1,709,575


### The best fine-tuned model that was found based on the validation set is:

In [55]:
max_validation_ft=validation_performances_ft.index(max(validation_performances_ft))
print (ft_model_names[max_validation_ft])
print (ft_models[max_validation_ft].summary())

print("Emotion Classification Accuracy on the Validation set for the fine-tuned model: {} ".format(validation_performances_ft[max_validation_ft]))

### The emotion classification accuracy of the best fine-tuned model (created using mobilenetv2) on the testing dataset is:

In [58]:
test_images_3d_3=grayscale_to_rgb(convert_to_tensor(test_images_3d))

test_performance_ft= ft_models[max_validation_ft].evaluate(test_images_3d_3, to_categorical(test_emotions))
print("Emotion Classification Accuracy on the Testing set for the fine-tuned model: {0}".format(test_performance_ft[1]))


### (g) (Bonus - 1 point) Data augmentation: Data augmentation is a way to increase the size of our dataset and reduce overfitting, especially when we use complicated models with manyparameters to learn. Using any available toolbox or your own code, implement some of these techniques and augment the original FER data.

In [80]:
from keras.preprocessing.image import ImageDataGenerator


shift=0.2
datagen1 = ImageDataGenerator(featurewise_center=True, featurewise_std_normalization=True, zca_whitening=True, rotation_range=90, width_shift_range=0.2, height_shift_range=0.2, horizontal_flip=True, vertical_flip=True)
datagen2 = ImageDataGenerator(featurewise_center=True, featurewise_std_normalization=True)
datagen3 = ImageDataGenerator(zca_whitening=True)
datagen4 = ImageDataGenerator(rotation_range=90)
datagen5 = ImageDataGenerator(width_shift_range=shift, height_shift_range=shift)

datagens=[datagen1, datagen2,datagen3, datagen4, datagen5]
labels=["Augmentation 1", "Augmentation 2", "Augmentation 3","Augmentation 4", "Augmentation 5"]
for ij,datagen in enumerate(datagens):
    datagen.fit(train_images_3d[:9])

    for X_batch, y_batch in datagen.flow(train_images_3d, train_emotions, batch_size=9):
        for i in range(0, 9):
            plt.subplot(330 + 1 + i)
            plt.title(labels[ij])
            plt.imshow(X_batch[i].reshape(48, 48), cmap=plt.get_cmap('gray'))
        plt.show()
        break

#### Answer:
Some outputs of the following data augmentation techniques are shown above:

1. datagen1 = ImageDataGenerator(featurewise_center=True, featurewise_std_normalization=True, zca_whitening=True, rotation_range=90, width_shift_range=0.2, height_shift_range=0.2, horizontal_flip=True, vertical_flip=True)
2. datagen2 = ImageDataGenerator(featurewise_center=True, featurewise_std_normalization=True)
3. datagen3 = ImageDataGenerator(zca_whitening=True)
4. datagen4 = ImageDataGenerator(rotation_range=90)
5. datagen5 = ImageDataGenerator(width_shift_range=shift, height_shift_range=shift)


Now, training a CNN model using datagen1:

In [73]:
#training a model on augmented dataset
datagen1.fit(train_images_3d)
it = datagen1.flow(train_images_3d, to_categorical(train_emotions))


common_features = [Conv2D(32, kernel_size=3, activation='relu', input_shape=(48,48,1)), 
            Conv2D(32, kernel_size=3, activation='relu'), 
            MaxPooling2D(pool_size=(2,2)),
            Conv2D(64, kernel_size=3, activation='relu'),
            Conv2D(64, kernel_size=3, activation='relu'),
            MaxPooling2D(pool_size=(2,2)), Flatten(),]
classifier = [Dense(512, activation='relu'), Dense(len(emotions), activation='softmax'),]

cnn_model = Sequential(common_features+classifier)
cnn_model.compile(optimizer='adam', loss='categorical_crossentropy',metrics=['accuracy'],)
history_cnn = cnn_model.fit_generator(it, epochs=10)

The accuracy on the test dataset using this model is as follows:

In [81]:
performance = cnn_model.evaluate(test_images_3d, to_categorical(test_emotions))

print("Accuracy on Test samples: {0}".format(performance[1]))