## CS 893 Advanced Computer Vision (Assignment No 02)

This assignment deals with classification of 08 distinct facial expressions along with predicting the dimensional model of affect by assigning corresponding valence and arousal values. The dataset comprises of 287,651 training images and 4999 testing images. The training dataset is a long tail dataset with most of the images belong to happy and neutral classes and very belong to disgust and contempt classes. Moreover, class balanced validation dataset with 100 images from each class has been extracted from training dataset.
For feature extraction, two seperate pipelines of convolutional neural network  (VGG and Resnet) has been used with transfer learning. For this learned filter coefficients from VGG-Face have been adopted. The models were retrained on training dataset and multi-output model was compiled having three heads (01 classifier and 02 Regression).  

## Required Modules to be installed

In [None]:
# ! pip install krippendorff
# ! pip install pandas
# ! pip install matplotlib
# ! pip install tensorflow
# ! pip install sklearn
# ! pip install opencv-python

## Import Requisite Modules

In [3]:
import pandas as pd
import numpy as np
import cv2
import random
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras import Model
from tensorflow.keras.layers import Conv2D, MaxPool2D, Flatten, Dense, GlobalAveragePooling2D, Dropout
from tensorflow.keras.layers import BatchNormalization, ReLU, Input, ZeroPadding2D, add, concatenate 
from tensorflow.keras.preprocessing.image import ImageDataGenerator, load_img
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import plot_model
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping
from sklearn.metrics import accuracy_score, cohen_kappa_score, confusion_matrix, auc, classification_report 
from sklearn.metrics import roc_auc_score, ConfusionMatrixDisplay, average_precision_score, mean_squared_error
from scipy.stats import pearsonr
import os
import krippendorff # install krippendorff module by executing (pip install krippendorff) 
%matplotlib inline

## Dataset Preprocessing

It is assumed that the dataset has been unzipped and placed in the Dataset folder with following folders inside Dataset folder (test_set and train_set).
The dataset (images and innotations) are read and saved in the csv files for test and training data with coulumns (filename, label, valence and arousal).
The training dataset is further divided into validation set and training set. The validation set contains 100 images of each class.

In [4]:
# This function returns corresponding image annotation for input image file
def GetImageAnnotation(filePath,annotType):
    temp1 = filePath.split("\\")
    temp2 = temp1[3].split(".")
    if (annotType == 0):
        annotationType = "_exp.npy"
    elif (annotType == 1):
        annotationType = "_val.npy"
    elif (annotType == 2):
        annotationType = "_aro.npy"
    annotationPath = os.path.join(temp1[0],temp1[1],"annotations",temp2[0]+annotationType)
    annotVal = np.load(annotationPath)
    return annotVal
# This function extracts annotations of image files
def ExtractAnnotations(filePath):
    label = GetImageAnnotation(filePath,0)
    valence = GetImageAnnotation(filePath,1)
    arousal = GetImageAnnotation(filePath,2)
    return label, valence, arousal
# This function creates a csv file from extracted annotations
def CreateCSVFile(imageDir, filename):
    fileNamesList = os.listdir(imageDir)
    for index, fileName in enumerate(fileNamesList):
        filePath = os.path.join(imageDir,fileName)
        label, valence, arousal = ExtractAnnotations(filePath)
        data1 = pd.DataFrame({'filename': fileName, 'label': [label], 'valence' : [valence], 'arousal' : [arousal]})
        if index !=0:
            data2 = pd.concat([data2, data1], axis=0)
        else:
            data2 = data1
    data2.to_csv(filename, index = False)
# Split training dataset into training and validation dataset    
def ExtractValidationset(dataFrame, noOfSamples, noOfClasses):
    df = dataFrame.copy()
    dfList = []
    for i in range(noOfClasses):
        tempdf = df.loc[df['label'] == i]
        dfClass = tempdf.sample(n=noOfSamples)
        dfList.append(dfClass) 
        if i==0:
            extractedDF = dfClass
        else:
            extractedDF = pd.concat([extractedDF, dfClass], axis=0)
        for j in dfClass.index: 
            df = df.drop(j)
     
    df.to_csv(r'train_set.csv', index = False)
    extractedDF.to_csv(r'val_set.csv', index = False)
    print("Training and validation dataset files have been saved")
# Calculate class frequency for visualization
def CalculateClassFrequency(dataFrame,noOfClasses):
    df = dataFrame.copy()
    classFrequencyList = []
    for i in range(noOfClasses):
        tempdf = df.loc[df['label'] == i]
        classFrequencyList.append(len(tempdf))
    return classFrequencyList
# Calculate respective class weights for weighted softmax loss function
def CalculateClassWeight(classFrequencyList, beta):
    classFrequencyArray = np.array(classFrequencyList)
    effectiveNumber = 1.0 - np.power(beta,classFrequencyArray)
    weights = (1-beta) / np.array(effectiveNumber)
    weights = weights / np.sum(weights) * len(classFrequencyArray)
    print("Calculated weights are: " + str(weights))
    return weights
# This function plots class histogram
def PlotClassHistogram(classNameList, classFrequencyList, plotTitle):
    plt.bar(classNameList,classFrequencyList)
    plt.xlabel("Class Names")
    plt.ylabel("Number of Images")
    plt.title("Image Distribution for " + plotTitle)
    plt.show()

## Run the following block to form csv files for training, validation and test dataset for further processing through dataframes

In [None]:
classNameList = np.array(['Neutral','Happy','Sad','Surprise','Fear','Disgust','Anger','Contempt'])
imageDirPath = os.path.join('Dataset','train_set','images')
CreateCSVFile(imageDirPath, 'train_setComplete.csv')
df = pd.read_csv('train_setComplete.csv')
ExtractValidationset(df, 100, 8)
df = pd.read_csv('train_set.csv')
classFrequencyList = CalculateClassFrequency(df,8)
classWeights = CalculateClassWeight(classFrequencyList, 0.9999)
PlotClassHistogram(classNameList, classFrequencyList, 'Training Dataset')
imageDirPath = os.path.join('Dataset','test_set','images')
CreateCSVFile(imageDirPath, 'test_set.csv')
df = pd.read_csv('test_set.csv')
classFrequencyList = CalculateClassFrequency(df,8)
PlotClassHistogram(classNameList, classFrequencyList, 'Test Dataset')

## Installing VGG-Face

Following packages along with their correct versions are required to be installed for using VGG-Face. 
Remember to revert back to latest versions of tensorflow and keras for running the other parts of the notebook.

In [None]:
# ! pip install tensorflow==1.14.0
# ! pip install keras==2.2.4
# ! pip install h5py==2.10.0 --force-reinstall
# ! pip install keras_vggface

VGG-Face comes with three pretrained models on face images dataset with backbone networks of (VGG-16, Resnet50 and SEnet50).
The models are downloaded and their weights are saved for using them on latest version of tensorflow/keras framework.

In [None]:
from keras_vggface.vggface import VGGFace
# Based on VGG16 architecture -> old paper(2015)
vggfaceModel = VGGFace(model='vgg16')
# Based on RESNET50 architecture -> new paper(2017)
resnetModel = VGGFace(model='resnet50')
# Based on SENET50 architecture -> new paper(2017)
senetModel = VGGFace(model='senet50')
#Save weights of models
vggfaceModel.save_weights('vggface-16.h5')
resnetModel.save_weights('vggface-resnet50.h5')
senetModel.save_weights('vggface-senet50.h5')

## Support functions for building VGG-16 based CNN

Run this block if VGG-16 based backbone is desired

In [5]:
# This function is used for adding convolutions layers followed by Maxpool layer
def ConvolutionBlock(noOfLayers, noOfFilters, x, convBlockName):
    for i in range(noOfLayers):
        layerName = convBlockName+'_'+str(i+1)+'_3x3'
        x = Conv2D(noOfFilters,(3,3), strides=(1,1),padding='same', activation='relu', name = layerName)(x)
    layerName = 'maxPool_'+convBlockName.split('_')[1]+'_'+convBlockName.split('_')[2]
    x = MaxPool2D(2,strides=2, name = layerName)(x)
    return x
# This function cascades Convolutions blocks
def BuildVGGLearner(metaParameters,inputImg):
    x = inputImg
    noOfConvBlocks = metaParameters['noOfConvBlocks']
    for i in range(noOfConvBlocks):
        noOfLayers = metaParameters['convBlock'+str(i+1)]['noOfLayers']
        noOfFilters = metaParameters['convBlock'+str(i+1)]['noOfFilters']
        convBlockName = metaParameters['convBlock'+str(i+1)]['name']
        x = ConvolutionBlock(noOfLayers,noOfFilters,x,convBlockName)
    return x
# This function adds a classifier at the end of Convolution blocks
def BuildVGGClassifier(metaParameters,inputFeatures):
    x = inputFeatures
    noOfFCBlocks = metaParameters['noOfFCBlocks']
    for i in range(noOfFCBlocks):
        noOfNodes = metaParameters['fcBlock'+str(i+1)]['noOfNodes']
        activationType = metaParameters['fcBlock'+str(i+1)]['activation']
        layerName = metaParameters['fcBlock'+str(i+1)]['name']+'_'+activationType
        x = Dense(noOfNodes, activation = activationType, name = layerName)(x)
        x = Dropout(0.2, seed = 210)(x)
    return x
# This function assembles convolution groups and classifier to form a complete VGG-16 model
def BuildVGGModel(metaParameters,inputImg):
    x = BuildVGGLearner(metaParameters,inputImg)
    layerName = 'flattenLayer_VGG'
    x = Flatten(name = layerName)(x)
    x =  BuildVGGClassifier(metaParameters,x)
    model = Model(inputs = inputImg, outputs = x)
    return model

Run the following block to build a VGG-16 model. The softmax layer comprises of 2622 nodes since the VGG-Face (VGG-16) was trained to classify 2622 classes for faces. The architecture of model (no of layers and filters) are defined by a dictionary of metaparameters.

In [7]:
inputImg = Input(shape =(224,224,3), name = 'inputLayer')
metaParameters = {'noOfConvBlocks' : 5, 
                 'convBlock1' : {'name' : 'convBlock_VGG_1','noOfLayers' : 2, 'noOfFilters' : 64},
                 'convBlock2' : {'name' : 'convBlock_VGG_2','noOfLayers' : 2, 'noOfFilters' : 128},
                 'convBlock3' : {'name' : 'convBlock_VGG_3','noOfLayers' : 3, 'noOfFilters' : 256},
                 'convBlock4' : {'name' : 'convBlock_VGG_4','noOfLayers' : 3, 'noOfFilters' : 512},
                 'convBlock5' : {'name' : 'convBlock_VGG_5','noOfLayers' : 3, 'noOfFilters' : 512},
                 'noOfFCBlocks' : 3, 
                 'fcBlock1'   : {'name' : 'fcBlock_VGG_1','noOfNodes' : 4096, 'activation' : 'relu'},
                 'fcBlock2'   : {'name' : 'fcBlock_VGG_2','noOfNodes' : 4096, 'activation' : 'relu'},
                 'fcBlock3'   : {'name' : 'fcBlock_VGG_3','noOfNodes' : 2622, 'activation' : 'softmax'}}
model = BuildVGGModel(metaParameters,inputImg)
# model.summary()
# plot_model(model, to_file='model_plot.png', show_shapes=True, show_layer_names=True)

## Support functions for building Resnet based CNN

Run this block for using the Resnet50 based backbone

In [8]:
def AddConvLayer(noOfFilters,kernel,stride,name,actFlag,inFeature,padding='valid'):
    x = Conv2D(noOfFilters,kernel_size=kernel, strides=stride,padding=padding, use_bias=False, name = name)(inFeature)
    x = BatchNormalization()(x)
    if actFlag:
        x = ReLU()(x)
    return x
    
def BuildResnetStem(metaParameters, inputImg):
    convBlockName = metaParameters['convBlock1']['name']
    noOfFilters = metaParameters['convBlock1']['noOfFilters']
    layerName = convBlockName  + '_1_7x7'
    x = AddConvLayer(noOfFilters=noOfFilters,kernel=(7,7),stride=(2,2),name=layerName,actFlag=True,inFeature=inputImg,padding='same')
    
    layerName = 'maxPool_'+convBlockName.split('_')[1]+'_'+convBlockName.split('_')[2]
    x = MaxPool2D(pool_size=(3,3), strides=(2,2), name = layerName)(x)
    return x

def ConvolutionBlock(noOfFilters, x, convBlockName, stride =(2,2)):
    layerName = convBlockName + '_1_1x1_proj'
    shortcutLayer = AddConvLayer(noOfFilters=4*noOfFilters,kernel=(1,1),stride=stride,name=layerName,actFlag=False,inFeature=x)

    layerName = convBlockName + '_1_1x1_reduce'
    x = AddConvLayer(noOfFilters=noOfFilters,kernel=(1,1),stride=stride,name=layerName,actFlag=True,inFeature=x)

    layerName = convBlockName + '_1_3x3'
    x = AddConvLayer(noOfFilters=noOfFilters,kernel=(3,3),stride=(1,1),name=layerName,actFlag=True,inFeature=x,padding='same')

    layerName = convBlockName + '_1_1x1_increase'
    x = AddConvLayer(noOfFilters=4*noOfFilters,kernel=(1,1),stride=(1,1),name=layerName,actFlag=False,inFeature=x)
    layerName = convBlockName+'_Add'
    x = add([x, shortcutLayer], name = layerName)
    x = ReLU()(x)
    return x

def BottleneckBlock(noOfFilters, x, convBlockName, bottleneckBlockNo):
    shortcutLayer = x
    layerName = convBlockName+'_'+str(bottleneckBlockNo)+'_1x1_reduce'
    x = AddConvLayer(noOfFilters=noOfFilters,kernel=(1,1),stride=(1,1),name=layerName,actFlag=True,inFeature=x)
    
    layerName = convBlockName+'_'+str(bottleneckBlockNo)+'_3x3'
    x = AddConvLayer(noOfFilters=noOfFilters,kernel=(3,3),stride=(1,1),name=layerName,actFlag=True,inFeature=x,padding='same')

    layerName = convBlockName+'_'+str(bottleneckBlockNo)+'_1x1_increase'
    x = AddConvLayer(noOfFilters=4*noOfFilters,kernel=(1,1),stride=(1,1),name=layerName,actFlag=False,inFeature=x)
    
    layerName = 'BottleneckBlock_'+convBlockName.split('_')[2]+'_'+str(bottleneckBlockNo)+'_Add'
    x = add([shortcutLayer, x], name=layerName)
    x = ReLU()(x)
    return x 

def BuildResnetClassifier(metaParameters,inputFeatures):
    x = inputFeatures
    noOfFCBlocks = metaParameters['noOfFCBlocks']
    for i in range(noOfFCBlocks):
        noOfNodes = metaParameters['fcBlock'+str(i+1)]['noOfNodes']
        activationType = metaParameters['fcBlock'+str(i+1)]['activation']
        layerName = metaParameters['fcBlock'+str(i+1)]['name'] +'_'+activationType
        x = Dense(noOfNodes, activation = activationType, name = layerName)(x)
        x = Dropout(0.3, seed = 210)(x)
    return x

def BuildResnetLearner(metaParameters,x):
    noOfConvBlocks = metaParameters['noOfConvBlocks']
    for i in range(2,noOfConvBlocks+1):
        convBlockName = metaParameters['convBlock'+str(i)]['name']
        noOfFilters = metaParameters['convBlock'+str(i)]['noOfFilters']
        noOfBottleneckBlocks = metaParameters['convBlock'+str(i)]['noOfBottleneckBlocks']
        if i == 2:
            x = ConvolutionBlock(noOfFilters, x, convBlockName, stride =(1,1))
        else:
            x = ConvolutionBlock(noOfFilters, x, convBlockName)

        for i in range(noOfBottleneckBlocks):
            x = BottleneckBlock(noOfFilters, x, convBlockName, i+2)
    return x

def BuildResnetModel(metaParameters,inputImg):
    x = BuildResnetStem(metaParameters,inputImg)
    x = BuildResnetLearner(metaParameters,x)
    layerName = 'globalAveragePooling_Resnet'
    x = GlobalAveragePooling2D(name=layerName)(x)
    x = BuildResnetClassifier(metaParameters,x)
    model = Model(inputs = inputImg, outputs = x)
    return model


Run the following block to build a Resnet50 model. The softmax layer comprises of 8631 nodes since the VGG-Face (Resnet50) was trained to classify 8631 classes for faces. The architecture of model (no of layers and filters) are defined by a dictionary of metaparameters.

In [10]:
layerName = 'inputLayer'
inputImg = Input(shape =(224,224,3),name=layerName)
metaParameters = {'noOfConvBlocks' : 5, 
                 'convBlock1' : {'name' : 'convBlock_Resnet_1','noOfBottleneckBlocks' : 0, 'noOfFilters' : 64},
                 'convBlock2' : {'name' : 'convBlock_Resnet_2','noOfBottleneckBlocks' : 2, 'noOfFilters' : 64},
                 'convBlock3' : {'name' : 'convBlock_Resnet_3','noOfBottleneckBlocks' : 3, 'noOfFilters' : 128},
                 'convBlock4' : {'name' : 'convBlock_Resnet_4','noOfBottleneckBlocks' : 5, 'noOfFilters' : 256},
                 'convBlock5' : {'name' : 'convBlock_Resnet_5','noOfBottleneckBlocks' : 2, 'noOfFilters' : 512},
                 'noOfFCBlocks' : 1, 
                 'fcBlock1'   : {'name' : 'fcBlock_Resnet_1','noOfNodes' : 8631, 'activation' : 'softmax'}}
model = BuildResnetModel(metaParameters,inputImg)
# plot_model(model, to_file='model_plot.png', show_shapes=True, show_layer_names=True)

## Support Functions for Facial Expression classier

In [11]:
# This function is used for defining a weighted softmax loss function for handing long tail dataset.
def WeightedSoftmax(y_true, y_pred):
    class_weight = tf.convert_to_tensor(np.array([0.57261169,0.57229192,0.62097051,0.7573698,1.21360446,1.80900564,0.62412396,1.83002202]),dtype=tf.float32)
    unreduced_scce = tf.keras.losses.SparseCategoricalCrossentropy(reduction = tf.keras.losses.Reduction.NONE)
    loss = unreduced_scce(y_true, y_pred)
    weight_mask = tf.gather(class_weight, y_true)
    loss = tf.math.multiply(loss, weight_mask)
    loss = tf.reduce_mean(loss)
    return loss
# This function adds a regression block by cascading dense layers followed by a CNN
def BuildRegressionBlock(metaParameters,inputFeatures, outputType):
    x = inputFeatures
    noOfRegressionBlocks = metaParameters['noOfRegressionBlocks']
    for i in range(noOfRegressionBlocks):
        noOfNodes = metaParameters['fcBlock'+str(i+1)]['noOfNodes']
        activationType = metaParameters['fcBlock'+str(i+1)]['activation']
        layerName = metaParameters['fcBlock'+str(i+1)]['name']+outputType+'_'+activationType
        x = Dense(noOfNodes, activation = activationType, name = layerName)(x)
        x = Dropout(0.2, seed = 210)(x)
    return x
# This function load the saved weights and extracts the feature descriptor layers of VGG-Face and make only the desired layers trainable
def ExtractFaceDescriptorModel(model,LastExtractedLayer, LastTrainableLayer,modelWeightPath):
    model.load_weights(modelWeightPath)
    faceDescriptorModel = Model(inputs=model.layers[0].input, outputs=model.get_layer(LastExtractedLayer).output)
    # vgg_face_descriptor.summary()
    for layer in faceDescriptorModel.layers:
        if (layer.name == LastTrainableLayer): break
        layer.trainable = False
    return faceDescriptorModel

Run this block to use VGG-16 based backbone

In [None]:
# This function assembles convolution groups and classifier to form a complete Facial Expression Recognition model with multi-output
def BuildFERModel(model,metaParameters):
    inputLayer = model.layers[0].input
    x = model.layers[-1].output
    layerName = 'globalAveragePooling_FER'
    x = GlobalAveragePooling2D(name=layerName)(x)
    classificationOutput = BuildVGGClassifier(metaParametersFER,x)
    classificationOutput = Dense(8, activation = 'softmax', name = 'classification_output')(classificationOutput)
    valenceOutput = BuildRegressionBlock(metaParametersFER,x, 'valence')
    valenceOutput = Dense(1, activation = 'tanh', name = 'valence_output')(valenceOutput)
    arousalOutput = BuildRegressionBlock(metaParametersFER,x, 'arousal')
    arousalOutput = Dense(1, activation = 'tanh', name = 'arousal_output')(arousalOutput)
    # model = Model(inputs = [inputLayer,landmarks], outputs = [classificationOutput, valenceOutput, arousalOutput])
    model = Model(inputs = inputLayer, outputs = [classificationOutput, valenceOutput, arousalOutput])
    return model

Run this block for using Resnet50 based backbone

In [None]:
def BuildFERModel(model,metaParameters):
    inputLayer = model.layers[0].input
    x = model.layers[-2].output
    x = Conv2D(1024,kernel_size=(1,1), strides=(1,1), activation='relu')(x)
    layerName = 'globalAveragePooling_FER'
    x = GlobalAveragePooling2D(name=layerName)(x)
    classificationOutput = BuildResnetClassifier(metaParameters,x)
    classificationOutput = Dense(8, activation = 'softmax', name = 'classification_output')(classificationOutput)
    valenceOutput = BuildRegressionBlock(metaParametersFER,x, 'valence')
    valenceOutput = Dense(1, activation = 'tanh', name = 'valence_output')(valenceOutput)
    arousalOutput = BuildRegressionBlock(metaParametersFER,x, 'arousal')
    arousalOutput = Dense(1, activation = 'tanh', name = 'arousal_output')(arousalOutput)
    # model = Model(inputs = [inputLayer,landmarks], outputs = [classificationOutput, valenceOutput, arousalOutput])
    model = Model(inputs = inputLayer, outputs = [classificationOutput, valenceOutput, arousalOutput])
    return model

## Load saved weights to the VGG-Face (VGG-16) model and extract desired layers

In [None]:
modelWeightPath = 'vggface-16.h5'
LastExtractedLayer = 'maxPool_VGG_5'
LastTrainableLayer = 'maxPool_VGG_4'
faceDescriptorModel = ExtractFaceDescriptorModel(model,LastExtractedLayer, LastTrainableLayer,modelWeightPath)

Run the following block to build a custom model for facial expression recognition with VGG-16 backbone. The softmax layer comprises of 8 nodes to classify 08 facial expressions. Moreover, the model also have two regression heads for predicting valence and arousal.
Since the dataset is a long tail dataset, a custom loss function (weighted softmax loss) has been used to give more priority to less represented classes.

In [None]:
metaParametersFER = {'noOfFCBlocks' : 2, 
                 'fcBlock1'   : {'name' : 'fcBlock_FER_1','noOfNodes' : 512, 'activation' : 'relu'},
                 'fcBlock2'   : {'name' : 'fcBlock_FER_2','noOfNodes' : 512, 'activation' : 'relu'},
                 'noOfRegressionBlocks' : 2, 
                 'fcBlockReg1'   : {'name' : 'fcBlock_Reg_1','noOfNodes' : 512, 'activation' : 'relu'},
                 'fcBlockReg2'   : {'name' : 'fcBlock_Reg_2','noOfNodes' : 512, 'activation' : 'relu'}}

model = BuildFERModel(model = faceDescriptorModel,metaParameters = metaParametersFER)
model.compile(optimizer='adam',loss={'classification_output': WeightedSoftmax,'valence_output': 'mse', 'arousal_output': 'mse'}, 
              loss_weights={'classification_output': 1.0, 'valence_output': 0.25, 'arousal_output': 0.25}, 
              metrics = {'classification_output': 'accuracy', 'valence_output': 'mse', 'arousal_output' : 'mse'})
plot_model(model, to_file='model_plot.png', show_shapes=True, show_layer_names=True)

## Load saved weights to the VGG-Face (Resnet50) model and extract desired layers

In [None]:
modelWeightPath = 'vggface-resnet50.h5'
LastExtractedLayer = 'globalAveragePooling_Resnet'
LastTrainableLayer = 'BottleneckBlock_3_4_Add'
faceDescriptorModel = ExtractFaceDescriptorModel(model,LastExtractedLayer, LastTrainableLayer,modelWeightPath)
# faceDescriptorModel.summary()

Run the following block to build a custom model for facial expression recognition using Resnet50 backbone. The softmax layer comprises of 8 nodes to classify 08 facial expressions. Moreover, the model also have two regression heads for predicting valence and arousal. Since the dataset is a long tail dataset, a custom loss function (weighted softmax loss) has been used to give more priority to less represented classes.

In [None]:
metaParametersFER = {'noOfFCBlocks' : 2, 
                 'fcBlock1'   : {'name' : 'fcBlock_FER_1','noOfNodes' : 512, 'activation' : 'relu'},
                 'fcBlock2'   : {'name' : 'fcBlock_FER_2','noOfNodes' : 512, 'activation' : 'relu'},
                 'noOfRegressionBlocks' : 2, 
                 'fcBlockReg1'   : {'name' : 'fcBlock_Reg_1','noOfNodes' : 512, 'activation' : 'relu'},
                 'fcBlockReg2'   : {'name' : 'fcBlock_Reg_2','noOfNodes' : 512, 'activation' : 'relu'}}

model = BuildFERModel(faceDescriptorModel,metaParametersFER)
model.compile(optimizer='adam',loss={'classification_output': WeightedSoftmax,'valence_output': 'mse', 'arousal_output': 'mse'}, 
              loss_weights={'classification_output': 1.0, 'valence_output': 0.25, 'arousal_output': 0.25}, 
              metrics = {'classification_output': 'accuracy', 'valence_output': 'mse', 'arousal_output' : 'mse'})
plot_model(model, to_file='model_plot.png', show_shapes=True, show_layer_names=True)

## Train the model

Run the following block to train the model. Here 'flow_from_dataframe' has been used to send image batches for training.

In [None]:
checkpoint = ModelCheckpoint ( 'FER-Checkpoint.h5' , save_best_only = True , monitor = 'val_loss' )
earlystop = EarlyStopping(monitor = 'val_loss', patience = 4)
train_dir = os.path.join('Dataset','train_set','images')
dfTrain=pd.read_csv(r"train_set.csv")
dfVal=pd.read_csv(r"val_set.csv")
datagenTrain=ImageDataGenerator(rescale=1./255, rotation_range=10, zoom_range=0.2, brightness_range=[0.8,1.2], fill_mode='nearest')
datagenVal=ImageDataGenerator(rescale=1./255)
train_generator=datagenTrain.flow_from_dataframe(dataframe=dfTrain, directory=train_dir, x_col="filename", 
                        y_col=["label","valence","arousal"], class_mode="multi_output", target_size=(224,224), batch_size=64, seed=210)
val_generator=datagenVal.flow_from_dataframe(dataframe=dfVal, directory=train_dir, x_col="filename", 
                      y_col=["label","valence","arousal"], class_mode="multi_output", target_size=(224,224), batch_size=64, shuffle = False)
history = model.fit(train_generator,epochs=31, validation_data= val_generator, callbacks =[checkpoint,earlystop])

## Support Functions for model evaluation

In [5]:
# This function is used for plotting the training progress of the model
def PlotTrainingGraph(history, graphType):
    trainingValue = history.history[graphType]
    validationValue = history.history['val_'+graphType]
    plot_epochs = range(1, len(trainingValue)+1)
    if (len(graphType.split('_')) == 3):
        temp1 = graphType.split('_')[0]
        temp2 = graphType.split('_')[2]
    else:
        temp1 = graphType
        temp2 = 'Cumulative'
    plt.plot(plot_epochs, trainingValue, 'r', label='Training')
    plt.plot(plot_epochs, validationValue, 'b', label='Validation')
    plt.title(temp1+' '+temp2)
    plt.ylabel(temp2)  #Y-axis label
    plt.xlabel('epoch')  #X-axis label
    plt.legend()
    plt.show()
# This function is used for printing Confusion Matrix     
def DisplayConfusions(labelsY, predictions,class_names):
    confMat = confusion_matrix(labelsY, predictions)
    print(confMat)
    plt.rcParams["figure.figsize"] = (14,7)
    disp = ConfusionMatrixDisplay(confusion_matrix=confMat, display_labels=class_names)
    
    disp.plot(xticks_rotation = 90)
    plt.show()
# This function is used for calculating concordance correlation coefficient    
def ConCorelationCoef(groundTruth, predicted):
    # covariance between groundTruth and predicted
    N = len(predicted)
    s_xy = 1.0 / (N - 1.0) * np.sum((groundTruth - np.mean(groundTruth)) * (predicted - np.mean(predicted)))
    # means
    x_m = np.mean(groundTruth)
    y_m = np.mean(predicted)
    # variances
    s_x_sq = np.var(groundTruth)
    s_y_sq = np.var(predicted)
    # condordance correlation coefficient
    ccc = (2.0*s_xy) / (s_x_sq + s_y_sq + (x_m-y_m)**2)
    return ccc
# This function is used for calculating the Sign Agreement score (SAGR)
def SAGRScore(groundTruth, predicted):
    prod = np.multiply(groundTruth, predicted)
    prod[prod < 0] = 0
    prod[prod > 0] = 1
    sagr = (1/len(prod))*sum(prod)
    return sagr
# This function is used for extracting ground truth values from dataset
def ExtractGroundTruth(groundTruthType,groundTruthFile):
    df = pd.read_csv(groundTruthFile, index_col=False)
    return df[groundTruthType].to_numpy()
#This function display images    
def DisplayImages(indexes, filename):
    df = pd.read_csv(filename)
    filenames = df['filename']
    
    
    for i in range(len(index)):
        plt.subplot(5,5,i+1)
        img = cv2.cvtColor(datasetX[index[i]], cv2.COLOR_BGR2RGB)
        plt.imshow(img)
        plt.title(predictions[index[i]])
    plt.show()

## Plot training and validation graphs

In [None]:
PlotTrainingGraph(history, 'classification_output_accuracy')
PlotTrainingGraph(history, 'loss')
PlotTrainingGraph(history, 'classification_output_loss')
PlotTrainingGraph(history, 'valence_output_mse')
PlotTrainingGraph(history, 'arousal_output_mse')

## Predict on the test dataset

Run following cell to make predictions on the test dataset

In [None]:
test_dir = os.path.join('Dataset','val_set','images')
dfTest=pd.read_csv(r"test_set.csv")
datagen=ImageDataGenerator(rescale=1./255)
test_generator=datagen.flow_from_dataframe(dataframe=dfTest, directory=test_dir, x_col="filename", 
                        y_col=["label","valence","arousal"], class_mode="multi_output", target_size=(224,224), batch_size=64, shuffle=False)
predictions = model.predict(test_generator,verbose=1)

Found 3999 validated image filenames.


## Extract predicted (labels/values) and ground truths

In [8]:
predictedLabels = np.argmax(predictions[0], axis=-1)
predictedLabelsProb = predictions[0]
groundTruthLabels = ExtractGroundTruth('label','test_set.csv')
groundTruthLabelsHotencoded = to_categorical(groundTruthLabels, 8)
groundTruthLabels = groundTruthLabels.reshape(groundTruthLabels.shape[0],1)
predictedLabels = predictedLabels.reshape(predictedLabels.shape[0],1)
valencePredicted = predictions[1]
arousalPredicted = predictions[2]
valencePredicted = np.squeeze(np.asarray(valencePredicted))
arousalPredicted = np.squeeze(np.asarray(arousalPredicted))
valenceGroundTruth = ExtractGroundTruth('valence','test_set.csv')
arousalGroundTruth = ExtractGroundTruth('arousal','test_set.csv')

## Extract metrics for test dataset evaluation

In [1]:
class_names = ["Neutral", "Happy", "Sad", "Surprise", "Fear", "Disgust", "Anger", "Contempt"]
accuracy = accuracy_score(groundTruthLabels, predictedLabels)
kappa = cohen_kappa_score(groundTruthLabels, predictedLabels)
confMatrix = confusion_matrix(groundTruthLabels, predictedLabels)
classificationReport = classification_report(groundTruthLabels,predictedLabels,target_names=class_names)
roc_auc = roc_auc_score(groundTruthLabelsHotencoded, predictedLabelsProb)
aucPR = average_precision_score(groundTruthLabelsHotencoded, predictedLabelsProb)

print('Accuracy of the trained model is '+str(accuracy))
print('Cohens Kappa of the trained model is '+str(kappa))
print('Area under the ROC of the trained model is '+str(roc_auc))
print('Area under the Precision-Recall curve of the trained model is '+str(aucPR))
print(classificationReport)
# print(confMatrix)
DisplayConfusions(groundTruthLabels,predictedLabels,class_names)

print("----------Regression metrics are depicted below----------")
rmseValence = mean_squared_error(valenceGroundTruth, valencePredicted, squared=False)
rmseArousal = mean_squared_error(arousalGroundTruth, arousalPredicted, squared=False)
corrValence, _ = pearsonr(valenceGroundTruth, valencePredicted)
corrArousal, _ = pearsonr(arousalGroundTruth, arousalPredicted)
valenceCCC = ConCorelationCoef(valenceGroundTruth, valencePredicted)
arousalCCC = ConCorelationCoef(arousalGroundTruth, arousalPredicted)
valenceSagr = SAGRScore(valenceGroundTruth, valencePredicted)
arousalSagr = SAGRScore(arousalGroundTruth, arousalPredicted)

print('Root mean squared error for valence is '+str(rmseValence))
print('Root mean squared error for arousal is '+str(rmseArousal))
print('Correlation for valence is '+str(corrValence))
print('Correlation for arousal is '+str(corrArousal))
print('Concordance Correlation Coefficient for valence is '+str(valenceCCC))
print('Concordance Correlation Coefficient for arousal is '+str(arousalCCC))
print('Sign Agreement score for valence is '+str(valenceSagr))
print('Sign Agreement score for for arousal is '+str(arousalSagr))

## Display Wrongly Classified Images

In [19]:
def DisplayImages(indexes, filename, groundTruthLabels,predictedLabels):
    class_names = {0:"Neutral", 1:"Happy", 2:"Sad", 3:"Surprise", 4:"Fear", 5:"Disgust", 6:"Anger", 7:"Contempt"}
    df = pd.read_csv(filename)
    filenames = df['filename']
    j=0
    for i, index in enumerate(indexes):
        if i%35 == 0:
            j=j+1
            if j>15: break
            imageName = filenames.loc[index]
            imageFilePath = os.path.join('Dataset','val_set','images',imageName)
            plt.subplot(5,3,j)
            img = cv2.imread(imageFilePath)
            img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
            plt.imshow(img)
            plt.title('Pred: ' + class_names[predictedLabels[index][0]] + ', GT: ' + class_names[groundTruthLabels[index][0]])
            frame1 = plt.gca()
            frame1.axes.get_xaxis().set_visible(False)
            frame1.axes.get_yaxis().set_visible(False)
    plt.subplots_adjust(left=0.1,
                    bottom=0.1, 
                    right=0.9, 
                    top=0.9, 
                    wspace=0.4, 
                    hspace=0.4)
    plt.show()

In [2]:
a = groundTruthLabels==predictedLabels
b = np.where(a==True)
wronglyClassifiedIndexes = b[0]
DisplayImages(wronglyClassifiedIndexes, 'test_set.csv', groundTruthLabels, predictedLabels)