<a href="https://colab.research.google.com/github/cgbartnik/comparison_between_visual_saliency_maps_of_CNN_and_those_of_human_beings/blob/master/Finetuning_pre_trained_VGGFace.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# A Comparison between Visual Saliency Maps of Convolutional Neural Networks and those of Human Beings: A Study on Facial Expression Recognition

**Overview**
-----------------

This notebook conains the code to fine-tune the CNN used in this master thesis. 

In this master thesis the feature extractor from a pre-trained Visual Geometry Group Face model (VGG Face model) was used. It was developed by Parkhi, Vedaldi, and Zisserman (2015) for recognizing faces of 2,622 celebrities. It was favoured over other widely used pre-trained models like ImageNet oder AlexNet because it was trained exclusively on faces.

Up to this point, there is no public available Keras implementation with the respective pre-trained weights for the new VGG-Face2 dataset. Therefore, this master thesis uses a Keras implementation for the VGG-Face model trained on the original dataset.

The pre-trained VGG Face model, is based on the VGG16 architecture and this master thesis is using weights from the keras-vggface, package implemented with the deep learning library Keras (version 2.2.5) and a Tensorflow (version 1.15.0) backend. The network is fine tuned with the training set in the notebook described above. Only the weights of the fully connected layers *fc6* and *fc7* with their respective 512 neurons were retrained, as well as *fc8* with seven neurons that classified the images to one of the seven expressions. 

For training, the following hyperparameters were used: The **ADAM optimizer** with a learning rate of *1e-5* and a decay of *1e-6*, as loss function **categorical crossentropy** and *20* epochs with *19* training steps and *5* validation steps for fine tuning the CNN.

# Fine tuning the pretrained VGG Face model

## Loading packages for the network

In [None]:
# Loading required packages 
%tensorflow_version 1.x
import tensorflow as tf
print(tf.__version__)


from keras_preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras import models
from keras import layers

from keras.engine import  Model
from keras.layers import Input, Flatten, Dense, AveragePooling2D 

from keras.callbacks import ModelCheckpoint

TensorFlow 1.x selected.
1.15.2


Using TensorFlow backend.


Installing the library containing the pretrained VGG Face Keras implementation

In [None]:
!pip install git+https://github.com/rcmalli/keras-vggface.git

Collecting git+https://github.com/rcmalli/keras-vggface.git
  Cloning https://github.com/rcmalli/keras-vggface.git to /tmp/pip-req-build-0fboi7j7
  Running command git clone -q https://github.com/rcmalli/keras-vggface.git /tmp/pip-req-build-0fboi7j7
Building wheels for collected packages: keras-vggface
  Building wheel for keras-vggface (setup.py) ... [?25l[?25hdone
  Created wheel for keras-vggface: filename=keras_vggface-0.6-cp36-none-any.whl size=8311 sha256=b404484e6bceddcd690741fabdb171ddc36a5d5c43d0076a751a4310a40d6b9d
  Stored in directory: /tmp/pip-ephem-wheel-cache-y3loz0vi/wheels/36/07/46/06c25ce8e9cd396dabe151ea1d8a2bc28dafcb11321c1f3a6d
Successfully built keras-vggface
Installing collected packages: keras-vggface
Successfully installed keras-vggface-0.6


In [None]:
# Loading the VGGFace library
from keras_vggface.vggface import VGGFace

## Reading in the preprocessed train- and testdata

If not running the complete notebook you can read in preprocessed Train-and Testdata using the saved csv-files.

In [None]:
test_data = pd.read_csv('/content/gdrive/My Drive/Thesis/Train_Test_CSV/test_data.csv')
train_data = pd.read_csv('/content/gdrive/My Drive/Thesis/Train_Test_CSV/train_data.csv')

Rename the **Image** colum to not show the full path but only the image name.

In [None]:
new_train = train_data["Image"].str.split("Train/", n = 1, expand = True)
train_data['Image'] = new_train[1]

In [None]:
new_test = test_data["Image"].str.split("Test/", n = 1, expand = True)
test_data['Image'] = new_test[1]

In [None]:
#saving the new csv files with the split Image column
os.chdir("/content/gdrive/My Drive/Thesis/Train_Test_CSV/")
train_data.to_csv("train_data_2.csv", sep=',', encoding='utf-8')
test_data.to_csv("test_data_2.csv", sep=',', encoding='utf-8')

In [None]:
# Reading in the new dataframes.
test_data = pd.read_csv('/content/gdrive/My Drive/Thesis/Train_Test_CSV/test_data_2.csv')
train_data = pd.read_csv('/content/gdrive/My Drive/Thesis/Train_Test_CSV/train_data_2.csv')

## Image Data Generator

In [None]:
datagen = ImageDataGenerator(rescale=1./255.,
                           validation_split=0.2,
                           width_shift_range=0.2,
                           height_shift_range=0.2,
                           shear_range=0.2,
                           zoom_range=0.2,
                           horizontal_flip=True,
                           brightness_range = [0.5, 1.5],
                           fill_mode='nearest')

test_datagen = ImageDataGenerator(rescale = 1./255.)

Rotation another classic augmentation technique is not used because when rotating faces the human recognition performance drops. Could be checked in different studys how CNNs are effected by rotating faces. 

In [None]:
train_generator = datagen.flow_from_dataframe(
    dataframe = train_data,
    directory = "/content/gdrive/My Drive/Thesis/Train",
    x_col="Image",
    y_col="label",
    subset="training",
    color_mode="rgb",
    batch_size=32,
    seed=42,
    shuffle=True,
    class_mode="categorical",
    target_size=(224,224))

valid_generator = datagen.flow_from_dataframe(
    dataframe = train_data,
    directory = "/content/gdrive/My Drive/Thesis/Train",
    x_col="Image",
    y_col="label",
    subset="validation",
    batch_size=32,
    seed=42,
    shuffle=True,
    class_mode="categorical",
    target_size=(224,224))

Found 603 validated image filenames belonging to 7 classes.
Found 150 validated image filenames belonging to 7 classes.


## Model definition


In [None]:
modelname = "FER_VGGFace_finetune"
nb_class = 7   # number of classes. 
hidden_dim = 512

vgg_model = VGGFace(include_top=False, input_shape=(224, 224, 3))
last_layer = vgg_model.get_layer('pool5').output
x = Flatten(name='flatten')(last_layer)
x = Dense(hidden_dim, activation='relu', name='fc6')(x)
x = Dense(hidden_dim, activation='relu', name='fc7')(x)
out = Dense(nb_class, activation='softmax', name='fc8')(x)
custom_vgg_model = Model(vgg_model.input, out)





Downloading data from https://github.com/rcmalli/keras-vggface/releases/download/v2.0/rcmalli_vggface_tf_notop_vgg16.h5








**Overview over the used VGG Face model for the finetuning**

**fc6, fc7** and **fc8** are finetuned. 

This is called transfer learning. Then the VGG Face model is trained on face recognition, recognizing the faces of 2.622 celebreties. In this work the first part of the CNN the feature extractor is kept and the second part the classifier is retrained to classify facial expressions. 

In [None]:
custom_vgg_model.summary()

Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
conv1_1 (Conv2D)             (None, 224, 224, 64)      1792      
_________________________________________________________________
conv1_2 (Conv2D)             (None, 224, 224, 64)      36928     
_________________________________________________________________
pool1 (MaxPooling2D)         (None, 112, 112, 64)      0         
_________________________________________________________________
conv2_1 (Conv2D)             (None, 112, 112, 128)     73856     
_________________________________________________________________
conv2_2 (Conv2D)             (None, 112, 112, 128)     147584    
_________________________________________________________________
pool2 (MaxPooling2D)         (None, 56, 56, 128)       0   

**Setting the Hyperparameters:**

For training the hyperparameters were choosen as followed. The ADAM optimizer was used with a learning rate of 1e-5 and a decay of 1e-6, categorical crossentropy was used as loss function. The network was fine-tuned for 20 epochs with 19 steps each and 5 validation steps.

In [None]:
opt = optimizers.Adam(lr=1e-5, decay=1e-6)
  
custom_vgg_model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])

**Creating the Callbacks for the training process:**

Each epoch the model is saved to the folder **Saved_models**.

In [None]:
os.chdir("/content/gdrive/My Drive/Thesis/Saved_models")

In [None]:
filepath = "saved-model-{epoch:02d}-{val_acc:.2f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=False, mode='max')

In [None]:
history = custom_vgg_model.fit_generator(train_generator,
                    steps_per_epoch = 19,
                    epochs = 20,
                    validation_data = valid_generator,
                    validation_steps = 5,
                    callbacks = [checkpoint])

Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where


Epoch 1/20

Epoch 00001: saving model to saved-model-01-0.25.hdf5
Epoch 2/20

Epoch 00002: saving model to saved-model-02-0.43.hdf5
Epoch 3/20

Epoch 00003: saving model to saved-model-03-0.61.hdf5
Epoch 4/20

Epoch 00004: saving model to saved-model-04-0.60.hdf5
Epoch 5/20

Epoch 00005: saving model to saved-model-05-0.79.hdf5
Epoch 6/20

Epoch 00006: saving model to saved-model-06-0.82.hdf5
Epoch 7/20

Epoch 00007: saving model to saved-model-07-0.82.hdf5
Epoch 8/20

Epoch 00008: saving model to saved-model-08-0.84.hdf5
Epoch 9/20

Epoch 00009: saving model to saved-model-09-0.79.hdf5
Epoch 10/20

Epoch 00010: saving model to saved-model-10-0.83.hdf5
Epoch 11/20

Epoch 00011: saving model to saved-model-11-0.86.hdf5
Epoch 12/20

-------------------------

This code was used to fine-tune the pre-trained VGG Face Model. Because the code was rerun, the displayed outputs above are not matching exactly the ones in the master thesis. To replicate the exact results the training history (*historyKDEF_model_100*) and the saved weights of the trained model (*saved-model-16-0.93.hdf5*) can be found in the public [GDrive folder](https://drive.google.com/drive/folders/10iGq7jdLaIG-VZZ1OGK6EleTJlkfq0x_?usp=sharing). 

