#**TRAINING THE EMOTION RECOGNITION MODEL**

## PREPARATION
In the following, we are going to set some constants in order to use them later on. We also have mounting google drive in order to save the training logs into our g-drive.

### MOUNTING GOOGLE DRIVE
The first thing to do is to give google colab permission to access our drive so as to save the training checkpoints.

In [1]:
#MOUNTING GOOGLE DRIVE

from google.colab import drive
drive.mount('/content/gdrive',  force_remount=True) 

# force_remount is an argument to force google drive to mount once again.

Mounted at /content/gdrive


### DOWNLOAD The DATASET

We have created a `dataset.zip` file from all the face mood images. Now we download this file from our drive in order to start the training phase.

In [2]:
%%capture
# Download dataset from google drive

!gdown --id 1H8XilueKOkQ57-9QB7SvxGgqubLFyiBP

# unzip the archive file
!unzip Dataset.zip

# we don't need the archive file anymore
!rm dataset.zip

### IMPORT SOME NECESSARY LIBRARIES

To implement our model, we have to import some needed libraries.

In [3]:
import os
from __future__ import print_function
import keras
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense,Dropout,Activation,Flatten,BatchNormalization
from keras.layers import Conv2D,MaxPooling2D
import tensorflow as tf
from tensorflow.keras.optimizers import RMSprop,Adam,SGD
from keras.callbacks import ModelCheckpoint, EarlyStopping, ReduceLROnPlateau


## LOAD AND GENERATION

Now we have everything to start the training phase,
Let's define some constants for our deep architecture. Each of the constants has its own application which is summarized in the table below:

CONSTANT_NAME | APPLICATION
-------------------|------------------
BATCH_SIZE       | # samples that will be passed through to the network at one time
IMAGE_ROWS       | width of the image 
IMAGE_COLS       | shape of the image (image_height, image_width)
N_CLASSES        | the output of the model, the classification layer of the model


In [13]:
num_classes = 7
img_rows,img_cols = 48,48
batch_size = 64

# The directory paths of training and vallidation data
train_data_dir = '/content/train'
validation_data_dir = '/content/test'


### DATA GENERATOR
We all encountered a situation where we try to load a dataset but there is not enough memory in our machine. 

As the field of machine learning progresses, this problem becomes more and more common. This is already one of the challenges in the field of vision where large datasets of images and video files are processed.

Here, we will use `Keras` to build data generators for loading and processing our images

The `ImageDataGenerator` class is very useful in image classification. There are several ways to use this generator, depending on the method we use, here we will focus on `flow_from_directory` which takes a path to the directory containing images sorted in sub directories and image augmentation parameters.

In [14]:
train_datagen = ImageDataGenerator(
					rescale=1./255,  # to transform every pixel value from range [0,255] -> [0,1]
					rotation_range=30, # rotation
					shear_range=0.3,  # distorting the image 
					zoom_range=0.3,  # zooming
					width_shift_range=0.4,
					height_shift_range=0.4,
					horizontal_flip=True, # flips both rows and columns horizontally 
					fill_mode='nearest')

validation_datagen = ImageDataGenerator(rescale=1./255)


In [15]:
train_generator = train_datagen.flow_from_directory(
					train_data_dir,
					color_mode='grayscale',
					target_size=(img_rows,img_cols),
					batch_size=batch_size,
					class_mode='categorical',
					shuffle=True)

validation_generator = validation_datagen.flow_from_directory(
							validation_data_dir,
							color_mode='grayscale',
							target_size=(img_rows,img_cols),
							batch_size=batch_size,
							class_mode='categorical',
							shuffle=True)

Found 28709 images belonging to 7 classes.
Found 7178 images belonging to 7 classes.


As you can see we have **28709** images which correspond to **7** different moods in our dataset where will be used as the training samples and **7178** images will be used as the validation samples.



## MODEL ARCHITECTURE AND TRAINING
While deep learning is certainly not new, it is experiencing explosive growth because of the intersection of deeply layered neural networks and the use of GPUs to accelerate their execution.

---
We decide to create a deep architecure to be able to train a model with good accuracy. To do so we are going to take the most use out of some callback function in the keras api.
- <b>`ModelCheckpoint`</b> 🏁: callback is used in conjunction with training using `model.fit()` to save a model or weights (in a checkpoint file) at some interval, so the model or weights can be loaded later to continue the training from the state saved.

- <b>`EarlyStopping` </b>🚦: Assuming the goal of a training is to minimize the loss. With this, the metric to be monitored would be `loss`, and mode would be `min`. A `model.fit()` training loop will check at end of every epoch whether the `loss` is no longer decreasing, considering the `min_delta` and patience if applicable. Once it's found no longer decreasing,the training terminates.

- <b> `ReduceLROnPlateau` </b> 𒑈: Models often benefit from reducing the learning rate by a factor of 2-10 once learning decreases. This callback monitors a quantity and if no improvement is seen for a `patience` number of epochs, the learning rate is reduced.


In [17]:
# reduces learning rate if no improvement are seen
learning_rate_reduction = ReduceLROnPlateau(monitor='val_loss',
                              factor=0.2,
                              patience=3,
                              verbose=1,
                              min_delta=0.0001)

# stop training if no improvements are seen
earlystop = EarlyStopping(monitor='val_loss',
                          min_delta=0,
                          patience=3,
                          verbose=1,
                          restore_best_weights=True
                          )

# saves model weights to file
checkpoint = ModelCheckpoint(os.path.join('/content/gdrive/MyDrive/Multimodal_Interaction/model', 'cp-{epoch:04d}.h5'),
                            monitor='val_loss',
                            verbose=1,
                            save_best_only=True,
                            mode='min',
                            )

callbacks = [earlystop,checkpoint,learning_rate_reduction]

### MODEL ARCHITECTURE

*   contains 4 convolutional layers
*   2 dense layers
*   and an output (classification) layer



In [16]:
model = Sequential()

################################.    1st CONVOLUTIONAL LAYER     ################################

model.add(Conv2D(32,(3,3),padding='same',kernel_initializer='he_normal',input_shape=(img_rows,img_cols,1)))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(Conv2D(32,(3,3),padding='same',kernel_initializer='he_normal',input_shape=(img_rows,img_cols,1)))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))

################################.    2nd CONVOLUTIONAL LAYER     ################################

model.add(Conv2D(64,(3,3),padding='same',kernel_initializer='he_normal'))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(Conv2D(64,(3,3),padding='same',kernel_initializer='he_normal'))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))

################################.    3rd CONVOLUTIONAL LAYER     ################################

model.add(Conv2D(128,(3,3),padding='same',kernel_initializer='he_normal'))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(Conv2D(128,(3,3),padding='same',kernel_initializer='he_normal'))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))

################################.    4th CONVOLUTIONAL LAYER     ################################

model.add(Conv2D(256,(3,3),padding='same',kernel_initializer='he_normal'))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(Conv2D(256,(3,3),padding='same',kernel_initializer='he_normal'))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))

################################    FLATTEN, FOR DENSE LAYER     ################################

model.add(Flatten())

################################.    DENSE LAYERS: 1st LAYER     ################################

model.add(Dense(64,kernel_initializer='he_normal'))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))

################################.    DENSE LAYERS: 2nd LAYER     ################################

model.add(Dense(64,kernel_initializer='he_normal'))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))

################################.    OUTPUT LAYER: CLASSES    ################################

model.add(Dense(num_classes,kernel_initializer='he_normal'))
model.add(Activation('softmax'))

print(model.summary())


Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_8 (Conv2D)           (None, 48, 48, 32)        320       
                                                                 
 activation_11 (Activation)  (None, 48, 48, 32)        0         
                                                                 
 batch_normalization_10 (Bat  (None, 48, 48, 32)       128       
 chNormalization)                                                
                                                                 
 conv2d_9 (Conv2D)           (None, 48, 48, 32)        9248      
                                                                 
 activation_12 (Activation)  (None, 48, 48, 32)        0         
                                                                 
 batch_normalization_11 (Bat  (None, 48, 48, 32)       128       
 chNormalization)                                     

### TRAINING

We are ready to start the training phase. At the first, we compile the model.


In [18]:
model.compile(loss='categorical_crossentropy',
              optimizer = Adam(lr=0.001),
              metrics=['accuracy'])

  super(Adam, self).__init__(name, **kwargs)


In [19]:
nb_train_samples = 28709
nb_validation_samples = 7178
epochs=150

In [20]:
history=model.fit_generator(
                train_generator,
                steps_per_epoch=nb_train_samples//batch_size,
                epochs=epochs,
                callbacks=callbacks,
                validation_data=validation_generator,
                validation_steps=nb_validation_samples//batch_size)

Epoch 1/150


  import sys


Epoch 1: val_loss improved from inf to 1.79494, saving model to /content/gdrive/MyDrive/Multimodal_Interaction/model/cp-0001.h5
Epoch 2/150
Epoch 2: val_loss improved from 1.79494 to 1.77302, saving model to /content/gdrive/MyDrive/Multimodal_Interaction/model/cp-0002.h5
Epoch 3/150
Epoch 3: val_loss improved from 1.77302 to 1.76187, saving model to /content/gdrive/MyDrive/Multimodal_Interaction/model/cp-0003.h5
Epoch 4/150
Epoch 4: val_loss improved from 1.76187 to 1.71743, saving model to /content/gdrive/MyDrive/Multimodal_Interaction/model/cp-0004.h5
Epoch 5/150
Epoch 5: val_loss did not improve from 1.71743
Epoch 6/150
Epoch 6: val_loss improved from 1.71743 to 1.70346, saving model to /content/gdrive/MyDrive/Multimodal_Interaction/model/cp-0006.h5
Epoch 7/150
Epoch 7: val_loss improved from 1.70346 to 1.61246, saving model to /content/gdrive/MyDrive/Multimodal_Interaction/model/cp-0007.h5
Epoch 8/150
Epoch 8: val_loss improved from 1.61246 to 1.40049, saving model to /content/gdri

# OUTPUT

Our output is a model with the name of "cp-0017.h5" which will be used as our model for **Live Emotion Detection** in our Fatcha Application.