# GESTURE RECOGNITION


# Overview

In this group project, you are going to build a 3D Conv model that will be able to predict the 5 gestures correctly. Please import the following libraries to get started.

In [1]:
# Import the needed Libraries

import numpy as np
import os
from scipy.misc import imread, imresize
import datetime
import os

# Supress all the warnings

import warnings
warnings.filterwarnings('ignore')


import cv2
import matplotlib.pyplot as plt
% matplotlib inline

We set the random seed so that the results don't vary drastically.

In [2]:
# Import the random seet and keras, tensorflow

np.random.seed(30)
import random as rn
rn.seed(30)
from keras import backend as K
import tensorflow as tf
tf.set_random_seed(30)

Using TensorFlow backend.


In this block, you read the folder names for training and validation. You also set the `batch_size` here. Note that you set the batch size in such a way that you are able to use the GPU in full capacity. You keep increasing the batch size until the machine throws an error.

In [3]:
## We have changed the right path where the files are stored above.

train_doc = np.random.permutation(open('./Project_data/train.csv').readlines())
val_doc = np.random.permutation(open('./Project_data/val.csv').readlines())
batch_size =  40 #experiment with the batch size


## Generator
This is one of the most important part of the code. The overall structure of the generator has been given. In the generator, you are going to preprocess the images as you have images of 2 different dimensions as well as create a batch of video frames. You have to experiment with `img_idx`, `y`,`z` and normalization such that you get high accuracy.

In [4]:
##Generator function for input data without augmentation.

x = 30 # No. of frames images
y = 120 # Width of the image
z = 120 # height

def generator(source_path, folder_list, batch_size):
    img_idx = [x for x in range(0,x)] #create a list of image numbers you want to use for a particular video
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(folder_list)//batch_size # calculate the number of batches
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,x,y,z,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    # Let us resize all the images.Let's use PIL.Image.NEAREST (use nearest neighbour) resampling filter. 
                    resized_image = imresize(image,(y,z)) ##default resample=1 or 'P' which indicates PIL.Image.NEAREST
                    resized_image = resized_image/255
                    
                    batch_data[folder,idx,:,:,0] = (resized_image[:,:,0])#normalise and feed in the image
                    batch_data[folder,idx,:,:,1] = (resized_image[:,:,1])#normalise and feed in the image
                    batch_data[folder,idx,:,:,2] = (resized_image[:,:,2])#normalise and feed in the image
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        
        # write the code for the remaining data points which are left after full batches        
        if (len(folder_list) != batch_size*num_batches):
            batch_size = len(folder_list) - (batch_size*num_batches)
            batch_data = np.zeros((batch_size,x,y,z,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    resized_image = imresize(image,(y,z)) ##default resample=1 or 'P' which indicates PIL.Image.NEAREST
                    resized_image = resized_image/255 #Normalize data
                    
                    batch_data[folder,idx,:,:,0] = (resized_image[:,:,0])
                    batch_data[folder,idx,:,:,1] = (resized_image[:,:,1])
                    batch_data[folder,idx,:,:,2] = (resized_image[:,:,2])
                   
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels


Note here that a video is represented above in the generator as (number of images, height, width, number of channels). Take this into consideration while creating the model architecture.

In [5]:
curr_dt_time = datetime.datetime.now()
train_path = './Project_data/train'
val_path = './Project_data/val'
num_train_sequences = len(train_doc)
print('# Training_Sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# Validation_Sequences =', num_val_sequences)
num_epochs = 15 # choose the number of epochs
print ('# Epochs = ', num_epochs)

# Training_Sequences = 663
# Validation_Sequences = 100
# Epochs =  15


## Model
Here you make the model using different functionalities that Keras provides. Remember to use `Conv3D` and `MaxPooling3D` and not `Conv2D` and `Maxpooling2D` for a 3D convolution model. You would want to use `TimeDistributed` while building a Conv2D + RNN model. Also remember that the last layer is the softmax. Design the network in such a way that the model is able to give good accuracy on the least number of parameters so that it can fit in the memory of the webcam.

## Model 1 : Basic Model to test if our network is working

### ***Model Summary***

- Batch Size : 40 
- Image Height : 120 
- Image Width : 120 
- Epochs - 15 
- Optimizer - Adam 

In [6]:
# Let us import all the needed libraries of Keras.

from keras.models import Sequential, Model
from keras.layers import Dense, GRU, Flatten, TimeDistributed, Flatten, BatchNormalization, Activation,Dropout
from keras.layers.convolutional import Conv3D, MaxPooling3D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras import optimizers

#write your model here
# Input all the images sequential by building the layer with dropouts and batchnormalisation

model_1 = Sequential()       
model_1.add(Conv3D(8,kernel_size=(3,3,3),input_shape=(30, 120, 120, 3),padding='same'))
model_1.add(BatchNormalization())
model_1.add(Activation('relu'))

model_1.add(Conv3D(16, (3, 3, 3), padding='same'))
model_1.add(Activation('relu'))
model_1.add(BatchNormalization())
model_1.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_1.add(Conv3D(32, (2, 2, 2), padding='same'))
model_1.add(Activation('relu'))
model_1.add(BatchNormalization())
model_1.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_1.add(Conv3D(64, (2, 2, 2), padding='same'))
model_1.add(Activation('relu'))
model_1.add(BatchNormalization())
model_1.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_1.add(Conv3D(128, (2, 2, 2), padding='same'))
model_1.add(Activation('relu'))
model_1.add(BatchNormalization())
model_1.add(MaxPooling3D(pool_size=(2, 2, 2)))      

# Flatten layer 

model_1.add(Flatten())

model_1.add(Dense(1000, activation='relu'))
model_1.add(Dropout(0.5))

model_1.add(Dense(500, activation='relu'))
model_1.add(Dropout(0.5))

#Softmax layer

model_1.add(Dense(5, activation='softmax'))
        
        

Now that you have written the model, the next step is to `compile` the model. When you print the `summary` of the model, you'll see the total number of parameters you have to train.

In [7]:
# Let us use the Adam optimiser 

optimiser = optimizers.Adam(lr=0.001) #write your optimizer
model_1.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_1.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_1 (Conv3D)            (None, 30, 120, 120, 8)   656       
_________________________________________________________________
batch_normalization_1 (Batch (None, 30, 120, 120, 8)   32        
_________________________________________________________________
activation_1 (Activation)    (None, 30, 120, 120, 8)   0         
_________________________________________________________________
conv3d_2 (Conv3D)            (None, 30, 120, 120, 16)  3472      
_________________________________________________________________
activation_2 (Activation)    (None, 30, 120, 120, 16)  0         
_________________________________________________________________
batch_normalization_2 (Batch (None, 30, 120, 120, 16)  64        
_________________________________________________________________
max_pooling3d_1 (MaxPooling3 (None, 15, 60, 60, 16)    0         
__________

Let us create the `train_generator` and the `val_generator` which will be used in `.fit_generator`.

In [10]:
# Let us train and validate the model 

train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [11]:
# Let us see the Validate the Losses and put back the checkpoint

model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=2, cooldown=1, verbose=1) # write the REducelronplateau code here
callbacks_list = [checkpoint, LR]

The `steps_per_epoch` and `validation_steps` are used by `fit_generator` to decide the number of next() calls it need to make.

In [12]:
# Let us see that the steps_per_epoch and validation steps are used by fit_generator to decide the no. of next()

if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

Let us now fit the model. This will start training the model and with the help of the checkpoints, you'll be able to save the model at the end of each epoch.

In [11]:
# Let us fit the model

model_1.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Epoch 1/15

Epoch 00001: saving model to model_init_2021-02-0709_59_17.824524/model-00001-7.52395-0.28808-4.66847-0.50000.h5
Epoch 2/15

Epoch 00002: saving model to model_init_2021-02-0709_59_17.824524/model-00002-4.32650-0.46547-6.08896-0.36667.h5
Epoch 3/15

Epoch 00003: saving model to model_init_2021-02-0709_59_17.824524/model-00003-2.81722-0.48501-2.62826-0.45000.h5
Epoch 4/15

Epoch 00004: saving model to model_init_2021-02-0709_59_17.824524/model-00004-2.26966-0.52322-2.05737-0.55000.h5
Epoch 5/15

Epoch 00005: saving model to model_init_2021-02-0709_59_17.824524/model-00005-1.69851-0.57508-2.03499-0.38333.h5
Epoch 6/15

Epoch 00006: saving model to model_init_2021-02-0709_59_17.824524/model-00006-1.48170-0.60208-1.15868-0.60000.h5
Epoch 7/15

Epoch 00007: saving model to model_init_2021-02-0709_59_17.824524/model-00007-1.24578-0.62976-2.08285-0.48333.h5
Epoch 8/15

Epoch 00008: saving model to model_init_2021-02-0709_59_17.824524/model-00008-1.05035-0.67820-1.53517-0.61667.h5


<keras.callbacks.History at 0x7f8a89509908>

### Results: 

- ***Best Training Accuracy - 86.85 %***
- ***Best Validation Accuracy - 75.00 %***


This is the most initial model, it is difficult to conclude that this will be our final model without hyper parameter tuning, so in upcoming models , we will experiment with different hyperparameters.


## Model 2 : Hyperparameter Tuned : Activation Function and Optimiser




- Changing activation function from 'Relu' to 'elu' since we are not working on non negative data, ELU can also be a good activation function. 

- Changing optimiser to SGD to check how the data is divided into smaller batches and uses a stochastic gradient descent algorithm

### ***Model Summary***

- Batch Size : 40 
- Image Height : 120 
- Image Width : 120 
- Epochs - 15 
- Optimizer - SGD
- Activation Function : ELU 

In [12]:
# Input all the images sequencial by building the layer with dropouts and batchnormalisation

model_2 = Sequential()       
model_2.add(Conv3D(8,kernel_size=(3,3,3),input_shape=(30, 120, 120, 3),padding='same'))
model_2.add(BatchNormalization())
model_2.add(Activation('elu'))

model_2.add(Conv3D(16, (3, 3, 3), padding='same'))
model_2.add(Activation('elu'))
model_2.add(BatchNormalization())
model_2.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_2.add(Conv3D(32, (2, 2, 2), padding='same'))
model_2.add(Activation('elu'))
model_2.add(BatchNormalization())
model_2.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_2.add(Conv3D(64, (2, 2, 2), padding='same'))
model_2.add(Activation('elu'))
model_2.add(BatchNormalization())
model_2.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_2.add(Conv3D(128, (2, 2, 2), padding='same'))
model_2.add(Activation('elu'))
model_2.add(BatchNormalization())
model_2.add(MaxPooling3D(pool_size=(2, 2, 2)))      

# Flatten layer 

model_2.add(Flatten())

model_2.add(Dense(1000, activation='elu'))
model_2.add(Dropout(0.5))

model_2.add(Dense(500, activation='elu'))
model_2.add(Dropout(0.5))

#Softmax layer

model_2.add(Dense(5, activation='softmax'))

# Let us use the Adam optimiser 

optimiser =optimizers.SGD(lr=0.001, decay=1e-6, momentum=0.7, nesterov=True) #write your optimizer
model_2.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_2.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_6 (Conv3D)            (None, 30, 120, 120, 8)   656       
_________________________________________________________________
batch_normalization_6 (Batch (None, 30, 120, 120, 8)   32        
_________________________________________________________________
activation_6 (Activation)    (None, 30, 120, 120, 8)   0         
_________________________________________________________________
conv3d_7 (Conv3D)            (None, 30, 120, 120, 16)  3472      
_________________________________________________________________
activation_7 (Activation)    (None, 30, 120, 120, 16)  0         
_________________________________________________________________
batch_normalization_7 (Batch (None, 30, 120, 120, 16)  64        
_________________________________________________________________
max_pooling3d_5 (MaxPooling3 (None, 15, 60, 60, 16)    0         
__________

In [13]:
# Let us train and validate the model 

train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [14]:
# Let us fit the model

model_2.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Epoch 1/15

Epoch 00001: saving model to model_init_2021-02-0709_59_17.824524/model-00001-3.20075-0.27602-1.45831-0.42000.h5
Epoch 2/15

Epoch 00002: saving model to model_init_2021-02-0709_59_17.824524/model-00002-2.15202-0.44757-1.00729-0.60000.h5
Epoch 3/15

Epoch 00003: saving model to model_init_2021-02-0709_59_17.824524/model-00003-1.80156-0.49591-1.11126-0.61667.h5
Epoch 4/15

Epoch 00004: saving model to model_init_2021-02-0709_59_17.824524/model-00004-1.54384-0.54799-0.92820-0.61667.h5
Epoch 5/15

Epoch 00005: saving model to model_init_2021-02-0709_59_17.824524/model-00005-1.29626-0.57188-1.25243-0.58333.h5
Epoch 6/15

Epoch 00006: saving model to model_init_2021-02-0709_59_17.824524/model-00006-1.21425-0.62284-1.38362-0.60000.h5

Epoch 00006: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
Epoch 7/15

Epoch 00007: saving model to model_init_2021-02-0709_59_17.824524/model-00007-1.27329-0.61592-1.13523-0.58333.h5
Epoch 8/15

Epoch 00008: saving model to mod

<keras.callbacks.History at 0x7f8a2823a898>

### Results: 

- ***Best Training Accuracy - 75.4 %***
- ***Best Validation Accuracy - 71.67 %***


From our model we can see that our validation accuracy is considerably higher than training accuracy in most of the cases. 

- This could be due to high dropouts since we are using 0.5 
- Very basic model with inadequate amount of data
- Indicates high bias in the neural network 


For all the above reasons, it is better to tune more hyperparamters with the initial model (Model - 1) 

## Model -  3 Hyperparameter Tuned : Epochs (Model 1 - Initial Model)

### ***Model Summary***

- Batch Size : 40 
- Image Height : 120 
- Image Width : 120 
- Epochs - 25 
- Optimizer - Adam 

In [15]:
# Let us experiment different x,y,z value in the CNN network and find tune all the image size & Hyperparameters later

x = 30 # number of frames
y = 120 # image width
z = 120 # image height

# Input all the images sequencial by building the layer with dropouts and batchnormalisation

model_3 = Sequential()       
model_3.add(Conv3D(8,kernel_size=(3,3,3),input_shape=(x,y,z,3),padding='same'))
model_3.add(BatchNormalization())
model_3.add(Activation('relu'))

model_3.add(Conv3D(16, (3, 3, 3), padding='same'))
model_3.add(Activation('relu'))
model_3.add(BatchNormalization())
model_3.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_3.add(Conv3D(32, (2, 2, 2), padding='same'))
model_3.add(Activation('relu'))
model_3.add(BatchNormalization())
model_3.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_3.add(Conv3D(64, (2, 2, 2), padding='same'))
model_3.add(Activation('relu'))
model_3.add(BatchNormalization())
model_3.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_3.add(Conv3D(128, (2, 2, 2), padding='same'))
model_3.add(Activation('relu'))
model_3.add(BatchNormalization())
model_3.add(MaxPooling3D(pool_size=(2, 2, 2)))      

# Flatten layer 

model_3.add(Flatten())

model_3.add(Dense(1000, activation='relu'))
model_3.add(Dropout(0.5))

model_3.add(Dense(500, activation='relu'))
model_3.add(Dropout(0.55))

#Softmax layer

model_3.add(Dense(5, activation='softmax'))

# Let us use the Adam optimiser 

optimiser = optimizers.Adam(lr=0.001) #write your optimizer
model_3.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_3.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_11 (Conv3D)           (None, 30, 120, 120, 8)   656       
_________________________________________________________________
batch_normalization_11 (Batc (None, 30, 120, 120, 8)   32        
_________________________________________________________________
activation_11 (Activation)   (None, 30, 120, 120, 8)   0         
_________________________________________________________________
conv3d_12 (Conv3D)           (None, 30, 120, 120, 16)  3472      
_________________________________________________________________
activation_12 (Activation)   (None, 30, 120, 120, 16)  0         
_________________________________________________________________
batch_normalization_12 (Batc (None, 30, 120, 120, 16)  64        
_________________________________________________________________
max_pooling3d_9 (MaxPooling3 (None, 15, 60, 60, 16)    0         
__________

In [16]:
# Let us train and validate the model 

train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [17]:
## Let us fit the model

model_3.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=25, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Epoch 1/25

Epoch 00001: saving model to model_init_2021-02-0709_59_17.824524/model-00001-8.00286-0.30618-12.36093-0.19000.h5
Epoch 2/25

Epoch 00002: saving model to model_init_2021-02-0709_59_17.824524/model-00002-7.87575-0.38363-9.95596-0.33333.h5
Epoch 3/25

Epoch 00003: saving model to model_init_2021-02-0709_59_17.824524/model-00003-6.73049-0.44687-8.06205-0.35000.h5
Epoch 4/25

Epoch 00004: saving model to model_init_2021-02-0709_59_17.824524/model-00004-6.30702-0.42415-4.14065-0.58333.h5
Epoch 5/25

Epoch 00005: saving model to model_init_2021-02-0709_59_17.824524/model-00005-5.33075-0.42812-3.96228-0.46667.h5
Epoch 6/25

Epoch 00006: saving model to model_init_2021-02-0709_59_17.824524/model-00006-4.18274-0.49827-3.06646-0.55000.h5
Epoch 7/25

Epoch 00007: saving model to model_init_2021-02-0709_59_17.824524/model-00007-3.86401-0.50519-3.05288-0.51667.h5
Epoch 8/25

Epoch 00008: saving model to model_init_2021-02-0709_59_17.824524/model-00008-2.77618-0.53979-3.37590-0.46667.h5

<keras.callbacks.History at 0x7f8a730c6160>

### Results: 

- ***Best Training Accuracy - 82.7 %***
- ***Best Validation Accuracy - 80.00 %***

We can clearly see that increasing epoch have increased accuracy.

The above are best values we got in model-3. Going with epoch-25's values as the difference between training and validation accuracy is <5%.

The computation time increases with the number of epochs, however the accuracy also increases and gradually the model runs better. 

Currently, we have obtained our best h5 model file.

### Model 3 is considered as the Base Model and the hyper parameters are trained on this.

## Model - 4 : Hyperparameter Tuned : Increase image height and width

### ***Model Summary***

- Batch Size : 40 
- Image Height : 160 
- Image Width : 160 
- Epochs - 25
- Optimizer - Adam 

In [18]:
# Let us experiment different x,y,z value in the CNN network and find tune all the image size & Hyperparameters later

x = 30 # number of frames
y = 160 # image width
z = 160 # image height

# Input all the images sequencial by building the layer with dropouts and batchnormalisation

model_4 = Sequential()       
model_4.add(Conv3D(8,kernel_size=(3,3,3),input_shape=(x,y,z,3),padding='same'))
model_4.add(BatchNormalization())
model_4.add(Activation('relu'))

model_4.add(Conv3D(16, (3, 3, 3), padding='same'))
model_4.add(Activation('relu'))
model_4.add(BatchNormalization())
model_4.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_4.add(Conv3D(32, (2, 2, 2), padding='same'))
model_4.add(Activation('relu'))
model_4.add(BatchNormalization())
model_4.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_4.add(Conv3D(64, (2, 2, 2), padding='same'))
model_4.add(Activation('relu'))
model_4.add(BatchNormalization())
model_4.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_4.add(Conv3D(128, (2, 2, 2), padding='same'))
model_4.add(Activation('relu'))
model_4.add(BatchNormalization())
model_4.add(MaxPooling3D(pool_size=(2, 2, 2)))      

# Flatten layer 

model_4.add(Flatten())

model_4.add(Dense(1000, activation='relu'))
model_4.add(Dropout(0.5))

model_4.add(Dense(500, activation='relu'))
model_4.add(Dropout(0.5))

#Softmax layer

model_4.add(Dense(5, activation='softmax'))

# Let us use the Adam optimiser 

optimiser = optimizers.Adam(lr=0.001) #write your optimizer
model_4.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_4.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_16 (Conv3D)           (None, 30, 160, 160, 8)   656       
_________________________________________________________________
batch_normalization_16 (Batc (None, 30, 160, 160, 8)   32        
_________________________________________________________________
activation_16 (Activation)   (None, 30, 160, 160, 8)   0         
_________________________________________________________________
conv3d_17 (Conv3D)           (None, 30, 160, 160, 16)  3472      
_________________________________________________________________
activation_17 (Activation)   (None, 30, 160, 160, 16)  0         
_________________________________________________________________
batch_normalization_17 (Batc (None, 30, 160, 160, 16)  64        
_________________________________________________________________
max_pooling3d_13 (MaxPooling (None, 15, 80, 80, 16)    0         
__________

In [19]:
# Let us train and validate the model 

train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [20]:
##Let us fit the model

##Commenting as it will throw OOM error
##model_4.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=25, verbose=1, 
#                    callbacks=callbacks_list, validation_data=val_generator, 
#                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Epoch 1/25


ResourceExhaustedError: OOM when allocating tensor with shape[40,16,30,160,160] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[Node: max_pooling3d_13/MaxPool3D = MaxPool3D[T=DT_FLOAT, _class=["loc:@training_3/Adam/gradients/batch_normalization_17/cond/Merge_grad/cond_grad"], data_format="NDHWC", ksize=[1, 2, 2, 2, 1], padding="VALID", strides=[1, 2, 2, 2, 1], _device="/job:localhost/replica:0/task:0/device:GPU:0"](batch_normalization_17/cond/Merge)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	 [[Node: metrics_3/categorical_accuracy/Mean/_2133 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_3226_metrics_3/categorical_accuracy/Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


### Results: 

***OOM Error - ResourceExhaustedError: OOM when allocating tensor with shape[40,16,30,160,160] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc***

- ***Best Training Accuracy - None %***
- ***Best Validation Accuracy - None %***

By increasing the memory the tensor size increases and the GPU outputs results in OOM exception, the resources are not enough to run the tensor.

## Model- 5 : Hyperparameter Tuned  : Image Dimension 

### Since 160 x 160 threw OOM exception, reduced it to 140 x 140 

### ***Model Summary***

- Batch Size : 40 
- Image Height : 140 
- Image Width : 140 
- Epochs - 25
- Optimizer - Adam 

In [22]:
# Let us experiment different x,y,z value in the CNN network and find tune all the image size & Hyperparameters later

x = 30 # number of frames
y = 140 # image width
z = 140 # image height

# Input all the images sequencial by building the layer with dropouts and batchnormalisation

model_5 = Sequential()       
model_5.add(Conv3D(8,kernel_size=(3,3,3),input_shape=(x,y,z,3),padding='same'))
model_5.add(BatchNormalization())
model_5.add(Activation('relu'))

model_5.add(Conv3D(16, (3, 3, 3), padding='same'))
model_5.add(Activation('relu'))
model_5.add(BatchNormalization())
model_5.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_5.add(Conv3D(32, (2, 2, 2), padding='same'))
model_5.add(Activation('relu'))
model_5.add(BatchNormalization())
model_5.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_5.add(Conv3D(64, (2, 2, 2), padding='same'))
model_5.add(Activation('relu'))
model_5.add(BatchNormalization())
model_5.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_5.add(Conv3D(128, (2, 2, 2), padding='same'))
model_5.add(Activation('relu'))
model_5.add(BatchNormalization())
model_5.add(MaxPooling3D(pool_size=(2, 2, 2)))      

# Flatten layer 

model_5.add(Flatten())

model_5.add(Dense(1000, activation='relu'))
model_5.add(Dropout(0.5))

model_5.add(Dense(500, activation='relu'))
model_5.add(Dropout(0.5))

#Softmax layer

model_5.add(Dense(5, activation='softmax'))

# Let us use the Adam optimiser 

optimiser = optimizers.Adam(lr=0.001) #write your optimizer
model_5.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_5.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_26 (Conv3D)           (None, 30, 140, 140, 8)   656       
_________________________________________________________________
batch_normalization_26 (Batc (None, 30, 140, 140, 8)   32        
_________________________________________________________________
activation_26 (Activation)   (None, 30, 140, 140, 8)   0         
_________________________________________________________________
conv3d_27 (Conv3D)           (None, 30, 140, 140, 16)  3472      
_________________________________________________________________
activation_27 (Activation)   (None, 30, 140, 140, 16)  0         
_________________________________________________________________
batch_normalization_27 (Batc (None, 30, 140, 140, 16)  64        
_________________________________________________________________
max_pooling3d_21 (MaxPooling (None, 15, 70, 70, 16)    0         
__________

In [23]:
# Let us train and validate the model 

train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [24]:
## Let us fit the model

##Commenting as it will throw OOM error

#model_5.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=25, verbose=1, 
#                    callbacks=callbacks_list, validation_data=val_generator, 
#                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Epoch 1/25


ResourceExhaustedError: OOM when allocating tensor with shape[40,16,30,140,140] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[Node: training_4/Adam/gradients/max_pooling3d_21/MaxPool3D_grad/MaxPool3DGrad = MaxPool3DGrad[T=DT_FLOAT, TInput=DT_FLOAT, _class=["loc:@training_4/Adam/gradients/batch_normalization_27/cond/Merge_grad/cond_grad"], data_format="NDHWC", ksize=[1, 2, 2, 2, 1], padding="VALID", strides=[1, 2, 2, 2, 1], _device="/job:localhost/replica:0/task:0/device:GPU:0"](batch_normalization_27/cond/Merge, max_pooling3d_21/MaxPool3D, training_4/Adam/gradients/conv3d_28/convolution_grad/Conv3DBackpropInputV2)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


### Results: 

***OOM Error - ResourceExhaustedError: OOM when allocating tensor with shape[40,16,30,160,160] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc***

- ***Best Training Accuracy - None %***
- ***Best Validation Accuracy - None %***

By increasing the memory the tensor size increases and the GPU outputs results in OOM exception, the resources are not enough to run the tensor.

## Model - 6 : Hyperparameter Tuned : Batch Size ( Increased to 50 ) 

### ***Model Summary***

- Batch Size : 50
- Image Height : 120
- Image Width : 120
- Epochs - 25
- Optimizer - Adam 

In [26]:
# Let us experiment different x,y,z value in the CNN network and find tune all the image size & Hyperparameters later

x = 30 # number of frames
y = 120 # image width
z = 120 # image height

# Input all the images sequencial by building the layer with dropouts and batchnormalisation

model_6 = Sequential()       
model_6.add(Conv3D(8,kernel_size=(3,3,3),input_shape=(x,y,z,3),padding='same'))
model_6.add(BatchNormalization())
model_6.add(Activation('relu'))

model_6.add(Conv3D(16, (3, 3, 3), padding='same'))
model_6.add(Activation('relu'))
model_6.add(BatchNormalization())
model_6.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_6.add(Conv3D(32, (2, 2, 2), padding='same'))
model_6.add(Activation('relu'))
model_6.add(BatchNormalization())
model_6.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_6.add(Conv3D(64, (2, 2, 2), padding='same'))
model_6.add(Activation('relu'))
model_6.add(BatchNormalization())
model_6.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_6.add(Conv3D(128, (2, 2, 2), padding='same'))
model_6.add(Activation('relu'))
model_6.add(BatchNormalization())
model_6.add(MaxPooling3D(pool_size=(2, 2, 2)))      

# Flatten layer 

model_6.add(Flatten())

model_6.add(Dense(1000, activation='relu'))
model_6.add(Dropout(0.5))

model_6.add(Dense(500, activation='relu'))
model_6.add(Dropout(0.5))

#Softmax layer

model_6.add(Dense(5, activation='softmax'))

# Let us use the Adam optimiser 

optimiser = optimizers.Adam(lr=0.001) #write your optimizer
model_6.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_6.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_31 (Conv3D)           (None, 30, 120, 120, 8)   656       
_________________________________________________________________
batch_normalization_31 (Batc (None, 30, 120, 120, 8)   32        
_________________________________________________________________
activation_31 (Activation)   (None, 30, 120, 120, 8)   0         
_________________________________________________________________
conv3d_32 (Conv3D)           (None, 30, 120, 120, 16)  3472      
_________________________________________________________________
activation_32 (Activation)   (None, 30, 120, 120, 16)  0         
_________________________________________________________________
batch_normalization_32 (Batc (None, 30, 120, 120, 16)  64        
_________________________________________________________________
max_pooling3d_25 (MaxPooling (None, 15, 60, 60, 16)    0         
__________

In [None]:
# Let us train and validate the model 
batch_size = 50
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [None]:
#Commenting out as it was causing OOM error.

#model_6.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=25, verbose=1, 
#                    callbacks=callbacks_list, validation_data=val_generator, 
#                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

### Results: 

***ResourceExhaustedError: OOM when allocating tensor with shape[50,16,30,120,120] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc***

- ***Best Training Accuracy - None %***
- ***Best Validation Accuracy - None %***

By increasing the memory the tensor size increases and the GPU outputs results in OOM exception, the resources are not enough to run the tensor.

## Model - 7 : Hyperparameter Tuned : Batch Size (Decreasing to 15)

### ***Model Summary***

- Batch Size : 15
- Image Height : 120
- Image Width : 120
- Epochs - 25
- Optimizer - Adam 

In [27]:
# Let us train and validate the model 
batch_size = 15
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

##Reusing model-6 as the overall architecture remains the same but only the batch_size is changed.
model_6.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=25, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Epoch 1/25

Epoch 00001: saving model to model_init_2021-02-0709_59_17.824524/model-00001-10.40410-0.25098-11.81994-0.26667.h5
Epoch 2/25

Epoch 00002: saving model to model_init_2021-02-0709_59_17.824524/model-00002-11.11381-0.27059-12.53630-0.22222.h5
Epoch 3/25

Epoch 00003: saving model to model_init_2021-02-0709_59_17.824524/model-00003-8.92132-0.41520-12.89448-0.20000.h5

Epoch 00003: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
Epoch 4/25

Epoch 00004: saving model to model_init_2021-02-0709_59_17.824524/model-00004-9.49549-0.39216-7.00489-0.56667.h5
Epoch 5/25

Epoch 00005: saving model to model_init_2021-02-0709_59_17.824524/model-00005-11.19090-0.29412-12.63675-0.20000.h5
Epoch 6/25

Epoch 00006: saving model to model_init_2021-02-0709_59_17.824524/model-00006-11.37748-0.29412-13.48479-0.13333.h5

Epoch 00006: ReduceLROnPlateau reducing learning rate to 0.0002500000118743628.
Epoch 7/25

Epoch 00007: saving model to model_init_2021-02-0709_59_17.824524/m

<keras.callbacks.History at 0x7f8a6f694f98>

### Results: 


- ***Best Training Accuracy - 43.14 %***
- ***Best Validation Accuracy - 36.67 %***

By decreasing the number of video sequences in each batch, i.e batch size the model is not able to learn the data, and the overall accuracy is very poor to be selected as the base model.

The difference between training accuracy and validation accuracy is also very poor. 

# Architecture 2 - Conv2D + RNN 

## Model - 8 : Use of Conv2D + RNN architecture

### ***Model Summary***

- Batch Size : 40
- Image Height : 120
- Image Width : 120
- Epochs - 25
- 4 Layers of Conv2D + LSTM + Dense + Softmax

In [28]:
# Let us import all the needed libraries of Keras.

from keras.models import Sequential, Model
from keras.layers import Dense, GRU, Flatten, TimeDistributed, Flatten, BatchNormalization, Activation,Dropout
from keras.layers.convolutional import Conv3D, MaxPooling3D, Conv2D, MaxPooling2D
from keras.layers.recurrent import LSTM
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras import optimizers


# Let us experiment different x,y,z value in the CNNLSTM network and find tune all the image size & Hyperparameters later

x = 30 # number of frames
y = 120 # image width
z = 120 # image height

# Input all the images sequencial by building the layer with dropouts and batchnormalisation

model_8 = Sequential()   
model_8.add(TimeDistributed(Conv2D(16, (3, 3),padding='same', activation='relu'),input_shape=(x,y,z,3)))
model_8.add(TimeDistributed(BatchNormalization()))
model_8.add(TimeDistributed(MaxPooling2D((2, 2))))
        
model_8.add(TimeDistributed(Conv2D(32, (3, 3) , padding='same', activation='relu')))
model_8.add(TimeDistributed(BatchNormalization()))
model_8.add(TimeDistributed(MaxPooling2D((2, 2))))
        
model_8.add(TimeDistributed(Conv2D(64, (3, 3) , padding='same', activation='relu')))
model_8.add(TimeDistributed(BatchNormalization()))
model_8.add(TimeDistributed(MaxPooling2D((2, 2))))
        
model_8.add(TimeDistributed(Conv2D(128, (3, 3) , padding='same', activation='relu')))
model_8.add(TimeDistributed(BatchNormalization()))
model_8.add(TimeDistributed(MaxPooling2D((2, 2))))

# Flatten layer 

model_8.add(TimeDistributed(Flatten()))

model_8.add(LSTM(64))
model_8.add(Dropout(0.25))

# Dense layer 
model_8.add(Dense(64,activation='relu'))
model_8.add(Dropout(0.25))

# Softmax layer
model_8.add(Dense(5, activation='softmax'))

# Adam optimiser

optimiser = optimizers.Adam(lr=0.001) #write your optimizer
model_8.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_8.summary())
        

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
time_distributed_1 (TimeDist (None, 30, 120, 120, 16)  448       
_________________________________________________________________
time_distributed_2 (TimeDist (None, 30, 120, 120, 16)  64        
_________________________________________________________________
time_distributed_3 (TimeDist (None, 30, 60, 60, 16)    0         
_________________________________________________________________
time_distributed_4 (TimeDist (None, 30, 60, 60, 32)    4640      
_________________________________________________________________
time_distributed_5 (TimeDist (None, 30, 60, 60, 32)    128       
_________________________________________________________________
time_distributed_6 (TimeDist (None, 30, 30, 30, 32)    0         
_________________________________________________________________
time_distributed_7 (TimeDist (None, 30, 30, 30, 64)    18496     
__________

In [29]:
# Let us train and validate the model 
batch_size = 40
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [30]:
# Let us fit the model

model_8.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=25, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Epoch 1/25

Epoch 00001: saving model to model_init_2021-02-0709_59_17.824524/model-00001-1.53326-0.31222-1.34602-0.46000.h5
Epoch 2/25

Epoch 00002: saving model to model_init_2021-02-0709_59_17.824524/model-00002-1.30735-0.45013-1.32478-0.41667.h5
Epoch 3/25

Epoch 00003: saving model to model_init_2021-02-0709_59_17.824524/model-00003-1.22884-0.50136-1.30487-0.51667.h5
Epoch 4/25

Epoch 00004: saving model to model_init_2021-02-0709_59_17.824524/model-00004-1.18725-0.50155-1.66637-0.31667.h5
Epoch 5/25

Epoch 00005: saving model to model_init_2021-02-0709_59_17.824524/model-00005-1.18328-0.49201-1.35089-0.41667.h5

Epoch 00005: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
Epoch 6/25

Epoch 00006: saving model to model_init_2021-02-0709_59_17.824524/model-00006-1.06754-0.59170-1.17173-0.56667.h5
Epoch 7/25

Epoch 00007: saving model to model_init_2021-02-0709_59_17.824524/model-00007-1.00869-0.61938-0.96120-0.61667.h5
Epoch 8/25

Epoch 00008: saving model to mod

<keras.callbacks.History at 0x7f8a6f6c1630>

### Results: 


- ***Best Training Accuracy - 78.89 %***
- ***Best Validation Accuracy - 68.33 %***

The above training accuracy is decent but fails in case of validation accuracy, this could be because the LSTM model used is very simple and introduction of more dense layers can help. 

## Model-9 : With GRU model

### ***Model Summary***

- Batch Size : 40
- Image Height : 120
- Image Width : 120
- Epochs - 25
- 4 Layers of Conv2D + GRU + Dense + Softmax

In [31]:
# Let us experiment different x,y,z value in the CNNLSTM network and find tune all the image size & Hyperparameters later

x = 30 # number of frames
y = 120 # image width
z = 120 # image height

# Input all the images sequencial by building the layer with dropouts and batchnormalisation

model_9 = Sequential()   
model_9.add(TimeDistributed(Conv2D(8, (3, 3),padding='same', activation='relu'),input_shape=(x,y,z,3)))
model_9.add(TimeDistributed(BatchNormalization()))
model_9.add(TimeDistributed(MaxPooling2D((2, 2))))
        
model_9.add(TimeDistributed(Conv2D(16, (3, 3) , padding='same', activation='relu')))
model_9.add(TimeDistributed(BatchNormalization()))
model_9.add(TimeDistributed(MaxPooling2D((2, 2))))
        
model_9.add(TimeDistributed(Conv2D(32, (3, 3) , padding='same', activation='relu')))
model_9.add(TimeDistributed(BatchNormalization()))
model_9.add(TimeDistributed(MaxPooling2D((2, 2))))
        
model_9.add(TimeDistributed(Conv2D(64, (3, 3) , padding='same', activation='relu')))
model_9.add(TimeDistributed(BatchNormalization()))
model_9.add(TimeDistributed(MaxPooling2D((2, 2))))
        
model_9.add(TimeDistributed(Conv2D(128, (3, 3) , padding='same', activation='relu')))
model_9.add(TimeDistributed(BatchNormalization()))
model_9.add(TimeDistributed(MaxPooling2D((2, 2))))

# Flatten layer 

model_9.add(TimeDistributed(Flatten()))

model_9.add(GRU(64))
model_9.add(Dropout(0.25))

# Dense layer 
model_9.add(Dense(64,activation='relu'))
model_9.add(Dropout(0.25))

# Softmax layer
model_9.add(Dense(5, activation='softmax'))

# Adam optimiser

optimiser = optimizers.Adam(lr=0.001) #write your optimizer
model_9.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_9.summary())
        

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
time_distributed_14 (TimeDis (None, 30, 120, 120, 8)   224       
_________________________________________________________________
time_distributed_15 (TimeDis (None, 30, 120, 120, 8)   32        
_________________________________________________________________
time_distributed_16 (TimeDis (None, 30, 60, 60, 8)     0         
_________________________________________________________________
time_distributed_17 (TimeDis (None, 30, 60, 60, 16)    1168      
_________________________________________________________________
time_distributed_18 (TimeDis (None, 30, 60, 60, 16)    64        
_________________________________________________________________
time_distributed_19 (TimeDis (None, 30, 30, 30, 16)    0         
_________________________________________________________________
time_distributed_20 (TimeDis (None, 30, 30, 30, 32)    4640      
__________

In [32]:
# Let us train and validate the model 
batch_size = 40
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [33]:
model_9.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=25, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Epoch 1/25

Epoch 00001: saving model to model_init_2021-02-0709_59_17.824524/model-00001-1.46695-0.39367-1.25016-0.52000.h5
Epoch 2/25

Epoch 00002: saving model to model_init_2021-02-0709_59_17.824524/model-00002-1.10179-0.57801-1.10797-0.61667.h5
Epoch 3/25

Epoch 00003: saving model to model_init_2021-02-0709_59_17.824524/model-00003-1.04822-0.58583-1.22596-0.50000.h5
Epoch 4/25

Epoch 00004: saving model to model_init_2021-02-0709_59_17.824524/model-00004-0.92755-0.63777-0.97438-0.60000.h5
Epoch 5/25

Epoch 00005: saving model to model_init_2021-02-0709_59_17.824524/model-00005-0.95843-0.60383-0.98401-0.60000.h5
Epoch 6/25

Epoch 00006: saving model to model_init_2021-02-0709_59_17.824524/model-00006-0.84464-0.68858-1.03267-0.56667.h5

Epoch 00006: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
Epoch 7/25

Epoch 00007: saving model to model_init_2021-02-0709_59_17.824524/model-00007-0.84717-0.64360-0.84559-0.73333.h5
Epoch 8/25

Epoch 00008: saving model to mod

<keras.callbacks.History at 0x7f8a65472390>

### Results: 


- ***Best Training Accuracy - 93.08 %***
- ***Best Validation Accuracy - 70.00 %***

The difference between Training and validation accuracy is very huge (around 20%) and this indicates Overfitting i.e the model does not fit well for unseen data.

## Model - 10 : Hyper parameter tuned on Base Model (Model 3) - Data Augmentation

### ***Model Summary***

- Batch Size : 40
- Image Height : 120
- Image Width : 120
- Epochs - 25
- Conv3D 
- Cropping and Data Augmentation on Input data - Generator Function

In [4]:
# Let we do the generators and input the images as we see that our images have two different sizes. 
x = 30 # No. of frames images
y = 120 # Width of the image
z = 120 # height

def generatorWithAugmentation(source_path, folder_list, batch_size):
    img_idx = [x for x in range(0,x)] #create a list of image numbers you want to use for a particular video
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(folder_list)//batch_size # calculate the number of batches
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,x,y,z,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    # Let us resize all the images 
                    
                    shifted = cv2.warpAffine(image, np.float32([[1, 0, np.random.randint(-20,20)],[0, 1, np.random.randint(-20,20)]]),(image.shape[1], image.shape[0]))
                    gray = cv2.cvtColor(shifted,cv2.COLOR_BGR2GRAY)

                    x0, y0 = np.argwhere(gray > 0).min(axis=0)
                    x1, y1 = np.argwhere(gray > 0).max(axis=0) 
                    
                    cropped=shifted[x0:x1,y0:y1,:]
                    
                    resized_image = imresize(cropped,(y,z))
                    resized_image = resized_image/255
                    
                    batch_data[folder,idx,:,:,0] = (resized_image[:,:,0])#normalise and feed in the image
                    batch_data[folder,idx,:,:,1] = (resized_image[:,:,1])#normalise and feed in the image
                    batch_data[folder,idx,:,:,2] = (resized_image[:,:,2])#normalise and feed in the image
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        
        # write the code for the remaining data points which are left after full batches        
        if (len(folder_list) != batch_size*num_batches):
            batch_size = len(folder_list) - (batch_size*num_batches)
            batch_data = np.zeros((batch_size,x,y,z,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    shifted = cv2.warpAffine(image, np.float32([[1, 0, np.random.randint(-20,20)],[0, 1, np.random.randint(-20,20)]]),(image.shape[1], image.shape[0]))
                    gray = cv2.cvtColor(shifted,cv2.COLOR_BGR2GRAY)

                    x0, y0 = np.argwhere(gray > 0).min(axis=0)
                    x1, y1 = np.argwhere(gray > 0).max(axis=0) 
                    
                    cropped=shifted[x0:x1,y0:y1,:]
                    
                    resized_image = imresize(cropped,(y,z))
                    resized_image = resized_image/255
                                        
                    batch_data[folder,idx,:,:,0] = (resized_image[:,:,0])
                    batch_data[folder,idx,:,:,1] = (resized_image[:,:,1])
                    batch_data[folder,idx,:,:,2] = (resized_image[:,:,2])
                   
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels

In [8]:
# Let us import all the needed libraries of Keras.
from keras.models import Sequential, Model
from keras.layers import Dense, GRU, Flatten, TimeDistributed, Flatten, BatchNormalization, Activation,Dropout
from keras.layers.convolutional import Conv3D, MaxPooling3D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras import optimizers

In [7]:
model_10 = Sequential()       
model_10.add(Conv3D(8,kernel_size=(3,3,3),input_shape=(x,y,z,3),padding='same'))
model_10.add(BatchNormalization())
model_10.add(Activation('relu'))

model_10.add(Conv3D(16, (3, 3, 3), padding='same'))
model_10.add(Activation('relu'))
model_10.add(BatchNormalization())
model_10.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_10.add(Conv3D(32, (2, 2, 2), padding='same'))
model_10.add(Activation('relu'))
model_10.add(BatchNormalization())
model_10.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_10.add(Conv3D(64, (2, 2, 2), padding='same'))
model_10.add(Activation('relu'))
model_10.add(BatchNormalization())
model_10.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_10.add(Conv3D(128, (2, 2, 2), padding='same'))
model_10.add(Activation('relu'))
model_10.add(BatchNormalization())
model_10.add(MaxPooling3D(pool_size=(2, 2, 2)))      

# Flatten layer 

model_10.add(Flatten())

model_10.add(Dense(1000, activation='relu'))
model_10.add(Dropout(0.5))

model_10.add(Dense(500, activation='relu'))
model_10.add(Dropout(0.55))

#Softmax layer

model_10.add(Dense(5, activation='softmax'))

# Let us use the Adam optimiser 

optimiser = optimizers.Adam(lr=0.001) #write your optimizer
model_10.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_10.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_1 (Conv3D)            (None, 30, 120, 120, 8)   656       
_________________________________________________________________
batch_normalization_1 (Batch (None, 30, 120, 120, 8)   32        
_________________________________________________________________
activation_1 (Activation)    (None, 30, 120, 120, 8)   0         
_________________________________________________________________
conv3d_2 (Conv3D)            (None, 30, 120, 120, 16)  3472      
_________________________________________________________________
activation_2 (Activation)    (None, 30, 120, 120, 16)  0         
_________________________________________________________________
batch_normalization_2 (Batch (None, 30, 120, 120, 16)  64        
_________________________________________________________________
max_pooling3d_1 (MaxPooling3 (None, 15, 60, 60, 16)    0         
__________

In [9]:
# Let us see the Validate the Losses and put back the checkpoint

model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=2, cooldown=1, verbose=1) # write the REducelronplateau code here
callbacks_list = [checkpoint, LR]

In [10]:
# Let us see that the steps_per_epoch and validation steps are used by fit_generator to decide the no. of next()

if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [12]:
train_generator = generatorWithAugmentation(train_path, train_doc, batch_size)
val_generator = generatorWithAugmentation(val_path, val_doc, batch_size)

In [14]:
# Let us fit the model
model_10.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=25, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Epoch 1/25

Epoch 00001: saving model to model_init_2021-02-0714_43_09.902843/model-00001-8.76739-0.29575-7.33396-0.48333.h5
Epoch 2/25

Epoch 00002: saving model to model_init_2021-02-0714_43_09.902843/model-00002-7.34197-0.40409-7.56350-0.43333.h5
Epoch 3/25

Epoch 00003: saving model to model_init_2021-02-0714_43_09.902843/model-00003-6.57397-0.45352-6.60649-0.35000.h5
Epoch 4/25

Epoch 00004: saving model to model_init_2021-02-0714_43_09.902843/model-00004-5.18537-0.41796-4.92736-0.38333.h5
Epoch 5/25

Epoch 00005: saving model to model_init_2021-02-0714_43_09.902843/model-00005-3.67416-0.48534-7.05714-0.40000.h5
Epoch 6/25

Epoch 00006: saving model to model_init_2021-02-0714_43_09.902843/model-00006-4.12345-0.41176-3.79681-0.43333.h5
Epoch 7/25

Epoch 00007: saving model to model_init_2021-02-0714_43_09.902843/model-00007-2.67785-0.53633-3.52961-0.43333.h5
Epoch 8/25

Epoch 00008: saving model to model_init_2021-02-0714_43_09.902843/model-00008-2.07575-0.58478-2.47751-0.60000.h5


<keras.callbacks.History at 0x7fe2ccd00a58>

### Results: 


- ***Best Training Accuracy - 82.01 %***
- ***Best Validation Accuracy - 73.33 %***

The training accuracy increases gradually throughout the model but fluctuates highly for validation accuracy, and this indicates it is not a stable model and would not perform well on unseen data. 

Data augmentation increases the computation time and is not a suitable model. 

**Hence we continue with Model 3 as our Base Model**

## Model 11 : Hyperparameter Tuned : Model Architecture - Added Dropouts

### ***Model Summary***

- Batch Size : 40
- Image Height : 120
- Image Width : 120
- Epochs - 25
- Conv3D - Add Dropouts at each Conv2D layer

In [16]:
# Let us train and validate the model 
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

**Added more dropouts, and each dropout value is changed from 0.5 to 0.25**

In [17]:
model_11 = Sequential()       
model_11.add(Conv3D(8,kernel_size=(3,3,3),input_shape=(x,y,z,3),padding='same'))
model_11.add(BatchNormalization())
model_11.add(Activation('relu'))

model_11.add(Conv3D(16, (3, 3, 3), padding='same'))
model_11.add(Activation('relu'))
model_11.add(BatchNormalization())
model_11.add(MaxPooling3D(pool_size=(2, 2, 2)))
model_11.add(Dropout(0.25))

model_11.add(Conv3D(32, (2, 2, 2), padding='same'))
model_11.add(Activation('relu'))
model_11.add(BatchNormalization())
model_11.add(MaxPooling3D(pool_size=(2, 2, 2)))
model_11.add(Dropout(0.25))

model_11.add(Conv3D(64, (2, 2, 2), padding='same'))
model_11.add(Activation('relu'))
model_11.add(BatchNormalization())
model_11.add(MaxPooling3D(pool_size=(2, 2, 2)))
model_11.add(Dropout(0.25))

model_11.add(Conv3D(128, (2, 2, 2), padding='same'))
model_11.add(Activation('relu'))
model_11.add(BatchNormalization())
model_11.add(MaxPooling3D(pool_size=(2, 2, 2)))    
model_11.add(Dropout(0.25))  

# Flatten layer 

model_11.add(Flatten())

model_11.add(Dense(1000, activation='relu'))
model_11.add(Dropout(0.25))

model_11.add(Dense(500, activation='relu'))
model_11.add(Dropout(0.25))

#Softmax layer

model_11.add(Dense(5, activation='softmax'))

# Let us use the Adam optimiser 

optimiser = optimizers.Adam(lr=0.001) #write your optimizer
model_11.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_11.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_6 (Conv3D)            (None, 30, 120, 120, 8)   656       
_________________________________________________________________
batch_normalization_6 (Batch (None, 30, 120, 120, 8)   32        
_________________________________________________________________
activation_6 (Activation)    (None, 30, 120, 120, 8)   0         
_________________________________________________________________
conv3d_7 (Conv3D)            (None, 30, 120, 120, 16)  3472      
_________________________________________________________________
activation_7 (Activation)    (None, 30, 120, 120, 16)  0         
_________________________________________________________________
batch_normalization_7 (Batch (None, 30, 120, 120, 16)  64        
_________________________________________________________________
max_pooling3d_5 (MaxPooling3 (None, 15, 60, 60, 16)    0         
__________

In [18]:
# Let us fit the model
model_11.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=25, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Epoch 1/25

Epoch 00001: saving model to model_init_2021-02-0714_43_09.902843/model-00001-12.25312-0.20814-12.24975-0.24000.h5
Epoch 2/25

Epoch 00002: saving model to model_init_2021-02-0714_43_09.902843/model-00002-13.15006-0.18414-12.89448-0.20000.h5
Epoch 3/25

Epoch 00003: saving model to model_init_2021-02-0714_43_09.902843/model-00003-12.78029-0.20708-12.34559-0.23333.h5

Epoch 00003: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
Epoch 4/25

Epoch 00004: saving model to model_init_2021-02-0714_43_09.902843/model-00004-13.12402-0.18576-12.08857-0.25000.h5
Epoch 5/25

Epoch 00005: saving model to model_init_2021-02-0714_43_09.902843/model-00005-12.51341-0.22364-12.89448-0.20000.h5
Epoch 6/25

Epoch 00006: saving model to model_init_2021-02-0714_43_09.902843/model-00006-13.38527-0.16955-12.56553-0.21667.h5

Epoch 00006: ReduceLROnPlateau reducing learning rate to 0.0002500000118743628.
Epoch 7/25

Epoch 00007: saving model to model_init_2021-02-0714_43_09.90284

<keras.callbacks.History at 0x7fe36881fcf8>

### Results: 


- ***Best Training Accuracy - 20.76 % approx***
- ***Best Validation Accuracy - 20.67 % approx***

The dropout layers increase the number of parameters that are assigned as 0 (dropping out 25% neurons in most of layers) and this underfits the model, as a result the model does not learn properly and it leads to a very poor accuracy.

**Hence we continue with Model 3 as our Base Model**

## Model 12 : Hyperparameter Tuned : Model Architecture - Added more dense layers

In [12]:
model_12 = Sequential()       
model_12.add(Conv3D(8,kernel_size=(3,3,3),input_shape=(x,y,z,3),padding='same'))
model_12.add(BatchNormalization())
model_12.add(Activation('relu'))

model_12.add(Conv3D(16, (3, 3, 3), padding='same'))
model_12.add(Activation('relu'))
model_12.add(BatchNormalization())
model_12.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_12.add(Conv3D(32, (2, 2, 2), padding='same'))
model_12.add(Activation('relu'))
model_12.add(BatchNormalization())

model_12.add(Conv3D(32, (2, 2, 2), padding='same'))
model_12.add(Activation('relu'))
model_12.add(BatchNormalization())
model_12.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_12.add(Conv3D(64, (2, 2, 2), padding='same'))
model_12.add(Activation('relu'))
model_12.add(BatchNormalization())

model_12.add(Conv3D(64, (2, 2, 2), padding='same'))
model_12.add(Activation('relu'))
model_12.add(BatchNormalization())
model_12.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_12.add(Conv3D(128, (2, 2, 2), padding='same'))
model_12.add(Activation('relu'))
model_12.add(BatchNormalization())

model_12.add(Conv3D(128, (2, 2, 2), padding='same'))
model_12.add(Activation('relu'))
model_12.add(BatchNormalization())
model_12.add(MaxPooling3D(pool_size=(2, 2, 2)))    

# Flatten layer 

model_12.add(Flatten())

model_12.add(Dense(1000, activation='relu'))
model_12.add(Dropout(0.25))

model_12.add(Dense(500, activation='relu'))
model_12.add(Dropout(0.25))

#Softmax layer

model_12.add(Dense(5, activation='softmax'))

# Let us use the Adam optimiser 

optimiser = optimizers.Adam(lr=0.001) #write your optimizer
model_12.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_12.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_1 (Conv3D)            (None, 30, 120, 120, 8)   656       
_________________________________________________________________
batch_normalization_1 (Batch (None, 30, 120, 120, 8)   32        
_________________________________________________________________
activation_1 (Activation)    (None, 30, 120, 120, 8)   0         
_________________________________________________________________
conv3d_2 (Conv3D)            (None, 30, 120, 120, 16)  3472      
_________________________________________________________________
activation_2 (Activation)    (None, 30, 120, 120, 16)  0         
_________________________________________________________________
batch_normalization_2 (Batch (None, 30, 120, 120, 16)  64        
_________________________________________________________________
max_pooling3d_1 (MaxPooling3 (None, 15, 60, 60, 16)    0         
__________

In [13]:
# Let us fit the model
model_12.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=25, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Epoch 1/25

Epoch 00001: saving model to model_init_2021-02-0716_33_19.139208/model-00001-11.42872-0.21870-13.05566-0.19000.h5
Epoch 2/25

Epoch 00002: saving model to model_init_2021-02-0716_33_19.139208/model-00002-11.20997-0.29668-11.28267-0.30000.h5
Epoch 3/25

Epoch 00003: saving model to model_init_2021-02-0716_33_19.139208/model-00003-11.52736-0.27248-12.62584-0.21667.h5
Epoch 4/25

Epoch 00004: saving model to model_init_2021-02-0716_33_19.139208/model-00004-11.20947-0.29721-12.62584-0.21667.h5

Epoch 00004: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
Epoch 5/25

Epoch 00005: saving model to model_init_2021-02-0716_33_19.139208/model-00005-10.36683-0.34824-11.28267-0.30000.h5
Epoch 6/25

Epoch 00006: saving model to model_init_2021-02-0716_33_19.139208/model-00006-10.05143-0.35294-10.84612-0.26667.h5
Epoch 7/25

Epoch 00007: saving model to model_init_2021-02-0716_33_19.139208/model-00007-10.75083-0.30796-12.23412-0.23333.h5
Epoch 8/25

Epoch 00008: savin

<keras.callbacks.History at 0x7ff43c3dbbe0>

### Results: 


- ***Best Training Accuracy - 54.33 %***
- ***Best Validation Accuracy - 45.00 %***

The model is undefitting as well as complex.The foremost objective of training machine learning based model is to keep a good trade-off between simplicity of the model and the performance accuracy which is not achieved with this model.

**Hence we continue with Model 3 as our Base Model**

## Model - 13 : To reduce memory footprint of Model 3 

### ***Model Summary***

- Batch Size : 40
- Image Height : 120
- Image Width : 120
- Epochs - 25
- Currently, the avg parameters that need to be trained = 7 Million , Reducing the parameters help us achieve a better model with less number of training parameters

In [19]:
# Let us experiment different x,y,z value in the CNN network and find tune all the image size & Hyperparameters later

x = 30 # number of frames
y = 120 # image width
z = 120 # image height

# Input all the images sequencial by building the layer with dropouts and batchnormalisation

model_13 = Sequential()       
model_13.add(Conv3D(8,kernel_size=(3,3,3),input_shape=(x,y,z,3),padding='same'))
model_13.add(BatchNormalization())
model_13.add(Activation('relu'))

model_13.add(Conv3D(16, (3, 3, 3), padding='same'))
model_13.add(Activation('relu'))
model_13.add(BatchNormalization())
model_13.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_13.add(Conv3D(32, (2, 2, 2), padding='same'))
model_13.add(Activation('relu'))
model_13.add(BatchNormalization())
model_13.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_13.add(Conv3D(64, (2, 2, 2), padding='same'))
model_13.add(Activation('relu'))
model_13.add(BatchNormalization())
model_13.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_13.add(Conv3D(128, (2, 2, 2), padding='same'))
model_13.add(Activation('relu'))
model_13.add(BatchNormalization())
model_13.add(MaxPooling3D(pool_size=(2, 2, 2)))      

# Flatten layer 

model_13.add(Flatten())

model_13.add(Dense(256, activation='relu'))
model_13.add(Dropout(0.25))

model_13.add(Dense(256, activation='relu'))
model_13.add(Dropout(0.25))

#Softmax layer

model_13.add(Dense(5, activation='softmax'))

# Let us use the Adam optimiser 

optimiser = optimizers.Adam(lr=0.001) #write your optimizer
model_13.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_13.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_9 (Conv3D)            (None, 30, 120, 120, 8)   656       
_________________________________________________________________
batch_normalization_9 (Batch (None, 30, 120, 120, 8)   32        
_________________________________________________________________
activation_9 (Activation)    (None, 30, 120, 120, 8)   0         
_________________________________________________________________
conv3d_10 (Conv3D)           (None, 30, 120, 120, 16)  3472      
_________________________________________________________________
activation_10 (Activation)   (None, 30, 120, 120, 16)  0         
_________________________________________________________________
batch_normalization_10 (Batc (None, 30, 120, 120, 16)  64        
_________________________________________________________________
max_pooling3d_5 (MaxPooling3 (None, 15, 60, 60, 16)    0         
__________

In [20]:
# Let us fit the model
model_13.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=25, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Epoch 1/25

Epoch 00001: saving model to model_init_2021-02-0716_53_40.973445/model-00001-2.60513-0.35747-1.74344-0.42000.h5
Epoch 2/25

Epoch 00002: saving model to model_init_2021-02-0716_53_40.973445/model-00002-1.18538-0.54476-2.08811-0.28333.h5
Epoch 3/25

Epoch 00003: saving model to model_init_2021-02-0716_53_40.973445/model-00003-1.02032-0.60218-1.90722-0.50000.h5

Epoch 00003: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
Epoch 4/25

Epoch 00004: saving model to model_init_2021-02-0716_53_40.973445/model-00004-0.93048-0.66563-0.93835-0.60000.h5
Epoch 5/25

Epoch 00005: saving model to model_init_2021-02-0716_53_40.973445/model-00005-0.75794-0.72843-0.61726-0.76667.h5
Epoch 6/25

Epoch 00006: saving model to model_init_2021-02-0716_53_40.973445/model-00006-0.53558-0.80623-0.91126-0.58333.h5
Epoch 7/25

Epoch 00007: saving model to model_init_2021-02-0716_53_40.973445/model-00007-0.75382-0.74740-0.92161-0.68333.h5

Epoch 00007: ReduceLROnPlateau reducing lea

<keras.callbacks.History at 0x7ff436a7b048>

### Results: 


- ***Best Training Accuracy - 93.77 % approx***
- ***Best Validation Accuracy - 81.67% approx***

The model is overfitting as there is a huge difference between the training and validation accuracy.

**Hence we continue with Model 3 as our Base Model**

## Model 14 : Hyperparameter Tuned : Filter size and Dense neurons (128)

### ***Model Summary***

- Batch Size : 40
- Image Height : 120
- Image Width : 120
- Epochs - 25
- Dense Neurons : 128 
- Filter size = 2,2,2

In [21]:
# Let we do the generators and input the images as we see that our images have two different sizes. 
x = 20 # No. of frames images
y = 120 # Width of the image
z = 120 # height

def generator(source_path, folder_list, batch_size):
    img_idx = [x for x in range(0,x)] #create a list of image numbers you want to use for a particular video
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(folder_list)//batch_size # calculate the number of batches
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,x,y,z,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    # Let us resize all the images 
                    resized_image = imresize(image,(y,z))
                    resized_image = resized_image/255
                    
                    batch_data[folder,idx,:,:,0] = (resized_image[:,:,0])#normalise and feed in the image
                    batch_data[folder,idx,:,:,1] = (resized_image[:,:,1])#normalise and feed in the image
                    batch_data[folder,idx,:,:,2] = (resized_image[:,:,2])#normalise and feed in the image
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        
        # write the code for the remaining data points which are left after full batches        
        if (len(folder_list) != batch_size*num_batches):
            batch_size = len(folder_list) - (batch_size*num_batches)
            batch_data = np.zeros((batch_size,x,y,z,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    resized_image = imresize(image,(y,z))
                    resized_image = resized_image/255 #Normalize data
                    
                    batch_data[folder,idx,:,:,0] = (resized_image[:,:,0])
                    batch_data[folder,idx,:,:,1] = (resized_image[:,:,1])
                    batch_data[folder,idx,:,:,2] = (resized_image[:,:,2])
                   
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels


In [25]:
# Let us experiment different x,y,z value in the CNN network and find tune all the image size & Hyperparameters later

x = 20 # number of frames
y = 120 # image width
z = 120 # image height

# Input all the images sequencial by building the layer with dropouts and batchnormalisation

model_14 = Sequential()       
model_14.add(Conv3D(8,kernel_size=(2,2,2),input_shape=(x,y,z,3),padding='same'))
model_14.add(BatchNormalization())
model_14.add(Activation('relu'))

model_14.add(Conv3D(16, (2, 2, 2), padding='same'))
model_14.add(Activation('relu'))
model_14.add(BatchNormalization())
model_14.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_14.add(Conv3D(32, (2, 2, 2), padding='same'))
model_14.add(Activation('relu'))
model_14.add(BatchNormalization())
model_14.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_14.add(Conv3D(64, (2, 2, 2), padding='same'))
model_14.add(Activation('relu'))
model_14.add(BatchNormalization())
model_14.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_14.add(Conv3D(128, (2, 2, 2), padding='same'))
model_14.add(Activation('relu'))
model_14.add(BatchNormalization())
model_14.add(MaxPooling3D(pool_size=(2, 2, 2)))      

# Flatten layer 

model_14.add(Flatten())

model_14.add(Dense(128, activation='relu'))
model_14.add(Dropout(0.25))

model_14.add(Dense(128, activation='relu'))
model_14.add(Dropout(0.25))

#Softmax layer

model_14.add(Dense(5, activation='softmax'))

# Let us use the Adam optimiser 

optimiser = optimizers.Adam(lr=0.001) #write your optimizer
model_14.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_14.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_29 (Conv3D)           (None, 20, 120, 120, 8)   200       
_________________________________________________________________
batch_normalization_29 (Batc (None, 20, 120, 120, 8)   32        
_________________________________________________________________
activation_29 (Activation)   (None, 20, 120, 120, 8)   0         
_________________________________________________________________
conv3d_30 (Conv3D)           (None, 20, 120, 120, 16)  1040      
_________________________________________________________________
activation_30 (Activation)   (None, 20, 120, 120, 16)  0         
_________________________________________________________________
batch_normalization_30 (Batc (None, 20, 120, 120, 16)  64        
_________________________________________________________________
max_pooling3d_21 (MaxPooling (None, 10, 60, 60, 16)    0         
__________

In [26]:
# Let us train and validate the model 
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [27]:
# Let us fit the model
model_14.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=25, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Epoch 1/25

Epoch 00001: saving model to model_init_2021-02-0716_53_40.973445/model-00001-2.57966-0.29563-1.42916-0.46000.h5
Epoch 2/25

Epoch 00002: saving model to model_init_2021-02-0716_53_40.973445/model-00002-1.35608-0.41176-1.25566-0.36667.h5
Epoch 3/25

Epoch 00003: saving model to model_init_2021-02-0716_53_40.973445/model-00003-1.29840-0.44959-1.14163-0.50000.h5
Epoch 4/25

Epoch 00004: saving model to model_init_2021-02-0716_53_40.973445/model-00004-1.18593-0.51703-1.10105-0.56667.h5
Epoch 5/25

Epoch 00005: saving model to model_init_2021-02-0716_53_40.973445/model-00005-1.27561-0.48562-1.06145-0.60000.h5
Epoch 6/25

Epoch 00006: saving model to model_init_2021-02-0716_53_40.973445/model-00006-1.27217-0.45675-1.11572-0.63333.h5
Epoch 7/25

Epoch 00007: saving model to model_init_2021-02-0716_53_40.973445/model-00007-1.18191-0.51903-1.19523-0.46667.h5

Epoch 00007: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
Epoch 8/25

Epoch 00008: saving model to mod

<keras.callbacks.History at 0x7ff3f5366160>

### Results: 


- ***Best Training Accuracy - 80.00 % approx***
- ***Best Validation Accuracy - 75.00 % approx***

Computational Incapabilities can be handled and the model is a good fit.

**Best Model to be used incase there are memory constraints as it can achieve the same accuracy at a lower number of parameters.**

## Re-run Model 3 to check if the overall accuracy does not drop

This is an extra piece of code that was changed to check the model after restarting the model

In [7]:
# Let us experiment different x,y,z value in the CNN network and find tune all the image size & Hyperparameters later

x = 30 # number of frames
y = 120 # image width
z = 120 # image height

# Input all the images sequencial by building the layer with dropouts and batchnormalisation

model_Three = Sequential()       
model_Three.add(Conv3D(8,kernel_size=(3,3,3),input_shape=(x,y,z,3),padding='same'))
model_Three.add(BatchNormalization())
model_Three.add(Activation('relu'))

model_Three.add(Conv3D(16, (3, 3, 3), padding='same'))
model_Three.add(Activation('relu'))
model_Three.add(BatchNormalization())
model_Three.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_Three.add(Conv3D(32, (2, 2, 2), padding='same'))
model_Three.add(Activation('relu'))
model_Three.add(BatchNormalization())
model_Three.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_Three.add(Conv3D(64, (2, 2, 2), padding='same'))
model_Three.add(Activation('relu'))
model_Three.add(BatchNormalization())
model_Three.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_Three.add(Conv3D(128, (2, 2, 2), padding='same'))
model_Three.add(Activation('relu'))
model_Three.add(BatchNormalization())
model_Three.add(MaxPooling3D(pool_size=(2, 2, 2)))      

# Flatten layer 

model_Three.add(Flatten())

model_Three.add(Dense(1000, activation='relu'))
model_Three.add(Dropout(0.5))

model_Three.add(Dense(500, activation='relu'))
model_Three.add(Dropout(0.55))

#Softmax layer

model_Three.add(Dense(5, activation='softmax'))

# Let us use the Adam optimiser 

optimiser = optimizers.Adam(lr=0.001) #write your optimizer
model_Three.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_Three.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_6 (Conv3D)            (None, 30, 120, 120, 8)   656       
_________________________________________________________________
batch_normalization_6 (Batch (None, 30, 120, 120, 8)   32        
_________________________________________________________________
activation_6 (Activation)    (None, 30, 120, 120, 8)   0         
_________________________________________________________________
conv3d_7 (Conv3D)            (None, 30, 120, 120, 16)  3472      
_________________________________________________________________
activation_7 (Activation)    (None, 30, 120, 120, 16)  0         
_________________________________________________________________
batch_normalization_7 (Batch (None, 30, 120, 120, 16)  64        
_________________________________________________________________
max_pooling3d_5 (MaxPooling3 (None, 15, 60, 60, 16)    0         
__________

In [8]:
# Let us train and validate the model 
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [13]:
## Let us fit the model
model_Three.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=25, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Epoch 1/25

Epoch 00001: saving model to model_init_2021-02-0717_41_34.923192/model-00001-8.64316-0.29412-7.42261-0.41000.h5
Epoch 2/25

Epoch 00002: saving model to model_init_2021-02-0717_41_34.923192/model-00002-6.96559-0.37340-10.11912-0.31667.h5
Epoch 3/25

Epoch 00003: saving model to model_init_2021-02-0717_41_34.923192/model-00003-5.88857-0.42234-4.70760-0.46667.h5
Epoch 4/25

Epoch 00004: saving model to model_init_2021-02-0717_41_34.923192/model-00004-3.40301-0.49226-2.43684-0.53333.h5
Epoch 5/25

Epoch 00005: saving model to model_init_2021-02-0717_41_34.923192/model-00005-2.44029-0.53355-6.41404-0.31667.h5
Epoch 6/25

Epoch 00006: saving model to model_init_2021-02-0717_41_34.923192/model-00006-2.07127-0.55363-2.12476-0.53333.h5
Epoch 7/25

Epoch 00007: saving model to model_init_2021-02-0717_41_34.923192/model-00007-1.81883-0.55709-1.61929-0.41667.h5
Epoch 8/25

Epoch 00008: saving model to model_init_2021-02-0717_41_34.923192/model-00008-1.39070-0.57439-2.62719-0.45000.h5

<keras.callbacks.History at 0x7f7709fe4ac8>

### Results: 

- ***Best Training Accuracy - 85.50 %***
- ***Best Validation Accuracy - 82.00 %***

We can clearly see that increasing epoch have increased accuracy.

The above are best values we got in model-3. Going with epoch-25's values as the difference between training and validation accuracy is <5%.

The computation time increases with the number of epochs, however the accuracy also increases and gradually the model runs better. 

Since, we are more emphasizing on performance now and not computation time, let’s still use model-3 as our final model. 

Currently, we have obtained our best h5 model file.

------------------------------------------------------------------------------------------------------------------------