# Gesture Recognition
In this group project, you are going to build a 3D Conv model that will be able to predict the 5 gestures correctly. Please import the following libraries to get started.

In [1]:
import numpy as np
import os
import datetime
import imageio
from imageio import imread
import cv2

def imread(path):
    from PIL import Image
    return np.array(Image.open(path))

def imresize(img, size):
    from PIL import Image
    return np.array(Image.fromarray(img).resize(size))

We set the random seed so that the results don't vary drastically.

In [2]:
np.random.seed(30)
import random as rn
rn.seed(30)
import tensorflow as tf
tf.random.set_seed(30)

In this block, you read the folder names for training and validation. You also set the `batch_size` here. Note that you set the batch size in such a way that you are able to use the GPU in full capacity. You keep increasing the batch size until the machine throws an error.

In [3]:
train_doc = np.random.permutation(open('Project_data/train.csv').readlines())
val_doc = np.random.permutation(open('Project_data/val.csv').readlines())
batch_size = 16 #experiment with the batch size

## Generator
This is one of the most important part of the code. The overall structure of the generator has been given. In the generator, you are going to preprocess the images as you have images of 2 different dimensions as well as create a batch of video frames. You have to experiment with `img_idx`, `y`,`z` and normalization such that you get high accuracy.

In [4]:
def generator(source_path, folder_list, batch_size, img_idx_v,img_h, img_w):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    img_idx = img_idx_v  #list of image numbers you want to use for a particular video passed to generator
    y = img_h # image height we want to use with model passsed to generator
    z = img_w # image width we want to use with model passsed to generator
    x = len(img_idx)
    while True:
        t = np.random.permutation(folder_list)
        num_batches = int(len(folder_list)/batch_size) # calculate the number of batches
        for batch in range(num_batches): # we iterate over the number of batches
            #x=len(img_idx)
            batch_data = np.zeros((batch_size,x,y,z,3))  # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imageio.imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    height, width , channel = image.shape
                    if height == 120 or width == 120:
                        image=image[20:140,:120,:]
                    
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    image = cv2.resize(image,(y,z))
                    image_r = image
                    image_r[:,:,1:2] = 0
                    batch_data[folder,idx,:,:,0] = (image_r[:,:,0] - image_r.mean())/image_r.std()#normalise and feed in the image
                    
                    image_g = image
                    image_g[:,:,0:2] = 0
                    batch_data[folder,idx,:,:,1] = (image_g[:,:,1] - image_g.mean())/image_g.std()#normalise and feed in the image
                    
                    image_b = image
                    image_b[:,:,0:1] = 0
                    batch_data[folder,idx,:,:,2] = (image_b[:,:,2] - image_b.mean())/image_b.std() #normalise and feed in the image
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        
        # write the code for the remaining data points which are left after full batches
        
        batch_size_r=len(folder_list)%batch_size
        if batch_size_r!=0:
            batch_data = np.zeros((batch_size_r,x,y,z,3))
            batch_labels = np.zeros((batch_size_r,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size_r): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imageio.imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    height, width , channel = image.shape
                    if height == 120 or width == 120:
                        image=image[20:140,:120,:]
                    
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    image = cv2.resize(image,(y,z))
                    image_r = image
                    image_r[:,:,1:2] = 0
                    batch_data[folder,idx,:,:,0] = (image_r[:,:,0] - image_r.mean())/image_r.std()#normalise and feed in the image
                    
                    image_g = image
                    image_g[:,:,0:2] = 0
                    batch_data[folder,idx,:,:,1] = (image_g[:,:,1] - image_g.mean())/image_g.std()#normalise and feed in the image
                    
                    image_b = image
                    image_b[:,:,0:1] = 0
                    batch_data[folder,idx,:,:,2] = (image_b[:,:,2] - image_b.mean())/image_b.std() #normalise and feed in the image
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

Note here that a video is represented above in the generator as (number of images, height, width, number of channels). Take this into consideration while creating the model architecture.

In [5]:
train_path = 'Project_data/train'
val_path = 'Project_data/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs = 15 # choose the number of epochs
print ('# epochs =', num_epochs)

# training sequences = 663
# validation sequences = 100
# epochs = 15


## Model
Here you make the model using different functionalities that Keras provides. Remember to use `Conv3D` and `MaxPooling3D` and not `Conv2D` and `Maxpooling2D` for a 3D convolution model. You would want to use `TimeDistributed` while building a Conv2D + RNN model. Also remember that the last layer is the softmax. Design the network in such a way that the model is able to give good accuracy on the least number of parameters so that it can fit in the memory of the webcam.

In [6]:
import tensorflow as tf
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Dense, GRU, Flatten, TimeDistributed, Flatten, BatchNormalization, Activation
from tensorflow.keras.layers import Conv3D, MaxPooling3D
from tensorflow.keras.callbacks import ModelCheckpoint, ReduceLROnPlateau, EarlyStopping
from tensorflow.keras import optimizers

###

from tensorflow.keras.layers import Dropout, Conv2D, MaxPooling2D


#### We will experiment different conv3D models.**

### Base Model with Limited Data
### Model 1 Batch Size: 128, Number of images from each video: 15 Image Size: 160x160, Epochs: 15 - Tried with batch size 128 but resources exhausted






### Reducing Batch Size to 64
### Model 2 Batch Size: 64, Number of images from each video: 15 Image Size: 160x160, Epochs: 15 

In [8]:
img_per_video=15
img_h = 160
img_w = 160
batch_size = 64#experiment with the batch size
filtersize = (3,3,3)
img_idx_v = np.round(np.linspace(0,29,img_per_video)).astype(int)

model = Sequential()

model.add(Conv3D(16, filtersize, activation='relu', input_shape=(len(img_idx_v),img_h,img_w,3), padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding='same'))

model.add(Conv3D(32, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))

model.add(Conv3D(64, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding='same'))

model.add(Conv3D(128, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding='same'))

model.add(Flatten())

model.add(Dense(128,activation='relu'))
model.add(BatchNormalization())


model.add(Dense(5, activation='softmax'))

#Now that you have written the model, the next step is to `compile` the model. When you print the `summary` of the model, you'll see the total number of parameters you have to train.

optimiser = 'adam' #write your optimizer
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d (Conv3D)              (None, 15, 160, 160, 16)  1312      
_________________________________________________________________
batch_normalization (BatchNo (None, 15, 160, 160, 16)  64        
_________________________________________________________________
max_pooling3d (MaxPooling3D) (None, 8, 80, 80, 16)     0         
_________________________________________________________________
conv3d_1 (Conv3D)            (None, 8, 80, 80, 32)     13856     
_________________________________________________________________
activation (Activation)      (None, 8, 80, 80, 32)     0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 8, 80, 80, 32)     128       
_________________________________________________________________
max_pooling3d_1 (MaxPooling3 (None, 4, 40, 40, 32)     0

Let us create the `train_generator` and the `val_generator` which will be used in `.fit_generator`.

In [9]:
train_generator = generator(train_path, train_doc, batch_size, img_idx_v, img_h, img_w)
val_generator = generator(val_path, val_doc, batch_size, img_idx_v, img_h, img_w)

In [10]:
curr_dt_time = datetime.datetime.now()
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=4, min_lr=0.001)# write the REducelronplateau code here
callbacks_list = [checkpoint, LR]



In [11]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

Let us now fit the model. This will start training the model and with the help of the checkpoints, you'll be able to save the model at the end of each epoch.

In [12]:
model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  Project_data/train ; batch size = 64
Epoch 1/15

Epoch 00001: val_loss improved from inf to 5.46895, saving model to model_init_2021-11-1813_02_42.391809/model-00001-1.43153-0.47210-5.46895-0.21000.h5
Epoch 2/15

Epoch 00002: val_loss improved from 5.46895 to 2.51069, saving model to model_init_2021-11-1813_02_42.391809/model-00002-0.67901-0.77979-2.51069-0.24000.h5
Epoch 3/15

Epoch 00003: val_loss improved from 2.51069 to 1.48566, saving model to model_init_2021-11-1813_02_42.391809/model-00003-0.38799-0.87330-1.48566-0.32000.h5
Epoch 4/15

Epoch 00004: val_loss did not improve from 1.48566
Epoch 5/15

Epoch 00005: val_loss did not improve from 1.48566
Epoch 6/15

Epoch 00006: val_loss did not improve from 1.48566
Epoch 7/15

Epoch 00007: val_loss did not improve from 1.48566
Epoch 8/15

Epoch 00008: val_loss did not improve from 1.48566
Epoch 9/15

Epoch 00009: val_loss did not improve from 1.48566
Epoch 10/15

Epoch 00010: val_loss did not improve from 1.48566
Epoch 

<tensorflow.python.keras.callbacks.History at 0x7f2f6401cc40>

#### We see a clear overfit model. Let us try to add Dropouts
### Model 3 Batch Size: 64, Number of images from each video: 15 Image Size: 160x160, Epochs: 15 with Dropouts

In [8]:
img_per_video=15
img_h = 160
img_w = 160
batch_size = 64 #experiment with the batch size
filtersize = (3,3,3)
img_idx_v = np.round(np.linspace(0,29,img_per_video)).astype(int)

model = Sequential()

model.add(Conv3D(16, filtersize, activation='relu', input_shape=(len(img_idx_v),img_h,img_w,3), padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding='same'))
model.add(Dropout(0.25))


model.add(Conv3D(32, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(Dropout(0.25))

model.add(Conv3D(64, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding='same'))
model.add(Dropout(0.25))

model.add(Conv3D(128, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding='same'))
model.add(Dropout(0.25))

model.add(Flatten())

model.add(Dense(128,activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))


model.add(Dense(5, activation='softmax'))

#Now that you have written the model, the next step is to `compile` the model. When you print the `summary` of the model, you'll see the total number of parameters you have to train.

optimiser = 'adam' #write your optimizer
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d (Conv3D)              (None, 15, 160, 160, 16)  1312      
_________________________________________________________________
batch_normalization (BatchNo (None, 15, 160, 160, 16)  64        
_________________________________________________________________
max_pooling3d (MaxPooling3D) (None, 8, 80, 80, 16)     0         
_________________________________________________________________
dropout (Dropout)            (None, 8, 80, 80, 16)     0         
_________________________________________________________________
conv3d_1 (Conv3D)            (None, 8, 80, 80, 32)     13856     
_________________________________________________________________
activation (Activation)      (None, 8, 80, 80, 32)     0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 8, 80, 80, 32)     1

In [9]:
train_generator = generator(train_path, train_doc, batch_size, img_idx_v, img_h, img_w)
val_generator = generator(val_path, val_doc, batch_size, img_idx_v, img_h, img_w)

In [10]:
curr_dt_time = datetime.datetime.now()
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=4, min_lr=0.001)# write the REducelronplateau code here
callbacks_list = [checkpoint, LR]



In [11]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [12]:
model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  Project_data/train ; batch size = 64
Epoch 1/15

Epoch 00001: val_loss improved from inf to 1.63770, saving model to model_init_2021-11-1813_40_53.447876/model-00001-1.89984-0.34992-1.63770-0.39000.h5
Epoch 2/15

Epoch 00002: val_loss did not improve from 1.63770
Epoch 3/15

Epoch 00003: val_loss did not improve from 1.63770
Epoch 4/15

Epoch 00004: val_loss did not improve from 1.63770
Epoch 5/15

Epoch 00005: val_loss did not improve from 1.63770
Epoch 6/15

Epoch 00006: val_loss did not improve from 1.63770
Epoch 7/15

Epoch 00007: val_loss did not improve from 1.63770
Epoch 8/15

Epoch 00008: val_loss did not improve from 1.63770
Epoch 9/15

Epoch 00009: val_loss did not improve from 1.63770
Epoch 10/15

Epoch 00010: val_loss did not improve from 1.63770
Epoch 11/15

Epoch 00011: val_loss did not improve from 1.63770
Epoch 12/15

Epoch 00012: val_loss did not improve from 1.63770
Epoch 13/15

Epoch 00013: val_loss did not improve from 1.63770
Epoch 14/15

Epoch 00014

<tensorflow.python.keras.callbacks.History at 0x7ff44c4d53d0>

#### Dropouts have not helped that much. Let us add more data and try with 22 images per video

### Model 4 Batch Size: 64, Number of images from each video: 22 Image Size: 160x160, Epochs: 15 with Dropouts

#### Tried But Kernel Died so Reducing the batch Size in next model to 40


### Model 5 Batch Size: 40, Number of images from each video: 22 Image Size: 160x160, Epochs: 15 with Dropouts

In [19]:
img_per_video=22
img_h = 160
img_w = 160
batch_size = 40 #experiment with the batch size
filtersize = (3,3,3)
img_idx_v = np.round(np.linspace(0,29,img_per_video)).astype(int)

model = Sequential()

model.add(Conv3D(16, filtersize, activation='relu', input_shape=(len(img_idx_v),img_h,img_w,3), padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding='same'))
model.add(Dropout(0.25))


model.add(Conv3D(32, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(Dropout(0.25))

model.add(Conv3D(64, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding='same'))
model.add(Dropout(0.25))

model.add(Conv3D(128, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding='same'))
model.add(Dropout(0.25))

model.add(Flatten())

model.add(Dense(128,activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))


model.add(Dense(5, activation='softmax'))

#Now that you have written the model, the next step is to `compile` the model. When you print the `summary` of the model, you'll see the total number of parameters you have to train.

optimiser = 'adam' #write your optimizer
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_4 (Conv3D)            (None, 22, 160, 160, 16)  1312      
_________________________________________________________________
batch_normalization_5 (Batch (None, 22, 160, 160, 16)  64        
_________________________________________________________________
max_pooling3d_4 (MaxPooling3 (None, 11, 80, 80, 16)    0         
_________________________________________________________________
dropout_5 (Dropout)          (None, 11, 80, 80, 16)    0         
_________________________________________________________________
conv3d_5 (Conv3D)            (None, 11, 80, 80, 32)    13856     
_________________________________________________________________
activation_3 (Activation)    (None, 11, 80, 80, 32)    0         
_________________________________________________________________
batch_normalization_6 (Batch (None, 11, 80, 80, 32)   

In [20]:
train_generator = generator(train_path, train_doc, batch_size, img_idx_v, img_h, img_w)
val_generator = generator(val_path, val_doc, batch_size, img_idx_v, img_h, img_w)

In [21]:
curr_dt_time = datetime.datetime.now()
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=4, min_lr=0.001)# write the REducelronplateau code here
callbacks_list = [checkpoint, LR]



In [22]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [23]:
model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  Project_data/train ; batch size = 40
Epoch 1/15

Epoch 00001: val_loss improved from inf to 1.77818, saving model to model_init_2021-11-1814_01_50.031547/model-00001-1.70793-0.38763-1.77818-0.33000.h5
Epoch 2/15

Epoch 00002: val_loss did not improve from 1.77818
Epoch 3/15

Epoch 00003: val_loss did not improve from 1.77818
Epoch 4/15

Epoch 00004: val_loss did not improve from 1.77818
Epoch 5/15

Epoch 00005: val_loss did not improve from 1.77818
Epoch 6/15

Epoch 00006: val_loss did not improve from 1.77818
Epoch 7/15

Epoch 00007: val_loss did not improve from 1.77818
Epoch 8/15

Epoch 00008: val_loss did not improve from 1.77818
Epoch 9/15

Epoch 00009: val_loss did not improve from 1.77818
Epoch 10/15

Epoch 00010: val_loss did not improve from 1.77818
Epoch 11/15

Epoch 00011: val_loss did not improve from 1.77818
Epoch 12/15

Epoch 00012: val_loss did not improve from 1.77818
Epoch 13/15

Epoch 00013: val_loss did not improve from 1.77818
Epoch 14/15

Epoch 00014

<tensorflow.python.keras.callbacks.History at 0x7f29fc0c40a0>

#### Though we saw a little better model but still its overfitting.  Let us try to simplify by making change in dense layer

### Model 6 Batch Size: 40, Number of images from each video: 22 Image Size: 160x160, Epochs: 15 with Dropouts, simplied dense layer



In [9]:
img_per_video=22
img_h = 160
img_w = 160
batch_size = 40 #experiment with the batch size
filtersize = (3,3,3)
img_idx_v = np.round(np.linspace(0,29,img_per_video)).astype(int)

model = Sequential()

model.add(Conv3D(16, filtersize, activation='relu', input_shape=(len(img_idx_v),img_h,img_w,3), padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding='same'))
model.add(Dropout(0.25))


model.add(Conv3D(32, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(Dropout(0.25))

model.add(Conv3D(64, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding='same'))
model.add(Dropout(0.25))

model.add(Conv3D(128, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding='same'))
model.add(Dropout(0.25))

model.add(Flatten())

model.add(Dense(64,activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))


model.add(Dense(5, activation='softmax'))

#Now that you have written the model, the next step is to `compile` the model. When you print the `summary` of the model, you'll see the total number of parameters you have to train.

optimiser = 'adam' #write your optimizer
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_4 (Conv3D)            (None, 22, 160, 160, 16)  1312      
_________________________________________________________________
batch_normalization_5 (Batch (None, 22, 160, 160, 16)  64        
_________________________________________________________________
max_pooling3d_4 (MaxPooling3 (None, 11, 80, 80, 16)    0         
_________________________________________________________________
dropout_5 (Dropout)          (None, 11, 80, 80, 16)    0         
_________________________________________________________________
conv3d_5 (Conv3D)            (None, 11, 80, 80, 32)    13856     
_________________________________________________________________
activation_3 (Activation)    (None, 11, 80, 80, 32)    0         
_________________________________________________________________
batch_normalization_6 (Batch (None, 11, 80, 80, 32)   

In [10]:
train_generator = generator(train_path, train_doc, batch_size, img_idx_v, img_h, img_w)
val_generator = generator(val_path, val_doc, batch_size, img_idx_v, img_h, img_w)

In [11]:
curr_dt_time = datetime.datetime.now()
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=4, min_lr=0.001)# write the REducelronplateau code here
callbacks_list = [checkpoint, LR]



In [12]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [13]:
model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  Project_data/train ; batch size = 40
Epoch 1/15

Epoch 00001: val_loss improved from inf to 1.52083, saving model to model_init_2021-11-1900_37_59.296895/model-00001-1.83561-0.37104-1.52083-0.28000.h5
Epoch 2/15

Epoch 00002: val_loss did not improve from 1.52083
Epoch 3/15

Epoch 00003: val_loss did not improve from 1.52083
Epoch 4/15

Epoch 00004: val_loss did not improve from 1.52083
Epoch 5/15

Epoch 00005: val_loss did not improve from 1.52083
Epoch 6/15

Epoch 00006: val_loss did not improve from 1.52083
Epoch 7/15

Epoch 00007: val_loss did not improve from 1.52083
Epoch 8/15

Epoch 00008: val_loss did not improve from 1.52083
Epoch 9/15

Epoch 00009: val_loss did not improve from 1.52083
Epoch 10/15

Epoch 00010: val_loss did not improve from 1.52083
Epoch 11/15

Epoch 00011: val_loss did not improve from 1.52083
Epoch 12/15

Epoch 00012: val_loss did not improve from 1.52083
Epoch 13/15

Epoch 00013: val_loss did not improve from 1.52083
Epoch 14/15

Epoch 00014

<tensorflow.python.keras.callbacks.History at 0x7fd390327dc0>

#### Try with reduced batch size, lower LR , additional dense layer and more Epochs
### Model 7 Batch Size: 20, Number of images from each video: 22 Image Size: 160x160, Epochs: 25 with Dropouts, two dense layer, LR=0.0002

In [8]:
img_per_video=22
img_h = 160
img_w = 160
batch_size = 20 #experiment with the batch size
filtersize = (3,3,3)
num_epochs = 25 # choose the number of epochs
img_idx_v = np.round(np.linspace(0,29,img_per_video)).astype(int)

model = Sequential()

model.add(Conv3D(16, filtersize, activation='relu', input_shape=(len(img_idx_v),img_h,img_w,3), padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding='same'))
model.add(Dropout(0.25))


model.add(Conv3D(32, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(Dropout(0.25))

model.add(Conv3D(64, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding='same'))
model.add(Dropout(0.25))

model.add(Conv3D(128, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding='same'))
model.add(Dropout(0.25))

model.add(Flatten())

model.add(Dense(64,activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))

model.add(Dense(64,activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))



model.add(Dense(5, activation='softmax'))

#Now that you have written the model, the next step is to `compile` the model. When you print the `summary` of the model, you'll see the total number of parameters you have to train.

optimiser = tf.keras.optimizers.Adam(lr=0.0002) #write your optimizer
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d (Conv3D)              (None, 22, 160, 160, 16)  1312      
_________________________________________________________________
batch_normalization (BatchNo (None, 22, 160, 160, 16)  64        
_________________________________________________________________
max_pooling3d (MaxPooling3D) (None, 11, 80, 80, 16)    0         
_________________________________________________________________
dropout (Dropout)            (None, 11, 80, 80, 16)    0         
_________________________________________________________________
conv3d_1 (Conv3D)            (None, 11, 80, 80, 32)    13856     
_________________________________________________________________
activation (Activation)      (None, 11, 80, 80, 32)    0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 11, 80, 80, 32)    1

In [8]:
train_generator = generator(train_path, train_doc, batch_size, img_idx_v, img_h, img_w)
val_generator = generator(val_path, val_doc, batch_size, img_idx_v, img_h, img_w)

In [9]:
curr_dt_time = datetime.datetime.now()
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=4, min_lr=0.001)# write the REducelronplateau code here
callbacks_list = [checkpoint, LR]



In [10]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [11]:
model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  Project_data/train ; batch size = 20
Epoch 1/25

Epoch 00001: val_loss improved from inf to 2.89170, saving model to model_init_2021-11-1901_34_29.620632/model-00001-2.37912-0.25490-2.89170-0.16000.h5
Epoch 2/25

Epoch 00002: val_loss did not improve from 2.89170
Epoch 3/25

Epoch 00003: val_loss did not improve from 2.89170
Epoch 4/25

Epoch 00004: val_loss did not improve from 2.89170
Epoch 5/25

Epoch 00005: val_loss did not improve from 2.89170
Epoch 6/25

Epoch 00006: val_loss did not improve from 2.89170
Epoch 7/25

Epoch 00007: val_loss did not improve from 2.89170
Epoch 8/25

Epoch 00008: val_loss did not improve from 2.89170
Epoch 9/25

Epoch 00009: val_loss did not improve from 2.89170
Epoch 10/25

Epoch 00010: val_loss did not improve from 2.89170
Epoch 11/25

Epoch 00011: val_loss did not improve from 2.89170
Epoch 12/25

Epoch 00012: val_loss did not improve from 2.89170
Epoch 13/25

Epoch 00013: val_loss did not improve from 2.89170
Epoch 14/25

Epoch 00014

<tensorflow.python.keras.callbacks.History at 0x7eff967acf70>

#### Try with lower filter size and image size and more epochs, lower dropout after Conv3D and higher after Dense

### Model 8 Batch Size: 20, Number of images from each video: 22 Image Size: 120x120, Epochs: 40 with Dropouts, two dense layer, LR=0.0002, filter size =(2,2,2)

In [36]:
img_per_video=22
img_h = 120
img_w = 120
batch_size = 20 #experiment with the batch size
filtersize = (2,2,2)
num_epochs = 40 # choose the number of epochs
img_idx_v = np.round(np.linspace(0,29,img_per_video)).astype(int)

model = Sequential()

model.add(Conv3D(16, filtersize, activation='relu', input_shape=(len(img_idx_v),img_h,img_w,3), padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))

model.add(Dropout(0.1))

model.add(Conv3D(32, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(Dropout(0.1))

model.add(Conv3D(64, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(Dropout(0.1))

model.add(Conv3D(128, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(Dropout(0.1))

model.add(Flatten())

model.add(Dense(128,activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))

model.add(Dense(128,activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))



model.add(Dense(5, activation='softmax'))

#Now that you have written the model, the next step is to `compile` the model. When you print the `summary` of the model, you'll see the total number of parameters you have to train.

optimiser = tf.keras.optimizers.Adam(lr=0.0002) #write your optimizer
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential_7"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_28 (Conv3D)           (None, 22, 120, 120, 16)  400       
_________________________________________________________________
batch_normalization_42 (Batc (None, 22, 120, 120, 16)  64        
_________________________________________________________________
max_pooling3d_28 (MaxPooling (None, 11, 60, 60, 16)    0         
_________________________________________________________________
dropout_34 (Dropout)         (None, 11, 60, 60, 16)    0         
_________________________________________________________________
conv3d_29 (Conv3D)           (None, 11, 60, 60, 32)    4128      
_________________________________________________________________
activation_21 (Activation)   (None, 11, 60, 60, 32)    0         
_________________________________________________________________
batch_normalization_43 (Batc (None, 11, 60, 60, 32)   

In [37]:
train_generator = generator(train_path, train_doc, batch_size, img_idx_v, img_h, img_w)
val_generator = generator(val_path, val_doc, batch_size, img_idx_v, img_h, img_w)

In [38]:
curr_dt_time = datetime.datetime.now()
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=4, min_lr=0.001)# write the REducelronplateau code here
callbacks_list = [checkpoint, LR]



In [39]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [40]:
model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  Project_data/train ; batch size = 20
Epoch 1/40

Epoch 00001: val_loss improved from inf to 1.71019, saving model to model_init_2021-11-1903_30_56.460227/model-00001-2.41480-0.26998-1.71019-0.16000.h5
Epoch 2/40

Epoch 00002: val_loss did not improve from 1.71019
Epoch 3/40

Epoch 00003: val_loss did not improve from 1.71019
Epoch 4/40

Epoch 00004: val_loss did not improve from 1.71019
Epoch 5/40

Epoch 00005: val_loss did not improve from 1.71019
Epoch 6/40

Epoch 00006: val_loss did not improve from 1.71019
Epoch 7/40

Epoch 00007: val_loss did not improve from 1.71019
Epoch 8/40

Epoch 00008: val_loss did not improve from 1.71019
Epoch 9/40

Epoch 00009: val_loss did not improve from 1.71019
Epoch 10/40

Epoch 00010: val_loss did not improve from 1.71019
Epoch 11/40

Epoch 00011: val_loss did not improve from 1.71019
Epoch 12/40

Epoch 00012: val_loss did not improve from 1.71019
Epoch 13/40

Epoch 00013: val_loss did not improve from 1.71019
Epoch 14/40

Epoch 00014

<tensorflow.python.keras.callbacks.History at 0x7f806fe567f0>

#### Further reduce parameters by lowering down image size
### Model 9 Batch Size: 20, Number of images from each video: 22 Image Size: 100x100, Epochs: 40 with Dropouts, two dense layer, LR=0.0002, filter size =(2,2,2)

In [46]:
img_per_video=22
img_h = 100
img_w = 100
batch_size = 20 #experiment with the batch size
filtersize = (2,2,2)
num_epochs = 40 # choose the number of epochs
img_idx_v = np.round(np.linspace(0,29,img_per_video)).astype(int)

model = Sequential()

model.add(Conv3D(16, filtersize, activation='relu', input_shape=(len(img_idx_v),img_h,img_w,3), padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))

model.add(Dropout(0.1))

model.add(Conv3D(32, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(Dropout(0.1))

model.add(Conv3D(64, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(Dropout(0.1))

model.add(Conv3D(128, filtersize, padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(Dropout(0.1))

model.add(Flatten())

model.add(Dense(128,activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))

model.add(Dense(128,activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))



model.add(Dense(5, activation='softmax'))

#Now that you have written the model, the next step is to `compile` the model. When you print the `summary` of the model, you'll see the total number of parameters you have to train.

optimiser = tf.keras.optimizers.Adam(lr=0.0002) #write your optimizer
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential_9"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_36 (Conv3D)           (None, 22, 100, 100, 16)  400       
_________________________________________________________________
batch_normalization_54 (Batc (None, 22, 100, 100, 16)  64        
_________________________________________________________________
max_pooling3d_36 (MaxPooling (None, 11, 50, 50, 16)    0         
_________________________________________________________________
dropout_42 (Dropout)         (None, 11, 50, 50, 16)    0         
_________________________________________________________________
conv3d_37 (Conv3D)           (None, 11, 50, 50, 32)    4128      
_________________________________________________________________
activation_27 (Activation)   (None, 11, 50, 50, 32)    0         
_________________________________________________________________
batch_normalization_55 (Batc (None, 11, 50, 50, 32)   

In [47]:
train_generator = generator(train_path, train_doc, batch_size, img_idx_v, img_h, img_w)
val_generator = generator(val_path, val_doc, batch_size, img_idx_v, img_h, img_w)

In [48]:
curr_dt_time = datetime.datetime.now()
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=4, min_lr=0.001)# write the REducelronplateau code here
callbacks_list = [checkpoint, LR]



In [49]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [50]:
model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  Project_data/train ; batch size = 20
Epoch 1/40

Epoch 00001: val_loss improved from inf to 1.70670, saving model to model_init_2021-11-1906_12_19.098061/model-00001-2.46118-0.25792-1.70670-0.21000.h5
Epoch 2/40

Epoch 00002: val_loss did not improve from 1.70670
Epoch 3/40

Epoch 00003: val_loss did not improve from 1.70670
Epoch 4/40

Epoch 00004: val_loss did not improve from 1.70670
Epoch 5/40

Epoch 00005: val_loss did not improve from 1.70670
Epoch 6/40

Epoch 00006: val_loss did not improve from 1.70670
Epoch 7/40

Epoch 00007: val_loss did not improve from 1.70670
Epoch 8/40

Epoch 00008: val_loss did not improve from 1.70670
Epoch 9/40

Epoch 00009: val_loss did not improve from 1.70670
Epoch 10/40

Epoch 00010: val_loss did not improve from 1.70670
Epoch 11/40

Epoch 00011: val_loss did not improve from 1.70670
Epoch 12/40

Epoch 00012: val_loss did not improve from 1.70670
Epoch 13/40

Epoch 00013: val_loss did not improve from 1.70670
Epoch 14/40

Epoch 00014

<tensorflow.python.keras.callbacks.History at 0x7f806fe554f0>

### Model 10 CNN+RNN (Transfer Learning with MobileNet)

In [8]:
from tensorflow.keras.applications import mobilenet

img_idx_v = list(np.round(np.linspace(0,29,16)).astype(int))

img_h = 224
img_w = 224
batch_size = 15 #experiment with the batch size
num_epochs=25

base_model=mobilenet.MobileNet(input_shape=(img_h, img_w, 3), weights='imagenet',include_top=False) #imports the mobilenet model and discards the last 1000 neuron layer.
x = Flatten()(base_model.output)
mob_net = tf.keras.models.Model(base_model.input, x)

model = Sequential()
model.add(TimeDistributed(mob_net, input_shape=(len(img_idx_v), img_h, img_w,3)))
model.add(GRU(len(img_idx_v)))
model.add(Dropout(.25)) #added
model.add(Dense(5, activation='softmax'))
model.summary()

optimiser = tf.keras.optimizers.Adam(lr=0.0002) #write your optimizer
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())


Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
time_distributed_1 (TimeDist (None, 16, 50176)         3228864   
_________________________________________________________________
gru_1 (GRU)                  (None, 16)                2409312   
_________________________________________________________________
dropout_1 (Dropout)          (None, 16)                0         
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 85        
Total params: 5,638,261
Trainable params: 5,616,373
Non-trainable params: 21,888
_________________________________________________________________
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
time_distributed_1 (TimeDist (None, 16, 50176)         3228864   
_________________

In [9]:
train_generator = generator(train_path, train_doc, batch_size, img_idx_v, img_h, img_w)
val_generator = generator(val_path, val_doc, batch_size, img_idx_v, img_h, img_w)

In [10]:
curr_dt_time = datetime.datetime.now()
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=4, min_lr=0.001)# write the REducelronplateau code here
callbacks_list = [checkpoint, LR]




In [11]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [12]:
model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)



Source path =  Project_data/train ; batch size = 15
Epoch 1/25

Epoch 00001: val_loss improved from inf to 1.21376, saving model to model_init_2021-11-1915_36_34.884718/model-00001-1.43848-0.35897-1.21376-0.53000.h5
Epoch 2/25

Epoch 00002: val_loss improved from 1.21376 to 0.92839, saving model to model_init_2021-11-1915_36_34.884718/model-00002-1.07321-0.53846-0.92839-0.65000.h5
Epoch 3/25

Epoch 00003: val_loss did not improve from 0.92839
Epoch 4/25

Epoch 00004: val_loss improved from 0.92839 to 0.80862, saving model to model_init_2021-11-1915_36_34.884718/model-00004-0.89601-0.64103-0.80862-0.71000.h5
Epoch 5/25

Epoch 00005: val_loss improved from 0.80862 to 0.75908, saving model to model_init_2021-11-1915_36_34.884718/model-00005-0.82124-0.66365-0.75908-0.72000.h5
Epoch 6/25

Epoch 00006: val_loss improved from 0.75908 to 0.72133, saving model to model_init_2021-11-1915_36_34.884718/model-00006-0.78187-0.66516-0.72133-0.71000.h5
Epoch 7/25

Epoch 00007: val_loss did not improve

<tensorflow.python.keras.callbacks.History at 0x7f31019d4ee0>

### Model 11 CNN+RNN (Without Transfer Learning)

In [8]:
img_idx_v = list(np.round(np.linspace(0,29,22)).astype(int))

img_h = 120
img_w = 120
batch_size = 20 #experiment with the batch size
num_epochs=25

model = Sequential()

model.add(TimeDistributed(Conv2D(16, (2, 2) , padding='same', activation='relu'), input_shape=(len(img_idx_v), img_h, img_w,3)))
model.add(TimeDistributed(BatchNormalization()))
model.add(TimeDistributed(MaxPooling2D((2, 2))))
model.add(Dropout(0.1))
        
model.add(TimeDistributed(Conv2D(32, (2, 2) , padding='same', activation='relu')))
model.add(TimeDistributed(BatchNormalization()))
model.add(TimeDistributed(MaxPooling2D((2, 2))))
        
model.add(TimeDistributed(Conv2D(64, (2, 2) , padding='same', activation='relu')))
model.add(TimeDistributed(BatchNormalization()))
model.add(TimeDistributed(MaxPooling2D((2, 2))))
model.add(Dropout(0.1))
        
model.add(TimeDistributed(Conv2D(128, (2, 2) , padding='same', activation='relu')))
model.add(TimeDistributed(BatchNormalization()))
model.add(TimeDistributed(MaxPooling2D((2, 2))))
model.add(Dropout(0.1))



model.add(TimeDistributed(Flatten()))


model.add(GRU(64))
model.add(Dropout(0.25))
        
model.add(Dense(64,activation='relu'))
model.add(Dropout(0.25))
        
model.add(Dense(5, activation='softmax'))

optimiser = tf.keras.optimizers.Adam(lr=0.001) #write your optimizer
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])

print (model.summary())


Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
time_distributed_13 (TimeDis (None, 22, 120, 120, 16)  208       
_________________________________________________________________
time_distributed_14 (TimeDis (None, 22, 120, 120, 16)  64        
_________________________________________________________________
time_distributed_15 (TimeDis (None, 22, 60, 60, 16)    0         
_________________________________________________________________
dropout_5 (Dropout)          (None, 22, 60, 60, 16)    0         
_________________________________________________________________
time_distributed_16 (TimeDis (None, 22, 60, 60, 32)    2080      
_________________________________________________________________
time_distributed_17 (TimeDis (None, 22, 60, 60, 32)    128       
_________________________________________________________________
time_distributed_18 (TimeDis (None, 22, 30, 30, 32)   

In [9]:
train_generator = generator(train_path, train_doc, batch_size, img_idx_v, img_h, img_w)
val_generator = generator(val_path, val_doc, batch_size, img_idx_v, img_h, img_w)

In [10]:
curr_dt_time = datetime.datetime.now()
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=4, min_lr=0.001)# write the REducelronplateau code here
callbacks_list = [checkpoint, LR]




In [11]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [12]:
model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)



Source path =  Project_data/train ; batch size = 20
Epoch 1/25

Epoch 00001: val_loss improved from inf to 1.66784, saving model to model_init_2021-11-2001_45_15.111756/model-00001-1.60621-0.27149-1.66784-0.18000.h5
Epoch 2/25

Epoch 00002: val_loss improved from 1.66784 to 1.62300, saving model to model_init_2021-11-2001_45_15.111756/model-00002-1.32690-0.45098-1.62300-0.22000.h5
Epoch 3/25

Epoch 00003: val_loss did not improve from 1.62300
Epoch 4/25

Epoch 00004: val_loss did not improve from 1.62300
Epoch 5/25

Epoch 00005: val_loss did not improve from 1.62300
Epoch 6/25

Epoch 00006: val_loss did not improve from 1.62300
Epoch 7/25

Epoch 00007: val_loss did not improve from 1.62300
Epoch 8/25

Epoch 00008: val_loss did not improve from 1.62300
Epoch 9/25

Epoch 00009: val_loss did not improve from 1.62300
Epoch 10/25

Epoch 00010: val_loss did not improve from 1.62300
Epoch 11/25

Epoch 00011: val_loss did not improve from 1.62300
Epoch 12/25

Epoch 00012: val_loss did not impr

<tensorflow.python.keras.callbacks.History at 0x7f393c0562b0>

### Conclusion

With two architecture approaches where we have Conv3D and CNN+RNN, we had multiple experiments. Though we have shown multiple experiments with Conv3D in this notebook, we kept only 2 models with CNN+RNN giving pretty good performance to keep the notebook simlified.

We have now Two models with pretty good results and optimized parameters.

#### Model 8: Conv3D, Trainable params: 907,733, Training Accuracy--80.12%, Validation Accuracy--74%

### Model 11: CNN+RNN( without Transfer Learning), Trainable params: 1,269,781, Training Accuracy--86.57%, Validation Accuracy--81%

**Now since evaluation is based on number of parameters + accuracy of model we decided to submit Model 11 h5 file where though number of parameters a bit high compared to Model 8 but results of Model 11 are significantly better
