# Gesture Recognition
In this group project, you are going to build a 3D Conv model that will be able to predict the 5 gestures correctly. Please import the following libraries to get started.

In [4]:
import numpy as np
import os
from matplotlib.pyplot import imread
from skimage.transform import resize as imresize
import datetime
import os

We set the random seed so that the results don't vary drastically.

In [5]:
np.random.seed(30)
import random as rn
rn.seed(30)
from keras import backend as K
import tensorflow as tf
tf.random.set_seed(30)

In this block, you read the folder names for training and validation. You also set the `batch_size` here. Note that you set the batch size in such a way that you are able to use the GPU in full capacity. You keep increasing the batch size until the machine throws an error.

In [6]:
train_doc = np.random.permutation(open('C:\cnndatasets\Project_data/train.csv').readlines())
val_doc = np.random.permutation(open('C:\cnndatasets\Project_data/val.csv').readlines())
batch_size = 32

## Generator
This is one of the most important part of the code. The overall structure of the generator has been given. In the generator, you are going to preprocess the images as you have images of 2 different dimensions as well as create a batch of video frames. You have to experiment with `img_idx`, `y`,`z` and normalization such that you get high accuracy.

In [4]:
x = 30 # number of frames
y = 160 # image width
z = 160 # image height
channels=3
classes=5

In [5]:
def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    img_idx =[x for x in range(0,x)]   #create a list of image numbers you want to use for a particular video
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(t)//batch_size    # calculate the number of batches
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,x,y,z,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,classes)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    
                    temp = imresize(image,(y,z))
                    temp = temp/255 #Normalize data
                    
                    batch_data[folder,idx,:,:,0] = (temp[:,:,0])  #normalise and feed in the image
                    batch_data[folder,idx,:,:,1] = (temp[:,:,1])  #normalise and feed in the image
                    batch_data[folder,idx,:,:,2] = (temp[:,:,2])  #normalise and feed in the image
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        
        # write the code for the remaining data points which are left after full batches
        if (len(folder_list) != batch_size*num_batches):
            print("Batch: ",num_batches+1,"Index:", batch_size)
            batch_size = len(folder_list) - (batch_size*num_batches)
            batch_data = np.zeros((batch_size,x,y,z,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,classes)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    temp = imresize(image,(y,z))
                    temp = temp/255 #Normalize data
                    
                    batch_data[folder,idx,:,:,0] = (temp[:,:,0])
                    batch_data[folder,idx,:,:,1] = (temp[:,:,1])
                    batch_data[folder,idx,:,:,2] = (temp[:,:,2])
                   
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels

Note here that a video is represented above in the generator as (number of images, height, width, number of channels). Take this into consideration while creating the model architecture.

In [6]:
curr_dt_time = datetime.datetime.now()
train_path = "C:\cnndatasets\Project_data/train"
val_path = "C:\cnndatasets\Project_data/val"
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs =   10  # choose the number of epochs
print ('# epochs =', num_epochs)
num_batches = num_train_sequences//batch_size 
print(num_batches)

# training sequences = 663
# validation sequences = 100
# epochs = 10
20


## Model
Here you make the model using different functionalities that Keras provides. Remember to use `Conv3D` and `MaxPooling3D` and not `Conv2D` and `Maxpooling2D` for a 3D convolution model. You would want to use `TimeDistributed` while building a Conv2D + RNN model. Also remember that the last layer is the softmax. Design the network in such a way that the model is able to give good accuracy on the least number of parameters so that it can fit in the memory of the webcam.

#### Experiment 1: Creating a model with 160x160 image size,epochs=10 and batch_size=32

In [7]:
from keras.models import Sequential, Model
from keras.layers import Dense, GRU, Flatten, TimeDistributed, Flatten, BatchNormalization, Activation
from keras.layers.convolutional import Conv3D, MaxPooling3D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau, EarlyStopping
from keras import optimizers
from keras.layers import Dropout

#write your model here
model_1 = Sequential()

model_1.add(Conv3D(8, #number of filters 
                 kernel_size=(3,3,3), 
                 input_shape=(x,y,z,channels),
                 padding='same'))

model_1.add(Activation('relu'))
model_1.add(BatchNormalization())

model_1.add(MaxPooling3D(pool_size=(2,2,2)))

model_1.add(Conv3D(16, #Number of filters, 
                 kernel_size=(3,3,3), 
                 padding='same'))

model_1.add(Activation('relu'))
model_1.add(BatchNormalization())
model_1.add(MaxPooling3D(pool_size=(2,2,2)))

model_1.add(Conv3D(32, #Number of filters 
                 kernel_size=(1,3,3), 
                 padding='same'))

model_1.add(Activation('relu'))
model_1.add(BatchNormalization())
model_1.add(MaxPooling3D(pool_size=(2,2,2)))


model_1.add(Conv3D(64, #Number pf filters 
                 kernel_size=(1,3,3), 
                 padding='same'))
model_1.add(BatchNormalization())
model_1.add(Activation('relu'))



#Flatten Layers
model_1.add(Flatten())

model_1.add(Dense(100, activation='relu'))
model_1.add(Dropout(0.5))


#softmax layer
model_1.add(Dense(classes, activation='softmax'))

Now that you have written the model, the next step is to `compile` the model. When you print the `summary` of the model, you'll see the total number of parameters you have to train.

In [100]:
optimiser = 'adam'
model_1.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_1.summary())

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d (Conv3D)              (None, 30, 160, 160, 8)   656       
_________________________________________________________________
activation (Activation)      (None, 30, 160, 160, 8)   0         
_________________________________________________________________
batch_normalization (BatchNo (None, 30, 160, 160, 8)   32        
_________________________________________________________________
max_pooling3d (MaxPooling3D) (None, 15, 80, 80, 8)     0         
_________________________________________________________________
conv3d_1 (Conv3D)            (None, 15, 80, 80, 16)    3472      
_________________________________________________________________
activation_1 (Activation)    (None, 15, 80, 80, 16)    0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 15, 80, 80, 16)    6

Let us create the `train_generator` and the `val_generator` which will be used in `.fit_generator`.

In [9]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [82]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=4, cooldown=1, verbose=1)# write the REducelronplateau code here
earlystop = EarlyStopping( monitor="val_loss", min_delta=0,patience=10,verbose=1)
callbacks_list = [checkpoint, LR, earlystop]



The `steps_per_epoch` and `validation_steps` are used by `fit_generator` to decide the number of next() calls it need to make.

In [11]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

Let us now fit the model. This will start training the model and with the help of the checkpoints, you'll be able to save the model at the end of each epoch.

In [12]:
model_1.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  C:\cnndatasets\Project_data/train ; batch size = 32
Epoch 1/10


ResourceExhaustedError:  OOM when allocating tensor with shape[32,8,30,25600] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[node sequential/batch_normalization/FusedBatchNormV3 (defined at <ipython-input-12-16518857c690>:1) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
 [Op:__inference_train_function_1425]

Function call stack:
train_function


### We had hit the limit on memory resources with image resolution of 160x160 with 30 frames and batch_size of 32...we get the below error

ResourceExhaustedError: OOM when allocating tensor with shape[32,8,30,25600] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

#### Exp -2 :   Reduce the batch size to 20 and image dimensions to 60x60 with 30 frames 

In [13]:
channels=3
classes=5


x = 30 # number of frames
y = 60 # image width
z = 60 # image height
batch_size=20

def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    img_idx =[x for x in range(0,x)]   #create a list of image numbers you want to use for a particular video
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(t)//batch_size    # calculate the number of batches
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,x,y,z,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,classes)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    
                    temp = imresize(image,(y,z))
                    temp = temp/255 #Normalize data
                    
                    batch_data[folder,idx,:,:,0] = (temp[:,:,0])  #normalise and feed in the image
                    batch_data[folder,idx,:,:,1] = (temp[:,:,1])  #normalise and feed in the image
                    batch_data[folder,idx,:,:,2] = (temp[:,:,2])  #normalise and feed in the image
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        
        # write the code for the remaining data points which are left after full batches
        if (len(folder_list) != batch_size*num_batches):
            print("Batch: ",num_batches+1,"Index:", batch_size)
            batch_size = len(folder_list) - (batch_size*num_batches)
            batch_data = np.zeros((batch_size,x,y,z,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,classes)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    temp = imresize(image,(y,z))
                    temp = temp/255 #Normalize data
                    
                    batch_data[folder,idx,:,:,0] = (temp[:,:,0])
                    batch_data[folder,idx,:,:,1] = (temp[:,:,1])
                    batch_data[folder,idx,:,:,2] = (temp[:,:,2])
                   
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels


In [14]:
curr_dt_time = datetime.datetime.now()
train_path = "C:\cnndatasets\Project_data/train"
val_path = "C:\cnndatasets\Project_data/val"
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs =   10  # choose the number of epochs
print ('# epochs =', num_epochs)
num_batches = num_train_sequences//batch_size 
print(num_batches)

# training sequences = 663
# validation sequences = 100
# epochs = 10
33


In [15]:
#define model
model_b = Sequential()
model_b.add(Conv3D(16, kernel_size=(3, 3, 3), input_shape=(x,y,z,channels), padding='same'))
model_b.add(Activation('relu'))
model_b.add(BatchNormalization())
model_b.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_b.add(Conv3D(32, kernel_size=(3,3,3), padding='same'))
model_b.add(Activation('relu'))
model_b.add(BatchNormalization())
model_b.add(MaxPooling3D(pool_size=(2, 2, 2)))


model_b.add(Conv3D(64, kernel_size=(3,3,3), padding='same'))
model_b.add(Activation('relu'))
model_b.add(BatchNormalization())
model_b.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_b.add(Flatten())
model_b.add(Dense(128, activation='relu'))
model_b.add(Dropout(0.25))
model_b.add(Dense(64, activation='relu'))
model_b.add(Dropout(0.25))
model_b.add(Dense(classes, activation='softmax'))

In [16]:
optimiser = 'adam'
model_b.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_b.summary())

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_4 (Conv3D)            (None, 30, 60, 60, 16)    1312      
_________________________________________________________________
activation_4 (Activation)    (None, 30, 60, 60, 16)    0         
_________________________________________________________________
batch_normalization_4 (Batch (None, 30, 60, 60, 16)    64        
_________________________________________________________________
max_pooling3d_3 (MaxPooling3 (None, 15, 30, 30, 16)    0         
_________________________________________________________________
conv3d_5 (Conv3D)            (None, 15, 30, 30, 32)    13856     
_________________________________________________________________
activation_5 (Activation)    (None, 15, 30, 30, 32)    0         
_________________________________________________________________
batch_normalization_5 (Batch (None, 15, 30, 30, 32)   

In [17]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [18]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=4, cooldown=1, verbose=1)# write the REducelronplateau code here
earlystop = EarlyStopping(monitor="val_loss", min_delta=0,patience=10,verbose=1)
callbacks_list = [checkpoint, LR,earlystop]



In [19]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [20]:
model_b.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  C:\cnndatasets\Project_data/train ; batch size = 20
Epoch 1/10

Epoch 00001: saving model to model_init_2021-10-2407_52_42.683254\model-00001-2.38754-0.28205-1.81556-0.23000.h5
Epoch 2/10

Epoch 00002: saving model to model_init_2021-10-2407_52_42.683254\model-00002-2.37569-0.27451-2.54966-0.21000.h5
Epoch 3/10

Epoch 00003: saving model to model_init_2021-10-2407_52_42.683254\model-00003-2.20736-0.25490-4.39838-0.20000.h5
Epoch 4/10

Epoch 00004: saving model to model_init_2021-10-2407_52_42.683254\model-00004-2.50209-0.21569-3.91546-0.17000.h5
Epoch 5/10

Epoch 00005: saving model to model_init_2021-10-2407_52_42.683254\model-00005-2.54770-0.16667-3.22769-0.21000.h5

Epoch 00005: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026.
Epoch 6/10

Epoch 00006: saving model to model_init_2021-10-2407_52_42.683254\model-00006-2.00954-0.23529-3.18391-0.18000.h5
Epoch 7/10

Epoch 00007: saving model to model_init_2021-10-2407_52_42.683254\model-00007-1.64712-0.3

<keras.callbacks.History at 0x1a4ea3762b0>

#### Exp-3 :adding more layers and increase the no of epochs to 15

In [21]:
x = 30 # number of frames
y = 60 # image width
z = 60 # image height

batch_size=20
num_epochs=15

In [22]:
model_b1 = Sequential()
model_b1.add(Conv3D(16, kernel_size=(3, 3, 3), input_shape=(x,y,z,channels), padding='same'))
model_b1.add(Activation('relu'))
model_b1.add(BatchNormalization())

model_b1.add(Conv3D(16, kernel_size=(3, 3, 3), input_shape=(x,y,z,channels), padding='same'))
model_b1.add(Activation('relu'))
model_b1.add(BatchNormalization())

model_b1.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_b1.add(Conv3D(32, kernel_size=(3,3,3), padding='same'))
model_b1.add(Activation('relu'))
model_b1.add(BatchNormalization())

model_b1.add(Conv3D(32, kernel_size=(3,3,3), padding='same'))
model_b1.add(Activation('relu'))
model_b1.add(BatchNormalization())

model_b1.add(MaxPooling3D(pool_size=(2, 2, 2)))


model_b1.add(Conv3D(64, kernel_size=(3,3,3), padding='same'))
model_b1.add(Activation('relu'))
model_b1.add(BatchNormalization())

model_b1.add(Conv3D(64, kernel_size=(3,3,3), padding='same'))
model_b1.add(Activation('relu'))
model_b1.add(BatchNormalization())

model_b1.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_b1.add(Flatten())
model_b1.add(Dense(128, activation='relu'))
model_b1.add(Dropout(0.25))
model_b1.add(Dense(64, activation='relu'))
model_b1.add(Dropout(0.25))
model_b1.add(Dense(classes, activation='softmax'))

In [23]:
optimiser = 'adam'
model_b1.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_b1.summary())

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_7 (Conv3D)            (None, 30, 60, 60, 16)    1312      
_________________________________________________________________
activation_7 (Activation)    (None, 30, 60, 60, 16)    0         
_________________________________________________________________
batch_normalization_7 (Batch (None, 30, 60, 60, 16)    64        
_________________________________________________________________
conv3d_8 (Conv3D)            (None, 30, 60, 60, 16)    6928      
_________________________________________________________________
activation_8 (Activation)    (None, 30, 60, 60, 16)    0         
_________________________________________________________________
batch_normalization_8 (Batch (None, 30, 60, 60, 16)    64        
_________________________________________________________________
max_pooling3d_6 (MaxPooling3 (None, 15, 30, 30, 16)   

In [24]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [25]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [26]:
model_b1.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  C:\cnndatasets\Project_data/train ; batch size = 20
Epoch 1/15

Epoch 00001: saving model to model_init_2021-10-2407_52_42.683254\model-00001-2.78396-0.31222-2.51257-0.16000.h5
Epoch 2/15

Epoch 00002: saving model to model_init_2021-10-2407_52_42.683254\model-00002-2.52636-0.27451-2.67990-0.23000.h5
Epoch 3/15

Epoch 00003: saving model to model_init_2021-10-2407_52_42.683254\model-00003-2.16703-0.35294-2.98488-0.18000.h5
Epoch 4/15

Epoch 00004: saving model to model_init_2021-10-2407_52_42.683254\model-00004-2.57890-0.24510-6.40638-0.16000.h5
Epoch 5/15

Epoch 00005: saving model to model_init_2021-10-2407_52_42.683254\model-00005-2.81370-0.23529-2.43718-0.24000.h5
Epoch 6/15

Epoch 00006: saving model to model_init_2021-10-2407_52_42.683254\model-00006-2.01413-0.25490-1.96117-0.21000.h5
Epoch 7/15

Epoch 00007: saving model to model_init_2021-10-2407_52_42.683254\model-00007-2.10662-0.20588-2.10004-0.25000.h5
Epoch 8/15

Epoch 00008: saving model to model_init_2021-1

<keras.callbacks.History at 0x1a405ea30d0>

From exp-2 and exp-3 there is very small change in training and validation accuracy even though we increase the no of layers and no of epochs.We can see the model is more impacted by image resolution,batch size and no of frames.

#### Exp 4:change the image resolution to 80x80  ,with frames=30 and  by keeping batchsize=10 and epochs=20

In [27]:
x = 30 # number of frames
y = 80 # image width
z = 80 # image height

batch_size=10
num_epochs=20

In [28]:
model_b2 = Sequential()
model_b2.add(Conv3D(8, kernel_size=(3, 3, 3), input_shape=(x,y,z,channels), padding='same'))
model_b2.add(Activation('relu'))
model_b2.add(BatchNormalization())



model_b2.add(MaxPooling3D(pool_size=(2, 2, 2)))


model_b2.add(Conv3D(16, kernel_size=(3,3,3), padding='same'))
model_b2.add(Activation('relu'))
model_b2.add(BatchNormalization())

model_b2.add(MaxPooling3D(pool_size=(2, 2, 2)))


model_b2.add(Conv3D(32, kernel_size=(3,3,3), padding='same'))
model_b2.add(Activation('relu'))
model_b2.add(BatchNormalization())

model_b2.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_b2.add(Conv3D(64, kernel_size=(3,3,3), padding='same'))
model_b2.add(Activation('relu'))
model_b2.add(BatchNormalization())

model_b2.add(MaxPooling3D(pool_size=(2, 2, 2)))


model_b2.add(Flatten())
model_b2.add(Dense(1000, activation='relu'))
model_b2.add(Dropout(0.5))
model_b2.add(Dense(500, activation='relu'))
model_b2.add(Dropout(0.5))
model_b2.add(Dense(classes, activation='softmax'))

In [29]:
optimiser = 'adam'
model_b2.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_b2.summary())

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_13 (Conv3D)           (None, 30, 80, 80, 8)     656       
_________________________________________________________________
activation_13 (Activation)   (None, 30, 80, 80, 8)     0         
_________________________________________________________________
batch_normalization_13 (Batc (None, 30, 80, 80, 8)     32        
_________________________________________________________________
max_pooling3d_9 (MaxPooling3 (None, 15, 40, 40, 8)     0         
_________________________________________________________________
conv3d_14 (Conv3D)           (None, 15, 40, 40, 16)    3472      
_________________________________________________________________
activation_14 (Activation)   (None, 15, 40, 40, 16)    0         
_________________________________________________________________
batch_normalization_14 (Batc (None, 15, 40, 40, 16)   

In [30]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [31]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [32]:
model_b2.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  C:\cnndatasets\Project_data/train ; batch size = 10
Epoch 1/20

Epoch 00001: saving model to model_init_2021-10-2407_52_42.683254\model-00001-3.00365-0.35747-7.41094-0.21000.h5
Epoch 2/20

Epoch 00002: saving model to model_init_2021-10-2407_52_42.683254\model-00002-2.25049-0.34328-10.29561-0.22000.h5
Epoch 3/20

Epoch 00003: saving model to model_init_2021-10-2407_52_42.683254\model-00003-2.34273-0.24876-10.49308-0.19000.h5
Epoch 4/20

Epoch 00004: saving model to model_init_2021-10-2407_52_42.683254\model-00004-1.69692-0.37811-7.29221-0.25000.h5
Epoch 5/20

Epoch 00005: saving model to model_init_2021-10-2407_52_42.683254\model-00005-1.58764-0.41791-7.77822-0.24000.h5
Epoch 6/20

Epoch 00006: saving model to model_init_2021-10-2407_52_42.683254\model-00006-1.42938-0.48259-8.36381-0.18000.h5
Epoch 7/20

Epoch 00007: saving model to model_init_2021-10-2407_52_42.683254\model-00007-1.43407-0.48756-7.00811-0.18000.h5
Epoch 8/20

Epoch 00008: saving model to model_init_2021

<keras.callbacks.History at 0x1a405eb16d0>

In exp-4 ,by increasing the trainable parameters,reducing the batch size  to 10,by changing the image dimensions,increasing the no of epochs ,we got the training accuracy and val accuracy are 79.1% and 79.0% respectively at the end of 20 epochs.



#### exp-5 : By changing  the image dimensions to 100x 100 ,batchsize=10 and epochs=30

In [40]:
x = 30 # number of frames
y = 100 # image width
z = 100 # image height
batch_size=10
num_epochs=30

In [41]:
def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    img_idx =[x for x in range(0,30)]   #create a list of image numbers you want to use for a particular video
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(t)//batch_size    # calculate the number of batches
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,x,y,z,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,classes)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    
                    temp = imresize(image,(y,z))
                    temp = temp/255 #Normalize data
                    
                    batch_data[folder,idx,:,:,0] = (temp[:,:,0])  #normalise and feed in the image
                    batch_data[folder,idx,:,:,1] = (temp[:,:,1])  #normalise and feed in the image
                    batch_data[folder,idx,:,:,2] = (temp[:,:,2])  #normalise and feed in the image
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        
        # write the code for the remaining data points which are left after full batches
        if (len(folder_list) != batch_size*num_batches):
            print("Batch: ",num_batches+1,"Index:", batch_size)
            batch_size = len(folder_list) - (batch_size*num_batches)
            batch_data = np.zeros((batch_size,x,y,z,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,classes)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    temp = imresize(image,(y,z))
                    temp = temp/255 #Normalize data
                    
                    batch_data[folder,idx,:,:,0] = (temp[:,:,0])
                    batch_data[folder,idx,:,:,1] = (temp[:,:,1])
                    batch_data[folder,idx,:,:,2] = (temp[:,:,2])
                   
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels

In [42]:
model_c = Sequential()

model_c.add(Conv3D(8, kernel_size=(3, 3, 3), input_shape=(x,y,z,channels), padding='same'))
model_c.add(Activation('relu'))
model_c.add(BatchNormalization())


model_c.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_c.add(Conv3D(16, kernel_size=(3, 3, 3), padding='same'))
model_c.add(Activation('relu'))
model_c.add(BatchNormalization())



model_c.add(MaxPooling3D(pool_size=(2, 2, 2)))


model_c.add(Conv3D(32, kernel_size=(3,3,3), padding='same'))
model_c.add(Activation('relu'))
model_c.add(BatchNormalization())

model_c.add(MaxPooling3D(pool_size=(2, 2, 2)))


model_c.add(Conv3D(64, kernel_size=(3,3,3), padding='same'))
model_c.add(Activation('relu'))
model_c.add(BatchNormalization())

model_c.add(MaxPooling3D(pool_size=(2, 2, 2)))



model_c.add(Flatten())
model_c.add(Dense(1000, activation='relu'))
model_c.add(Dropout(0.5))
model_c.add(Dense(500, activation='relu'))
model_c.add(Dropout(0.5))
model_c.add(Dense(classes, activation='softmax'))

In [43]:
optimiser = tf.keras.optimizers.Adam()
model_c.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_c.summary())

Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_21 (Conv3D)           (None, 30, 100, 100, 8)   656       
_________________________________________________________________
activation_21 (Activation)   (None, 30, 100, 100, 8)   0         
_________________________________________________________________
batch_normalization_21 (Batc (None, 30, 100, 100, 8)   32        
_________________________________________________________________
max_pooling3d_17 (MaxPooling (None, 15, 50, 50, 8)     0         
_________________________________________________________________
conv3d_22 (Conv3D)           (None, 15, 50, 50, 16)    3472      
_________________________________________________________________
activation_22 (Activation)   (None, 15, 50, 50, 16)    0         
_________________________________________________________________
batch_normalization_22 (Batc (None, 15, 50, 50, 16)   

In [44]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [45]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [46]:
model_c.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  C:\cnndatasets\Project_data/train ; batch size = 10
Epoch 1/30

Epoch 00001: saving model to model_init_2021-10-2407_52_42.683254\model-00001-2.86356-0.41176-4.70224-0.16000.h5
Epoch 2/30

Epoch 00002: saving model to model_init_2021-10-2407_52_42.683254\model-00002-2.10496-0.43284-2.90019-0.17000.h5
Epoch 3/30

Epoch 00003: saving model to model_init_2021-10-2407_52_42.683254\model-00003-2.26446-0.38308-5.37779-0.15000.h5
Epoch 4/30

Epoch 00004: saving model to model_init_2021-10-2407_52_42.683254\model-00004-1.61122-0.57214-4.86445-0.20000.h5
Epoch 5/30

Epoch 00005: saving model to model_init_2021-10-2407_52_42.683254\model-00005-1.22808-0.56219-3.50140-0.12000.h5
Epoch 6/30

Epoch 00006: saving model to model_init_2021-10-2407_52_42.683254\model-00006-1.21734-0.54229-3.90085-0.22000.h5

Epoch 00006: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026.
Epoch 7/30

Epoch 00007: saving model to model_init_2021-10-2407_52_42.683254\model-00007-0.96329-0.6


Epoch 00028: saving model to model_init_2021-10-2407_52_42.683254\model-00028-0.20547-0.91542-0.47020-0.84000.h5
Epoch 29/30

Epoch 00029: saving model to model_init_2021-10-2407_52_42.683254\model-00029-0.15099-0.95025-0.42815-0.88000.h5
Epoch 30/30

Epoch 00030: saving model to model_init_2021-10-2407_52_42.683254\model-00030-0.20252-0.90547-0.33880-0.92000.h5


<keras.callbacks.History at 0x1a4efd88700>

In Exp-5, we increase the image dimensions to 100x100,the trainable parameters are increased ,hence we got the best results in this model

#### Exp-6 :Lets decrease the number of parameters by keeping batch size and epochs as same.That is by decreasing the dense neuron layers.

In [47]:
batch_size=10
num_epochs=30

In [50]:
model_c1 = Sequential()

model_c1.add(Conv3D(8, kernel_size=(3, 3, 3), input_shape=(x,y,z,channels), padding='same'))
model_c1.add(Activation('relu'))
model_c1.add(BatchNormalization())


model_c1.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_c1.add(Conv3D(16, kernel_size=(3, 3, 3), padding='same'))
model_c1.add(Activation('relu'))
model_c1.add(BatchNormalization())



model_c1.add(MaxPooling3D(pool_size=(2, 2, 2)))


model_c1.add(Conv3D(32, kernel_size=(3,3,3), padding='same'))
model_c1.add(Activation('relu'))
model_c1.add(BatchNormalization())

model_c1.add(MaxPooling3D(pool_size=(2, 2, 2)))


model_c1.add(Conv3D(64, kernel_size=(3,3,3), padding='same'))
model_c1.add(Activation('relu'))
model_c1.add(BatchNormalization())

model_c1.add(MaxPooling3D(pool_size=(2, 2, 2)))



model_c1.add(Flatten())
model_c1.add(Dense(256, activation='relu'))
model_c1.add(Dropout(0.25))
model_c1.add(Dense(128, activation='relu'))
model_c1.add(Dropout(0.25))
model_c1.add(Dense(classes, activation='softmax'))

In [51]:
optimiser=tf.keras.optimizers.Adam()
model_c1.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_c1.summary())

Model: "sequential_7"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_29 (Conv3D)           (None, 30, 100, 100, 8)   656       
_________________________________________________________________
activation_29 (Activation)   (None, 30, 100, 100, 8)   0         
_________________________________________________________________
batch_normalization_29 (Batc (None, 30, 100, 100, 8)   32        
_________________________________________________________________
max_pooling3d_25 (MaxPooling (None, 15, 50, 50, 8)     0         
_________________________________________________________________
conv3d_30 (Conv3D)           (None, 15, 50, 50, 16)    3472      
_________________________________________________________________
activation_30 (Activation)   (None, 15, 50, 50, 16)    0         
_________________________________________________________________
batch_normalization_30 (Batc (None, 15, 50, 50, 16)   

In [52]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [53]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [54]:
model_c1.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  C:\cnndatasets\Project_data/train ; batch size = 10
Epoch 1/30

Epoch 00001: saving model to model_init_2021-10-2407_52_42.683254\model-00001-1.93815-0.35143-4.81388-0.21000.h5
Epoch 2/30

Epoch 00002: saving model to model_init_2021-10-2407_52_42.683254\model-00002-1.81124-0.36816-9.24781-0.18000.h5
Epoch 3/30

Epoch 00003: saving model to model_init_2021-10-2407_52_42.683254\model-00003-1.60591-0.37811-6.11457-0.23000.h5
Epoch 4/30

Epoch 00004: saving model to model_init_2021-10-2407_52_42.683254\model-00004-1.49827-0.46269-4.58979-0.14000.h5
Epoch 5/30

Epoch 00005: saving model to model_init_2021-10-2407_52_42.683254\model-00005-1.29107-0.51741-5.13388-0.15000.h5
Epoch 6/30

Epoch 00006: saving model to model_init_2021-10-2407_52_42.683254\model-00006-1.29795-0.52736-4.59421-0.21000.h5
Epoch 7/30

Epoch 00007: saving model to model_init_2021-10-2407_52_42.683254\model-00007-1.05103-0.58706-3.63846-0.18000.h5
Epoch 8/30

Epoch 00008: saving model to model_init_2021-1


Epoch 00029: saving model to model_init_2021-10-2407_52_42.683254\model-00029-0.12901-0.95522-0.55411-0.86000.h5
Epoch 30/30

Epoch 00030: saving model to model_init_2021-10-2407_52_42.683254\model-00030-0.16293-0.94030-0.42383-0.88000.h5


<keras.callbacks.History at 0x1a4f24dd580>

Above results clearly shows that when we use batch size=10 and also by having the trainable parametes decreased ,the training and validation accuracy are 94.03% and 88.0% respectively at the end of 30 epochs.<br>Since we use batch size=10 ,the iterations are more and it takes more computational time.

#### exp-7 :By  increasing the batchsize to 20 and rest remaining same.let us try to examine the results 

In [55]:
batch_size=20

model_c2 = Sequential()

model_c2.add(Conv3D(8, kernel_size=(3, 3, 3), input_shape=(x,y,z,channels), padding='same'))
model_c2.add(Activation('relu'))
model_c2.add(BatchNormalization())


model_c2.add(MaxPooling3D(pool_size=(2, 2, 2)))

model_c2.add(Conv3D(16, kernel_size=(3, 3, 3), padding='same'))
model_c2.add(Activation('relu'))
model_c2.add(BatchNormalization())



model_c2.add(MaxPooling3D(pool_size=(2, 2, 2)))


model_c2.add(Conv3D(32, kernel_size=(3,3,3), padding='same'))
model_c2.add(Activation('relu'))
model_c2.add(BatchNormalization())

model_c2.add(MaxPooling3D(pool_size=(2, 2, 2)))


model_c2.add(Conv3D(64, kernel_size=(3,3,3), padding='same'))
model_c2.add(Activation('relu'))
model_c2.add(BatchNormalization())

model_c2.add(MaxPooling3D(pool_size=(2, 2, 2)))



model_c2.add(Flatten())
model_c2.add(Dense(256, activation='relu'))
model_c2.add(Dropout(0.25))
model_c2.add(Dense(128, activation='relu'))
model_c2.add(Dropout(0.25))
model_c2.add(Dense(classes, activation='softmax'))

In [59]:
optimiser=tf.keras.optimizers.Adam()
model_c2.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_c2.summary())

Model: "sequential_8"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_33 (Conv3D)           (None, 30, 100, 100, 8)   656       
_________________________________________________________________
activation_33 (Activation)   (None, 30, 100, 100, 8)   0         
_________________________________________________________________
batch_normalization_33 (Batc (None, 30, 100, 100, 8)   32        
_________________________________________________________________
max_pooling3d_29 (MaxPooling (None, 15, 50, 50, 8)     0         
_________________________________________________________________
conv3d_34 (Conv3D)           (None, 15, 50, 50, 16)    3472      
_________________________________________________________________
activation_34 (Activation)   (None, 15, 50, 50, 16)    0         
_________________________________________________________________
batch_normalization_34 (Batc (None, 15, 50, 50, 16)   

In [57]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [58]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [60]:
model_c2.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  C:\cnndatasets\Project_data/train ; batch size = 20
Epoch 1/30

Epoch 00001: saving model to model_init_2021-10-2407_52_42.683254\model-00001-1.85071-0.38763-2.82922-0.21000.h5
Epoch 2/30

Epoch 00002: saving model to model_init_2021-10-2407_52_42.683254\model-00002-1.62302-0.40196-4.52054-0.23000.h5
Epoch 3/30

Epoch 00003: saving model to model_init_2021-10-2407_52_42.683254\model-00003-2.04910-0.30392-6.52720-0.18000.h5
Epoch 4/30

Epoch 00004: saving model to model_init_2021-10-2407_52_42.683254\model-00004-1.93807-0.32353-5.21118-0.21000.h5
Epoch 5/30

Epoch 00005: saving model to model_init_2021-10-2407_52_42.683254\model-00005-1.66374-0.35294-4.43092-0.24000.h5

Epoch 00005: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026.
Epoch 6/30

Epoch 00006: saving model to model_init_2021-10-2407_52_42.683254\model-00006-1.73611-0.38235-3.75270-0.21000.h5
Epoch 7/30

Epoch 00007: saving model to model_init_2021-10-2407_52_42.683254\model-00007-1.37444-0.4

<keras.callbacks.History at 0x1a4f26c74c0>

From exp-7 ,we see that as the batch size increase to 20,we saw the model overfit and the validation loss is not decreasing,and we saw the early stopping at the 11th epoch.

#### Exp-8 : Add dropout layers to the cnn model 

In [78]:
model_c3 = Sequential()

model_c3.add(Conv3D(8, kernel_size=(3, 3, 3), input_shape=(x,y,z,channels), padding='same'))
model_c3.add(Activation('relu'))
model_c3.add(BatchNormalization())


model_c3.add(MaxPooling3D(pool_size=(2, 2, 2)))
model_c3.add(Dropout(0.25))

model_c3.add(Conv3D(16, kernel_size=(3, 3, 3), padding='same'))
model_c3.add(Activation('relu'))
model_c3.add(BatchNormalization())



model_c3.add(MaxPooling3D(pool_size=(2, 2, 2)))
model_c3.add(Dropout(0.25))


model_c3.add(Conv3D(32, kernel_size=(3,3,3), padding='same'))
model_c3.add(Activation('relu'))
model_c3.add(BatchNormalization())

model_c3.add(MaxPooling3D(pool_size=(2, 2, 2)))
model_c3.add(Dropout(0.25))


model_c3.add(Conv3D(64, kernel_size=(3,3,3), padding='same'))
model_c3.add(Activation('relu'))
model_c3.add(BatchNormalization())

model_c3.add(MaxPooling3D(pool_size=(2, 2, 2)))
model_c3.add(Dropout(0.25))



model_c3.add(Flatten())
model_c3.add(Dense(256, activation='relu'))
model_c3.add(Dropout(0.25))
model_c3.add(Dense(128, activation='relu'))
model_c3.add(Dropout(0.25))
model_c3.add(Dense(classes, activation='softmax'))

In [79]:
optimiser=tf.keras.optimizers.Adam()
model_c3.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_c3.summary())

Model: "sequential_12"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_49 (Conv3D)           (None, 30, 100, 100, 8)   656       
_________________________________________________________________
activation_49 (Activation)   (None, 30, 100, 100, 8)   0         
_________________________________________________________________
batch_normalization_49 (Batc (None, 30, 100, 100, 8)   32        
_________________________________________________________________
max_pooling3d_45 (MaxPooling (None, 15, 50, 50, 8)     0         
_________________________________________________________________
dropout_35 (Dropout)         (None, 15, 50, 50, 8)     0         
_________________________________________________________________
conv3d_50 (Conv3D)           (None, 15, 50, 50, 16)    3472      
_________________________________________________________________
activation_50 (Activation)   (None, 15, 50, 50, 16)  

In [80]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [81]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [83]:
model_c3.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  C:\cnndatasets\Project_data/train ; batch size = 20
Epoch 1/30

Epoch 00001: saving model to model_init_2021-10-2407_52_42.683254\model-00001-2.58797-0.28658-2.78104-0.16000.h5
Epoch 2/30

Epoch 00002: saving model to model_init_2021-10-2407_52_42.683254\model-00002-1.94981-0.32353-4.69980-0.18000.h5
Epoch 3/30

Epoch 00003: saving model to model_init_2021-10-2407_52_42.683254\model-00003-1.75467-0.35294-3.55451-0.17000.h5
Epoch 4/30

Epoch 00004: saving model to model_init_2021-10-2407_52_42.683254\model-00004-1.68464-0.35294-3.96897-0.16000.h5
Epoch 5/30

Epoch 00005: saving model to model_init_2021-10-2407_52_42.683254\model-00005-1.85709-0.35294-3.33143-0.25000.h5

Epoch 00005: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026.
Epoch 6/30

Epoch 00006: saving model to model_init_2021-10-2407_52_42.683254\model-00006-1.56617-0.38235-3.96056-0.21000.h5
Epoch 7/30

Epoch 00007: saving model to model_init_2021-10-2407_52_42.683254\model-00007-1.47554-0.4

<keras.callbacks.History at 0x1a4efdc3340>

 Even after adding drop out layers the training accuracy has reduced ,but the model still overfits as shown in the accuracies above.The validation loss is not decreasing and the model stops learning at 11th epoch

### Hence the final model is Exp-6 with batch size 10 and with the training and validation accuracy are 94.03% and 88.0% respectively for conv 3d  model with least no of parameters.

### Using Conv2D + RNN to build a model

Lets build a custom conv2d+ rnn model 

In [304]:
x = 30 # number of frames
y = 100 # image width
z = 100 # image height
batch_size=10
num_epochs=30
channels=3
classes=5


In [305]:
def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    img_idx =[x for x in range(0,x)]   #create a list of image numbers you want to use for a particular video
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(t)//batch_size    # calculate the number of batches
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,x,y,z,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,classes)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    
                    temp = imresize(image,(y,z))
                    temp = temp/255 #Normalize data
                    
                    batch_data[folder,idx,:,:,0] = (temp[:,:,0])  #normalise and feed in the image
                    batch_data[folder,idx,:,:,1] = (temp[:,:,1])  #normalise and feed in the image
                    batch_data[folder,idx,:,:,2] = (temp[:,:,2])  #normalise and feed in the image
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        
        # write the code for the remaining data points which are left after full batches
        if (len(folder_list) != batch_size*num_batches):
            print("Batch: ",num_batches+1,"Index:", batch_size)
            batch_size = len(folder_list) - (batch_size*num_batches)
            batch_data = np.zeros((batch_size,x,y,z,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,classes)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    temp = imresize(image,(y,z))
                    temp = temp/255 #Normalize data
                    
                    batch_data[folder,idx,:,:,0] = (temp[:,:,0])
                    batch_data[folder,idx,:,:,1] = (temp[:,:,1])
                    batch_data[folder,idx,:,:,2] = (temp[:,:,2])
                   
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels

In [123]:
curr_dt_time = datetime.datetime.now()
train_path = "C:\cnndatasets\Project_data/train"
val_path = "C:\cnndatasets\Project_data/val"
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs =   30  # choose the number of epochs
print ('# epochs =', num_epochs)
num_batches = num_train_sequences//batch_size 
print(num_batches)

# training sequences = 663
# validation sequences = 100
# epochs = 30
66


#### Exp- 9 : Using custom conv2d + GRU to build a model

In [127]:
#define model

from keras.models import Sequential, Model
from keras.layers import Dense, GRU, Flatten, TimeDistributed, Flatten, BatchNormalization, Activation
from keras.layers.convolutional import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau, EarlyStopping
from keras import optimizers
from keras.layers import Dropout


input_shape=(x,y,z,channels)
model_d= Sequential()

model_d.add(TimeDistributed(Conv2D(16, kernel_size=(3, 3),  padding='same'),input_shape=input_shape))
model_d.add(TimeDistributed(Activation('relu')))
model_d.add(TimeDistributed(BatchNormalization()))


model_d.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))

model_d.add(TimeDistributed(Conv2D(32, kernel_size=(3, 3), padding='same')))
model_d.add(TimeDistributed(Activation('relu')))
model_d.add(TimeDistributed(BatchNormalization()))


model_d.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))

model_d.add(TimeDistributed(Conv2D(64, kernel_size=(3, 3),  padding='same')))
model_d.add(TimeDistributed(Activation('relu')))
model_d.add(TimeDistributed(BatchNormalization()))


model_d.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))

model_d.add(TimeDistributed(Conv2D(128, kernel_size=(3, 3), padding='same')))
model_d.add(TimeDistributed(Activation('relu')))
model_d.add(TimeDistributed(BatchNormalization()))


model_d.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))

model_d.add(TimeDistributed(Flatten()))

model_d.add(GRU(64))
model_d.add(Dropout(0.25))

model_d.add(Dense(64,activation='relu'))
model_d.add(Dropout(0.25))

model_d.add(Dense(classes, activation='softmax'))





In [128]:
optimiser=tf.keras.optimizers.Adam()
model_d.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_d.summary())

Model: "sequential_26"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
time_distributed_54 (TimeDis (None, 30, 100, 100, 16)  448       
_________________________________________________________________
time_distributed_55 (TimeDis (None, 30, 100, 100, 16)  0         
_________________________________________________________________
time_distributed_56 (TimeDis (None, 30, 100, 100, 16)  64        
_________________________________________________________________
time_distributed_57 (TimeDis (None, 30, 50, 50, 16)    0         
_________________________________________________________________
time_distributed_58 (TimeDis (None, 30, 50, 50, 32)    4640      
_________________________________________________________________
time_distributed_59 (TimeDis (None, 30, 50, 50, 32)    0         
_________________________________________________________________
time_distributed_60 (TimeDis (None, 30, 50, 50, 32)  

In [129]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [130]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=4, cooldown=1, verbose=1)# write the REducelronplateau code here
earlystop = EarlyStopping( monitor="val_loss", min_delta=0,patience=10,verbose=1)
callbacks_list = [checkpoint, LR, earlystop]



In [131]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [132]:
model_d.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  C:\cnndatasets\Project_data/train ; batch size = 10
Epoch 1/30

Epoch 00001: saving model to model_init_2021-10-2521_05_10.942506\model-00001-1.57467-0.30166-1.73278-0.21000.h5
Epoch 2/30

Epoch 00002: saving model to model_init_2021-10-2521_05_10.942506\model-00002-1.51832-0.37313-1.75715-0.20000.h5
Epoch 3/30

Epoch 00003: saving model to model_init_2021-10-2521_05_10.942506\model-00003-1.43003-0.37811-1.78984-0.21000.h5
Epoch 4/30

Epoch 00004: saving model to model_init_2021-10-2521_05_10.942506\model-00004-1.34199-0.44279-1.85099-0.20000.h5
Epoch 5/30

Epoch 00005: saving model to model_init_2021-10-2521_05_10.942506\model-00005-1.40677-0.41294-2.38586-0.19000.h5

Epoch 00005: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026.
Epoch 6/30

Epoch 00006: saving model to model_init_2021-10-2521_05_10.942506\model-00006-1.29550-0.45274-2.35519-0.12000.h5
Epoch 7/30

Epoch 00007: saving model to model_init_2021-10-2521_05_10.942506\model-00007-1.24587-0.4

<keras.callbacks.History at 0x1a4f3632cd0>

from the above model we can infer that the accuracy of training and validation are 63.14% and 69.0% respectively.<br> The model stops at 24th epoch as the validation loss does not decreases.

#### Exp-10 : Let us add more dense nuerons and gru cells and examine the results 

In [133]:
input_shape=(x,y,z,channels)
model_d1= Sequential()

model_d1.add(TimeDistributed(Conv2D(16, kernel_size=(3, 3),  padding='same'),input_shape=input_shape))
model_d1.add(TimeDistributed(Activation('relu')))
model_d1.add(TimeDistributed(BatchNormalization()))


model_d1.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))

model_d1.add(TimeDistributed(Conv2D(32, kernel_size=(3, 3), padding='same')))
model_d1.add(TimeDistributed(Activation('relu')))
model_d1.add(TimeDistributed(BatchNormalization()))


model_d1.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))

model_d1.add(TimeDistributed(Conv2D(64, kernel_size=(3, 3),  padding='same')))
model_d1.add(TimeDistributed(Activation('relu')))
model_d1.add(TimeDistributed(BatchNormalization()))


model_d1.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))

model_d1.add(TimeDistributed(Conv2D(128, kernel_size=(3, 3), padding='same')))
model_d1.add(TimeDistributed(Activation('relu')))
model_d1.add(TimeDistributed(BatchNormalization()))


model_d1.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))

model_d1.add(TimeDistributed(Flatten()))

model_d1.add(GRU(128))
model_d1.add(Dropout(0.25))

model_d1.add(Dense(128,activation='relu'))
model_d1.add(Dropout(0.25))

model_d1.add(Dense(classes, activation='softmax'))



In [137]:
optimiser=tf.keras.optimizers.Adam()
model_d1.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_d1.summary())

Model: "sequential_27"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
time_distributed_71 (TimeDis (None, 30, 100, 100, 16)  448       
_________________________________________________________________
time_distributed_72 (TimeDis (None, 30, 100, 100, 16)  0         
_________________________________________________________________
time_distributed_73 (TimeDis (None, 30, 100, 100, 16)  64        
_________________________________________________________________
time_distributed_74 (TimeDis (None, 30, 50, 50, 16)    0         
_________________________________________________________________
time_distributed_75 (TimeDis (None, 30, 50, 50, 32)    4640      
_________________________________________________________________
time_distributed_76 (TimeDis (None, 30, 50, 50, 32)    0         
_________________________________________________________________
time_distributed_77 (TimeDis (None, 30, 50, 50, 32)  

In [134]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [138]:
model_d1.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  C:\cnndatasets\Project_data/train ; batch size = 10
Epoch 1/30

Epoch 00001: saving model to model_init_2021-10-2521_05_10.942506\model-00001-1.47912-0.36350-3.39767-0.18000.h5
Epoch 2/30

Epoch 00002: saving model to model_init_2021-10-2521_05_10.942506\model-00002-1.45055-0.37313-2.17741-0.16000.h5
Epoch 3/30

Epoch 00003: saving model to model_init_2021-10-2521_05_10.942506\model-00003-1.42087-0.46269-2.25097-0.15000.h5
Epoch 4/30

Epoch 00004: saving model to model_init_2021-10-2521_05_10.942506\model-00004-1.45185-0.37313-2.25518-0.19000.h5
Epoch 5/30

Epoch 00005: saving model to model_init_2021-10-2521_05_10.942506\model-00005-1.30278-0.45274-2.43397-0.12000.h5
Epoch 6/30

Epoch 00006: saving model to model_init_2021-10-2521_05_10.942506\model-00006-1.42101-0.38806-2.95852-0.21000.h5

Epoch 00006: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026.
Epoch 7/30

Epoch 00007: saving model to model_init_2021-10-2521_05_10.942506\model-00007-1.20201-0.5


Epoch 00028: saving model to model_init_2021-10-2521_05_10.942506\model-00028-0.83908-0.67164-0.77149-0.72000.h5
Epoch 29/30

Epoch 00029: saving model to model_init_2021-10-2521_05_10.942506\model-00029-0.72554-0.75622-0.75027-0.76000.h5

Epoch 00029: ReduceLROnPlateau reducing learning rate to 1.6000001778593287e-06.
Epoch 30/30

Epoch 00030: saving model to model_init_2021-10-2521_05_10.942506\model-00030-0.77122-0.72139-0.86692-0.69000.h5


<keras.callbacks.History at 0x1a4f36328e0>

from above experiment we saw an in increase in training accuracy but no increase in validation accuracy

#### Exp-11: lets add layers in GRU 

In [144]:
input_shape=(x,y,z,channels)
model_d2= Sequential()

model_d2.add(TimeDistributed(Conv2D(16, kernel_size=(3, 3),  padding='same'),input_shape=input_shape))
model_d2.add(TimeDistributed(Activation('relu')))
model_d2.add(TimeDistributed(BatchNormalization()))


model_d2.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))

model_d2.add(TimeDistributed(Conv2D(32, kernel_size=(3, 3), padding='same')))
model_d2.add(TimeDistributed(Activation('relu')))
model_d2.add(TimeDistributed(BatchNormalization()))


model_d2.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))

model_d2.add(TimeDistributed(Conv2D(64, kernel_size=(3, 3),  padding='same')))
model_d2.add(TimeDistributed(Activation('relu')))
model_d2.add(TimeDistributed(BatchNormalization()))


model_d2.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))

model_d2.add(TimeDistributed(Conv2D(128, kernel_size=(3, 3), padding='same')))
model_d2.add(TimeDistributed(Activation('relu')))
model_d2.add(TimeDistributed(BatchNormalization()))


model_d2.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))

model_d2.add(TimeDistributed(Flatten()))

model_d2.add(GRU(128,return_sequences=True))
model_d2.add(Dropout(0.25))

model_d2.add(GRU(128))
model_d2.add(Dropout(0.25))


model_d2.add(Dense(128,activation='relu'))
model_d2.add(Dropout(0.25))

model_d2.add(Dense(classes, activation='softmax'))



In [145]:
optimiser=tf.keras.optimizers.Adam()
model_d2.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_d2.summary())

Model: "sequential_32"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
time_distributed_156 (TimeDi (None, 30, 100, 100, 16)  448       
_________________________________________________________________
time_distributed_157 (TimeDi (None, 30, 100, 100, 16)  0         
_________________________________________________________________
time_distributed_158 (TimeDi (None, 30, 100, 100, 16)  64        
_________________________________________________________________
time_distributed_159 (TimeDi (None, 30, 50, 50, 16)    0         
_________________________________________________________________
time_distributed_160 (TimeDi (None, 30, 50, 50, 32)    4640      
_________________________________________________________________
time_distributed_161 (TimeDi (None, 30, 50, 50, 32)    0         
_________________________________________________________________
time_distributed_162 (TimeDi (None, 30, 50, 50, 32)  

In [146]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [148]:
model_d2.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  C:\cnndatasets\Project_data/train ; batch size = 10
Epoch 1/30

Epoch 00001: saving model to model_init_2021-10-2521_05_10.942506\model-00001-1.44541-0.35445-1.93876-0.16000.h5
Epoch 2/30

Epoch 00002: saving model to model_init_2021-10-2521_05_10.942506\model-00002-1.38729-0.45771-1.69775-0.20000.h5
Epoch 3/30

Epoch 00003: saving model to model_init_2021-10-2521_05_10.942506\model-00003-1.43052-0.42289-1.92463-0.21000.h5
Epoch 4/30

Epoch 00004: saving model to model_init_2021-10-2521_05_10.942506\model-00004-1.29897-0.50249-2.22404-0.23000.h5
Epoch 5/30

Epoch 00005: saving model to model_init_2021-10-2521_05_10.942506\model-00005-1.15506-0.54229-1.82008-0.23000.h5
Epoch 6/30

Epoch 00006: saving model to model_init_2021-10-2521_05_10.942506\model-00006-1.27641-0.51244-2.04663-0.19000.h5

Epoch 00006: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026.
Epoch 7/30

Epoch 00007: saving model to model_init_2021-10-2521_05_10.942506\model-00007-1.09609-0.5

<keras.callbacks.History at 0x1a562f5d850>

We can clearly see the model overfits as training accuracy is 92.04% and validation accuracy is 68%.and the model stops learning at 28th epoch as validation loss is not reducing.

#### Exp-12 :Add dropouts,and set the learning rate to 0.0001

In [162]:
input_shape=(x,y,z,channels)
model_d3= Sequential()

model_d3.add(TimeDistributed(Conv2D(16, kernel_size=(3, 3),  padding='same'),input_shape=input_shape))
model_d3.add(TimeDistributed(Activation('relu')))
model_d3.add(TimeDistributed(BatchNormalization()))


model_d3.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))


model_d3.add(TimeDistributed(Conv2D(32, kernel_size=(3, 3), padding='same')))
model_d3.add(TimeDistributed(Activation('relu')))
model_d3.add(TimeDistributed(BatchNormalization()))


model_d3.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))

model_d3.add(TimeDistributed(Conv2D(64, kernel_size=(3, 3),  padding='same')))
model_d3.add(TimeDistributed(Activation('relu')))
model_d3.add(TimeDistributed(BatchNormalization()))


model_d3.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))


model_d3.add(TimeDistributed(Conv2D(128, kernel_size=(3, 3), padding='same')))
model_d3.add(TimeDistributed(Activation('relu')))
model_d3.add(TimeDistributed(BatchNormalization()))


model_d3.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model_d3.add(Dropout(0.25))

model_d3.add(TimeDistributed(Flatten()))

model_d3.add(GRU(128,return_sequences=True))
model_d3.add(Dropout(0.25))

model_d3.add(GRU(128))
model_d3.add(Dropout(0.25))


model_d3.add(Dense(128,activation='relu'))
model_d3.add(Dropout(0.25))

model_d3.add(Dense(classes, activation='softmax'))



In [163]:
optimiser=tf.keras.optimizers.Adam(0.0001)
model_d3.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_d3.summary())

Model: "sequential_37"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
time_distributed_241 (TimeDi (None, 30, 100, 100, 16)  448       
_________________________________________________________________
time_distributed_242 (TimeDi (None, 30, 100, 100, 16)  0         
_________________________________________________________________
time_distributed_243 (TimeDi (None, 30, 100, 100, 16)  64        
_________________________________________________________________
time_distributed_244 (TimeDi (None, 30, 50, 50, 16)    0         
_________________________________________________________________
time_distributed_245 (TimeDi (None, 30, 50, 50, 32)    4640      
_________________________________________________________________
time_distributed_246 (TimeDi (None, 30, 50, 50, 32)    0         
_________________________________________________________________
time_distributed_247 (TimeDi (None, 30, 50, 50, 32)  

In [164]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [165]:
model_d3.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  C:\cnndatasets\Project_data/train ; batch size = 10
Epoch 1/30

Epoch 00001: saving model to model_init_2021-10-2521_05_10.942506\model-00001-1.49089-0.35445-1.64185-0.22000.h5
Epoch 2/30

Epoch 00002: saving model to model_init_2021-10-2521_05_10.942506\model-00002-1.24750-0.49254-1.65047-0.20000.h5
Epoch 3/30

Epoch 00003: saving model to model_init_2021-10-2521_05_10.942506\model-00003-1.29653-0.49254-1.79601-0.20000.h5
Epoch 4/30

Epoch 00004: saving model to model_init_2021-10-2521_05_10.942506\model-00004-1.07243-0.58209-2.19028-0.15000.h5
Epoch 5/30

Epoch 00005: saving model to model_init_2021-10-2521_05_10.942506\model-00005-0.92690-0.68657-2.29860-0.14000.h5

Epoch 00005: ReduceLROnPlateau reducing learning rate to 1.9999999494757503e-05.
Epoch 6/30

Epoch 00006: saving model to model_init_2021-10-2521_05_10.942506\model-00006-0.88527-0.65672-2.37832-0.17000.h5
Epoch 7/30

Epoch 00007: saving model to model_init_2021-10-2521_05_10.942506\model-00007-0.80525-0.7


Epoch 00028: saving model to model_init_2021-10-2521_05_10.942506\model-00028-0.57930-0.85075-0.72186-0.70000.h5
Epoch 00028: early stopping


<keras.callbacks.History at 0x1a54c2b4640>

From above results we saw that adding dropouts and reducing the learning rate we decrease the training accuracy to 85.05% and the validation accuracy is increased to 70.0%.<br> But the model still overfits.

#### Exp-13 : lets build a model from a pretrained architecture by using transfer learning

In [91]:
x = 30 # number of frames
y = 100 # image width
z = 100 # image height
batch_size=5
num_epochs=30
channels=3
classes=5

In [92]:
def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    img_idx =[x for x in range(0,x)]   #create a list of image numbers you want to use for a particular video
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(t)//batch_size    # calculate the number of batches
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,x,y,z,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,classes)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    
                    temp = imresize(image,(y,z))
                    temp = temp/255 #Normalize data
                    
                    batch_data[folder,idx,:,:,0] = (temp[:,:,0])  #normalise and feed in the image
                    batch_data[folder,idx,:,:,1] = (temp[:,:,1])  #normalise and feed in the image
                    batch_data[folder,idx,:,:,2] = (temp[:,:,2])  #normalise and feed in the image
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        
        # write the code for the remaining data points which are left after full batches
        if (len(folder_list) != batch_size*num_batches):
            print("Batch: ",num_batches+1,"Index:", batch_size)
            batch_size = len(folder_list) - (batch_size*num_batches)
            batch_data = np.zeros((batch_size,x,y,z,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,classes)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    temp = imresize(image,(y,z))
                    temp = temp/255 #Normalize data
                    
                    batch_data[folder,idx,:,:,0] = (temp[:,:,0])
                    batch_data[folder,idx,:,:,1] = (temp[:,:,1])
                    batch_data[folder,idx,:,:,2] = (temp[:,:,2])
                   
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels

In [93]:
curr_dt_time = datetime.datetime.now()
train_path = "C:\cnndatasets\Project_data/train"
val_path = "C:\cnndatasets\Project_data/val"
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs =   30  # choose the number of epochs
print ('# epochs =', num_epochs)
num_batches = num_train_sequences//batch_size 
print(num_batches)

# training sequences = 663
# validation sequences = 100
# epochs = 30
132


In [108]:
from keras.applications import mobilenet
mobile_net = mobilenet.MobileNet(weights='imagenet', include_top=False)

for layer in mobile_net.layers:
    layer.trainable=False



In [95]:

from keras.models import Sequential, Model
from keras.layers import Dense, GRU, Flatten, TimeDistributed, Flatten, BatchNormalization, Activation
from keras.layers.convolutional import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau, EarlyStopping
from keras import optimizers
from keras.layers import Dropout

In [109]:
model_e = Sequential()
model_e.add(TimeDistributed(mobile_net,input_shape=(x,y,z,channels)))



model_e.add(TimeDistributed(BatchNormalization()))
model_e.add(TimeDistributed(MaxPooling2D((2, 2))))
model_e.add(TimeDistributed(Flatten()))

model_e.add(GRU(128))
model_e.add(Dropout(0.25))




model_e.add(Dense(128,activation='relu'))
model_e.add(Dropout(0.25))




model_e.add(Dense(classes, activation='softmax'))

In [110]:
optimiser=tf.keras.optimizers.Adam()
model_e.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_e.summary())

Model: "sequential_19"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
time_distributed_73 (TimeDis (None, 30, 3, 3, 1024)    3228864   
_________________________________________________________________
time_distributed_74 (TimeDis (None, 30, 3, 3, 1024)    4096      
_________________________________________________________________
time_distributed_75 (TimeDis (None, 30, 1, 1, 1024)    0         
_________________________________________________________________
time_distributed_76 (TimeDis (None, 30, 1024)          0         
_________________________________________________________________
gru_22 (GRU)                 (None, 128)               443136    
_________________________________________________________________
dropout_38 (Dropout)         (None, 128)               0         
_________________________________________________________________
dense_34 (Dense)             (None, 128)             

In [111]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [112]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=2, cooldown=1, verbose=1)# write the REducelronplateau code here
earlystop = EarlyStopping( monitor="val_loss", min_delta=0,patience=10,verbose=1)
callbacks_list = [checkpoint, LR, earlystop]



In [113]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [114]:
model_e.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  C:\cnndatasets\Project_data/train ; batch size = 5
Epoch 1/30

Epoch 00001: saving model to model_init_2021-10-2712_52_55.420864\model-00001-1.64179-0.22474-1.58165-0.27000.h5
Epoch 2/30

Epoch 00002: saving model to model_init_2021-10-2712_52_55.420864\model-00002-1.59939-0.27068-1.62482-0.24000.h5
Epoch 3/30

Epoch 00003: saving model to model_init_2021-10-2712_52_55.420864\model-00003-1.59950-0.31328-1.55735-0.27000.h5
Epoch 4/30

Epoch 00004: saving model to model_init_2021-10-2712_52_55.420864\model-00004-1.55993-0.28571-1.52149-0.32000.h5
Epoch 5/30

Epoch 00005: saving model to model_init_2021-10-2712_52_55.420864\model-00005-1.52281-0.30827-1.45850-0.32000.h5
Epoch 6/30

Epoch 00006: saving model to model_init_2021-10-2712_52_55.420864\model-00006-1.53572-0.27569-1.49337-0.29000.h5
Epoch 7/30

Epoch 00007: saving model to model_init_2021-10-2712_52_55.420864\model-00007-1.51569-0.32331-1.47405-0.39000.h5

Epoch 00007: ReduceLROnPlateau reducing learning rate to 0


Epoch 00027: saving model to model_init_2021-10-2712_52_55.420864\model-00027-1.24925-0.46115-1.39854-0.42000.h5

Epoch 00027: ReduceLROnPlateau reducing learning rate to 7.812500371073838e-06.
Epoch 00027: early stopping


<keras.callbacks.History at 0x1fc295de220>

After trying numerous times with resnet and mobilenet pretrained  models,we saw that the training loss and validation loss was not decreasing and hence the model stops learning for batch size=10.<br>So we kept batch size=5 and we got training accuracy of 46.12% and validation accuracy of 42%.<br>The model is generalisable but the acuracies are low ,so to improve the accuracies we train some layers of mobilenet parameters.

In [124]:
from keras.applications import mobilenet
mobile_net = mobilenet.MobileNet(weights='imagenet', include_top=False)

for layer in mobile_net.layers[:-10]:
    layer.trainable=False



In [132]:
model_e1 = Sequential()
model_e1.add(TimeDistributed(mobile_net,input_shape=(x,y,z,channels)))



model_e1.add(TimeDistributed(BatchNormalization()))
model_e1.add(TimeDistributed(MaxPooling2D((2, 2))))
model_e1.add(TimeDistributed(Flatten()))

model_e1.add(GRU(128))
model_e1.add(Dropout(0.25))




model_e1.add(Dense(128,activation='relu'))
model_e1.add(Dropout(0.25))




model_e1.add(Dense(classes, activation='softmax'))

In [133]:
optimiser=tf.keras.optimizers.Adam()
model_e1.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model_e1.summary())

Model: "sequential_22"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
time_distributed_85 (TimeDis (None, 30, 3, 3, 1024)    3228864   
_________________________________________________________________
time_distributed_86 (TimeDis (None, 30, 3, 3, 1024)    4096      
_________________________________________________________________
time_distributed_87 (TimeDis (None, 30, 1, 1, 1024)    0         
_________________________________________________________________
time_distributed_88 (TimeDis (None, 30, 1024)          0         
_________________________________________________________________
gru_25 (GRU)                 (None, 128)               443136    
_________________________________________________________________
dropout_44 (Dropout)         (None, 128)               0         
_________________________________________________________________
dense_40 (Dense)             (None, 128)             

In [134]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [135]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=4, cooldown=1, verbose=1)# write the REducelronplateau code here
earlystop = EarlyStopping( monitor="val_loss", min_delta=0,patience=10,verbose=1)
callbacks_list = [checkpoint, LR, earlystop]



In [136]:
model_e1.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  C:\cnndatasets\Project_data/train ; batch size = 5
Epoch 1/30

Epoch 00001: saving model to model_init_2021-10-2712_52_55.420864\model-00001-1.81990-0.18703-1.60919-0.23000.h5
Epoch 2/30

Epoch 00002: saving model to model_init_2021-10-2712_52_55.420864\model-00002-1.70652-0.19549-1.62296-0.16000.h5
Epoch 3/30

Epoch 00003: saving model to model_init_2021-10-2712_52_55.420864\model-00003-1.64702-0.24060-1.62290-0.25000.h5
Epoch 4/30

Epoch 00004: saving model to model_init_2021-10-2712_52_55.420864\model-00004-1.63691-0.20050-1.61448-0.23000.h5
Epoch 5/30

Epoch 00005: saving model to model_init_2021-10-2712_52_55.420864\model-00005-1.62070-0.21303-1.61181-0.23000.h5

Epoch 00005: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
Epoch 6/30

Epoch 00006: saving model to model_init_2021-10-2712_52_55.420864\model-00006-1.62177-0.18296-1.60394-0.21000.h5
Epoch 7/30

Epoch 00007: saving model to model_init_2021-10-2712_52_55.420864\model-00007-1.61684-0.177

<keras.callbacks.History at 0x1fc3b067fd0>

We see even the trainable parameters increased the loss is not all decreasing.hence the model stops learning.

## Thus final model is Exp-6 with Training Accuracy: 0.94, Validation accuracy:-0.88 with least number of parameters. 