# Neural Networks Project - Gesture Recognition
In this group project, we are going to build a 3D Conv model that will be able to predict the 5 gestures correctly. 

The training data consists of a few hundred videos categorised into one of the five classes. Each video (typically 2-3 seconds long) is divided into a sequence of 30 frames(images). These videos have been recorded by various people performing one of the five gestures in front of a webcam - similar to what the smart TV will use.

The data is in a zip file. The zip file contains a 'train' and a 'val' folder with two CSV files for the two folders. These folders are in turn divided into subfolders where each subfolder represents a video of a particular gesture. Each subfolder, i.e. a video, contains 30 frames (or images).

Our task is to train a model on the 'train' folder which performs well on the 'val' folder as well (as usually done in ML projects). We have withheld the test folder for evaluation purposes - your final model's performance will be tested on the 'test' set.

In [None]:
import numpy as np
import os
from imageio import imread
import cv2
import datetime
import os

We set the random seed so that the results don't vary drastically.

In [None]:
np.random.seed(30)
import random as rn
rn.seed(30)
from tensorflow.keras import backend as K 
import tensorflow as tf
tf.random.set_seed(30)

In [None]:
from keras.models import Sequential
from keras.layers import Dense, GRU, Dropout, Flatten, BatchNormalization, Activation
from keras.layers.convolutional import Conv3D, MaxPooling3D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras import optimizers

In [None]:
#Mounting the drive
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
#Unzipping the content
!unzip /content/drive/MyDrive/Project_data.zip

In [None]:
#Reading the data
train_doc = np.random.permutation(open('Project_data/train.csv').readlines())
val_doc = np.random.permutation(open('Project_data/val.csv').readlines())

## Experiment 1 (Architecture - Conv 3D)
(Batch Size = 10, Image dimensions = 84*84, Epochs = 20, Frames = 18)

In [None]:
batch_size = 10
#experimenting with batch size to use GPU to its full capacity

## Generator
This is one of the most important part of the code. The overall structure of the generator has been given. In the generator, you are going to preprocess the images as you have images of 2 different dimensions as well as create a batch of video frames. You have to experiment with `img_idx`, `y`,`z` and normalization such that you get high accuracy.

In [None]:
def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    img_idx = [0,1,2,4,6,8,10,12,14,16,18,20,22,24,26,27,28,29]
    while True:
        t = np.random.permutation(folder_list)
        num_batches = int(len(t)/batch_size)
        for batch in range(num_batches):
            batch_data = np.zeros((batch_size,18,84,84,3))
            batch_labels = np.zeros((batch_size,5))
            for folder in range(batch_size):
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0])
                for idx,item in enumerate(img_idx):
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    if image.shape[1] == 160:
                        image = cv2.resize(image[:,20:140,:],(84,84)).astype(np.float32)
                    else:
                        image = cv2.resize(image,(84,84)).astype(np.float32)
                    
                    batch_data[folder,idx,:,:,0] = image[:,:,0]/255.0 #Red
                    batch_data[folder,idx,:,:,1] = image[:,:,1]/255.0 #Green
                    batch_data[folder,idx,:,:,2] = image[:,:,2]/255.0 #Blue
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels

        if (len(t)%batch_size) != 0:
            batch_data = np.zeros((len(t)%batch_size,18,84,84,3))
            batch_labels = np.zeros((len(t)%batch_size,5))
            for folder in range(len(t)%batch_size):
                imgs = os.listdir(source_path+'/'+ t[folder + (num_batches*batch_size)].split(';')[0])
                for idx,item in enumerate(img_idx):
                    image = imread(source_path+'/'+ t[folder + (num_batches*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    if image.shape[1] == 160:
                        image = cv2.resize(image[:,20:140,:],(84,84)).astype(np.float32)
                    else:
                        image = cv2.resize(image,(84,84)).astype(np.float32)

                    batch_data[folder,idx,:,:,0] = image[:,:,0]/255.0 #Red
                    batch_data[folder,idx,:,:,1] = image[:,:,1]/255.0 #Green
                    batch_data[folder,idx,:,:,2] = image[:,:,2]/255.0 #Blue
                    
                batch_labels[folder, int(t[folder + (num_batches*batch_size)].strip().split(';')[2])] = 1

            yield batch_data, batch_labels


Note here that a video is represented above in the generator as (number of images, height, width, number of channels). Take this into consideration while creating the model architecture.

In [None]:
curr_dt_time = datetime.datetime.now()
train_path = 'Project_data/train'
val_path = 'Project_data/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs = 20 # choosing the number of epochs
print ('# epochs =', num_epochs)

# training sequences = 663
# validation sequences = 100
# epochs = 20


## Model 1
Here we make the model using different functionalities that Keras provides. Remember to use `Conv3D` and `MaxPooling3D` and not `Conv2D` and `Maxpooling2D` for a 3D convolution model.


In [None]:

from keras.models import Sequential
from keras.layers import Dense, GRU, Dropout, Flatten, BatchNormalization, Activation
from keras.layers.convolutional import Conv3D, MaxPooling3D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras import optimizers

model = Sequential()
model.add(Conv3D(64, (3,3,3), strides=(1,1,1), padding='same', input_shape=(18,84,84,3)))
model.add(BatchNormalization())
model.add(Activation('elu'))
model.add(MaxPooling3D(pool_size=(2,2,1), strides=(2,2,1)))

model.add(Conv3D(128, (3,3,3), strides=(1,1,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('elu'))
model.add(MaxPooling3D(pool_size=(2,2,2), strides=(2,2,2)))

# model.add(Dropout(0.25))

model.add(Conv3D(256, (3,3,3), strides=(1,1,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('elu'))
model.add(MaxPooling3D(pool_size=(2,2,2), strides=(2,2,2)))

# model.add(Dropout(0.25))

model.add(Conv3D(256, (3,3,3), strides=(1,1,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('elu'))
model.add(MaxPooling3D(pool_size=(2,2,2), strides=(2,2,2)))

model.add(Flatten())
model.add(Dropout(0.5))
model.add(Dense(512, activation='elu'))
model.add(Dropout(0.5))
model.add(Dense(5, activation='softmax'))

Now that you have written the model, the next step is to `compile` the model. When you print the `summary` of the model, you'll see the total number of parameters you have to train.

In [None]:
sgd = optimizers.SGD(lr=0.001, decay=1e-6, momentum=0.7, nesterov=True)
model.compile(optimizer=sgd, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d (Conv3D)              (None, 18, 84, 84, 64)    5248      
_________________________________________________________________
batch_normalization (BatchNo (None, 18, 84, 84, 64)    256       
_________________________________________________________________
activation (Activation)      (None, 18, 84, 84, 64)    0         
_________________________________________________________________
max_pooling3d (MaxPooling3D) (None, 9, 42, 84, 64)     0         
_________________________________________________________________
conv3d_1 (Conv3D)            (None, 9, 42, 84, 128)    221312    
_________________________________________________________________
batch_normalization_1 (Batch (None, 9, 42, 84, 128)    512       
_________________________________________________________________
activation_1 (Activation)    (None, 9, 42, 84, 128)    0

  "The `lr` argument is deprecated, use `learning_rate` instead.")


Let us create the `train_generator` and the `val_generator` which will be used in `.fit_generator`.

In [None]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [None]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', 
                             verbose=1, save_best_only=False, 
                             save_weights_only=False, mode='auto')

LR = ReduceLROnPlateau(monitor= 'val_loss', factor= 0.5, 
                       patience= 2, verbose= 10, 
                       mode= 'min', min_delta= 0.0001, 
                       cooldown= 0, min_lr= 0.0001) # write the REducelronplateau code here
callbacks_list = [checkpoint, LR]

The `steps_per_epoch` and `validation_steps` are used by `fit_generator` to decide the number of next() calls it need to make.

In [None]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

Let us now fit the model. This will start training the model and with the help of the checkpoints, you'll be able to save the model at the end of each epoch.

In [None]:
model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)



Source path =  Project_data/train ; batch size = 10
Epoch 1/20

Epoch 00001: saving model to model_init_2021-08-0911_03_43.393103/model-00001-3.53227-0.35143-5.74138-0.25000.h5
Epoch 2/20

Epoch 00002: saving model to model_init_2021-08-0911_03_43.393103/model-00002-1.64333-0.50830-2.26180-0.49000.h5
Epoch 3/20

Epoch 00003: saving model to model_init_2021-08-0911_03_43.393103/model-00003-1.45283-0.52941-1.33794-0.50000.h5
Epoch 4/20

Epoch 00004: saving model to model_init_2021-08-0911_03_43.393103/model-00004-1.19555-0.58974-1.33191-0.48000.h5
Epoch 5/20

Epoch 00005: saving model to model_init_2021-08-0911_03_43.393103/model-00005-1.02983-0.62594-1.23252-0.58000.h5
Epoch 6/20

Epoch 00006: saving model to model_init_2021-08-0911_03_43.393103/model-00006-1.03378-0.61840-0.66741-0.71000.h5
Epoch 7/20

Epoch 00007: saving model to model_init_2021-08-0911_03_43.393103/model-00007-0.91695-0.67873-0.76079-0.70000.h5
Epoch 8/20

Epoch 00008: saving model to model_init_2021-08-0911_03_43.39

<keras.callbacks.History at 0x7ff8c055eb10>

#### Maximum validation accuracy = 82%
#### Corresponding training accuracy = 81.93%

The model already has a good enough accuracy. We are going to experiment with the batch size by increasing it to 32.

## Experiment 2 (Architecture - Conv 3D)

(Batch Size = 10, Image dimensions = 84*84, Epochs = 20, Frames = 18)

In [None]:
batch_size = 32
#experimenting with batch size to use GPU to its full capacity

## Generator
This is one of the most important part of the code. The overall structure of the generator has been given. In the generator, you are going to preprocess the images as you have images of 2 different dimensions as well as create a batch of video frames. You have to experiment with `img_idx`, `y`,`z` and normalization such that you get high accuracy.

In [None]:
def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    img_idx = [0,1,2,4,6,8,10,12,14,16,18,20,22,24,26,27,28,29]
    while True:
        t = np.random.permutation(folder_list)
        num_batches = int(len(t)/batch_size)
        for batch in range(num_batches):
            batch_data = np.zeros((batch_size,18,84,84,3))
            batch_labels = np.zeros((batch_size,5))
            for folder in range(batch_size):
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0])
                for idx,item in enumerate(img_idx):
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    if image.shape[1] == 160:
                        image = cv2.resize(image[:,20:140,:],(84,84)).astype(np.float32)
                    else:
                        image = cv2.resize(image,(84,84)).astype(np.float32)
                    
                    batch_data[folder,idx,:,:,0] = image[:,:,0]/255.0 #Red
                    batch_data[folder,idx,:,:,1] = image[:,:,1]/255.0 #Green
                    batch_data[folder,idx,:,:,2] = image[:,:,2]/255.0 #Blue
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels

        if (len(t)%batch_size) != 0:
            batch_data = np.zeros((len(t)%batch_size,18,84,84,3))
            batch_labels = np.zeros((len(t)%batch_size,5))
            for folder in range(len(t)%batch_size):
                imgs = os.listdir(source_path+'/'+ t[folder + (num_batches*batch_size)].split(';')[0])
                for idx,item in enumerate(img_idx):
                    image = imread(source_path+'/'+ t[folder + (num_batches*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    if image.shape[1] == 160:
                        image = cv2.resize(image[:,20:140,:],(84,84)).astype(np.float32)
                    else:
                        image = cv2.resize(image,(84,84)).astype(np.float32)

                    batch_data[folder,idx,:,:,0] = image[:,:,0]/255.0 #Red
                    batch_data[folder,idx,:,:,1] = image[:,:,1]/255.0 #Green
                    batch_data[folder,idx,:,:,2] = image[:,:,2]/255.0 #Blue

                batch_labels[folder, int(t[folder + (num_batches*batch_size)].strip().split(';')[2])] = 1

            yield batch_data, batch_labels


Note here that a video is represented above in the generator as (number of images, height, width, number of channels). Take this into consideration while creating the model architecture.

In [None]:
curr_dt_time = datetime.datetime.now()
train_path = 'Project_data/train'
val_path = 'Project_data/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs = 20 # choosing the number of epochs
print ('# epochs =', num_epochs)

# training sequences = 663
# validation sequences = 100
# epochs = 20


## Model 2
Here you make the model using different functionalities that Keras provides. Remember to use `Conv3D` and `MaxPooling3D` and not `Conv2D` and `Maxpooling2D` for a 3D convolution model. 

In [None]:

from keras.models import Sequential
from keras.layers import Dense, GRU, Dropout, Flatten, BatchNormalization, Activation
from keras.layers.convolutional import Conv3D, MaxPooling3D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras import optimizers

model = Sequential()
model.add(Conv3D(64, (3,3,3), strides=(1,1,1), padding='same', input_shape=(18,84,84,3)))
model.add(BatchNormalization())
model.add(Activation('elu'))
model.add(MaxPooling3D(pool_size=(2,2,1), strides=(2,2,1)))

model.add(Conv3D(128, (3,3,3), strides=(1,1,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('elu'))
model.add(MaxPooling3D(pool_size=(2,2,2), strides=(2,2,2)))

# model.add(Dropout(0.25))

model.add(Conv3D(256, (3,3,3), strides=(1,1,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('elu'))
model.add(MaxPooling3D(pool_size=(2,2,2), strides=(2,2,2)))

# model.add(Dropout(0.25))

model.add(Conv3D(256, (3,3,3), strides=(1,1,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('elu'))
model.add(MaxPooling3D(pool_size=(2,2,2), strides=(2,2,2)))

model.add(Flatten())
model.add(Dropout(0.5))
model.add(Dense(512, activation='elu'))
model.add(Dropout(0.5))
model.add(Dense(5, activation='softmax'))

Now that you have written the model, the next step is to `compile` the model. When you print the `summary` of the model, you'll see the total number of parameters you have to train.

In [None]:
sgd = optimizers.SGD(lr=0.001, decay=1e-6, momentum=0.7, nesterov=True)
model.compile(optimizer=sgd, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_4 (Conv3D)            (None, 18, 84, 84, 64)    5248      
_________________________________________________________________
batch_normalization_4 (Batch (None, 18, 84, 84, 64)    256       
_________________________________________________________________
activation_4 (Activation)    (None, 18, 84, 84, 64)    0         
_________________________________________________________________
max_pooling3d_4 (MaxPooling3 (None, 9, 42, 84, 64)     0         
_________________________________________________________________
conv3d_5 (Conv3D)            (None, 9, 42, 84, 128)    221312    
_________________________________________________________________
batch_normalization_5 (Batch (None, 9, 42, 84, 128)    512       
_________________________________________________________________
activation_5 (Activation)    (None, 9, 42, 84, 128)   

  "The `lr` argument is deprecated, use `learning_rate` instead.")


Let us create the `train_generator` and the `val_generator` which will be used in `.fit_generator`.

In [None]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [None]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', 
                             verbose=1, save_best_only=False, 
                             save_weights_only=False, mode='auto')

LR = ReduceLROnPlateau(monitor= 'val_loss', factor= 0.5, 
                       patience= 2, verbose= 10, 
                       mode= 'min', min_delta= 0.0001, 
                       cooldown= 0, min_lr= 0.0001) # write the REducelronplateau code here
callbacks_list = [checkpoint, LR]

The `steps_per_epoch` and `validation_steps` are used by `fit_generator` to decide the number of next() calls it need to make.

In [None]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

Let us now fit the model. This will start training the model and with the help of the checkpoints, you'll be able to save the model at the end of each epoch.

In [None]:
model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)



Source path =  Project_data/train ; batch size = 32
Epoch 1/20

Epoch 00001: saving model to model_init_2021-08-0911_19_29.133050/model-00001-3.31934-0.31071-11.03752-0.21000.h5
Epoch 2/20

Epoch 00002: saving model to model_init_2021-08-0911_19_29.133050/model-00002-1.79857-0.48869-7.67753-0.24000.h5
Epoch 3/20

Epoch 00003: saving model to model_init_2021-08-0911_19_29.133050/model-00003-1.35762-0.57919-6.91338-0.21000.h5
Epoch 4/20

Epoch 00004: saving model to model_init_2021-08-0911_19_29.133050/model-00004-1.22563-0.57768-3.58229-0.27000.h5
Epoch 5/20

Epoch 00005: saving model to model_init_2021-08-0911_19_29.133050/model-00005-1.08157-0.62293-2.26958-0.42000.h5
Epoch 6/20

Epoch 00006: saving model to model_init_2021-08-0911_19_29.133050/model-00006-1.02379-0.63198-1.78955-0.50000.h5
Epoch 7/20

Epoch 00007: saving model to model_init_2021-08-0911_19_29.133050/model-00007-0.95856-0.68477-1.02349-0.65000.h5
Epoch 8/20

Epoch 00008: saving model to model_init_2021-08-0911_19_29.1

<keras.callbacks.History at 0x7ff8d4fa7d50>

### Maximum validation accuracy : 77%
### Corresponding testing accuracy : 87.39%

The accuracy score has decreased but the process was faster after increasing the batch size.

## Experiment 3
(Architecture - Conv 3D)

(Batch Size = 10, Image dimensions = 120*120, Epochs = 50, Frames = 20)

In [None]:
batch_size = 10
#experimenting with batch size to use GPU to its full capacity

## Generator
This is one of the most important part of the code. The overall structure of the generator has been given. In the generator, you are going to preprocess the images as you have images of 2 different dimensions as well as create a batch of video frames. You have to experiment with `img_idx`, `y`,`z` and normalization such that you get high accuracy.

In [None]:
def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    img_idx = [0,1,2,4,6,8,10,12,14,16,18,20,22,24,26,27,28,29]
    while True:
        t = np.random.permutation(folder_list)
        num_batches = int(len(t)/batch_size)
        for batch in range(num_batches):
            batch_data = np.zeros((batch_size,18,120,120,3))
            batch_labels = np.zeros((batch_size,5))
            for folder in range(batch_size):
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0])
                for idx,item in enumerate(img_idx):
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    if image.shape[1] == 160:
                        image = cv2.resize(image[:,20:140,:],(120,120)).astype(np.float32)
                    else:
                        image = cv2.resize(image,(120,120)).astype(np.float32)
                    
                    batch_data[folder,idx,:,:,0] = image[:,:,0]/255.0 #Red
                    batch_data[folder,idx,:,:,1] = image[:,:,1]/255.0 #Green
                    batch_data[folder,idx,:,:,2] = image[:,:,2]/255.0 #Blue
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels

        if (len(t)%batch_size) != 0:
            batch_data = np.zeros((len(t)%batch_size,18,120,120,3))
            batch_labels = np.zeros((len(t)%batch_size,5))
            for folder in range(len(t)%batch_size):
                imgs = os.listdir(source_path+'/'+ t[folder + (num_batches*batch_size)].split(';')[0])
                for idx,item in enumerate(img_idx):
                    image = imread(source_path+'/'+ t[folder + (num_batches*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    if image.shape[1] == 160:
                        image = cv2.resize(image[:,20:140,:],(120,120)).astype(np.float32)
                    else:
                        image = cv2.resize(image,(120,120)).astype(np.float32)

                    batch_data[folder,idx,:,:,0] = image[:,:,0]/255.0 #Red
                    batch_data[folder,idx,:,:,1] = image[:,:,1]/255.0 #Green
                    batch_data[folder,idx,:,:,2] = image[:,:,2]/255.0 #Blue

                batch_labels[folder, int(t[folder + (num_batches*batch_size)].strip().split(';')[2])] = 1

            yield batch_data, batch_labels


Note here that a video is represented above in the generator as (number of images, height, width, number of channels). Take this into consideration while creating the model architecture.

In [None]:
curr_dt_time = datetime.datetime.now()
train_path = 'Project_data/train'
val_path = 'Project_data/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs = 50 # choosing the number of epochs
print ('# epochs =', num_epochs)

# training sequences = 663
# validation sequences = 100
# epochs = 50


## Model 3
Here you make the model using different functionalities that Keras provides. Remember to use `Conv3D` and `MaxPooling3D` and not `Conv2D` and `Maxpooling2D` for a 3D convolution model. 


In [None]:

from keras.models import Sequential
from keras.layers import Dense, GRU, Dropout, Flatten, BatchNormalization, Activation
from keras.layers.convolutional import Conv3D, MaxPooling3D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras import optimizers

model = Sequential()
model.add(Conv3D(64, (3,3,3), strides=(1,1,1), padding='same', input_shape=(18,120,120,3)))
model.add(BatchNormalization())
model.add(Activation('elu'))
model.add(MaxPooling3D(pool_size=(2,2,1), strides=(2,2,1)))

model.add(Conv3D(128, (3,3,3), strides=(1,1,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('elu'))
model.add(MaxPooling3D(pool_size=(2,2,2), strides=(2,2,2)))

# model.add(Dropout(0.25))

model.add(Conv3D(256, (3,3,3), strides=(1,1,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('elu'))
model.add(MaxPooling3D(pool_size=(2,2,2), strides=(2,2,2)))

# model.add(Dropout(0.25))

model.add(Conv3D(256, (3,3,3), strides=(1,1,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('elu'))
model.add(MaxPooling3D(pool_size=(2,2,2), strides=(2,2,2)))

model.add(Flatten())
model.add(Dropout(0.5))
model.add(Dense(512, activation='elu'))
model.add(Dropout(0.5))
model.add(Dense(5, activation='softmax'))

Now that you have written the model, the next step is to `compile` the model. When you print the `summary` of the model, you'll see the total number of parameters you have to train.

In [None]:
sgd = optimizers.SGD(lr=0.001, decay=1e-6, momentum=0.7, nesterov=True)
model.compile(optimizer=sgd, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_8 (Conv3D)            (None, 18, 120, 120, 64)  5248      
_________________________________________________________________
batch_normalization_8 (Batch (None, 18, 120, 120, 64)  256       
_________________________________________________________________
activation_8 (Activation)    (None, 18, 120, 120, 64)  0         
_________________________________________________________________
max_pooling3d_8 (MaxPooling3 (None, 9, 60, 120, 64)    0         
_________________________________________________________________
conv3d_9 (Conv3D)            (None, 9, 60, 120, 128)   221312    
_________________________________________________________________
batch_normalization_9 (Batch (None, 9, 60, 120, 128)   512       
_________________________________________________________________
activation_9 (Activation)    (None, 9, 60, 120, 128)  

  "The `lr` argument is deprecated, use `learning_rate` instead.")


Let us create the `train_generator` and the `val_generator` which will be used in `.fit_generator`.

In [None]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [None]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', 
                             verbose=1, save_best_only=False, 
                             save_weights_only=False, mode='auto')

LR = ReduceLROnPlateau(monitor= 'val_loss', factor= 0.5, 
                       patience= 2, verbose= 10, 
                       mode= 'min', min_delta= 0.0001, 
                       cooldown= 0, min_lr= 0.0001) # write the REducelronplateau code here
callbacks_list = [checkpoint, LR]

The `steps_per_epoch` and `validation_steps` are used by `fit_generator` to decide the number of next() calls it need to make.

In [None]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

Let us now fit the model. This will start training the model and with the help of the checkpoints, you'll be able to save the model at the end of each epoch.

In [None]:
model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)



Source path =  Project_data/train ; batch size = 10
Epoch 1/50

Epoch 00001: saving model to model_init_2021-08-0911_40_30.295443/model-00001-3.36651-0.38009-4.60359-0.27000.h5
Epoch 2/50

Epoch 00002: saving model to model_init_2021-08-0911_40_30.295443/model-00002-1.43341-0.57617-1.28888-0.55000.h5
Epoch 3/50

Epoch 00003: saving model to model_init_2021-08-0911_40_30.295443/model-00003-1.09862-0.61237-1.85598-0.46000.h5
Epoch 4/50

Epoch 00004: saving model to model_init_2021-08-0911_40_30.295443/model-00004-1.03714-0.64103-1.16823-0.58000.h5
Epoch 5/50

Epoch 00005: saving model to model_init_2021-08-0911_40_30.295443/model-00005-0.95584-0.66667-0.71718-0.73000.h5
Epoch 6/50

Epoch 00006: saving model to model_init_2021-08-0911_40_30.295443/model-00006-0.77860-0.70136-0.76363-0.77000.h5
Epoch 7/50

Epoch 00007: saving model to model_init_2021-08-0911_40_30.295443/model-00007-0.78424-0.73002-1.48736-0.63000.h5

Epoch 00007: ReduceLROnPlateau reducing learning rate to 0.0005000000237

<keras.callbacks.History at 0x7ff878128710>

### Maximum validation accuracy : 83%
### Corresponding testing accuracy : 88% (Epoch 13)

The accuracy score has increased for both Maximum validation accuaracy and testing accuracy.
Also, increase in epoch did not necessarily reuslt in increase of the score.