# Gesture Recognition
In this group project, you are going to build a 3D Conv model that will be able to predict the 5 gestures correctly. Please import the following libraries to get started.

In [1]:
# !pip install opencv-python

In [1]:
import numpy as np
import os
# from scipy.misc import imread, imresize
import datetime
import os

import imageio
from imageio import imread
from PIL import Image
import pathlib
import cv2

We set the random seed so that the results don't vary drastically.

In [2]:
np.random.seed(30)
import random as rn
rn.seed(30)
from keras import backend as K
import tensorflow as tf
# tf.random.set_random_seed(30)
tf.random.set_seed(30)

In this block, you read the folder names for training and validation. You also set the `batch_size` here. Note that you set the batch size in such a way that you are able to use the GPU in full capacity. You keep increasing the batch size until the machine throws an error.

In [3]:
train_doc = np.random.permutation(open('Project_data/train.csv').readlines())
val_doc = np.random.permutation(open('Project_data/val.csv').readlines())

In [4]:
from keras.models import Sequential, Model
from keras.layers import Dense, GRU, Flatten, TimeDistributed, BatchNormalization, Activation, Dropout
from keras.layers.convolutional import Conv2D, MaxPooling2D, Conv3D, MaxPooling3D, AveragePooling2D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras import optimizers
from keras.regularizers import l2
from keras.layers import LSTM, GRU, Bidirectional, SimpleRNN, RNN

## Generator
This is one of the most important part of the code. The overall structure of the generator has been given. In the generator, you are going to preprocess the images as you have images of 2 different dimensions as well as create a batch of video frames. You have to experiment with `img_idx`, `y`,`z` and normalization such that you get high accuracy.

In [21]:
img_height = 160
img_width = 160
channels = 3

def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    img_idx = np.arange(0,30,3)
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(folder_list)//batch_size
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,len(img_idx),img_height,img_width,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    image = image.resize((img_height,img_width))
                    image = (image - np.percentile(image,5)) / (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        
        # write the code for the remaining data points which are left after full batches
        addnl_image_count = len(folder_list) % batch_size
        batch = batch + 1
        if(addnl_image_count!=0):
            batch_data = np.zeros((addnl_image_count,len(img_idx),img_height,img_width,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((addnl_image_count,5)) # batch_labels is the one hot representation of the output
            for folder in range(addnl_image_count): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                   
                    image = image.resize((img_height,img_width))
                    image = (image - np.percentile(image,5))/ (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels



Note here that a video is represented above in the generator as (number of images, height, width, number of channels). Take this into consideration while creating the model architecture.

In [22]:
curr_dt_time = datetime.datetime.now()
train_path = 'Project_data/train'
val_path = 'Project_data/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs = 10 # choose the number of epochs
print ('# epochs =', num_epochs)
num_classes = 5
batch_size = 16

# training sequences = 663
# validation sequences = 100
# epochs = 10


## Model
Here you make the model using different functionalities that Keras provides. Remember to use `Conv3D` and `MaxPooling3D` and not `Conv2D` and `Maxpooling2D` for a 3D convolution model. You would want to use `TimeDistributed` while building a Conv2D + RNN model. Also remember that the last layer is the softmax. Design the network in such a way that the model is able to give good accuracy on the least number of parameters so that it can fit in the memory of the webcam.

## Conv3D architecture - 

Now that you have written the model, the next step is to `compile` the model. When you print the `summary` of the model, you'll see the total number of parameters you have to train.

## Final model for gesture recognition

In [5]:

def get_random_affine():
    dx, dy = np.random.randint(-1.7, 1.8, 2)
    M = np.float32([[1, 0, dx], [0, 1, dy]])
    return M

In [6]:
def aug_generator(source_path, folder_list, batch_size):
      
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    img_idx = [x for x in range(0,nb_frames)] #create a list of image numbers you want to use for a particular video
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(folder_list)//batch_size # calculate the number of batches
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,nb_frames,nb_rows,nb_cols,nb_channel)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            
            batch_data_aug = np.zeros((batch_size,nb_frames,nb_rows,nb_cols,nb_channel)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels_aug = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
  
            batch_data_aug2 = np.zeros((batch_size,nb_frames,nb_rows,nb_cols,nb_channel)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels_aug2 = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
 
            batch_data_aug3 = np.zeros((batch_size,nb_frames,nb_rows,nb_cols,nb_channel)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels_aug3 = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
      
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                M = get_random_affine()
                M2 = get_random_affine()
                M3 = get_random_affine()
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = cv2.imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item], cv2.IMREAD_COLOR)
                    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes   
                    resized = cv2.resize(image, (nb_rows,nb_cols), interpolation = cv2.INTER_AREA)
                    batch_data[folder,idx] = resized
                    batch_data_aug[folder,idx] = cv2.warpAffine(resized, M, (resized.shape[0], resized.shape[1]))
                    batch_data_aug2[folder,idx] = cv2.warpAffine(resized, M2, (resized.shape[0], resized.shape[1]))
                    batch_data_aug3[folder,idx] = cv2.warpAffine(resized, M3, (resized.shape[0], resized.shape[1]))
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
                batch_labels_aug[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
                batch_labels_aug2[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
                batch_labels_aug3[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            
            batch_data = np.append(batch_data, batch_data_aug, axis = 0) 
            batch_data = np.append(batch_data, batch_data_aug2, axis = 0) 
            batch_data = np.append(batch_data, batch_data_aug3, axis = 0)
            batch_labels = np.append(batch_labels, batch_labels_aug, axis = 0) 
            batch_labels = np.append(batch_labels, batch_labels_aug2, axis = 0) 
            batch_labels = np.append(batch_labels, batch_labels_aug3, axis = 0)
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        
        # write the code for the remaining data points which are left after full batches
        if (len(folder_list) != batch_size*num_batches):
            print("Batch: ",num_batches+1,"Index:", batch_size)
            batch_size = len(folder_list) - (batch_size*num_batches)
            
            batch_data = np.zeros((batch_size,nb_frames,nb_rows,nb_cols,nb_channel)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            
            batch_data_aug = np.zeros((batch_size,nb_frames,nb_rows,nb_cols,nb_channel)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels_aug = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
  
            batch_data_aug2 = np.zeros((batch_size,nb_frames,nb_rows,nb_cols,nb_channel)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels_aug2 = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
   
            batch_data_aug3 = np.zeros((batch_size,nb_frames,nb_rows,nb_cols,nb_channel)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels_aug3 = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
      
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                M = get_random_affine()
                M2 = get_random_affine()
                M3 = get_random_affine()
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = cv2.imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item], cv2.IMREAD_COLOR)
                    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    resized = cv2.resize(image, (nb_rows,nb_cols), interpolation = cv2.INTER_AREA)
                    batch_data[folder,idx] = resized
                    batch_data_aug[folder,idx] = cv2.warpAffine(resized, M, (resized.shape[0], resized.shape[1]))
                    batch_data_aug2[folder,idx] = cv2.warpAffine(resized, M2, (resized.shape[0], resized.shape[1]))
                    batch_data_aug3[folder,idx] = cv2.warpAffine(resized, M3, (resized.shape[0], resized.shape[1]))
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
                batch_labels_aug[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
                batch_labels_aug2[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
                batch_labels_aug3[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            
            batch_data = np.append(batch_data, batch_data_aug, axis = 0) 
            batch_data = np.append(batch_data, batch_data_aug2, axis = 0) 
            batch_data = np.append(batch_data, batch_data_aug3, axis = 0)
            batch_labels = np.append(batch_labels, batch_labels_aug, axis = 0) 
            batch_labels = np.append(batch_labels, batch_labels_aug2, axis = 0) 
            batch_labels = np.append(batch_labels, batch_labels_aug3, axis = 0) 
            yield batch_data, batch_labels

In [7]:
nb_filters = [8,16,32,64]
nb_dense = [1000, 500, 5]
# input_shape = (30, 120, 120, c)

In [8]:
from keras.models import Sequential, Model
from keras.layers import Dense, GRU, Flatten, TimeDistributed, BatchNormalization, Activation, Dropout
from keras.layers.convolutional import Conv2D, MaxPooling2D, Conv3D, MaxPooling3D, AveragePooling2D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras import optimizers
from keras.regularizers import l2
from keras.layers import LSTM, GRU, Bidirectional, SimpleRNN, RNN


nb_frames = 30 # number of frames
nb_rows = 120 # image width
nb_cols = 120 # image height 

nb_classes = 5
nb_channel = 3

input_shape=(30,120,120,3)

# Define model
model = Sequential()

model.add(Conv3D(nb_filters[0], kernel_size=(3,3,3), input_shape=(30,120,120,3),
                 padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling3D(pool_size=(2,2,2)))

model.add(Conv3D(nb_filters[1], kernel_size=(3,3,3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))

model.add(MaxPooling3D(pool_size=(2,2,2)))

model.add(Conv3D(nb_filters[2], kernel_size=(1,3,3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))

model.add(MaxPooling3D(pool_size=(2,2,2)))

model.add(Conv3D(nb_filters[3], kernel_size=(1,3,3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))

model.add(MaxPooling3D(pool_size=(2,2,2)))

#Flatten Layers
model.add(Flatten())

model.add(Dense(nb_dense[0], activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(nb_dense[1], activation='relu'))
model.add(Dropout(0.5))

#softmax layer
model.add(Dense(nb_dense[2], activation='softmax'))
model.compile(optimizer="Adam", loss='categorical_crossentropy', metrics=['categorical_accuracy'])
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d (Conv3D)              (None, 30, 120, 120, 8)   656       
_________________________________________________________________
batch_normalization (BatchNo (None, 30, 120, 120, 8)   32        
_________________________________________________________________
activation (Activation)      (None, 30, 120, 120, 8)   0         
_________________________________________________________________
max_pooling3d (MaxPooling3D) (None, 15, 60, 60, 8)     0         
_________________________________________________________________
conv3d_1 (Conv3D)            (None, 15, 60, 60, 16)    3472      
_________________________________________________________________
batch_normalization_1 (Batch (None, 15, 60, 60, 16)    64        
_________________________________________________________________
activation_1 (Activation)    (None, 15, 60, 60, 16)    0

In [9]:
curr_dt_time = datetime.datetime.now()
train_path = 'Project_data/train'
val_path = 'Project_data/val'
# train_path = 'gdrive/MyDrive/Colab Notebooks/Gesture Recognition/Project_data/train'
# val_path = 'gdrive/MyDrive/Colab Notebooks/Gesture Recognition/Project_data/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
batch_size = 10
num_epochs = 20

# training sequences = 663
# validation sequences = 100


In [10]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [11]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=2, cooldown=1, verbose=1)# write the Reducelronplateau code here
callbacks_list = [checkpoint, LR]

W0329 15:22:30.157113 139998542677824 callbacks.py:1071] `period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen.


In [12]:
batch_size = 10
train_generator = aug_generator(train_path, train_doc, batch_size)
val_generator = aug_generator(val_path, val_doc, batch_size)
num_epochs = 20
model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

W0329 15:22:36.620363 139998542677824 deprecation.py:323] From <ipython-input-12-6667ca054671>:7: Model.fit_generator (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
Instructions for updating:
Please use Model.fit, which supports generators.


Source path =  Project_data/train ; batch size = 10
Epoch 1/20

Epoch 00001: saving model to model_init_2021-03-2915_22_30.137713/model-00001-2.62928-0.33409-9.47401-0.23000.h5
Epoch 2/20
Epoch 00002: saving model to model_init_2021-03-2915_22_30.137713/model-00002-1.67134-0.34204-1.21487-0.58000.h5
Epoch 3/20
Epoch 00003: saving model to model_init_2021-03-2915_22_30.137713/model-00003-1.49376-0.36567-1.50779-0.34250.h5
Epoch 4/20
Epoch 00004: saving model to model_init_2021-03-2915_22_30.137713/model-00004-1.40351-0.39552-1.13348-0.50000.h5
Epoch 5/20
Epoch 00005: saving model to model_init_2021-03-2915_22_30.137713/model-00005-1.40435-0.40796-1.34120-0.43250.h5
Epoch 6/20
Epoch 00006: saving model to model_init_2021-03-2915_22_30.137713/model-00006-1.29369-0.47761-1.15542-0.47750.h5

Epoch 00006: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
Epoch 7/20
Epoch 00007: saving model to model_init_2021-03-2915_22_30.137713/model-00007-1.08092-0.55473-1.14964-0.56000.h

<tensorflow.python.keras.callbacks.History at 0x7f53510c8d30>

### Experimentation models

#### Model 2 -
Optimiser - Adam <br>
Grayscale image (ie only one channel is used) is used <br>
Image is resized to the size 160/160
image is min-max normalised to using 95 and 5 percentile instead of max and min <br>
No of frames used per video = 10 (that too the frame from 0 to 31) <br>
Architecture - Architecture 1
batch_size = 16

In [30]:
img_height = 160
img_width = 160
channels = 1
img_idx = np.arange(0,30,3)

def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    img_height = 160
    img_width = 160
    while True:
        t = np.random.permutation(folder_list)
        # num_batches = # calculate the number of batches
        num_batches = len(folder_list)//batch_size
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,len(img_idx),img_height,img_width,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    image = image.resize((img_height,img_width))
                    image = image.convert('L')
                    image = (image - np.percentile(image,5)) / (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        
        # write the code for the remaining data points which are left after full batches
        addnl_image_count = len(folder_list) % batch_size
        batch = batch + 1
        if(addnl_image_count!=0):
            batch_data = np.zeros((addnl_image_count,len(img_idx),img_height,img_width,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((addnl_image_count,5)) # batch_labels is the one hot representation of the output
            for folder in range(addnl_image_count): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                   
                    image = image.resize((img_height,img_width))
                    image = image.convert('L')
                    image = (image - np.percentile(image,5))/ (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels



In [31]:
curr_dt_time = datetime.datetime.now()
train_path = 'Project_data/train'
val_path = 'Project_data/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs = 10 # choose the number of epochs
print ('# epochs =', num_epochs)
num_classes = 5
batch_size = 16

# training sequences = 663
# validation sequences = 100
# epochs = 10


In [32]:
model = Sequential()
model.add(Conv3D(16, (3, 3, 3), padding='same',
          input_shape=(len(img_idx),img_height,img_width,channels)))


model.add(Conv3D(64, (2, 2, 2), padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))

model.add(Conv3D(128, (2, 2, 2), padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))

model.add(Flatten())
model.add(Dense(128,activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))

model.add(Dense(64,activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.25))


# model.add(Dense(num_classes,activation='softmax'))

model.add(Dense(num_classes))
model.add(Activation('softmax'))

In [33]:
optimiser = 'Adam'

# compile it
model.compile(loss='categorical_crossentropy', optimizer=optimiser, metrics=['categorical_accuracy'])

# summary of model
model.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_10 (Conv3D)           (None, 10, 160, 160, 16)  448       
_________________________________________________________________
conv3d_11 (Conv3D)           (None, 10, 160, 160, 64)  8256      
_________________________________________________________________
activation_10 (Activation)   (None, 10, 160, 160, 64)  0         
_________________________________________________________________
batch_normalization_11 (Batc (None, 10, 160, 160, 64)  256       
_________________________________________________________________
max_pooling3d_8 (MaxPooling3 (None, 5, 80, 80, 64)     0         
_________________________________________________________________
conv3d_12 (Conv3D)           (None, 5, 80, 80, 128)    65664     
_________________________________________________________________
activation_11 (Activation)   (None, 5, 80, 80, 128)   

In [34]:

train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [35]:
model_name = 'model' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', save_freq='epoch')

#LR = # write the REducelronplateau code here
LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=0.01)
callbacks_list = [checkpoint, LR]

In [36]:

if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [37]:
model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

W0329 12:29:43.155325 139856429156160 deprecation.py:323] From <ipython-input-37-bd77c9c60c14>:3: Model.fit_generator (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
Instructions for updating:
Please use Model.fit, which supports generators.


Source path =  Project_data/train ; batch size = 16
Epoch 1/10

Epoch 00001: saving model to model_2021-03-2912_29_39.544224/model-00001-2.15752-0.25641-13.80201-0.30000.h5
Epoch 2/10
Epoch 00002: saving model to model_2021-03-2912_29_39.544224/model-00002-1.72195-0.36501-28.94143-0.23000.h5
Epoch 3/10
Epoch 00003: saving model to model_2021-03-2912_29_39.544224/model-00003-1.35284-0.47511-62.38010-0.20000.h5
Epoch 4/10
Epoch 00004: saving model to model_2021-03-2912_29_39.544224/model-00004-1.23616-0.52036-65.45608-0.20000.h5
Epoch 5/10
Epoch 00005: saving model to model_2021-03-2912_29_39.544224/model-00005-1.23290-0.50980-12.73339-0.27000.h5
Epoch 6/10
Epoch 00006: saving model to model_2021-03-2912_29_39.544224/model-00006-1.12259-0.55656-11.80759-0.24000.h5
Epoch 7/10
Epoch 00007: saving model to model_2021-03-2912_29_39.544224/model-00007-0.96210-0.64404-11.15958-0.23000.h5
Epoch 8/10
Epoch 00008: saving model to model_2021-03-2912_29_39.544224/model-00008-0.82656-0.68024-1.53739

<tensorflow.python.keras.callbacks.History at 0x7f31c02782e8>

#### Model 3 -
Optimiser - Adam <br>
RGB image (ie all the three channel is used) is used <br>
Image is cropped to the size 160/160 (for the image of 120x160 black padding will be provided)
image is min-max normalised to using 95 and 5 percentile instead of max and min <br>
No of frames used per video = 10 ( the frame from 0 to 30) <br>
Architecture - Architecture 1
batch_size = 16

In [13]:
img_height = 160
img_width = 160
channels = 3
# img_idx = np.arange(10,21,2)
img_idx = np.arange(0,30,3)

def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(folder_list)//batch_size
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,len(img_idx),img_height,img_width,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    # image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    mid_height = image.height//2
                    mid_width = image.width//2
                    
                    image = image.crop((mid_width-80,mid_height-80,mid_width+80,mid_height+80))
                    image = (image - np.percentile(image,5))/ (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        addnl_image_count = len(folder_list) % batch_size
        batch = batch + 1
        if(addnl_image_count!=0):
            batch_data = np.zeros((addnl_image_count,len(img_idx),img_height,img_width,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((addnl_image_count,5)) # batch_labels is the one hot representation of the output
            for folder in range(addnl_image_count): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                   
                    mid_height = image.height//2
                    mid_width = image.width//2
                    
                    image = image.crop((mid_width-80,mid_height-80,mid_width+80,mid_height+80))
                    image = (image - np.percentile(image,5))/ (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels

In [14]:
curr_dt_time = datetime.datetime.now()
train_path = 'Project_data/train'
val_path = 'Project_data/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs = 10 # choose the number of epochs
print ('# epochs =', num_epochs)
num_classes = 5
batch_size = 16

# training sequences = 663
# validation sequences = 100
# epochs = 10


In [15]:
model = Sequential()
model.add(Conv3D(16, (2, 2, 2), padding='same',
          input_shape=(len(img_idx),img_height,img_width,channels)))


model.add(Conv3D(64, (2, 2, 2), padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))

model.add(Conv3D(128, (2, 2, 2), padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))

model.add(Flatten())
model.add(Dense(128,activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))

model.add(Dense(64,activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.25))


# model.add(Dense(num_classes,activation='softmax'))

model.add(Dense(num_classes))
model.add(Activation('softmax'))

In [16]:
optimiser = 'Adam'

# compile it
model.compile(loss='categorical_crossentropy', optimizer=optimiser, metrics=['categorical_accuracy'])

# summary of model
model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_8 (Conv3D)            (None, 6, 160, 160, 32)   2624      
_________________________________________________________________
activation_9 (Activation)    (None, 6, 160, 160, 32)   0         
_________________________________________________________________
conv3d_9 (Conv3D)            (None, 6, 160, 160, 32)   27680     
_________________________________________________________________
activation_10 (Activation)   (None, 6, 160, 160, 32)   0         
_________________________________________________________________
max_pooling3d_4 (MaxPooling3 (None, 2, 54, 54, 32)     0         
_________________________________________________________________
dropout_3 (Dropout)          (None, 2, 54, 54, 32)     0         
_________________________________________________________________
conv3d_10 (Conv3D)           (None, 2, 54, 54, 64)    

In [17]:

train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [18]:
model_name = 'model' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', save_freq='epoch')

#LR = # write the REducelronplateau code here
LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=0.01)
callbacks_list = [checkpoint, LR]

In [19]:

if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [20]:
model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  Project_data/train ; batch size = 64
Epoch 1/10

Epoch 00001: saving model to model_2021-03-2719_43_37.784431/model-00001-2.43851-0.30015-1.66938-0.31000.h5
Epoch 2/10
Epoch 00002: saving model to model_2021-03-2719_43_37.784431/model-00002-1.59296-0.39517-1.78173-0.23000.h5
Epoch 3/10
Epoch 00003: saving model to model_2021-03-2719_43_37.784431/model-00003-1.36509-0.46908-1.73309-0.29000.h5
Epoch 4/10
Epoch 00004: saving model to model_2021-03-2719_43_37.784431/model-00004-1.14526-0.57617-1.80857-0.30000.h5
Epoch 5/10
Epoch 00005: saving model to model_2021-03-2719_43_37.784431/model-00005-1.00442-0.63198-2.52000-0.21000.h5
Epoch 6/10
Epoch 00006: saving model to model_2021-03-2719_43_37.784431/model-00006-0.84128-0.69382-2.71930-0.18000.h5
Epoch 7/10
Epoch 00007: saving model to model_2021-03-2719_43_37.784431/model-00007-0.70830-0.74811-3.24413-0.26000.h5
Epoch 8/10
Epoch 00008: saving model to model_2021-03-2719_43_37.784431/model-00008-0.53451-0.81448-3.96322-0.1800

<tensorflow.python.keras.callbacks.History at 0x7f3f0418f8d0>

#### Model 4 -
Optimiser - Adam <br>
RGB image (ie all the three channel is used) is used <br>
Image is first resized to the standard size of 120x160 and then cropped to a size of 80x120
image is min-max normalised to using 95 and 5 percentile instead of max and min <br>
No of frames used per video = 10 ( the frame from 0 to 30) <br>
Architecture - Architecture 1
batch_size = 16

In [4]:
img_height = 80
img_width = 120
channels = 3
# img_idx = np.arange(10,21,2)
img_idx = np.arange(0,30,3)


def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(folder_list)//batch_size
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,len(img_idx),img_height,img_width,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    # image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    if image.height==360 and image.width==360:
                        image = image.resize((160,120))
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    mid_height = image.height//2
                    mid_width = image.width//2
                    
                    image = image.crop((mid_width-60,mid_height-40,mid_width+60,mid_height+40))
                    image = (image - np.percentile(image,5))/ (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        addnl_image_count = len(folder_list) % batch_size
        batch = batch + 1
        if(addnl_image_count!=0):
            batch_data = np.zeros((addnl_image_count,len(img_idx),img_height,img_width,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((addnl_image_count,5)) # batch_labels is the one hot representation of the output
            for folder in range(addnl_image_count): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    if image.height==360 and image.width==360:
                        image = image.resize((160,120))
                    mid_height = image.height//2
                    mid_width = image.width//2
                    
                    image = image.crop((mid_width-60,mid_height-40,mid_width+60,mid_height+40))
                    image = (image - np.percentile(image,5))/ (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels

In [5]:
curr_dt_time = datetime.datetime.now()
train_path = 'Project_data/train'
val_path = 'Project_data/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs = 10 # choose the number of epochs
print ('# epochs =', num_epochs)
num_classes = 5
batch_size = 16

# training sequences = 663
# validation sequences = 100
# epochs = 10


In [7]:
model = Sequential()
model.add(Conv3D(16, (2, 2, 2), padding='same',
          input_shape=(len(img_idx),img_height,img_width,channels)))


model.add(Conv3D(64, (2, 2, 2), padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))

model.add(Conv3D(128, (2, 2, 2), padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))

model.add(Flatten())
model.add(Dense(128,activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))

model.add(Dense(64,activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.25))


# model.add(Dense(num_classes,activation='softmax'))

model.add(Dense(num_classes))
model.add(Activation('softmax'))

In [8]:
optimiser = 'Adam'

# compile it
model.compile(loss='categorical_crossentropy', optimizer=optimiser, metrics=['categorical_accuracy'])

# summary of model
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_4 (Conv3D)            (None, 6, 80, 120, 32)    2624      
_________________________________________________________________
activation_4 (Activation)    (None, 6, 80, 120, 32)    0         
_________________________________________________________________
conv3d_5 (Conv3D)            (None, 6, 80, 120, 32)    27680     
_________________________________________________________________
activation_5 (Activation)    (None, 6, 80, 120, 32)    0         
_________________________________________________________________
max_pooling3d_2 (MaxPooling3 (None, 2, 27, 40, 32)     0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 2, 27, 40, 32)     0         
_________________________________________________________________
conv3d_6 (Conv3D)            (None, 2, 27, 40, 64)    

In [9]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [10]:
model_name = 'model' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', save_freq='epoch')

#LR = # write the REducelronplateau code here
LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=0.01)
callbacks_list = [checkpoint, LR]

In [11]:

if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [12]:
model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

W0327 19:38:33.229362 139911367006016 deprecation.py:323] From <ipython-input-12-bd77c9c60c14>:3: Model.fit_generator (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
Instructions for updating:
Please use Model.fit, which supports generators.


Source path =  Project_data/train ; batch size = 64
Epoch 1/10

Epoch 00001: saving model to model_2021-03-2719_33_57.452747/model-00001-2.40011-0.31825-2.28037-0.24000.h5
Epoch 2/10
Epoch 00002: saving model to model_2021-03-2719_33_57.452747/model-00002-1.61950-0.38612-1.65325-0.25000.h5
Epoch 3/10
Epoch 00003: saving model to model_2021-03-2719_33_57.452747/model-00003-1.32212-0.48567-1.96197-0.21000.h5
Epoch 4/10
Epoch 00004: saving model to model_2021-03-2719_33_57.452747/model-00004-1.17540-0.53997-1.88818-0.40000.h5
Epoch 5/10
Epoch 00005: saving model to model_2021-03-2719_33_57.452747/model-00005-1.08628-0.60935-1.87936-0.38000.h5
Epoch 6/10
Epoch 00006: saving model to model_2021-03-2719_33_57.452747/model-00006-0.88605-0.65008-2.26286-0.26000.h5
Epoch 7/10
Epoch 00007: saving model to model_2021-03-2719_33_57.452747/model-00007-0.80909-0.68024-2.04607-0.31000.h5
Epoch 8/10
Epoch 00008: saving model to model_2021-03-2719_33_57.452747/model-00008-0.65599-0.75566-2.52142-0.3000

<tensorflow.python.keras.callbacks.History at 0x7f3f0c042780>

#### Model 5 -
Optimiser - Adam <br>
RGB image (ie all the three channel is used) is used <br>
Image is resized and then cropped to the standard size of 80x120<br>
image is min-max normalised to using 95 and 5 percentile instead of max and min <br>
No of frames used per video = 6 ( the frame from 10 to 20) <br>
Architecture - Architecture 2

In [6]:
img_height = 160
img_width = 160

def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    img_idx = np.arange(10,21,2)

    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(folder_list)//batch_size
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,len(img_idx),img_height,img_width,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    # image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    if image.height==360 and image.width==360:
                        image = image.resize((160,160))
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    mid_height = image.height//2
                    mid_width = image.width//2
                    
                    image = image.crop((mid_width-80,mid_height-80,mid_width+80,mid_height+80))
                    image = (image - np.percentile(image,5))/ (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        addnl_image_count = len(folder_list) % batch_size
        batch = batch + 1
        if(addnl_image_count!=0):
            batch_data = np.zeros((addnl_image_count,len(img_idx),img_height,img_width,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((addnl_image_count,5)) # batch_labels is the one hot representation of the output
            for folder in range(addnl_image_count): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                   
                    if image.height==360 and image.width==360:
                        image = image.resize((160,160))
                    mid_height = image.height//2
                    mid_width = image.width//2
                    
                    image = image.crop((mid_width-80,mid_height-80,mid_width+80,mid_height+80))
                    image = (image - np.percentile(image,5))/ (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels

In [7]:
curr_dt_time = datetime.datetime.now()
train_path = 'Project_data/train'
val_path = 'Project_data/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs = 10 # choose the number of epochs
print ('# epochs =', num_epochs)
num_classes = 5
batch_size = 32

# training sequences = 663
# validation sequences = 100
# epochs = 10


In [8]:
from keras.models import Sequential, Model
from keras.layers import Dense, GRU, Flatten, TimeDistributed, BatchNormalization, Activation, Dropout
from keras.layers.convolutional import Conv2D, MaxPooling2D, Conv3D, MaxPooling3D, AveragePooling2D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras import optimizers
from keras.regularizers import l2
from keras.layers import LSTM, GRU, Bidirectional, SimpleRNN, RNN


img_height = 160
img_width = 160
channels = 3

model = Sequential()
model.add(Conv3D(32, (3, 3, 3), padding='same', input_shape=(6,img_height,img_width,channels)))
model.add(Activation('relu'))
model.add(Conv3D(32, (3, 3, 3), padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(3, 3, 3),padding='same'))
model.add(Dropout(0.25))

model.add(Conv3D(64, (3, 3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv3D(64, (3, 3, 3), padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(3, 3, 3),padding='same'))

model.add(Flatten())
model.add(Dense(512,activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))

model.add(Dense(num_classes))
model.add(Activation('softmax'))

In [9]:
optimiser = 'Adam'

# compile it
model.compile(loss='categorical_crossentropy', optimizer=optimiser, metrics=['categorical_accuracy'])

# summary of model
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d (Conv3D)              (None, 6, 160, 160, 32)   2624      
_________________________________________________________________
activation (Activation)      (None, 6, 160, 160, 32)   0         
_________________________________________________________________
conv3d_1 (Conv3D)            (None, 6, 160, 160, 32)   27680     
_________________________________________________________________
activation_1 (Activation)    (None, 6, 160, 160, 32)   0         
_________________________________________________________________
batch_normalization (BatchNo (None, 6, 160, 160, 32)   128       
_________________________________________________________________
max_pooling3d (MaxPooling3D) (None, 2, 54, 54, 32)     0         
_________________________________________________________________
dropout (Dropout)            (None, 2, 54, 54, 32)     0

In [10]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [11]:
model_name = 'model' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', save_freq='epoch')

#LR = # write the REducelronplateau code here
LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=0.01)
callbacks_list = [checkpoint, LR]

In [12]:

if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [13]:
model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

W0329 13:06:19.899816 139947362854720 deprecation.py:323] From <ipython-input-13-bd77c9c60c14>:3: Model.fit_generator (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
Instructions for updating:
Please use Model.fit, which supports generators.


Source path =  Project_data/train ; batch size = 32
Epoch 1/10

Epoch 00001: saving model to model_2021-03-2913_05_37.548373/model-00001-2.03669-0.37858-2.10024-0.21000.h5
Epoch 2/10
Epoch 00002: saving model to model_2021-03-2913_05_37.548373/model-00002-1.18269-0.55053-2.78735-0.32000.h5
Epoch 3/10
Epoch 00003: saving model to model_2021-03-2913_05_37.548373/model-00003-0.81399-0.68778-8.17916-0.28000.h5
Epoch 4/10
Epoch 00004: saving model to model_2021-03-2913_05_37.548373/model-00004-0.60898-0.75867-13.53042-0.24000.h5
Epoch 5/10
Epoch 00005: saving model to model_2021-03-2913_05_37.548373/model-00005-0.47789-0.81297-16.15639-0.23000.h5
Epoch 6/10
Epoch 00006: saving model to model_2021-03-2913_05_37.548373/model-00006-0.36451-0.86124-20.69443-0.21000.h5
Epoch 7/10
Epoch 00007: saving model to model_2021-03-2913_05_37.548373/model-00007-0.23929-0.91855-26.54247-0.22000.h5
Epoch 8/10
Epoch 00008: saving model to model_2021-03-2913_05_37.548373/model-00008-0.15373-0.95475-27.51917-0

<tensorflow.python.keras.callbacks.History at 0x7f474c4dbac8>

#### Model 6 -
Optimiser - Adam <br>
RGB image (ie all the three channel) are used <br>
Image is resized to the size 160/160
image is min-max normalised to using 95 and 5 percentile instead of max and min <br>
No of frames used per video = 10 (that too the frame from 0 to 31) <br>
batch size = 16

In [25]:
img_height = 160
img_width = 160
channels = 3
img_idx = np.arange(0,30,1)

def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(folder_list)//batch_size
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,len(img_idx),img_height,img_width,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    image = image.resize((img_height,img_width))
                    image = (image - np.percentile(image,5)) / (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        
        # write the code for the remaining data points which are left after full batches
        addnl_image_count = len(folder_list) % batch_size
        batch = batch + 1
        if(addnl_image_count!=0):
            batch_data = np.zeros((addnl_image_count,len(img_idx),img_height,img_width,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((addnl_image_count,5)) # batch_labels is the one hot representation of the output
            for folder in range(addnl_image_count): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                   
                    image = image.resize((img_height,img_width))
                    image = (image - np.percentile(image,5))/ (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels



In [26]:
curr_dt_time = datetime.datetime.now()
train_path = 'Project_data/train'
val_path = 'Project_data/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs = 10 # choose the number of epochs
print ('# epochs =', num_epochs)
num_classes = 5
batch_size = 16

# training sequences = 663
# validation sequences = 100
# epochs = 10


In [27]:
model = Sequential()

model.add(Conv3D(8, kernel_size=(3,3,3), input_shape=(len(img_idx),img_height,img_width,channels), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(MaxPooling3D(pool_size=(2,2,2)))

model.add(Conv3D(16, kernel_size=(3,3,3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(MaxPooling3D(pool_size=(2,2,2)))

model.add(Conv3D(32, kernel_size=(1,3,3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(MaxPooling3D(pool_size=(2,2,2)))

model.add(Conv3D(64, kernel_size=(1,3,3), padding='same'))
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(MaxPooling3D(pool_size=(2,2,2)))

#Flatten Layers
model.add(Flatten())

model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))

#softmax layer
model.add(Dense(5, activation='softmax'))


In [28]:
optimiser = "Adam"
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_12 (Conv3D)           (None, 30, 160, 160, 8)   656       
_________________________________________________________________
batch_normalization_9 (Batch (None, 30, 160, 160, 8)   32        
_________________________________________________________________
activation_13 (Activation)   (None, 30, 160, 160, 8)   0         
_________________________________________________________________
dropout_12 (Dropout)         (None, 30, 160, 160, 8)   0         
_________________________________________________________________
max_pooling3d_10 (MaxPooling (None, 15, 80, 80, 8)     0         
_________________________________________________________________
conv3d_13 (Conv3D)           (None, 15, 80, 80, 16)    3472      
_________________________________________________________________
batch_normalization_10 (Batc (None, 15, 80, 80, 16)   

In [29]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [30]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=0.01)
callbacks_list = [checkpoint, LR]

W0329 13:18:14.023926 139947362854720 callbacks.py:1071] `period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen.


In [31]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [32]:
model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  Project_data/train ; batch size = 16
Epoch 1/10

Epoch 00001: saving model to model_init_2021-03-2913_18_11.848691/model-00001-2.51845-0.21719-1.60724-0.21000.h5
Epoch 2/10
Epoch 00002: saving model to model_init_2021-03-2913_18_11.848691/model-00002-1.49911-0.33635-1.58685-0.32000.h5
Epoch 3/10
Epoch 00003: saving model to model_init_2021-03-2913_18_11.848691/model-00003-1.33431-0.43137-1.62208-0.22000.h5
Epoch 4/10
Epoch 00004: saving model to model_init_2021-03-2913_18_11.848691/model-00004-1.15405-0.55807-1.63866-0.23000.h5
Epoch 5/10
Epoch 00005: saving model to model_init_2021-03-2913_18_11.848691/model-00005-0.98203-0.58371-1.73103-0.22000.h5
Epoch 6/10
Epoch 00006: saving model to model_init_2021-03-2913_18_11.848691/model-00006-0.89290-0.64857-1.82701-0.28000.h5
Epoch 7/10
Epoch 00007: saving model to model_init_2021-03-2913_18_11.848691/model-00007-0.85732-0.67421-1.84246-0.27000.h5
Epoch 8/10
Epoch 00008: saving model to model_init_2021-03-2913_18_11.848691/mo

<tensorflow.python.keras.callbacks.History at 0x7f47080ae518>

#### Model 7 -
Optimiser - Adam <br>
RGB image (ie all the three channel) are used <br>
Image is resized to the size 120/120
image is min-max normalised to using 95 and 5 percentile instead of max and min <br>
No of frames used per video = 10 (that too the frame from 0 to 31) <br>
batch size = 16

In [33]:
img_height = 120
img_width = 120
channels = 3
img_idx = np.arange(0,30,1)

def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(folder_list)//batch_size
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,len(img_idx),img_height,img_width,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    image = image.resize((img_height,img_width))
                    image = (image - np.percentile(image,5)) / (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        
        # write the code for the remaining data points which are left after full batches
        addnl_image_count = len(folder_list) % batch_size
        batch = batch + 1
        if(addnl_image_count!=0):
            batch_data = np.zeros((addnl_image_count,len(img_idx),img_height,img_width,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((addnl_image_count,5)) # batch_labels is the one hot representation of the output
            for folder in range(addnl_image_count): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                   
                    image = image.resize((img_height,img_width))
                    image = (image - np.percentile(image,5))/ (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels



In [34]:
curr_dt_time = datetime.datetime.now()
train_path = 'Project_data/train'
val_path = 'Project_data/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs = 10 # choose the number of epochs
print ('# epochs =', num_epochs)
num_classes = 5
batch_size = 16

# training sequences = 663
# validation sequences = 100
# epochs = 10


In [35]:
model = Sequential()

model.add(Conv3D(8, kernel_size=(3,3,3), input_shape=(len(img_idx),img_height,img_width,channels), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(MaxPooling3D(pool_size=(2,2,2)))

model.add(Conv3D(16, kernel_size=(3,3,3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(MaxPooling3D(pool_size=(2,2,2)))

model.add(Conv3D(32, kernel_size=(1,3,3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(MaxPooling3D(pool_size=(2,2,2)))

model.add(Conv3D(64, kernel_size=(1,3,3), padding='same'))
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(MaxPooling3D(pool_size=(2,2,2)))

#Flatten Layers
model.add(Flatten())

model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))

#softmax layer
model.add(Dense(5, activation='softmax'))


In [36]:
optimiser = "Adam"
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_16 (Conv3D)           (None, 30, 120, 120, 8)   656       
_________________________________________________________________
batch_normalization_12 (Batc (None, 30, 120, 120, 8)   32        
_________________________________________________________________
activation_17 (Activation)   (None, 30, 120, 120, 8)   0         
_________________________________________________________________
dropout_18 (Dropout)         (None, 30, 120, 120, 8)   0         
_________________________________________________________________
max_pooling3d_14 (MaxPooling (None, 15, 60, 60, 8)     0         
_________________________________________________________________
conv3d_17 (Conv3D)           (None, 15, 60, 60, 16)    3472      
_________________________________________________________________
batch_normalization_13 (Batc (None, 15, 60, 60, 16)   

In [37]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [38]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=0.01)
callbacks_list = [checkpoint, LR]

W0329 13:39:40.976706 139947362854720 callbacks.py:1071] `period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen.


In [39]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [40]:
model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  Project_data/train ; batch size = 16
Epoch 1/10

Epoch 00001: saving model to model_init_2021-03-2913_39_38.561164/model-00001-2.27375-0.22021-1.60772-0.21000.h5
Epoch 2/10
Epoch 00002: saving model to model_init_2021-03-2913_39_38.561164/model-00002-1.58299-0.26998-1.60207-0.20000.h5
Epoch 3/10
Epoch 00003: saving model to model_init_2021-03-2913_39_38.561164/model-00003-1.44586-0.36350-1.59683-0.23000.h5
Epoch 4/10
Epoch 00004: saving model to model_init_2021-03-2913_39_38.561164/model-00004-1.34734-0.44042-1.59678-0.32000.h5
Epoch 5/10
Epoch 00005: saving model to model_init_2021-03-2913_39_38.561164/model-00005-1.20051-0.53695-1.56952-0.29000.h5
Epoch 6/10
Epoch 00006: saving model to model_init_2021-03-2913_39_38.561164/model-00006-1.09715-0.54902-1.54342-0.31000.h5
Epoch 7/10
Epoch 00007: saving model to model_init_2021-03-2913_39_38.561164/model-00007-1.01279-0.61086-1.48458-0.43000.h5
Epoch 8/10
Epoch 00008: saving model to model_init_2021-03-2913_39_38.561164/mo

<tensorflow.python.keras.callbacks.History at 0x7f46ec419a20>

#### Model 8 -
Optimiser - Adam <br>
RGB image (ie all the three channel) are used <br>
Image is cropped to the size 120/120
image is min-max normalised to using 95 and 5 percentile instead of max and min <br>
No of frames used per video = 10 (that too the frame from 0 to 31) <br>
batch size = 16

In [41]:
img_height = 120
img_width = 120
channels = 3
img_idx = np.arange(0,30,1)

def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(folder_list)//batch_size
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,len(img_idx),img_height,img_width,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    mid_height = image.height//2
                    mid_width = image.width//2
                    image = image.crop((mid_width-60,mid_height-60,mid_width+60,mid_height+60))
                    image = (image - np.percentile(image,5)) / (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        
        # write the code for the remaining data points which are left after full batches
        addnl_image_count = len(folder_list) % batch_size
        batch = batch + 1
        if(addnl_image_count!=0):
            batch_data = np.zeros((addnl_image_count,len(img_idx),img_height,img_width,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((addnl_image_count,5)) # batch_labels is the one hot representation of the output
            for folder in range(addnl_image_count): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    mid_height = image.height//2
                    mid_width = image.width//2
                    image = image.crop((mid_width-60,mid_height-60,mid_width+60,mid_height+60))
                    image = (image - np.percentile(image,5))/ (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels



In [42]:
curr_dt_time = datetime.datetime.now()
train_path = 'Project_data/train'
val_path = 'Project_data/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs = 10 # choose the number of epochs
print ('# epochs =', num_epochs)
num_classes = 5
batch_size = 16

# training sequences = 663
# validation sequences = 100
# epochs = 10


In [43]:
model = Sequential()

model.add(Conv3D(8, kernel_size=(3,3,3), input_shape=(len(img_idx),img_height,img_width,channels), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(MaxPooling3D(pool_size=(2,2,2)))

model.add(Conv3D(16, kernel_size=(3,3,3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(MaxPooling3D(pool_size=(2,2,2)))

model.add(Conv3D(32, kernel_size=(1,3,3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(MaxPooling3D(pool_size=(2,2,2)))

model.add(Conv3D(64, kernel_size=(1,3,3), padding='same'))
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(MaxPooling3D(pool_size=(2,2,2)))

#Flatten Layers
model.add(Flatten())

model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))

#softmax layer
model.add(Dense(5, activation='softmax'))


In [44]:
optimiser = "Adam"
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_20 (Conv3D)           (None, 30, 120, 120, 8)   656       
_________________________________________________________________
batch_normalization_15 (Batc (None, 30, 120, 120, 8)   32        
_________________________________________________________________
activation_21 (Activation)   (None, 30, 120, 120, 8)   0         
_________________________________________________________________
dropout_24 (Dropout)         (None, 30, 120, 120, 8)   0         
_________________________________________________________________
max_pooling3d_18 (MaxPooling (None, 15, 60, 60, 8)     0         
_________________________________________________________________
conv3d_21 (Conv3D)           (None, 15, 60, 60, 16)    3472      
_________________________________________________________________
batch_normalization_16 (Batc (None, 15, 60, 60, 16)   

In [45]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [46]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=0.01)
callbacks_list = [checkpoint, LR]

W0329 14:01:18.717895 139947362854720 callbacks.py:1071] `period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen.


In [47]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [48]:
model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  Project_data/train ; batch size = 16
Epoch 1/10

Epoch 00001: saving model to model_init_2021-03-2914_01_17.209734/model-00001-2.27595-0.23077-1.60952-0.18000.h5
Epoch 2/10
Epoch 00002: saving model to model_init_2021-03-2914_01_17.209734/model-00002-1.58028-0.26998-1.61324-0.19000.h5
Epoch 3/10
Epoch 00003: saving model to model_init_2021-03-2914_01_17.209734/model-00003-1.48608-0.35596-1.61845-0.15000.h5
Epoch 4/10
Epoch 00004: saving model to model_init_2021-03-2914_01_17.209734/model-00004-1.34968-0.44646-1.65570-0.17000.h5
Epoch 5/10
Epoch 00005: saving model to model_init_2021-03-2914_01_17.209734/model-00005-1.29971-0.48718-1.59454-0.27000.h5
Epoch 6/10
Epoch 00006: saving model to model_init_2021-03-2914_01_17.209734/model-00006-1.25443-0.49774-1.59705-0.24000.h5
Epoch 7/10
Epoch 00007: saving model to model_init_2021-03-2914_01_17.209734/model-00007-1.16534-0.54299-1.56821-0.24000.h5
Epoch 8/10
Epoch 00008: saving model to model_init_2021-03-2914_01_17.209734/mo

<tensorflow.python.keras.callbacks.History at 0x7f46d2ef14a8>

### Modle 9

In [5]:

def get_random_affine():
    dx, dy = np.random.randint(-1.7, 1.8, 2)
    M = np.float32([[1, 0, dx], [0, 1, dy]])
    return M

In [6]:
img_height = 120
img_width = 120
channels = 3
img_idx = np.arange(0,30,1) #create a list of image numbers you want to use for a particular video
frame_count = len(img_idx)

def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(folder_list)//batch_size # calculate the number of batches
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,frame_count,img_height,img_width,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            
            batch_data_aug = np.zeros((batch_size,frame_count,img_height,img_width,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels_aug = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
  
            batch_data_aug2 = np.zeros((batch_size,frame_count,img_height,img_width,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels_aug2 = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
 
            batch_data_aug3 = np.zeros((batch_size,frame_count,img_height,img_width,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels_aug3 = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
      
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                M = get_random_affine()
                M2 = get_random_affine()
                M3 = get_random_affine()
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = cv2.imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item], cv2.IMREAD_COLOR)
                    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
                    
#                     height, width, channel = image.shape
#                     mid_height = height//2
#                     mid_width = width//2
                    
#                     resized = image[mid_height-60:mid_height+60, mid_width-60:mid_width+60]
                    # resized[:,:,0] = image[mid_height-60:mid_height+60, mid_width-60:mid_width+60,0]
                    # resized[:,:,1] = image[mid_height-60:mid_height+60, mid_width-60:mid_width+60,1]
                    # resized[:,:,2] = image[mid_height-60:mid_height+60, mid_width-60:mid_width+60,2]
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes   
                    resized = cv2.resize(image, (120,120), interpolation = cv2.INTER_AREA)
                    batch_data[folder,idx] = resized
                    batch_data_aug[folder,idx] = cv2.warpAffine(resized, M, (resized.shape[0], resized.shape[1]))
                    batch_data_aug2[folder,idx] = cv2.warpAffine(resized, M2, (resized.shape[0], resized.shape[1]))
                    batch_data_aug3[folder,idx] = cv2.warpAffine(resized, M3, (resized.shape[0], resized.shape[1]))
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
                batch_labels_aug[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
                batch_labels_aug2[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
                batch_labels_aug3[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            
            batch_data = np.append(batch_data, batch_data_aug, axis = 0) 
            batch_data = np.append(batch_data, batch_data_aug2, axis = 0) 
            batch_data = np.append(batch_data, batch_data_aug3, axis = 0)
            batch_labels = np.append(batch_labels, batch_labels_aug, axis = 0) 
            batch_labels = np.append(batch_labels, batch_labels_aug2, axis = 0) 
            batch_labels = np.append(batch_labels, batch_labels_aug3, axis = 0)
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        
        # write the code for the remaining data points which are left after full batches
        if (len(folder_list) != batch_size*num_batches):
            print("Batch: ",num_batches+1,"Index:", batch_size)
            batch_size = len(folder_list) - (batch_size*num_batches)
            
            batch_data = np.zeros((batch_size,frame_count,img_height,img_width,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            
            batch_data_aug = np.zeros((batch_size,frame_count,img_height,img_width,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels_aug = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
  
            batch_data_aug2 = np.zeros((batch_size,frame_count,img_height,img_width,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels_aug2 = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
   
            batch_data_aug3 = np.zeros((batch_size,frame_count,img_height,img_width,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels_aug3 = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
      
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                M = get_random_affine()
                M2 = get_random_affine()
                M3 = get_random_affine()
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = cv2.imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item], cv2.IMREAD_COLOR)
                    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
                    
#                     height, width, channel = image.shape
#                     mid_height = height//2
#                     mid_width = width//2
                    
#                     resized = image[mid_height-60:mid_height+60, mid_width-60:mid_width+60]
                    # resized[:,:,0] = image[mid_height-60:mid_height+60, mid_width-60:mid_width+60,0]
                    # resized[:,:,1] = image[mid_height-60:mid_height+60, mid_width-60:mid_width+60,1]
                    # resized[:,:,2] = image[mid_height-60:mid_height+60, mid_width-60:mid_width+60,2]
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes   
                    resized = cv2.resize(image, (120,120), interpolation = cv2.INTER_AREA)
                    batch_data[folder,idx] = resized
                    batch_data_aug[folder,idx] = cv2.warpAffine(resized, M, (resized.shape[0], resized.shape[1]))
                    batch_data_aug2[folder,idx] = cv2.warpAffine(resized, M2, (resized.shape[0], resized.shape[1]))
                    batch_data_aug3[folder,idx] = cv2.warpAffine(resized, M3, (resized.shape[0], resized.shape[1]))
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
                batch_labels_aug[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
                batch_labels_aug2[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
                batch_labels_aug3[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            
            batch_data = np.append(batch_data, batch_data_aug, axis = 0) 
            batch_data = np.append(batch_data, batch_data_aug2, axis = 0) 
            batch_data = np.append(batch_data, batch_data_aug3, axis = 0)
            batch_labels = np.append(batch_labels, batch_labels_aug, axis = 0) 
            batch_labels = np.append(batch_labels, batch_labels_aug2, axis = 0) 
            batch_labels = np.append(batch_labels, batch_labels_aug3, axis = 0) 
            yield batch_data, batch_labels

In [7]:
curr_dt_time = datetime.datetime.now()
train_path = 'Project_data/train'
val_path = 'Project_data/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs = 20 # choose the number of epochs
print ('# epochs =', num_epochs)
num_classes = 5
batch_size = 16

# training sequences = 663
# validation sequences = 100
# epochs = 20


In [8]:
from keras.models import Sequential, Model
from keras.layers import Dense, GRU, Flatten, TimeDistributed, BatchNormalization, Activation, Dropout
from keras.layers.convolutional import Conv2D, MaxPooling2D, Conv3D, MaxPooling3D, AveragePooling2D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras import optimizers
from keras.regularizers import l2
from keras.layers import LSTM, GRU, Bidirectional, SimpleRNN, RNN


model = Sequential()

model.add(Conv3D(8, kernel_size=(3,3,3), input_shape=(len(img_idx),img_height,img_width,channels), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(MaxPooling3D(pool_size=(2,2,2)))

model.add(Conv3D(16, kernel_size=(3,3,3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(MaxPooling3D(pool_size=(2,2,2)))

model.add(Conv3D(32, kernel_size=(1,3,3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(MaxPooling3D(pool_size=(2,2,2)))

model.add(Conv3D(64, kernel_size=(1,3,3), padding='same'))
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(MaxPooling3D(pool_size=(2,2,2)))

#Flatten Layers
model.add(Flatten())

model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))

#softmax layer
model.add(Dense(5, activation='softmax'))


In [9]:
optimiser = "Adam"
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d (Conv3D)              (None, 30, 120, 120, 8)   656       
_________________________________________________________________
batch_normalization (BatchNo (None, 30, 120, 120, 8)   32        
_________________________________________________________________
activation (Activation)      (None, 30, 120, 120, 8)   0         
_________________________________________________________________
dropout (Dropout)            (None, 30, 120, 120, 8)   0         
_________________________________________________________________
max_pooling3d (MaxPooling3D) (None, 15, 60, 60, 8)     0         
_________________________________________________________________
conv3d_1 (Conv3D)            (None, 15, 60, 60, 16)    3472      
_________________________________________________________________
batch_normalization_1 (Batch (None, 15, 60, 60, 16)    6

In [10]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [11]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=0.01)
callbacks_list = [checkpoint, LR]

W0329 14:37:36.719548 140555769276224 callbacks.py:1071] `period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen.


In [12]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [13]:
model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  Project_data/train ; batch size = 16
Epoch 1/20
Batch:  7 Index: 16

Epoch 00001: saving model to model_init_2021-03-2914_37_17.500815/model-00001-1.96327-0.22926-1.59864-0.23000.h5
Epoch 2/20
Epoch 00002: saving model to model_init_2021-03-2914_37_17.500815/model-00002-1.53325-0.31293-1.61987-0.14286.h5
Epoch 3/20
Epoch 00003: saving model to model_init_2021-03-2914_37_17.500815/model-00003-1.39825-0.39966-1.53328-0.17857.h5
Epoch 4/20
 9/42 [=====>........................] - ETA: 42s - loss: 1.4433 - categorical_accuracy: 0.3889Batch:  95 Index: 7
Epoch 00004: saving model to model_init_2021-03-2914_37_17.500815/model-00004-1.36698-0.44891-1.36111-0.50000.h5
Epoch 5/20
Epoch 00005: saving model to model_init_2021-03-2914_37_17.500815/model-00005-1.26427-0.46071-1.32020-0.64286.h5
Epoch 6/20
Epoch 00006: saving model to model_init_2021-03-2914_37_17.500815/model-00006-1.08792-0.57024-1.41525-0.39286.h5
Epoch 7/20
Epoch 00007: saving model to model_init_2021-03-2914_37_1

<tensorflow.python.keras.callbacks.History at 0x7fd50e5c4588>

##### Model 10

conv3d without dropout
in color image with size 80x120 but cropped images after resizing
frames used - 6
image is normalised to the (95-5)

In [4]:
def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    img_idx = np.arange(10,21,2)
    img_height = 160
    img_width = 160
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(folder_list)//batch_size
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,len(img_idx),img_height,img_width,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    # image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    if image.height==360 and image.width==360:
                        image = image.resize((160,160))
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    mid_height = image.height//2
                    mid_width = image.width//2
                    
                    image = image.crop((mid_width-80,mid_height-80,mid_width+80,mid_height+80))
                    image = (image - np.percentile(image,5))/ (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        addnl_image_count = len(folder_list) % batch_size
        batch = batch + 1
        if(addnl_image_count!=0):
            batch_data = np.zeros((addnl_image_count,len(img_idx),img_height,img_width,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((addnl_image_count,5)) # batch_labels is the one hot representation of the output
            for folder in range(addnl_image_count): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                   
                    if image.height==360 and image.width==360:
                        image = image.resize((160,160))
                    mid_height = image.height//2
                    mid_width = image.width//2
                    
                    image = image.crop((mid_width-80,mid_height-80,mid_width+80,mid_height+80))
                    image = (image - np.percentile(image,5))/ (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels

In [5]:
curr_dt_time = datetime.datetime.now()
train_path = 'Project_data/train'
val_path = 'Project_data/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs = 10 # choose the number of epochs
print ('# epochs =', num_epochs)
num_classes = 5
batch_size = 32

# training sequences = 663
# validation sequences = 100
# epochs = 10


In [6]:
from keras.models import Sequential, Model
from keras.layers import Dense, GRU, Flatten, TimeDistributed, BatchNormalization, Activation, Dropout
from keras.layers.convolutional import Conv2D, MaxPooling2D, Conv3D, MaxPooling3D, AveragePooling2D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras import optimizers
from keras.regularizers import l2
from keras.layers import LSTM, GRU, Bidirectional, SimpleRNN, RNN


img_height = 160
img_width = 160
channels = 3

model = Sequential()
model.add(Conv3D(32, (3, 3, 3), padding='same', input_shape=(6,img_height,img_width,channels)))
model.add(Activation('relu'))
model.add(Conv3D(32, (3, 3, 3), padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(3, 3, 3),padding='same'))
# model.add(Dropout(0.25))

model.add(Conv3D(64, (3, 3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv3D(64, (3, 3, 3), padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(3, 3, 3),padding='same'))

model.add(Flatten())
model.add(Dense(512,activation='relu'))
model.add(BatchNormalization())
# model.add(Dropout(0.5))

model.add(Dense(num_classes))
model.add(Activation('softmax'))

In [7]:
optimiser = 'Adam'

# compile it
model.compile(loss='categorical_crossentropy', optimizer=optimiser, metrics=['categorical_accuracy'])

# summary of model
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d (Conv3D)              (None, 6, 160, 160, 32)   2624      
_________________________________________________________________
activation (Activation)      (None, 6, 160, 160, 32)   0         
_________________________________________________________________
conv3d_1 (Conv3D)            (None, 6, 160, 160, 32)   27680     
_________________________________________________________________
activation_1 (Activation)    (None, 6, 160, 160, 32)   0         
_________________________________________________________________
batch_normalization (BatchNo (None, 6, 160, 160, 32)   128       
_________________________________________________________________
max_pooling3d (MaxPooling3D) (None, 2, 54, 54, 32)     0         
_________________________________________________________________
conv3d_2 (Conv3D)            (None, 2, 54, 54, 64)     5

In [8]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [9]:
model_name = 'model' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', save_freq='epoch')

#LR = # write the REducelronplateau code here
LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=0.01)
callbacks_list = [checkpoint, LR]

In [10]:

if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [11]:
model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

W0328 06:18:09.168770 139986328037184 deprecation.py:323] From <ipython-input-11-bd77c9c60c14>:3: Model.fit_generator (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
Instructions for updating:
Please use Model.fit, which supports generators.


Source path =  Project_data/train ; batch size = 32
Epoch 1/10

Epoch 00001: saving model to model_2021-03-2806_18_05.764754/model-00001-2.11729-0.40573-1.86506-0.25000.h5
Epoch 2/10
Epoch 00002: saving model to model_2021-03-2806_18_05.764754/model-00002-0.85158-0.67421-2.41959-0.27000.h5
Epoch 3/10
Epoch 00003: saving model to model_2021-03-2806_18_05.764754/model-00003-0.53699-0.80694-7.18828-0.22000.h5
Epoch 4/10
Epoch 00004: saving model to model_2021-03-2806_18_05.764754/model-00004-0.32116-0.89593-9.36341-0.25000.h5
Epoch 5/10
Epoch 00005: saving model to model_2021-03-2806_18_05.764754/model-00005-0.25665-0.91554-11.95388-0.22000.h5
Epoch 6/10
Epoch 00006: saving model to model_2021-03-2806_18_05.764754/model-00006-0.13285-0.96229-15.46944-0.24000.h5
Epoch 7/10
Epoch 00007: saving model to model_2021-03-2806_18_05.764754/model-00007-0.07733-0.98341-17.03047-0.21000.h5
Epoch 8/10
Epoch 00008: saving model to model_2021-03-2806_18_05.764754/model-00008-0.05902-0.98341-19.67587-0.

<tensorflow.python.keras.callbacks.History at 0x7f50804a2748>

#### Model 11 -
Optimiser - Adam <br>
RGB image (ie all the three channel is used) is used <br>
Image is first cropped to the standard size of 120x120<br>
image is min-max normalised to using 95 and 5 percentile instead of max and min <br>
No of frames used per video = 10 ( the frame from 10 to 20) <br>
Filter size - (3x3x3) <br>
With 3 dropouts each of value 0.5

In [12]:
img_height = 120
img_width = 120
channels = 3
# img_idx = np.arange(10,21,2)
img_idx = np.arange(10,20,1)


def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(folder_list)//batch_size
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,len(img_idx),img_height,img_width,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    # image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])

                    mid_height = image.height//2
                    mid_width = image.width//2
                    image = image.crop((mid_width-60,mid_height-60,mid_width+60,mid_height+60))
                    # image = image.resize((160,160))
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    
                    image = (image - np.percentile(image,5))/ (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        addnl_image_count = len(folder_list) % batch_size
        batch = batch + 1
        if(addnl_image_count!=0):
            batch_data = np.zeros((addnl_image_count,len(img_idx),img_height,img_width,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((addnl_image_count,5)) # batch_labels is the one hot representation of the output
            for folder in range(addnl_image_count): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    mid_height = image.height//2
                    mid_width = image.width//2
                    image = image.crop((mid_width-60,mid_height-60,mid_width+60,mid_height+60))
                    # image = image.resize((160,160))
                    
                    image = (image - np.percentile(image,5))/ (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels

In [13]:
curr_dt_time = datetime.datetime.now()
train_path = 'Project_data/train'
val_path = 'Project_data/val'
# train_path = 'gdrive/MyDrive/Colab Notebooks/Gesture Recognition/Project_data/train'
# val_path = 'gdrive/MyDrive/Colab Notebooks/Gesture Recognition/Project_data/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs = 10 # choose the number of epochs
print ('# epochs =', num_epochs)
num_classes = 5
batch_size = 16

# training sequences = 663
# validation sequences = 100
# epochs = 10


In [20]:
from keras.models import Sequential, Model
from keras.layers import Dense, GRU, Flatten, TimeDistributed, BatchNormalization, Activation, Dropout
from keras.layers.convolutional import Conv2D, MaxPooling2D, Conv3D, MaxPooling3D, AveragePooling2D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras import optimizers
from keras.regularizers import l2
from keras.layers import LSTM, GRU, Bidirectional, SimpleRNN, RNN


model = Sequential()
model.add(Conv3D(32, (3, 3, 3), padding='same', input_shape=(len(img_idx),img_height,img_width,channels)))
model.add(Activation('relu'))
model.add(Conv3D(32, (3, 3, 3), padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(3, 3, 3),padding='same'))
model.add(Dropout(0.5))

model.add(Conv3D(64, (3, 3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv3D(64, (3, 3, 3), padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(3, 3, 3),padding='same'))
model.add(Dropout(0.5))

model.add(Flatten())
model.add(Dense(64,activation='relu',kernel_regularizer=l2(0.01)))
model.add(BatchNormalization())
model.add(Dropout(0.5))

# # model.add(Dense(64,activation='relu',kernel_regularizer=l2(0.01)))
# model.add(Dense(64,activation='relu'))
# model.add(BatchNormalization())
# model.add(Dropout(0.25))



model.add(Dense(num_classes))
model.add(Activation('softmax'))

In [21]:
optimiser = "Adam"
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_8 (Conv3D)            (None, 10, 120, 120, 32)  2624      
_________________________________________________________________
activation_10 (Activation)   (None, 10, 120, 120, 32)  0         
_________________________________________________________________
conv3d_9 (Conv3D)            (None, 10, 120, 120, 32)  27680     
_________________________________________________________________
activation_11 (Activation)   (None, 10, 120, 120, 32)  0         
_________________________________________________________________
batch_normalization_6 (Batch (None, 10, 120, 120, 32)  128       
_________________________________________________________________
max_pooling3d_4 (MaxPooling3 (None, 4, 40, 40, 32)     0         
_________________________________________________________________
dropout_6 (Dropout)          (None, 4, 40, 40, 32)    

In [22]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [23]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=0.01)
callbacks_list = [checkpoint, LR]

W0328 12:36:38.387002 140095317186368 callbacks.py:1071] `period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen.


In [24]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [25]:
model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  Project_data/train ; batch size = 16
Epoch 1/10

Epoch 00001: saving model to model_init_2021-03-2812_28_36.234442/model-00001-3.57962-0.25339-3.17095-0.25000.h5
Epoch 2/10
Epoch 00002: saving model to model_init_2021-03-2812_28_36.234442/model-00002-3.05909-0.41327-2.95739-0.37000.h5
Epoch 3/10
Epoch 00003: saving model to model_init_2021-03-2812_28_36.234442/model-00003-2.75783-0.44193-3.53797-0.30000.h5
Epoch 4/10
Epoch 00004: saving model to model_init_2021-03-2812_28_36.234442/model-00004-2.53087-0.53092-3.47905-0.28000.h5
Epoch 5/10
Epoch 00005: saving model to model_init_2021-03-2812_28_36.234442/model-00005-2.45669-0.56410-4.15611-0.23000.h5
Epoch 6/10
Epoch 00006: saving model to model_init_2021-03-2812_28_36.234442/model-00006-2.24737-0.64253-8.89534-0.21000.h5
Epoch 7/10
Epoch 00007: saving model to model_init_2021-03-2812_28_36.234442/model-00007-2.06527-0.69985-4.03745-0.26000.h5
Epoch 8/10
Epoch 00008: saving model to model_init_2021-03-2812_28_36.234442/mo

<tensorflow.python.keras.callbacks.History at 0x7f69d0463208>

#### Model 12 -
Optimiser - Adam <br>
RGB image (ie all the three channel is used) is used <br>
Image is first cropped to the standard size of 120x120<br>
image is min-max normalised to using 95 and 5 percentile instead of max and min <br>
No of frames used per video = 10 ( the frame from 5 to 25) <br>
Filter size - (3x3x3) <br>
With 2 0.5 dropouts and 1 L2 regularizer

In [34]:
img_height = 120
img_width = 120
channels = 3
# img_idx = np.arange(10,21,2)
img_idx = np.arange(5,25,1)


def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(folder_list)//batch_size
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,len(img_idx),img_height,img_width,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    # image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])

                    mid_height = image.height//2
                    mid_width = image.width//2
                    image = image.crop((mid_width-60,mid_height-60,mid_width+60,mid_height+60))
                    # image = image.resize((160,160))
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    
                    image = (image - np.percentile(image,5))/ (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        addnl_image_count = len(folder_list) % batch_size
        batch = batch + 1
        if(addnl_image_count!=0):
            batch_data = np.zeros((addnl_image_count,len(img_idx),img_height,img_width,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((addnl_image_count,5)) # batch_labels is the one hot representation of the output
            for folder in range(addnl_image_count): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    mid_height = image.height//2
                    mid_width = image.width//2
                    image = image.crop((mid_width-60,mid_height-60,mid_width+60,mid_height+60))
                    # image = image.resize((160,160))
                    
                    image = (image - np.percentile(image,5))/ (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels

In [35]:
curr_dt_time = datetime.datetime.now()
train_path = 'Project_data/train'
val_path = 'Project_data/val'
# train_path = 'gdrive/MyDrive/Colab Notebooks/Gesture Recognition/Project_data/train'
# val_path = 'gdrive/MyDrive/Colab Notebooks/Gesture Recognition/Project_data/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs = 10 # choose the number of epochs
print ('# epochs =', num_epochs)
num_classes = 5
batch_size = 16

# training sequences = 663
# validation sequences = 100
# epochs = 10


In [36]:
from keras.models import Sequential, Model
from keras.layers import Dense, GRU, Flatten, TimeDistributed, BatchNormalization, Activation, Dropout
from keras.layers.convolutional import Conv2D, MaxPooling2D, Conv3D, MaxPooling3D, AveragePooling2D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras import optimizers
from keras.regularizers import l2
from keras.layers import LSTM, GRU, Bidirectional, SimpleRNN, RNN


model = Sequential()
model.add(Conv3D(32, (3, 3, 3), padding='same', input_shape=(len(img_idx),img_height,img_width,channels)))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(3, 3, 3),padding='same'))
model.add(Dropout(0.5))

model.add(Conv3D(64, (3, 3, 3), padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(3, 3, 3),padding='same'))
model.add(Dropout(0.5))

model.add(Flatten())
model.add(Dense(64,activation='relu',kernel_regularizer=l2(0.01)))
model.add(BatchNormalization())
model.add(Dropout(0.5))


model.add(Dense(num_classes))
model.add(Activation('softmax'))

In [37]:
optimiser = "Adam"
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_14 (Conv3D)           (None, 20, 120, 120, 32)  2624      
_________________________________________________________________
activation_18 (Activation)   (None, 20, 120, 120, 32)  0         
_________________________________________________________________
batch_normalization_12 (Batc (None, 20, 120, 120, 32)  128       
_________________________________________________________________
max_pooling3d_8 (MaxPooling3 (None, 7, 40, 40, 32)     0         
_________________________________________________________________
dropout_12 (Dropout)         (None, 7, 40, 40, 32)     0         
_________________________________________________________________
conv3d_15 (Conv3D)           (None, 7, 40, 40, 64)     55360     
_________________________________________________________________
activation_19 (Activation)   (None, 7, 40, 40, 64)    

In [38]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [39]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=0.01)
callbacks_list = [checkpoint, LR]

W0328 13:00:47.198309 140095317186368 callbacks.py:1071] `period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen.


In [40]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [41]:
model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  Project_data/train ; batch size = 16
Epoch 1/10

Epoch 00001: saving model to model_init_2021-03-2813_00_46.057130/model-00001-3.52219-0.30015-5.63051-0.16000.h5
Epoch 2/10
Epoch 00002: saving model to model_init_2021-03-2813_00_46.057130/model-00002-2.90198-0.48718-18.33045-0.16000.h5
Epoch 3/10
Epoch 00003: saving model to model_init_2021-03-2813_00_46.057130/model-00003-2.47578-0.60935-22.50764-0.17000.h5
Epoch 4/10
Epoch 00004: saving model to model_init_2021-03-2813_00_46.057130/model-00004-2.26436-0.66968-26.32201-0.13000.h5
Epoch 5/10
Epoch 00005: saving model to model_init_2021-03-2813_00_46.057130/model-00005-2.04425-0.73002-21.73462-0.18000.h5
Epoch 6/10
Epoch 00006: saving model to model_init_2021-03-2813_00_46.057130/model-00006-1.97948-0.74962-15.33971-0.18000.h5
Epoch 7/10
Epoch 00007: saving model to model_init_2021-03-2813_00_46.057130/model-00007-1.82437-0.77979-26.39358-0.16000.h5
Epoch 8/10
Epoch 00008: saving model to model_init_2021-03-2813_00_46.057

<tensorflow.python.keras.callbacks.History at 0x7f69d04c16a0>

#### Model 13 -
Optimiser - Adam <br>
RGB image (ie all the three channel is used) is used <br>
Image is first cropped to the standard size of 120x120<br>
image is min-max normalised to using 95 and 5 percentile instead of max and min <br>
No of frames used per video = 10 ( the frame from 5 to 25) <br>
Filter size - (3x3x3) <br>
With 2 0.5 dropouts and 1 L2 regularizer

In [4]:
img_height = 120
img_width = 120
channels = 3
# img_idx = np.arange(10,21,2)
img_idx = np.arange(5,25,1)


def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(folder_list)//batch_size
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,len(img_idx),img_height,img_width,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    # image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])

                    mid_height = image.height//2
                    mid_width = image.width//2
                    image = image.crop((mid_width-60,mid_height-60,mid_width+60,mid_height+60))
                    # image = image.resize((160,160))
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    
                    image = (image - np.percentile(image,5))/ (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        addnl_image_count = len(folder_list) % batch_size
        batch = batch + 1
        if(addnl_image_count!=0):
            batch_data = np.zeros((addnl_image_count,len(img_idx),img_height,img_width,channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((addnl_image_count,5)) # batch_labels is the one hot representation of the output
            for folder in range(addnl_image_count): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = Image.open(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item])
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    mid_height = image.height//2
                    mid_width = image.width//2
                    image = image.crop((mid_width-60,mid_height-60,mid_width+60,mid_height+60))
                    # image = image.resize((160,160))
                    
                    image = (image - np.percentile(image,5))/ (np.percentile(image,95) - np.percentile(image,5))
                    batch_data[folder,idx,:,:,0] = image[:,:,0]
                    batch_data[folder,idx,:,:,1] = image[:,:,1]
                    batch_data[folder,idx,:,:,2] = image[:,:,2]
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels

In [5]:
curr_dt_time = datetime.datetime.now()
train_path = 'Project_data/train'
val_path = 'Project_data/val'
# train_path = 'gdrive/MyDrive/Colab Notebooks/Gesture Recognition/Project_data/train'
# val_path = 'gdrive/MyDrive/Colab Notebooks/Gesture Recognition/Project_data/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs = 25 # choose the number of epochs
print ('# epochs =', num_epochs)
num_classes = 5
batch_size = 64

# training sequences = 663
# validation sequences = 100
# epochs = 25


In [6]:
from keras.models import Sequential, Model
from keras.layers import Dense, GRU, Flatten, TimeDistributed, BatchNormalization, Activation, Dropout
from keras.layers.convolutional import Conv2D, MaxPooling2D, Conv3D, MaxPooling3D, AveragePooling2D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras import optimizers
from keras.regularizers import l2
from keras.layers import LSTM, GRU, Bidirectional, SimpleRNN, RNN


model = Sequential()

model.add(Conv3D(8, kernel_size=(3,3,3), input_shape=(len(img_idx),img_height,img_width,channels), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(MaxPooling3D(pool_size=(2,2,2)))

model.add(Conv3D(16, kernel_size=(3,3,3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(MaxPooling3D(pool_size=(2,2,2)))

model.add(Conv3D(32, kernel_size=(1,3,3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(MaxPooling3D(pool_size=(2,2,2)))

model.add(Conv3D(64, kernel_size=(1,3,3), padding='same'))
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(MaxPooling3D(pool_size=(2,2,2)))

#Flatten Layers
model.add(Flatten())

model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))

#softmax layer
model.add(Dense(5, activation='softmax'))

In [7]:
optimiser = "Adam"
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d (Conv3D)              (None, 20, 120, 120, 8)   656       
_________________________________________________________________
batch_normalization (BatchNo (None, 20, 120, 120, 8)   32        
_________________________________________________________________
activation (Activation)      (None, 20, 120, 120, 8)   0         
_________________________________________________________________
dropout (Dropout)            (None, 20, 120, 120, 8)   0         
_________________________________________________________________
max_pooling3d (MaxPooling3D) (None, 10, 60, 60, 8)     0         
_________________________________________________________________
conv3d_1 (Conv3D)            (None, 10, 60, 60, 16)    3472      
_________________________________________________________________
batch_normalization_1 (Batch (None, 10, 60, 60, 16)    6

In [8]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [9]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=0.01)
callbacks_list = [checkpoint, LR]

W0328 14:09:24.889477 140539319473984 callbacks.py:1071] `period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen.


In [10]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [11]:
model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  Project_data/train ; batch size = 64
Epoch 1/25

Epoch 00001: saving model to model_init_2021-03-2814_09_04.469840/model-00001-2.91797-0.20362-1.60787-0.25000.h5
Epoch 2/25
Epoch 00002: saving model to model_init_2021-03-2814_09_04.469840/model-00002-1.61418-0.22926-1.60825-0.22000.h5
Epoch 3/25
Epoch 00003: saving model to model_init_2021-03-2814_09_04.469840/model-00003-1.59195-0.24284-1.60901-0.17000.h5
Epoch 4/25
Epoch 00004: saving model to model_init_2021-03-2814_09_04.469840/model-00004-1.56934-0.25943-1.60855-0.19000.h5
Epoch 5/25
Epoch 00005: saving model to model_init_2021-03-2814_09_04.469840/model-00005-1.53291-0.34842-1.60705-0.27000.h5
Epoch 6/25
Epoch 00006: saving model to model_init_2021-03-2814_09_04.469840/model-00006-1.46643-0.35445-1.59981-0.30000.h5
Epoch 7/25
Epoch 00007: saving model to model_init_2021-03-2814_09_04.469840/model-00007-1.36618-0.43590-1.59332-0.28000.h5
Epoch 8/25
Epoch 00008: saving model to model_init_2021-03-2814_09_04.469840/mo

Epoch 22/25
Epoch 00022: saving model to model_init_2021-03-2814_09_04.469840/model-00022-0.44306-0.82353-1.73488-0.30000.h5
Epoch 23/25
Epoch 00023: saving model to model_init_2021-03-2814_09_04.469840/model-00023-0.35028-0.88386-1.72763-0.31000.h5
Epoch 24/25
Epoch 00024: saving model to model_init_2021-03-2814_09_04.469840/model-00024-0.35105-0.85370-1.64928-0.28000.h5
Epoch 25/25
Epoch 00025: saving model to model_init_2021-03-2814_09_04.469840/model-00025-0.31942-0.88688-1.74837-0.30000.h5


<tensorflow.python.keras.callbacks.History at 0x7fd1380aaeb8>