# Gesture Recognition
In this group project, you are going to build a 3D Conv model that will be able to predict the 5 gestures correctly. Please import the following libraries to get started. Once you have completed the code you can download the notebook for making a submission.

In [1]:
import numpy as np
import os
from imageio import imread
from skimage.transform import resize
import datetime
import os

We set the random seed so that the results don't vary drastically.

In [2]:
np.random.seed(30)
import random as rn
rn.seed(30)
from tensorflow import keras
import tensorflow as tf
tf.random.set_seed(30)

In this block, you read the folder names for training and validation. You also set the `batch_size` here. Note that you set the batch size in such a way that you are able to use the GPU in full capacity. You keep increasing the batch size until the machine throws an error.

**data path: /home/datasets/Project_data**

In [3]:
train_doc = np.random.permutation(open('/home/datasets/Project_data/train.csv').readlines())
val_doc = np.random.permutation(open('/home/datasets/Project_data/val.csv').readlines())
batch_size = 32

## Generator
This is one of the most important part of the code. The overall structure of the generator has been given. In the generator, you are going to preprocess the images as you have images of 2 different dimensions as well as create a batch of video frames. You have to experiment with `img_idx`, `y`,`z` and normalization such that you get high accuracy.

In [7]:
def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    img_idx = [2,4,6,8,10,12,14,16,18,20,22,24,26]
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(t) // batch_size # calculate the number of batches
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,13,80,80,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    image = resize(image,(80,80)).astype(np.float32)
                    
                    batch_data[folder,idx,:,:,0] = (image[:,:,0])/255 #normalise and feed in the image
                    batch_data[folder,idx,:,:,1] = (image[:,:,1])/255 #normalise and feed in the image
                    batch_data[folder,idx,:,:,2] = (image[:,:,2])/255 #normalise and feed in the image
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        
        # write the code for the remaining data points which are left after full batches
        if len(t) % batch_size != 0:
            batch_data = np.zeros((batch_size,13,80,80,3)) 
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    image = resize(image,(80,80)).astype(np.float32)
                    batch_data[folder,idx,:,:,0] = (image[:,:,0])/255
                    batch_data[folder,idx,:,:,1] = (image[:,:,1])/255
                    batch_data[folder,idx,:,:,2] = (image[:,:,2])/255
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1 
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield does  


Note here that a video is represented above in the generator as (number of images, height, width, number of channels). Take this into consideration while creating the model architecture.

In [4]:
curr_dt_time = datetime.datetime.now()
train_path = '/home/datasets/Project_data/train'
val_path = '/home/datasets/Project_data/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs = 20 # choose the number of epochs
print ('# epochs =', num_epochs)

# training sequences = 663
# validation sequences = 100
# epochs = 20


Here you make the model using different functionalities that Keras provides. Remember to use `Conv3D` and `MaxPooling3D` and not `Conv2D` and `Maxpooling2D` for a 3D convolution model. You would want to use `TimeDistributed` while building a Conv2D + RNN model. Also remember that the last layer is the softmax. Design the network in such a way that the model is able to give good accuracy on the least number of parameters so that it can fit in the memory of the webcam.

In [5]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, GRU, Dropout, Flatten, BatchNormalization, Activation, Conv3D, MaxPooling3D
from tensorflow.keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from tensorflow.keras import optimizers
from keras.layers import Dropout
from tensorflow.keras.layers import Conv2D, MaxPooling2D, TimeDistributed, LSTM, Dense, Flatten, Dropout,GlobalAveragePooling2D
from tensorflow.keras.layers import Conv2D, MaxPooling2D, TimeDistributed, GRU, Dense, Dropout, Flatten
from tensorflow.keras.layers import BatchNormalization

#### First Model-Conv3D simple network

In [38]:
#write your model here
model = Sequential()
model.add(Conv3D(32, (3, 3, 3), padding='same',
         input_shape=(13,80,80,3)))
model.add(Activation('relu'))
model.add(MaxPooling3D(pool_size=(2, 2, 2)))

model.add(Conv3D(64, (2, 2, 2), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling3D(pool_size=(2, 2, 2)))

model.add(Flatten())
model.add(Dense(32,activation='relu'))


model.add(Dense(5,activation='softmax'))

Now that you have written the model, the next step is to `compile` the model. When you print the `summary` of the model, you'll see the total number of parameters you have to train.

In [39]:
optimiser = 'adam' #write your optimizer''
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv3d_12 (Conv3D)          (None, 13, 80, 80, 32)    2624      
                                                                 
 activation_12 (Activation)  (None, 13, 80, 80, 32)    0         
                                                                 
 max_pooling3d_12 (MaxPoolin  (None, 6, 40, 40, 32)    0         
 g3D)                                                            
                                                                 
 conv3d_13 (Conv3D)          (None, 6, 40, 40, 64)     16448     
                                                                 
 activation_13 (Activation)  (None, 6, 40, 40, 64)     0         
                                                                 
 max_pooling3d_13 (MaxPoolin  (None, 3, 20, 20, 64)    0         
 g3D)                                                 

Let us create the `train_generator` and the `val_generator` which will be used in `.fit_generator`.

In [40]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [41]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto',save_freq = 'epoch')

LR =ReduceLROnPlateau(monitor='val_loss', factor=0.2, verbose=1, patience=4) # write the REducelronplateau code here
callbacks_list = [checkpoint, LR]

The `steps_per_epoch` and `validation_steps` are used by `fit` method to decide the number of next() calls it need to make.

In [42]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

Let us now fit the model. This will start training the model and with the help of the checkpoints, you'll be able to save the model at the end of each epoch.

In [43]:
history=model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  /home/datasets/Project_data/train ; batch size = 32
Epoch 1/20


2025-03-03 17:08:32.979381: I tensorflow/stream_executor/cuda/cuda_dnn.cc:377] Loaded cuDNN version 8302



Epoch 00001: saving model to model_init_2025-03-0317_07_55.516965/model-00001-2.21980-0.20387-1.57967-0.20312.h5
Epoch 2/20
Epoch 00002: saving model to model_init_2025-03-0317_07_55.516965/model-00002-1.54695-0.24107-1.50708-0.28906.h5
Epoch 3/20
Epoch 00003: saving model to model_init_2025-03-0317_07_55.516965/model-00003-1.44469-0.28720-1.29351-0.36719.h5
Epoch 4/20
Epoch 00004: saving model to model_init_2025-03-0317_07_55.516965/model-00004-1.35834-0.33929-1.32331-0.34375.h5
Epoch 5/20
Epoch 00005: saving model to model_init_2025-03-0317_07_55.516965/model-00005-1.26310-0.35119-1.26534-0.35156.h5
Epoch 6/20
Epoch 00006: saving model to model_init_2025-03-0317_07_55.516965/model-00006-1.19114-0.40476-1.25674-0.45312.h5
Epoch 7/20
Epoch 00007: saving model to model_init_2025-03-0317_07_55.516965/model-00007-1.09243-0.47619-1.17562-0.37500.h5
Epoch 8/20
Epoch 00008: saving model to model_init_2025-03-0317_07_55.516965/model-00008-1.01823-0.48363-1.12114-0.47656.h5
Epoch 9/20
Epoch 0

Overfitting: The difference between training and validation accuracy tends to increase as the epochs progress (e.g., from Epoch 10 onwards). This suggests that the model may be overfitting to the training data, as it performs significantly better on the training set compared to the validation set.

#### Second Model-Conv3D
Adding Dropouts at dense layer , adding another Conv3D layer to see if any increase in the Train and Validation accuracy

In [7]:
#write your model here
model = Sequential()
model.add(Conv3D(32, (3, 3, 3), padding='same',
         input_shape=(13,80,80,3)))
model.add(Activation('relu'))
model.add(MaxPooling3D(pool_size=(2, 2, 2)))

model.add(Conv3D(64, (2, 2, 2), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling3D(pool_size=(2, 2, 2)))

model.add(Conv3D(128, (2, 2, 2), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling3D(pool_size=(2, 2, 2)))

model.add(Flatten())
model.add(Dense(128,activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(64,activation='relu'))
model.add(Dropout(0.25))

model.add(Dense(5,activation='softmax'))

2025-03-04 07:09:07.551002: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2025-03-04 07:09:07.551079: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14800 MB memory:  -> device: 0, name: Quadro RTX 5000, pci bus id: 0000:1c:00.0, compute capability: 7.5


In [11]:
optimiser = 'adam' #write your optimizer''
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv3d (Conv3D)             (None, 13, 80, 80, 32)    2624      
                                                                 
 activation (Activation)     (None, 13, 80, 80, 32)    0         
                                                                 
 max_pooling3d (MaxPooling3D  (None, 6, 40, 40, 32)    0         
 )                                                               
                                                                 
 conv3d_1 (Conv3D)           (None, 6, 40, 40, 64)     16448     
                                                                 
 activation_1 (Activation)   (None, 6, 40, 40, 64)     0         
                                                                 
 max_pooling3d_1 (MaxPooling  (None, 3, 20, 20, 64)    0         
 3D)                                                    

In [46]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [12]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto',save_freq = 'epoch')

LR =ReduceLROnPlateau(monitor='val_loss', factor=0.2, verbose=1, patience=4) # write the REducelronplateau code here
callbacks_list = [checkpoint, LR]

In [9]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [49]:
history2=model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  /home/datasets/Project_data/train ; batch size = 32
Epoch 1/20

Epoch 00001: saving model to model_init_2025-03-0317_07_55.516965/model-00001-1.62840-0.18750-1.59943-0.24219.h5
Epoch 2/20
Epoch 00002: saving model to model_init_2025-03-0317_07_55.516965/model-00002-1.59026-0.22917-1.53570-0.24219.h5
Epoch 3/20
Epoch 00003: saving model to model_init_2025-03-0317_07_55.516965/model-00003-1.52280-0.27083-1.34918-0.30469.h5
Epoch 4/20
Epoch 00004: saving model to model_init_2025-03-0317_07_55.516965/model-00004-1.44189-0.35565-1.43476-0.31250.h5
Epoch 5/20
Epoch 00005: saving model to model_init_2025-03-0317_07_55.516965/model-00005-1.38239-0.38988-1.19910-0.44531.h5
Epoch 6/20
Epoch 00006: saving model to model_init_2025-03-0317_07_55.516965/model-00006-1.27831-0.40923-1.22349-0.46875.h5
Epoch 7/20
Epoch 00007: saving model to model_init_2025-03-0317_07_55.516965/model-00007-1.25523-0.41964-1.12269-0.59375.h5
Epoch 8/20
Epoch 00008: saving model to model_init_2025-03-0317_

#### Summary
Overfitting:

The difference between training and validation accuracy is relatively small in the early epochs but grows larger in later epochs (e.g., Epoch 20 has a difference of 0.1082).

This suggests that the model may be starting to overfit to the training data, as it performs significantly better on the training set compared to the validation set in later epochs.

Validation Accuracy Fluctuations:
Validation accuracy fluctuates slightly across epochs (e.g., drops at Epoch 17 and Epoch 20), which could indicate instability in the model's generalization performance.

#### Third Model-Conv3D
Adding more epochs may give the stable validation accuracy. Before doing that we would try what effect batch_size of 64 have on the model. For this model we will try batch_size 64. Used optimiser "SGD"

In [24]:
new_batch_size = 64
train_generator = generator(train_path, train_doc, new_batch_size)
val_generator = generator(val_path, val_doc, new_batch_size)

model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'
checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', save_freq='epoch')
LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, verbose=1, patience=4)# write the REducelronplateau code here
callbacks_list = [checkpoint, LR]

optimiser = 'sgd'
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv3d_3 (Conv3D)           (None, 13, 80, 80, 32)    2624      
                                                                 
 activation_3 (Activation)   (None, 13, 80, 80, 32)    0         
                                                                 
 max_pooling3d_3 (MaxPooling  (None, 6, 40, 40, 32)    0         
 3D)                                                             
                                                                 
 conv3d_4 (Conv3D)           (None, 6, 40, 40, 64)     16448     
                                                                 
 activation_4 (Activation)   (None, 6, 40, 40, 64)     0         
                                                                 
 max_pooling3d_4 (MaxPooling  (None, 3, 20, 20, 64)    0         
 3D)                                                  

In [25]:
num_epochs2=30
history3 = model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs2, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

  history3 = model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs2, verbose=1,


Source path =  /home/datasets/Project_data/train ; batch size = 64
Epoch 1/30

Epoch 00001: saving model to model_init_2025-03-0404_25_08.943004/model-00001-1.61226-0.19866-1.60078-0.23438.h5
Epoch 2/30
Epoch 00002: saving model to model_init_2025-03-0404_25_08.943004/model-00002-1.60552-0.21131-1.59640-0.26172.h5
Epoch 3/30
Epoch 00003: saving model to model_init_2025-03-0404_25_08.943004/model-00003-1.59958-0.24405-1.59530-0.21875.h5
Epoch 4/30
Epoch 00004: saving model to model_init_2025-03-0404_25_08.943004/model-00004-1.59560-0.23810-1.58146-0.24609.h5
Epoch 5/30
Epoch 00005: saving model to model_init_2025-03-0404_25_08.943004/model-00005-1.59174-0.25446-1.57707-0.34375.h5
Epoch 6/30
Epoch 00006: saving model to model_init_2025-03-0404_25_08.943004/model-00006-1.57688-0.29613-1.56395-0.29688.h5
Epoch 7/30
Epoch 00007: saving model to model_init_2025-03-0404_25_08.943004/model-00007-1.57351-0.27455-1.54901-0.27344.h5
Epoch 8/30
Epoch 00008: saving model to model_init_2025-03-0404_

#### Model 3 Summary:
The model shows gradual improvement in both training and validation accuracy, but the performance is inconsistent, and convergence is slow

#### Fourth Model-Conv3D
 We will now experiment in increasing the epochs and reducing the batch size from 64 to 32 to see if any increase in accuracy

In [8]:
new_batch_size = 32
train_generator = generator(train_path, train_doc, new_batch_size)
val_generator = generator(val_path, val_doc, new_batch_size)

model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'
checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', save_freq='epoch')
LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, verbose=1, patience=4)# write the REducelronplateau code here
callbacks_list = [checkpoint, LR]

optimiser = 'adam'
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv3d (Conv3D)             (None, 13, 80, 80, 32)    2624      
                                                                 
 activation (Activation)     (None, 13, 80, 80, 32)    0         
                                                                 
 max_pooling3d (MaxPooling3D  (None, 6, 40, 40, 32)    0         
 )                                                               
                                                                 
 conv3d_1 (Conv3D)           (None, 6, 40, 40, 64)     16448     
                                                                 
 activation_1 (Activation)   (None, 6, 40, 40, 64)     0         
                                                                 
 max_pooling3d_1 (MaxPooling  (None, 3, 20, 20, 64)    0         
 3D)                                                    

#### Summary:
improvement in both training and validation accuracy, but signs of overfitting emerge in later epochs

In [10]:
num_epochs3=40
history4 = model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs3, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  /home/datasets/Project_data/train ; batch size = 32
Epoch 1/40


2025-03-04 07:11:06.946001: I tensorflow/stream_executor/cuda/cuda_dnn.cc:377] Loaded cuDNN version 8302



Epoch 00001: saving model to model_init_2025-03-0407_08_44.828734/model-00001-1.62987-0.21726-1.59869-0.21094.h5
Epoch 2/40
Epoch 00002: saving model to model_init_2025-03-0407_08_44.828734/model-00002-1.58407-0.20982-1.52218-0.40625.h5
Epoch 3/40
Epoch 00003: saving model to model_init_2025-03-0407_08_44.828734/model-00003-1.51845-0.34673-1.34073-0.38281.h5
Epoch 4/40
Epoch 00004: saving model to model_init_2025-03-0407_08_44.828734/model-00004-1.41685-0.38542-1.27846-0.54688.h5
Epoch 5/40
Epoch 00005: saving model to model_init_2025-03-0407_08_44.828734/model-00005-1.36188-0.39881-1.22652-0.40625.h5
Epoch 6/40
Epoch 00006: saving model to model_init_2025-03-0407_08_44.828734/model-00006-1.26486-0.41815-1.08827-0.61719.h5
Epoch 7/40
Epoch 00007: saving model to model_init_2025-03-0407_08_44.828734/model-00007-1.14073-0.47173-0.94265-0.52344.h5
Epoch 8/40
Epoch 00008: saving model to model_init_2025-03-0407_08_44.828734/model-00008-1.07503-0.50744-0.99756-0.60156.h5
Epoch 9/40
Epoch 0

#### Summary:
Tried with number of epochs =40 with batch size=32.
Model Result: The model shows consistent improvement in both training and validation accuracy, but signs of overfitting emerge in later epochs so far this Con3D model gave best accuracy.


#### Fifth Model-TimeDistributed + Conv2D + LSTM

In [8]:
# Define the model
model = Sequential()

# Time Distributed Conv2D layers for spatial feature extraction
model.add(TimeDistributed(Conv2D(32, (3, 3), activation='relu'), input_shape=(None, 80, 80, 3)))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(TimeDistributed(Conv2D(64, (3, 3), activation='relu')))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(TimeDistributed(Conv2D(128, (3, 3), activation='relu')))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(TimeDistributed(Flatten()))  # Flatten spatial features for each frame

# LSTM layers for temporal modeling
model.add(LSTM(128, return_sequences=True))  # First LSTM layer
model.add(LSTM(64))  # Second LSTM layer

# Fully connected layers for classification
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.25))  # Dropout for regularization
model.add(Dense(5,activation='softmax'))  # Output layer

# compile the model
optimiser = 'adam' #write your optimizer''
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])

# Summary of the model
print (model.summary())

2025-03-04 11:30:32.882827: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2025-03-04 11:30:32.882890: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 22855 MB memory:  -> device: 0, name: Quadro RTX 6000, pci bus id: 0000:1c:00.0, compute capability: 7.5


Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 time_distributed (TimeDistr  (None, None, 78, 78, 32)  896      
 ibuted)                                                         
                                                                 
 time_distributed_1 (TimeDis  (None, None, 39, 39, 32)  0        
 tributed)                                                       
                                                                 
 time_distributed_2 (TimeDis  (None, None, 37, 37, 64)  18496    
 tributed)                                                       
                                                                 
 time_distributed_3 (TimeDis  (None, None, 18, 18, 64)  0        
 tributed)                                                       
                                                                 
 time_distributed_4 (TimeDis  (None, None, 16, 16, 128  

In [9]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [10]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto',save_freq = 'epoch')

LR =ReduceLROnPlateau(monitor='val_loss', factor=0.2, verbose=1, patience=4) # write the REducelronplateau code here
callbacks_list = [checkpoint, LR]

In [11]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [12]:
history5=model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  /home/datasets/Project_data/train ; batch size = 32
Epoch 1/20


2025-03-04 11:31:44.586279: I tensorflow/stream_executor/cuda/cuda_dnn.cc:377] Loaded cuDNN version 8302



Epoch 00001: saving model to model_init_2025-03-0411_26_01.292544/model-00001-1.61651-0.22321-1.58128-0.25781.h5
Epoch 2/20
Epoch 00002: saving model to model_init_2025-03-0411_26_01.292544/model-00002-1.50881-0.32887-1.42635-0.42188.h5
Epoch 3/20
Epoch 00003: saving model to model_init_2025-03-0411_26_01.292544/model-00003-1.29357-0.45833-1.02721-0.59375.h5
Epoch 4/20
Epoch 00004: saving model to model_init_2025-03-0411_26_01.292544/model-00004-1.06975-0.56994-1.12861-0.52344.h5
Epoch 5/20
Epoch 00008: saving model to model_init_2025-03-0411_26_01.292544/model-00008-0.60018-0.79315-1.28811-0.59375.h5
Epoch 9/20
Epoch 00009: saving model to model_init_2025-03-0411_26_01.292544/model-00009-0.51852-0.82143-1.09956-0.67969.h5

Epoch 00009: ReduceLROnPlateau reducing learning rate to 0.00020000000949949026.
Epoch 10/20
Epoch 00010: saving model to model_init_2025-03-0411_26_01.292544/model-00010-0.29842-0.90476-1.23869-0.68750.h5
Epoch 11/20
Epoch 00012: saving model to model_init_2025-03

#### Summary:
Overfitting: The large gap between training and validation accuracy (e.g., 98.21% vs. 71.88%) suggests that the model is overfitting to the training data.

Validation Performance: The validation accuracy plateaus around 70-72%, indicating that the model struggles to generalize to unseen data.

Learning Rate: The learning rate adjustments helped stabilize training but did not significantly improve validation performance.

#### Sixth Model-Conv2D + GRU

In [7]:
# Define the model
model = Sequential()

# Time Distributed Conv2D layers for spatial feature extraction
model.add(TimeDistributed(Conv2D(32, (3, 3), activation='relu'), input_shape=(None, 80, 80, 3)))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(TimeDistributed(Conv2D(64, (3, 3), activation='relu')))
model.add(BatchNormalization())
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(TimeDistributed(Flatten()))  # Flatten spatial features for each frame

# GRU layers for temporal modeling
model.add(GRU(64,return_sequences=True))  # First GRU layer
model.add(GRU(32, return_sequences=False))  # Second GRU layer

# Fully connected layers for classification
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))  # Dropout for regularization
model.add(Dense(5, activation='softmax'))  # Output layer (5 classes)

# Compile the model
optimiser = 'adam'  # Optimizer
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])

# Summary of the model
print(model.summary())

2025-03-04 14:28:37.521030: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2025-03-04 14:28:37.521118: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14800 MB memory:  -> device: 0, name: Quadro RTX 5000, pci bus id: 0000:41:00.0, compute capability: 7.5


Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 time_distributed (TimeDistr  (None, None, 78, 78, 32)  896      
 ibuted)                                                         
                                                                 
 time_distributed_1 (TimeDis  (None, None, 39, 39, 32)  0        
 tributed)                                                       
                                                                 
 time_distributed_2 (TimeDis  (None, None, 37, 37, 64)  18496    
 tributed)                                                       
                                                                 
 batch_normalization (BatchN  (None, None, 37, 37, 64)  256      
 ormalization)                                                   
                                                                 
 time_distributed_3 (TimeDis  (None, None, 18, 18, 64)  

In [8]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto',save_freq = 'epoch')

LR =ReduceLROnPlateau(monitor='val_loss', factor=0.2, verbose=1, patience=4) # write the REducelronplateau code here
callbacks_list = [checkpoint, LR]

In [9]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [10]:

history6=model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  /home/datasets/Project_data/train ; batch size = 32
Epoch 1/20


2025-03-04 14:29:08.807106: I tensorflow/stream_executor/cuda/cuda_dnn.cc:377] Loaded cuDNN version 8302



Epoch 00001: saving model to model_init_2025-03-0414_25_22.012483/model-00001-1.56177-0.31250-1.61561-0.21875.h5
Epoch 2/20
Epoch 00002: saving model to model_init_2025-03-0414_25_22.012483/model-00002-1.28745-0.48661-1.60130-0.21875.h5
Epoch 3/20
Epoch 00003: saving model to model_init_2025-03-0414_25_22.012483/model-00003-1.07671-0.58482-1.45903-0.26562.h5
Epoch 4/20
Epoch 00004: saving model to model_init_2025-03-0414_25_22.012483/model-00004-0.87580-0.70685-1.21535-0.57031.h5
Epoch 5/20
Epoch 00005: saving model to model_init_2025-03-0414_25_22.012483/model-00005-0.70106-0.77679-1.53130-0.41406.h5
Epoch 6/20
Epoch 00006: saving model to model_init_2025-03-0414_25_22.012483/model-00006-0.60646-0.79167-1.76749-0.34375.h5
Epoch 7/20
Epoch 00007: saving model to model_init_2025-03-0414_25_22.012483/model-00007-0.45381-0.85565-1.17083-0.53125.h5
Epoch 8/20
Epoch 00008: saving model to model_init_2025-03-0414_25_22.012483/model-00008-0.36163-0.90327-2.03224-0.41406.h5
Epoch 9/20
Epoch 0

#### Summary:
Model result: Overfitting
The training accuracy reached 100%, while the validation accuracy plateaued around 60-65%.

#### Seventh Model-Conv2D + LSTM with Minimal filters at each Conv2D layer

In [20]:
# Define the model
model = Sequential()

# Time Distributed Conv2D layers for spatial feature extraction
model.add(TimeDistributed(Conv2D(8, (3, 3), activation='relu'), input_shape=(None, 80, 80, 3)))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(TimeDistributed(Conv2D(16, (3, 3), activation='relu')))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(TimeDistributed(Conv2D(32, (3, 3), activation='relu')))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(TimeDistributed(Flatten()))  # Flatten spatial features for each frame

# LSTM layers for temporal modeling
model.add(LSTM(16, return_sequences=True))  # First LSTM layer
model.add(LSTM(32))  # Second LSTM layer



# Fully connected layers for classification
model.add(Dense(16, activation='relu'))
model.add(Dropout(0.25))  # Dropout for regularization
model.add(Dense(5,activation='softmax'))  # Output layer

# compile the model
optimiser = 'adam' #write your optimizer''
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])

# Summary of the model
print (model.summary())

Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 time_distributed_19 (TimeDi  (None, None, 78, 78, 8)  224       
 stributed)                                                      
                                                                 
 time_distributed_20 (TimeDi  (None, None, 39, 39, 8)  0         
 stributed)                                                      
                                                                 
 time_distributed_21 (TimeDi  (None, None, 37, 37, 16)  1168     
 stributed)                                                      
                                                                 
 time_distributed_22 (TimeDi  (None, None, 18, 18, 16)  0        
 stributed)                                                      
                                                                 
 time_distributed_23 (TimeDi  (None, None, 16, 16, 32)

In [21]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto',save_freq = 'epoch')

LR =ReduceLROnPlateau(monitor='val_loss', factor=0.2, verbose=1, patience=4) # write the REducelronplateau code here
callbacks_list = [checkpoint, LR]

In [22]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [23]:
history7=model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  /home/datasets/Project_data/train ; batch size = 32
Epoch 1/20


2025-03-04 16:34:00.191603: I tensorflow/stream_executor/cuda/cuda_dnn.cc:377] Loaded cuDNN version 8302



Epoch 00001: saving model to model_init_2025-03-0416_28_47.540319/model-00001-1.61071-0.19345-1.60215-0.22656.h5
Epoch 2/20
Epoch 00002: saving model to model_init_2025-03-0416_28_47.540319/model-00002-1.59620-0.21875-1.58270-0.28906.h5
Epoch 3/20
Epoch 00003: saving model to model_init_2025-03-0416_28_47.540319/model-00003-1.50633-0.30357-1.48594-0.35156.h5
Epoch 4/20
Epoch 00004: saving model to model_init_2025-03-0416_28_47.540319/model-00004-1.44009-0.35119-1.39532-0.38281.h5
Epoch 5/20
Epoch 00005: saving model to model_init_2025-03-0416_28_47.540319/model-00005-1.28091-0.45387-1.21776-0.42969.h5
Epoch 6/20
Epoch 00006: saving model to model_init_2025-03-0416_28_47.540319/model-00006-1.18870-0.49702-1.21430-0.48438.h5
Epoch 7/20
Epoch 00007: saving model to model_init_2025-03-0416_28_47.540319/model-00007-1.12869-0.53869-1.18614-0.55469.h5
Epoch 8/20
Epoch 00008: saving model to model_init_2025-03-0416_28_47.540319/model-00008-0.97441-0.60565-1.15266-0.52344.h5
Epoch 9/20
Epoch 0

#### Summary:
Tried with the less filters at each Conv2D layers but still the model result is showing Overfitting
Let’s try the same with Model with GRU to see if any increase in Validation accuracy

#### Eigth Model - Conv2D + GRU with minimal filters at Conv2D Layers

In [24]:
# Define the model
model = Sequential()

# Time Distributed Conv2D layers for spatial feature extraction
model.add(TimeDistributed(Conv2D(8, (3, 3), activation='relu'), input_shape=(None, 80, 80, 3)))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(TimeDistributed(Conv2D(16, (3, 3), activation='relu')))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(TimeDistributed(Conv2D(32, (3, 3), activation='relu')))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(TimeDistributed(Flatten()))  # Flatten spatial features for each frame

# GRU layers for temporal modeling
model.add(GRU(64,return_sequences=True))  # First GRU layer
model.add(GRU(32, return_sequences=False))  # Second GRU layer


# Fully connected layers for classification
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.5))  # Dropout for regularization
model.add(Dense(5,activation='softmax'))  # Output layer

# compile the model
optimiser = 'adam' #write your optimizer''
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])

# Summary of the model
print (model.summary())

Model: "sequential_5"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 time_distributed_26 (TimeDi  (None, None, 78, 78, 8)  224       
 stributed)                                                      
                                                                 
 time_distributed_27 (TimeDi  (None, None, 39, 39, 8)  0         
 stributed)                                                      
                                                                 
 time_distributed_28 (TimeDi  (None, None, 37, 37, 16)  1168     
 stributed)                                                      
                                                                 
 time_distributed_29 (TimeDi  (None, None, 18, 18, 16)  0        
 stributed)                                                      
                                                                 
 time_distributed_30 (TimeDi  (None, None, 16, 16, 32)

In [25]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto',save_freq = 'epoch')

LR =ReduceLROnPlateau(monitor='val_loss', factor=0.2, verbose=1, patience=4) # write the REducelronplateau code here
callbacks_list = [checkpoint, LR]

In [26]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [27]:
history8=model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  /home/datasets/Project_data/train ; batch size = 32
Epoch 1/20

Epoch 00001: saving model to model_init_2025-03-0416_28_47.540319/model-00001-1.60385-0.22917-1.57178-0.25000.h5
Epoch 2/20
Epoch 00002: saving model to model_init_2025-03-0416_28_47.540319/model-00002-1.55060-0.27679-1.49128-0.36719.h5
Epoch 3/20
Epoch 00003: saving model to model_init_2025-03-0416_28_47.540319/model-00003-1.42461-0.39286-1.32827-0.39844.h5
Epoch 4/20
Epoch 00004: saving model to model_init_2025-03-0416_28_47.540319/model-00004-1.21194-0.52827-1.16510-0.57812.h5
Epoch 5/20
Epoch 00005: saving model to model_init_2025-03-0416_28_47.540319/model-00005-1.04001-0.60714-1.09178-0.56250.h5
Epoch 6/20
Epoch 00006: saving model to model_init_2025-03-0416_28_47.540319/model-00006-0.90139-0.65774-1.03087-0.59375.h5
Epoch 7/20
Epoch 00007: saving model to model_init_2025-03-0416_28_47.540319/model-00007-0.78277-0.70833-1.05669-0.61719.h5
Epoch 8/20
Epoch 00008: saving model to model_init_2025-03-0416_

#### Ninth Model - Conv3D Model with more images:
Previously we tried with the 80 * 80 image resolution and only 13 images from the set we considered. Now we will use same image resolution and we will use 20 images from each video

In [26]:
def new_generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    img_idx = [0,2,4,6,8,10,11,12,13,14,15,16,17,18,19,20,22,24,26,28]
    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(t) // batch_size # calculate the number of batches
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,20,80,80,3)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    
                    #crop the images and resize them. Note that the images are of 2 different shape 
                    image = resize(image,(80,80)).astype(np.float32)
                    
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    batch_data[folder,idx,:,:,0] = (image[:,:,0])/255 #normalise and feed in the image
                    batch_data[folder,idx,:,:,1] = (image[:,:,1])/255 #normalise and feed in the image
                    batch_data[folder,idx,:,:,2] = (image[:,:,2])/255 #normalise and feed in the image
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do

        
        # write the code for the remaining data points which are left after full batches
        if len(t) % batch_size != 0:
            batch_data = np.zeros((batch_size,20,80,80,3)) 
            batch_labels = np.zeros((batch_size,5)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    image = resize(image,(80,80)).astype(np.float32)
                    batch_data[folder,idx,:,:,0] = (image[:,:,0])/255
                    batch_data[folder,idx,:,:,1] = (image[:,:,1])/255
                    batch_data[folder,idx,:,:,2] = (image[:,:,2])/255
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1 
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield does    

In [27]:
from keras.models import Sequential, Model
from keras.layers import Dense, GRU, Flatten, TimeDistributed, Flatten, BatchNormalization, Activation
from keras.layers.convolutional import Conv3D, MaxPooling3D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras import optimizers
from keras.layers import Dropout
from skimage.io import imread

#write your model here
model = Sequential()
model.add(Conv3D(32, (3, 3, 3), padding='same',
         input_shape=(20,80,80,3)))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))

model.add(Conv3D(64, (2, 2, 2), padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))

model.add(Conv3D(128, (2, 2, 2), padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling3D(pool_size=(2, 2, 2)))

model.add(Flatten())
model.add(Dense(128,activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))

model.add(Dense(64,activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.25))

model.add(Dense(5,activation='softmax'))

In [28]:
train_generator = new_generator(train_path, train_doc, batch_size)
val_generator = new_generator(val_path, val_doc, batch_size)

In [29]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'
checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', save_freq='epoch')
LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, verbose=1, patience=4)# write the REducelronplateau code here
callbacks_list = [checkpoint, LR]

optimiser = 'sgd'
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv3d_3 (Conv3D)           (None, 20, 80, 80, 32)    2624      
                                                                 
 activation_3 (Activation)   (None, 20, 80, 80, 32)    0         
                                                                 
 batch_normalization_5 (Batc  (None, 20, 80, 80, 32)   128       
 hNormalization)                                                 
                                                                 
 max_pooling3d_3 (MaxPooling  (None, 10, 40, 40, 32)   0         
 3D)                                                             
                                                                 
 conv3d_4 (Conv3D)           (None, 10, 40, 40, 64)    16448     
                                                                 
 activation_4 (Activation)   (None, 10, 40, 40, 64)   

In [30]:
steps_per_epoch=0
validation_steps=0
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [31]:
history8 = model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  /home/datasets/Project_data/train ; batch size = 32
Epoch 1/20

Epoch 00001: saving model to model_init_2025-03-0506_39_47.413366/model-00001-1.66340-0.40923-1.99692-0.18750.h5
Epoch 2/20
Epoch 00002: saving model to model_init_2025-03-0506_39_47.413366/model-00002-1.28552-0.50000-2.29828-0.33594.h5
Epoch 3/20
Epoch 00004: saving model to model_init_2025-03-0506_39_47.413366/model-00004-0.86064-0.67411-3.05868-0.21094.h5
Epoch 5/20
Epoch 00005: saving model to model_init_2025-03-0506_39_47.413366/model-00005-0.84744-0.67708-3.41666-0.23438.h5

Epoch 00005: ReduceLROnPlateau reducing learning rate to 0.0019999999552965165.
Epoch 6/20
Epoch 00006: saving model to model_init_2025-03-0506_39_47.413366/model-00006-0.63889-0.76042-3.92373-0.21094.h5
Epoch 7/20
Epoch 00007: saving model to model_init_2025-03-0506_39_47.413366/model-00007-0.59813-0.79018-4.66868-0.22656.h5
Epoch 8/20
Epoch 00008: saving model to model_init_2025-03-0506_39_47.413366/model-00008-0.60603-0.78869-5.

#### Tenth Model - Image Data augmentation with Conv3D

In [6]:
import imgaug.augmenters as iaa

def imageauggenerator(source_path, folder_list, batch_size):
    print('Source path = ', source_path, '; batch size =', batch_size)
    img_idx = [2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26]
    
    # Define augmentation sequence
    seq = iaa.Sequential([
        iaa.Fliplr(0.5),                     # Horizontal flip
        iaa.Affine(rotate=(-10, 10)),        # Random rotation
        iaa.Multiply((0.8, 1.2)),            # Random brightness
        iaa.CropAndPad(percent=(-0.05, 0.05)) # Random cropping and padding
    ])

    while True:
        t = np.random.permutation(folder_list)
        num_batches = len(t) // batch_size
        
        for batch in range(num_batches):
            batch_data = np.zeros((batch_size, 13, 80, 80, 3))
            batch_labels = np.zeros((batch_size, 5))

            for folder in range(batch_size):
                imgs = os.listdir(source_path + '/' + t[folder + (batch * batch_size)].split(';')[0])
                
                for idx, item in enumerate(img_idx):
                    image_path = source_path + '/' + t[folder + (batch * batch_size)].strip().split(';')[0] + '/' + imgs[item]
                    image = imread(image_path).astype(np.float32)
                    image = resize(image, (80, 80)).astype(np.float32)
                    
                    # Apply augmentation
                    image = seq(image=image)
                    
                    batch_data[folder, idx, :, :, 0] = image[:, :, 0] / 255
                    batch_data[folder, idx, :, :, 1] = image[:, :, 1] / 255
                    batch_data[folder, idx, :, :, 2] = image[:, :, 2] / 255

                batch_labels[folder, int(t[folder + (batch * batch_size)].strip().split(';')[2])] = 1

            yield batch_data, batch_labels

        if len(t) % batch_size != 0:
            remaining_size = len(t) % batch_size
            batch_data = np.zeros((remaining_size, 13, 80, 80, 3))
            batch_labels = np.zeros((remaining_size, 5))
            
            for folder in range(remaining_size):
                imgs = os.listdir(source_path + '/' + t[folder + (batch * batch_size)].split(';')[0])
                
                for idx, item in enumerate(img_idx):
                    image_path = source_path + '/' + t[folder + (batch * batch_size)].strip().split(';')[0] + '/' + imgs[item]
                    image = imread(image_path).astype(np.float32)
                    image = resize(image, (80, 80)).astype(np.float32)
                    
                    # Apply augmentation
                    image = seq(image=image)
                    
                    batch_data[folder, idx, :, :, 0] = image[:, :, 0] / 255
                    batch_data[folder, idx, :, :, 1] = image[:, :, 1] / 255
                    batch_data[folder, idx, :, :, 2] = image[:, :, 2] / 255

                batch_labels[folder, int(t[folder + (batch * batch_size)].strip().split(';')[2])] = 1
            
            yield batch_data, batch_labels

# Let me know if you want me to tweak the augmentations or add anything else! 🚀


In [7]:
#write your model here
model = Sequential()
model.add(Conv3D(32, (3, 3, 3), padding='same',
         input_shape=(13,80,80,3)))
model.add(Activation('relu'))
model.add(MaxPooling3D(pool_size=(2, 2, 2)))

model.add(Conv3D(64, (2, 2, 2), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling3D(pool_size=(2, 2, 2)))

model.add(Conv3D(128, (2, 2, 2), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling3D(pool_size=(2, 2, 2)))

model.add(Flatten())
model.add(Dense(128,activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(64,activation='relu'))
model.add(Dropout(0.25))

model.add(Dense(5,activation='softmax'))

2025-03-05 12:07:48.219053: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2025-03-05 12:07:48.219205: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14800 MB memory:  -> device: 0, name: Quadro RTX 5000, pci bus id: 0000:40:00.0, compute capability: 7.5


In [8]:
train_generator = imageauggenerator(train_path, train_doc, batch_size)
val_generator = imageauggenerator(val_path, val_doc, batch_size)

model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto',save_freq = 'epoch')

LR =ReduceLROnPlateau(monitor='val_loss', factor=0.2, verbose=1, patience=4) # write the REducelronplateau code here
callbacks_list = [checkpoint, LR]

In [9]:
optimiser = 'adam'
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv3d (Conv3D)             (None, 13, 80, 80, 32)    2624      
                                                                 
 activation (Activation)     (None, 13, 80, 80, 32)    0         
                                                                 
 max_pooling3d (MaxPooling3D  (None, 6, 40, 40, 32)    0         
 )                                                               
                                                                 
 conv3d_1 (Conv3D)           (None, 6, 40, 40, 64)     16448     
                                                                 
 activation_1 (Activation)   (None, 6, 40, 40, 64)     0         
                                                                 
 max_pooling3d_1 (MaxPooling  (None, 3, 20, 20, 64)    0         
 3D)                                                    

In [10]:
steps_per_epoch=0
validation_steps=0
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

In [11]:
num_epochs9=40
history9 = model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs9, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  /home/datasets/Project_data/train ; batch size = 32
Epoch 1/40


2025-03-05 12:08:33.522577: I tensorflow/stream_executor/cuda/cuda_dnn.cc:377] Loaded cuDNN version 8302



Epoch 00001: saving model to model_init_2025-03-0512_07_25.517768/model-00001-1.63378-0.22172-1.59368-0.21000.h5
Epoch 2/40
Epoch 00002: saving model to model_init_2025-03-0512_07_25.517768/model-00002-1.60628-0.20211-1.59891-0.35000.h5
Epoch 3/40
Epoch 00003: saving model to model_init_2025-03-0512_07_25.517768/model-00003-1.58148-0.26094-1.53511-0.21000.h5
Epoch 4/40
Epoch 00004: saving model to model_init_2025-03-0512_07_25.517768/model-00004-1.58043-0.25038-1.56874-0.27000.h5
Epoch 5/40
Epoch 00005: saving model to model_init_2025-03-0512_07_25.517768/model-00005-1.53949-0.27903-1.43353-0.43000.h5
Epoch 6/40
Epoch 00007: saving model to model_init_2025-03-0512_07_25.517768/model-00007-1.44876-0.35747-1.43355-0.37000.h5
Epoch 8/40
Epoch 00008: saving model to model_init_2025-03-0512_07_25.517768/model-00008-1.43035-0.36350-1.28813-0.47000.h5
Epoch 9/40
Epoch 00009: saving model to model_init_2025-03-0512_07_25.517768/model-00009-1.41967-0.33786-1.31083-0.49000.h5
Epoch 10/40
Epoch 

#### Summary:
Applied Image Data augmentation on Training data and left the validation data as-is.

Overall Model performance: The model steadily learned, with both training and validation accuracy improving over time
but still had room for improvement.