# PG Diploma in Machine Learning and AI from IIIT Bangalore | Upgrad

#### Developed by:
##### 1. Sanjay Belgaonkar - Group facilitator
##### 2. Yashraj Pathak

## Deep Learning Course Project - Gesture Recognition

### Problem Statement
Imagine you are working as a data scientist at a home electronics company which manufactures state of the art smart televisions. You want to develop a cool feature in the smart-TV that can recognise five different gestures performed by the user which will help users control the TV without using a remote.

The gestures are continuously monitored by the webcam mounted on the TV. Each gesture corresponds to a specific command:
 
| Gesture | Corresponding Action |
| --- | --- | 
| Thumbs Up | Increase the volume. |
| Thumbs Down | Decrease the volume. |
| Left Swipe | 'Jump' backwards 10 seconds. |
| Right Swipe | 'Jump' forward 10 seconds. |
| Stop | Pause the movie. |

Each video is a sequence of 30 frames (or images).

### Objectives:
1. **Generator**:  The generator should be able to take a batch of videos as input without any error. Steps like cropping, resizing and normalization should be performed successfully.

2. **Model**: Develop a model that is able to train without any errors which will be judged on the total number of parameters (as the inference(prediction) time should be less) and the accuracy achieved. As suggested by Snehansu, start training on a small amount of data and then proceed further.

3. **Write up**: This should contain the detailed procedure followed in choosing the final model. The write up should start with the reason for choosing the base model, then highlight the reasons and metrics taken into consideration to modify and experiment to arrive at the final model. 

In [84]:
#!nvidia-smi

## Importing Data:

In [85]:
#from google.colab import drive
#drive.mount('/content/drive')

In [86]:
#!cp '/content/drive/MyDrive/Project_data.zip' '/content'
#!unzip '/content/Project_data.zip' -d '/content/Gesture'

### Importing all necessary Libraries:

In [87]:
import numpy as np
import pandas as pd
import tensorflow as tf
from keras import backend as K
import random as rn
import os
from imageio import imread
import cv2
import matplotlib.pyplot as plt
#% matplotlib inline
import datetime
import time

from keras.models import Sequential, Model
from keras.layers import Dense, GRU, Flatten, TimeDistributed, Flatten, BatchNormalization, Activation
from keras.layers.convolutional import Conv3D, MaxPooling3D, Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras.layers.recurrent import LSTM
from keras import optimizers
from keras.layers import Dropout
from keras.applications import mobilenet
from keras import backend as K

We set the random seed so that the results don't vary drastically.

In [88]:
np.random.seed(30)
rn.seed(30)
tf.random.set_seed(30)

## Deriving Data Folder Path:


In [89]:
data_folder = 'datasets/Project_data'
train_path = data_folder + '/train'
val_path = data_folder + '/val'

total_frames = 30
num_gestures = 5

trial_count = 1

In this block, you read the folder names for training and validation. You also set the `batch_size` here. Note that you set the batch size in such a way that you are able to use the GPU in full capacity. You keep increasing the batch size until the machine throws an error.

In [90]:
train_doc = np.random.permutation(open(data_folder + '/train.csv').readlines())
val_doc = np.random.permutation(open(data_folder + '/val.csv').readlines())

## Generator
This is one of the most important part of the code. The overall structure of the generator has been given. In the generator, you are going to preprocess the images as you have images of 2 different dimensions as well as create a batch of video frames. You have to experiment with `img_idx`, `y`,`z` and normalization such that you get high accuracy.

In [91]:
# function to plot the training/validation accuracies/losses.

def plot_training_validation_graph(history):
    fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(15,4))
    axes[0].plot(history.history['loss'])   
    axes[0].plot(history.history['val_loss'])
    axes[0].legend(['loss','val_loss'])

    axes[1].plot(history.history['categorical_accuracy'])   
    axes[1].plot(history.history['val_categorical_accuracy'])
    axes[1].legend(['categorical_accuracy','val_categorical_accuracy'])
    plt.show()

In [92]:
def get_batch_data(source_path, folder_list, batch_size, num_gesture_frames, image_size, num_channels, batch, img_idx, t) :
  batch_data = np.zeros((batch_size,num_gesture_frames,image_size,image_size,num_channels)) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
  batch_labels = np.zeros((batch_size,num_gestures)) # batch_labels is the one hot representation of the output
  for folder in range(batch_size): # iterate over the batch_size
      gesture_info = t[folder + batch * batch_size].strip().split(';')            
      imgs = os.listdir(source_path+'/'+ gesture_info[0]) # read all the images in the folder
      for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
          if num_channels == 3 :
            image = imread(source_path + '/' + gesture_info[0] +'/'+imgs[item]).astype(np.float32)
            
            #crop the images and resize them. Note that the images are of 2 different shape 
            #and the conv3D will throw error if the inputs in a batch have different shapes
            image = cv2.resize(image, (image_size, image_size))

            batch_data[folder,idx,:,:,0] = (image[:,:,0]) / 255.0 #normalise and feed in the image
            batch_data[folder,idx,:,:,1] = (image[:,:,1]) / 255.0 #normalise and feed in the image
            batch_data[folder,idx,:,:,2] = (image[:,:,2]) / 255.0  #normalise and feed in the image
          else :
            image = cv2.imread(source_path + '/' + gesture_info[0] +'/'+imgs[item], cv2.IMREAD_GRAYSCALE)
            image = cv2.resize(image, (image_size, image_size))
            batch_data[folder,idx,:,:,0] = image / 255.0

      batch_labels[folder, int(gesture_info[2])] = 1
  return batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do


In [93]:
def generator(source_path, folder_list, batch_size, num_gesture_frames, image_size, num_channels):
    #print( 'Source path = ', source_path, '; batch size =', batch_size)
    img_idx =  np.round(np.linspace(0, total_frames - 1, num_gesture_frames)).astype(int) #create a list of image numbers you want to use for a particular video

    while True:
        t = np.random.permutation(folder_list)
        num_batches =  len(t) // batch_size # calculate the number of batches
        for batch in range(num_batches): # we iterate over the number of batches
            yield get_batch_data(source_path, folder_list, batch_size, num_gesture_frames, image_size, num_channels, batch, img_idx, t)
        
        # write the code for the remaining data points which are left after full batches

        rem_batch_size = len(t) % batch_size
        yield get_batch_data(source_path, folder_list, rem_batch_size, num_gesture_frames, image_size, num_channels, num_batches, img_idx, t)



### Model Function:

In [94]:
def build_model(model_text, model, batch_size, num_epochs, num_gesture_frames, image_size, num_channels, learning_rate=0.001) :
  optimiser = tf.keras.optimizers.Adam(learning_rate=learning_rate)
  model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])

  train_generator = generator(train_path, train_doc, batch_size, num_gesture_frames, image_size, num_channels)
  val_generator = generator(val_path, val_doc, batch_size, num_gesture_frames, image_size, num_channels)

  num_train_sequences = len(train_doc) 
  num_val_sequences = len(val_doc) 

  if (num_train_sequences % batch_size) == 0:
      steps_per_epoch = int(num_train_sequences/batch_size)
  else:
      steps_per_epoch = (num_train_sequences//batch_size) + 1

  if (num_val_sequences%batch_size) == 0:
      validation_steps = int(num_val_sequences/batch_size)
  else:
      validation_steps = (num_val_sequences//batch_size) + 1

  model_name = model_text + '_' + str(datetime.datetime.now()).replace(' ','').replace(':','_') + '/'
      
  if not os.path.exists(model_name):
      os.mkdir(model_name)

  filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

  checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=0, save_best_only=True, save_weights_only=False, mode='auto', save_freq='epoch')
  LR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, verbose=0, patience=4)
  callbacks_list = [checkpoint, LR] 

  history = model.fit(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=0, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)
  
  plot_training_validation_graph(history)
  return history

### Model Memory function:

In [95]:
def get_model_memory_usage(batch_size, model):
    shapes_mem_count = 0
    internal_model_mem_count = 0
    for l in model.layers:
        layer_type = l.__class__.__name__
        if layer_type == 'Model':
            internal_model_mem_count += get_model_memory_usage(batch_size, l)
        single_layer_mem = 1
        out_shape = l.output_shape
        if type(out_shape) is list:
            out_shape = out_shape[0]
        for s in out_shape:
            if s is None:
                continue
            single_layer_mem *= s
        shapes_mem_count += single_layer_mem

    trainable_count = np.sum([K.count_params(p) for p in model.trainable_weights])
    non_trainable_count = np.sum([K.count_params(p) for p in model.non_trainable_weights])

    number_size = 4.0
    if K.floatx() == 'float16':
        number_size = 2.0
    if K.floatx() == 'float64':
        number_size = 8.0

    total_memory = number_size * (batch_size * shapes_mem_count + trainable_count + non_trainable_count)
    mbs = np.round(total_memory / (1024.0 ** 2), 3) + internal_model_mem_count
    return mbs

### Model:
Here you make the model using different functionalities that Keras provides. Remember to use Conv3D and MaxPooling3D and not Conv2D and Maxpooling2D for a 3D convolution model. You would want to use TimeDistributed while building a Conv2D + RNN model. Also remember that the last layer is the softmax. Design the network in such a way that the model is able to give good accuracy on the least number of parameters so that it can fit in the memory of the webcam.

### CNN_3D function:

In [96]:
def try_cnn3d_only(batch_size, num_epochs, num_gesture_frames, image_size, num_channels, batch_normalize, learning_rate, dropout, dense_neurons) :
  model = Sequential()
  model.add(Conv3D(16, (3, 3, 3), padding='same', input_shape=(num_gesture_frames,image_size,image_size,num_channels)))
  model.add(Activation('relu'))
  model.add(MaxPooling3D(pool_size=(2, 2, 2)))

  model.add(Conv3D(32, (2, 2, 2), padding='same'))
  model.add(Activation('relu'))
  model.add(MaxPooling3D(pool_size=(2, 2, 2)))

  model.add(Conv3D(64, (2, 2, 2), padding='same'))
  model.add(Activation('relu'))
  model.add(MaxPooling3D(pool_size=(2, 2, 2)))

  model.add(Conv3D(128, (2, 2, 2), padding='same'))
  model.add(Activation('relu'))
  model.add(MaxPooling3D(pool_size=(2, 2, 2)))

  model.add(Flatten())
  model.add(Dense(dense_neurons,activation='relu'))
  model.add(Dropout(dropout))

  model.add(Dense(dense_neurons // 2,activation='relu'))
  if batch_normalize:
    model.add(BatchNormalization())   
  model.add(Dropout(dropout))

  model.add(Dense(num_gestures,activation='softmax'))

  print('Total params: {:,}'.format(model.count_params()))

  return build_model("cnn3d_only", model, batch_size, num_epochs, num_gesture_frames, image_size, num_channels, learning_rate),model

### Model - 1

In [None]:
st = time.time()
hist,model = try_cnn3d_only(batch_size=30, num_epochs=25, num_gesture_frames=30,
                                   image_size= 64, num_channels=1, batch_normalize= False,
                                   learning_rate= 0.001, dropout= 0.25, dense_neurons= 128)
df_train = pd.DataFrame(hist.history)
df_best = df_train[df_train['val_loss'] == df_train['val_loss'].min()]
et = time.time()
elapsed_time = et - st
mbs = get_model_memory_usage(batch_size, model)
print("Execution time (secs):", elapsed_time, "Memory usage(MB)", mbs)
print("Best stats:")
print(df_best)


Total params: 357,541


### Model - 2

In [None]:
st = time.time()
hist,model = try_cnn3d_only(batch_size= 40, num_epochs= 25, num_gesture_frames= 30,
                                   image_size= 128, num_channels= 3, batch_normalize= False,
                                   learning_rate= 0.001, dropout= 0.25, dense_neurons= 256)
df_train = pd.DataFrame(hist.history)
df_best = df_train[df_train['val_loss'] == df_train['val_loss'].min()]
et = time.time()
elapsed_time = et - st
mbs = get_model_memory_usage(batch_size, model)
print("Execution time (secs):", elapsed_time, "Memory usage(MB)", mbs)
print("Best stats:")
print(df_best)



### Model - 3

In [None]:
st = time.time()
hist,model = try_cnn3d_only(batch_size= 40, num_epochs= 30, num_gesture_frames= 16,
                                   image_size= 140, num_channels= 3, batch_normalize= False,
                                   learning_rate= 0.001, dropout=0.25, dense_neurons=256)
df_train = pd.DataFrame(hist.history)
df_best = df_train[df_train['val_loss'] == df_train['val_loss'].min()]
et = time.time()
elapsed_time = et - st
mbs = get_model_memory_usage(batch_size, model)
print("Execution time (secs):", elapsed_time, "Memory usage(MB)", mbs)
print("Best stats:")
print(df_best)



### Model - 4

In [None]:
st = time.time()
hist,model = try_cnn3d_only(batch_size= 30, num_epochs=25, num_gesture_frames= 30,
                                   image_size= 64, num_channels=1, batch_normalize=False,
                                   learning_rate= 0.001, dropout= 0.25, dense_neurons=128)
df_train = pd.DataFrame(hist.history)
df_best = df_train[df_train['val_loss'] == df_train['val_loss'].min()]
et = time.time()
elapsed_time = et - st
mbs = get_model_memory_usage(batch_size, model)
print("Execution time (secs):", elapsed_time, "Memory usage(MB)", mbs)
print("Best stats:")
print(df_best)



### CNN_LSTM function:

In [None]:
def try_cnn_lstm(batch_size, num_epochs, num_gesture_frames, image_size, num_channels, batch_normalize, learning_rate, dropout, dense_neurons, rnn_cells) :
  model = Sequential()

  model.add(TimeDistributed(Conv2D(16, (3, 3) , padding='same', activation='relu'),
                            input_shape=(num_gesture_frames, image_size, image_size, num_channels)))
  model.add(TimeDistributed(MaxPooling2D((2, 2))))
  
  model.add(TimeDistributed(Conv2D(32, (3, 3) , padding='same', activation='relu')))
  model.add(TimeDistributed(MaxPooling2D((2, 2))))
  
  model.add(TimeDistributed(Conv2D(64, (3, 3) , padding='same', activation='relu')))
  model.add(TimeDistributed(MaxPooling2D((2, 2))))
  
  model.add(TimeDistributed(Conv2D(128, (3, 3) , padding='same', activation='relu')))
  model.add(TimeDistributed(MaxPooling2D((2, 2))))

  model.add(TimeDistributed(Conv2D(256, (3, 3) , padding='same', activation='relu')))
  model.add(TimeDistributed(MaxPooling2D((2, 2))))

  model.add(TimeDistributed(Flatten()))

  model.add(LSTM(rnn_cells))
  
  model.add(Dense(dense_neurons, activation='relu'))
  model.add(Dropout(dropout))

  model.add(Dense(num_gestures, activation='softmax'))

  print('Total params: {:,}'.format(model.count_params()))

  return build_model("cnn_lstm", model, batch_size, num_epochs, num_gesture_frames, image_size, num_channels, learning_rate), model

### Model - 5

In [None]:
hist,model = try_cnn_lstm(batch_size=30, num_epochs=20, num_gesture_frames= 30,
                                      image_size=64, num_channels=3, batch_normalize=False,
                                      learning_rate=.001, dropout=0.25, dense_neurons=128, rnn_cells=64)
df_train = pd.DataFrame(hist.history)
df_best = df_train[df_train['val_loss'] == df_train['val_loss'].min()]
et = time.time()
elapsed_time = et - st
mbs = get_model_memory_usage(batch_size, model)
print("Execution time (secs):", elapsed_time, "Memory usage(MB)", mbs)
print("Best stats:")
print(df_best)

 

### Model - 6

In [None]:
hist,model = try_cnn_lstm(batch_size=40, num_epochs=25, num_gesture_frames=30,
                                      image_size=128 ,num_channels=1, batch_normalize=False,
                                      learning_rate= 0.001, dropout=0.25, dense_neurons=128, rnn_cells=128)
df_train = pd.DataFrame(hist.history)
df_best = df_train[df_train['val_loss'] == df_train['val_loss'].min()]
et = time.time()
elapsed_time = et - st
mbs = get_model_memory_usage(batch_size, model)
print("Execution time (secs):", elapsed_time, "Memory usage(MB)", mbs)
print("Best stats:")
print(df_best)

 

### Model - 7

In [None]:
hist,model = try_cnn_lstm(batch_size=25, num_epochs=20, num_gesture_frames=30,
                                      image_size=140, num_channels=1, batch_normalize=False,
                                      learning_rate=0.001, dropout= 0.25, dense_neurons= 256, rnn_cells= 256)
df_train = pd.DataFrame(hist.history)
df_best = df_train[df_train['val_loss'] == df_train['val_loss'].min()]
et = time.time()
#elapsed_time = et - st
mbs = get_model_memory_usage(batch_size, model)
print("Execution time (secs):", elapsed_time, "Memory usage(MB)", mbs)
print("Best stats:")
print(df_best)

 

### Model - 8

In [None]:
hist,model = try_cnn_lstm(batch_size= 30, num_epochs= 30, num_gesture_frames= 30,
                                      image_size= 64, num_channels= 3, batch_normalize=False,
                                      learning_rate= 0.001, dropout= 0.25, dense_neurons= 512, rnn_cells= 256)
df_train = pd.DataFrame(hist.history)
df_best = df_train[df_train['val_loss'] == df_train['val_loss'].min()]
et = time.time()
elapsed_time = et - st
mbs = get_model_memory_usage(batch_size, model)
print("Execution time (secs):", elapsed_time, "Memory usage(MB)", mbs)
print("Best stats:")
print(df_best)

 

### CNN_GRU function:

In [None]:
def try_cnn_GRU(batch_size, num_epochs, num_gesture_frames, image_size, num_channels, batch_normalize, learning_rate, dropout, dense_neurons, rnn_cells) :
  model = Sequential()

  model.add(TimeDistributed(Conv2D(16, (3, 3) , padding='same', activation='relu'),
                            input_shape=(num_gesture_frames, image_size, image_size, num_channels)))
  model.add(TimeDistributed(MaxPooling2D((2, 2))))
  
  model.add(TimeDistributed(Conv2D(32, (3, 3) , padding='same', activation='relu')))
  model.add(TimeDistributed(MaxPooling2D((2, 2))))
  
  model.add(TimeDistributed(Conv2D(64, (3, 3) , padding='same', activation='relu')))
  model.add(TimeDistributed(MaxPooling2D((2, 2))))
  
  model.add(TimeDistributed(Conv2D(128, (3, 3) , padding='same', activation='relu')))
  model.add(TimeDistributed(MaxPooling2D((2, 2))))

  model.add(TimeDistributed(Conv2D(256, (3, 3) , padding='same', activation='relu')))
  model.add(TimeDistributed(MaxPooling2D((2, 2))))

  model.add(TimeDistributed(Flatten()))

  model.add(GRU(rnn_cells))
  
  model.add(Dense(dense_neurons, activation='relu'))
  model.add(Dropout(dropout))

  model.add(Dense(num_gestures, activation='softmax'))

  print('Total params: {:,}'.format(model.count_params()))

  return build_model("cnn_gru", model, batch_size, num_epochs, num_gesture_frames, image_size, num_channels, learning_rate), model

### Model - 9

In [None]:
 hist,model = try_cnn_GRU(batch_size=30, num_epochs=25, num_gesture_frames=30,
                                      image_size= 64, num_channels= 3, batch_normalize=False,
                                      learning_rate= 0.001, dropout= 0.25, dense_neurons= 128, rnn_cells=128)
df_train = pd.DataFrame(hist.history)
df_best = df_train[df_train['val_loss'] == df_train['val_loss'].min()]
et = time.time()
elapsed_time = et - st
mbs = get_model_memory_usage(batch_size, model)
print("Execution time (secs):", elapsed_time, "Memory usage(MB)", mbs)
print("Best stats:")
print(df_best)


### Model - 10

In [None]:
 hist,model = try_cnn_GRU(batch_size= 40, num_epochs=30, num_gesture_frames=30,
                                      image_size= 128, num_channels= 1, batch_normalize=False,
                                      learning_rate= 0.001, dropout= 0.25, dense_neurons= 256, rnn_cells=128)
df_train = pd.DataFrame(hist.history)
df_best = df_train[df_train['val_loss'] == df_train['val_loss'].min()]
et = time.time()
elapsed_time = et - st
mbs = get_model_memory_usage(batch_size, model)
print("Execution time (secs):", elapsed_time, "Memory usage(MB)", mbs)
print("Best stats:")
print(df_best)


### Model - 11

In [None]:
 hist,model = try_cnn_GRU(batch_size= 30, num_epochs= 20, num_gesture_frames=30,
                                      image_size= 64, num_channels=3, batch_normalize=False,
                                      learning_rate= 0.001, dropout = 0.25, dense_neurons= 256, rnn_cells=256)
df_train = pd.DataFrame(hist.history)
df_best = df_train[df_train['val_loss'] == df_train['val_loss'].min()]
et = time.time()
elapsed_time = et - st
mbs = get_model_memory_usage(batch_size, model)
print("Execution time (secs):", elapsed_time, "Memory usage(MB)", mbs)
print("Best stats:")
print(df_best)


### Model - 12

---



---



In [None]:
 hist,model = try_cnn_GRU(batch_size=40, num_epochs=25, num_gesture_frames= 30,
                                      image_size=100, num_channels=1, batch_normalize=False,
                                      learning_rate=0.001, dropout= 0.25, dense_neurons=512, rnn_cells=128)
df_train = pd.DataFrame(hist.history)
df_best = df_train[df_train['val_loss'] == df_train['val_loss'].min()]
et = time.time()
elapsed_time = et - st
mbs = get_model_memory_usage(batch_size, model)
print("Execution time (secs):", elapsed_time, "Memory usage(MB)", mbs)
print("Best stats:")
print(df_best)


### TL_MobileNet_LSTM:

In [None]:
def try_TL_MobileNet_LSTM(batch_size, num_epochs, num_gesture_frames, image_size, num_channels, batch_normalize, learning_rate, dropout, dense_neurons, rnn_cells) :

  mobilenet_tl = tf.keras.applications.MobileNetV2(weights='imagenet', include_top=False, classes=num_gestures)

  model = Sequential()
  model.add(TimeDistributed(mobilenet_tl, input_shape=(num_gesture_frames,image_size,image_size,num_channels)))
   
  for layer in model.layers:
      layer.trainable = False
  
  #model.add(TimeDistributed(BatchNormalization()))
  model.add(TimeDistributed(MaxPooling2D((2, 2))))
  model.add(TimeDistributed(Flatten()))

  model.add(LSTM(rnn_cells))
  model.add(Dropout(dropout))
  
  model.add(Dense(dense_neurons,activation='relu'))
  model.add(Dropout(dropout))
  
  model.add(Dense(num_gestures, activation='softmax'))
  
  print('Total params: {:,}'.format(model.count_params()))

  return build_model("TL_MobileNet_LSTM", model, batch_size, num_epochs, num_gesture_frames, image_size, num_channels, learning_rate), model

### Model - 13

In [None]:
 hist,model = try_TL_MobileNet_LSTM(batch_size=25, num_epochs= 20, num_gesture_frames=30,
                                      image_size=100, num_channels=3, batch_normalize=False,
                                      learning_rate= 0.001, dropout= 0.25, dense_neurons=128, rnn_cells=64)

### Model - 14

In [None]:
 hist,model = try_TL_MobileNet_LSTM(batch_size=30, num_epochs=25, num_gesture_frames= 30,
                                      image_size=140, num_channels= 3, batch_normalize=False,
                                      learning_rate= 0.001, dropout= 0.25, dense_neurons= 256, rnn_cells=256)

### Model - 15

In [None]:
 hist,model = try_TL_MobileNet_LSTM(batch_size= 30, num_epochs=20, num_gesture_frames= 30,
                                      image_size=64, num_channels=3, batch_normalize=False,
                                      learning_rate= 0.001, dropout= 0.25, dense_neurons= 128, rnn_cells=128)

### Model - 16

In [None]:
 hist,model = try_TL_MobileNet_LSTM(batch_size= 40, num_epochs=20, num_gesture_frames=30,
                                      image_size=128, num_channels= 3, batch_normalize=False,
                                      learning_rate=0.001, dropout=0.25, dense_neurons=256, rnn_cells=128)

### TL_MobileNet_GRU:

In [None]:
def try_TL_MobileNet_GRU(batch_size, num_epochs, num_gesture_frames, image_size, num_channels, batch_normalize, learning_rate, dropout, dense_neurons, rnn_cells) :

  mobilenet_tl = tf.keras.applications.MobileNet(weights='imagenet', include_top=False, classes=num_gestures)

  model = Sequential()
  model.add(TimeDistributed(mobilenet_tl, input_shape=(num_gesture_frames,image_size,image_size,num_channels)))
   
  for layer in model.layers:
      layer.trainable = False
  
  #model.add(TimeDistributed(BatchNormalization()))
  model.add(TimeDistributed(MaxPooling2D((2, 2))))
  model.add(TimeDistributed(Flatten()))

  model.add(GRU(rnn_cells))
  model.add(Dropout(dropout))
  
  model.add(Dense(dense_neurons,activation='relu'))
  model.add(Dropout(dropout))
  
  model.add(Dense(num_gestures, activation='softmax'))
  
  print('Total params: {:,}'.format(model.count_params()))

  return build_model("TL_MobileNet_GRU", model, batch_size, num_epochs, num_gesture_frames, image_size, num_channels, learning_rate), model

### Model - 17

In [None]:
hist,model = try_TL_MobileNet_GRU(batch_size= 30, num_epochs= 25, num_gesture_frames= 30,
                                      image_size= 64, num_channels= 3, batch_normalize=False,
                                      learning_rate= 0.001, dropout= 0.25, dense_neurons= 128, rnn_cells=256)

### Model - 18

In [None]:
hist,model = try_TL_MobileNet_GRU(batch_size= 35, num_epochs= 25, num_gesture_frames= 30,
                                      image_size= 100, num_channels= 3, batch_normalize=False,
                                      learning_rate= 0.001, dropout= 0.25, dense_neurons= 64, rnn_cells=128)

### Model - 19

In [None]:
hist,model = try_TL_MobileNet_GRU(batch_size= 40, num_epochs= 25, num_gesture_frames= 30,
                                      image_size= 64, num_channels= 3, batch_normalize=False,
                                      learning_rate= 0.001, dropout= 0.25, dense_neurons= 128, rnn_cells=64)

### Model - 20

In [None]:
hist,model = try_TL_MobileNet_GRU(batch_size= 30, num_epochs= 20, num_gesture_frames= 30,
                                      image_size= 140, num_channels= 3, batch_normalize=False,
                                      learning_rate= 0.001, dropout= 0.25, dense_neurons= 512, rnn_cells=256)

## After doing all the experiments, we finalized Model 8 - CNN+LSTM, which performed well.
__Reason:__

__- (Training Accuracy : 93%, Validation Accuracy : 85%)__

__- Number of Parameters(1,657,445)less according to other models performance__

__- Learning rate gradually decreacing after 16 Epoch__


__The best weights of CNN-LSTM: model-00020-0.19649-0.93514-0.45695-0.85000.h5 (19 MB). we considered this weight for model testing, Let's have look at the performance below__


# Loading model and Testing

In [None]:
import time
from keras.models import load_model
model = load_model('model_init_2020-06-2522_00_52.036987/model-00020-0.19649-0.93514-0.45695-0.85000.h5')

In [None]:
test_generator=RNNCNN1()
test_generator.initialize_path(project_folder)
test_generator.initialize_image_properties(image_height=120,image_width=120)
test_generator.initialize_hyperparams(frames_to_sample=18,batch_size=20,num_epochs=20)

g=test_generator.generator(test_generator.val_path,test_generator.val_doc,augment=False)
batch_data, batch_labels=next(g)

In [None]:
batch_labels
