# Gesture Recognition

In [1]:
import numpy as np
import os
from scipy.misc import imread, imresize
import datetime
import os

We set the random seed so that the results don't vary drastically.

In [2]:
np.random.seed(30)
import random as rn
rn.seed(30)
from keras import backend as K
import tensorflow as tf
tf.set_random_seed(30)

Using TensorFlow backend.


In this block, you read the folder names for training and validation. You also set the `batch_size` here. Note that you set the batch size in such a way that you are able to use the GPU in full capacity. You keep increasing the batch size until the machine throws an error.

In [3]:
train_doc = np.random.permutation(open('Project_data/train.csv').readlines())
val_doc = np.random.permutation(open('Project_data/val.csv').readlines())
batch_size = 10

## Generator
This is one of the most important part of the code. The overall structure of the generator has been given. In the generator, you are going to preprocess the images as you have images of 2 different dimensions as well as create a batch of video frames. You have to experiment with some of the parts of the generator function such that you get high accuracy.

In [4]:
def generator(source_path, folder_list, batch_size):
    print( 'Source path = ', source_path, '; batch size =', batch_size)
    img_idx = [0,1,2,4,6,8,10,12,14,16,18,20,22,24,26,27,28,29]
    while True:
        t = np.random.permutation(folder_list)
        num_batches = int(len(t)/batch_size)
        for batch in range(num_batches):
            batch_data = np.zeros((batch_size,18,84,84,3))
            batch_labels = np.zeros((batch_size,5))
            for folder in range(batch_size):
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0])
                for idx,item in enumerate(img_idx):
                    image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    if image.shape[1] == 160:
                        image = imresize(image[:,20:140,:],(84,84)).astype(np.float32)
                    else:
                        image = imresize(image,(84,84)).astype(np.float32)
                    
                    batch_data[folder,idx,:,:,0] = image[:,:,0] - 104
                    batch_data[folder,idx,:,:,1] = image[:,:,1] - 117
                    batch_data[folder,idx,:,:,2] = image[:,:,2] - 123
                    
                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels

        if (len(t)%batch_size) != 0:
            batch_data = np.zeros((len(t)%batch_size,18,84,84,3))
            batch_labels = np.zeros((len(t)%batch_size,5))
            for folder in range(len(t)%batch_size):
                imgs = os.listdir(source_path+'/'+ t[folder + (num_batches*batch_size)].split(';')[0])
                for idx,item in enumerate(img_idx):
                    image = imread(source_path+'/'+ t[folder + (num_batches*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
                    if image.shape[1] == 160:
                        image = imresize(image[:,20:140,:],(84,84)).astype(np.float32)
                    else:
                        image = imresize(image,(84,84)).astype(np.float32)

                    batch_data[folder,idx,:,:,0] = image[:,:,0] - 104
                    batch_data[folder,idx,:,:,1] = image[:,:,1] - 117
                    batch_data[folder,idx,:,:,2] = image[:,:,2] - 123

                batch_labels[folder, int(t[folder + (num_batches*batch_size)].strip().split(';')[2])] = 1

            yield batch_data, batch_labels

In [5]:
curr_dt_time = datetime.datetime.now()
train_path = 'Project_data/train'
val_path = 'Project_data/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
num_epochs = 30
print ('# epochs =', num_epochs)

# training sequences = 663
# validation sequences = 100
# epochs = 30


## Model
Here you make the model using different functionalities that Keras provides. Remember to use `Conv3D` and `MaxPooling3D` and not `Conv2D` and `Maxpooling2D`. Also remember that the last layer is the softmax. Remember that the network is designed in such a way that the model is able to fit in the memory of the webcam.

In [6]:
from keras.models import Sequential
from keras.layers import Dense, GRU, Dropout, Flatten, BatchNormalization, Activation
from keras.layers.convolutional import Conv3D, MaxPooling3D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras import optimizers

model = Sequential()
model.add(Conv3D(64, (3,3,3), strides=(1,1,1), padding='same', input_shape=(18,84,84,3)))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling3D(pool_size=(2,2,1), strides=(2,2,1)))

model.add(Conv3D(128, (3,3,3), strides=(1,1,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling3D(pool_size=(2,2,2), strides=(2,2,2)))

model.add(Conv3D(256, (3,3,3), strides=(1,1,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling3D(pool_size=(2,2,2), strides=(2,2,2)))

model.add(Conv3D(256, (3,3,3), strides=(1,1,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling3D(pool_size=(2,2,2), strides=(2,2,2)))

model.add(Flatten())
model.add(Dropout(0.5))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(5, activation='softmax'))

Now that you have written the model, the next step is to `compile` the model. When you print the `summary` of the model, you'll see the total number of parameters you have to train.

In [7]:
sgd = optimizers.SGD(lr=0.001, decay=1e-6, momentum=0.7, nesterov=True)
model.compile(optimizer=sgd, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_1 (Conv3D)            (None, 18, 84, 84, 64)    5248      
_________________________________________________________________
batch_normalization_1 (Batch (None, 18, 84, 84, 64)    256       
_________________________________________________________________
activation_1 (Activation)    (None, 18, 84, 84, 64)    0         
_________________________________________________________________
max_pooling3d_1 (MaxPooling3 (None, 9, 42, 84, 64)     0         
_________________________________________________________________
conv3d_2 (Conv3D)            (None, 9, 42, 84, 128)    221312    
_________________________________________________________________
batch_normalization_2 (Batch (None, 9, 42, 84, 128)    512       
_________________________________________________________________
activation_2 (Activation)    (None, 9, 42, 84, 128)    0         
__________

Let us create the `train_generator` and the `val_generator` which will be used in `.fit_generator`.

In [8]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [9]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=2, verbose=1, mode='min', epsilon=0.0001, cooldown=0, min_lr=0.00001)
callbacks_list = [checkpoint, LR]



In [10]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

Let us now fit the model. This will start training the model and with the help of the checkpoints, you'll be able to save the model at the end of each epoch.

In [11]:
model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Source path =  Project_data/val ; batch size =Source path =  Project_data/train  10
; batch size = 10
Epoch 1/30


`imread` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``imageio.imread`` instead.
  del sys.path[0]
`imresize` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``skimage.transform.resize`` instead.
  from ipykernel import kernelapp as app
`imresize` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``skimage.transform.resize`` instead.




`imread` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``imageio.imread`` instead.
`imresize` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``skimage.transform.resize`` instead.



Epoch 00001: saving model to model_init_2020-09-1314_28_24.024543/model-00001-3.09214-0.27903-1.46109-0.37000.h5
Epoch 2/30

Epoch 00002: saving model to model_init_2020-09-1314_28_24.024543/model-00002-1.40423-0.43137-1.07115-0.57000.h5
Epoch 3/30

`imresize` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``skimage.transform.resize`` instead.



Epoch 00003: saving model to model_init_2020-09-1314_28_24.024543/model-00003-1.18665-0.47813-1.00266-0.58000.h5
Epoch 4/30

Epoch 00004: saving model to model_init_2020-09-1314_28_24.024543/model-00004-1.15081-0.49774-0.90191-0.67000.h5
Epoch 5/30

Epoch 00005: saving model to model_init_2020-09-1314_28_24.024543/model-00005-1.03422-0.57164-0.89288-0.66000.h5
Epoch 6/30

Epoch 00006: saving model to model_init_2020-09-1314_28_24.024543/model-00006-0.98713-0.56712-0.77348-0.72000.h5
Epoch 7/30

Epoch 00007: saving model to model_init_2020-09-1314_28_24.024543/model-00007-0.94621-0.62142-0.90208-0.65000.h5
Epoch 8/30

Epoch 00008: saving model to model_init_2020-09-1314_28_24.024543/model-00008-0.87226-0.64555-0.80177-0.61000.h5

Epoch 00008: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
Epoch 9/30

Epoch 00009: saving model to model_init_2020-09-1314_28_24.024543/model-00009-0.77876-0.68477-0.70298-0.77000.h5
Epoch 10/30

Epoch 00010: saving model to model_init_20

<keras.callbacks.History at 0x7fdc420f5828>

In [14]:
from keras.models import load_model, Model

model_name = './model_init_2020-09-1314_28_24.024543/model-00028-0.38487-0.86425-0.49906-0.81000.h5'
    
test_doc = open('Project_data/val.csv').readlines()
test_path = 'Project_data/val'
num_test_sequences = len(test_doc)
print ('# testing sequences =', num_test_sequences)
test_generator = generator(test_path, test_doc, batch_size)
model = load_model(model_name)
print("Model loaded.")
model_func = Model(inputs=[model.input], outputs=[model.output])
    
acc = 0
    
num_batches = int(num_test_sequences/batch_size)
    
for i in range(num_batches):
    x,true_labels = test_generator.__next__()
    print ("shape of x:", x.shape, "and shape of true_labels:", true_labels.shape)
    pred_idx = np.argmax(model_func.predict_on_batch(x), axis=1)
    for j,k in enumerate(pred_idx):
        if true_labels[j,k] == 1:
            acc += 1
                
if (num_test_sequences%batch_size) != 0:
    x,true_labels = test_generator.__next__()
    print ("shape of x:", x.shape, "and shape of true_labels:", true_labels.shape)
    pred_idx = np.argmax(model_func.predict_on_batch(x), axis=1)
    for j,k in enumerate(pred_idx):
        if true_labels[j,k] == 1:
            acc += 1

print('Accuracy is =', acc/num_test_sequences) 


# testing sequences = 100
Model loaded.
Source path =  Project_data/val ; batch size = 10


`imread` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``imageio.imread`` instead.
  del sys.path[0]
`imresize` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``skimage.transform.resize`` instead.
  from ipykernel import kernelapp as app
`imresize` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``skimage.transform.resize`` instead.


shape of x: (10, 18, 84, 84, 3) and shape of true_labels: (10, 5)
shape of x: (10, 18, 84, 84, 3) and shape of true_labels: (10, 5)
shape of x: (10, 18, 84, 84, 3) and shape of true_labels: (10, 5)
shape of x: (10, 18, 84, 84, 3) and shape of true_labels: (10, 5)
shape of x: (10, 18, 84, 84, 3) and shape of true_labels: (10, 5)
shape of x: (10, 18, 84, 84, 3) and shape of true_labels: (10, 5)
shape of x: (10, 18, 84, 84, 3) and shape of true_labels: (10, 5)
shape of x: (10, 18, 84, 84, 3) and shape of true_labels: (10, 5)
shape of x: (10, 18, 84, 84, 3) and shape of true_labels: (10, 5)
shape of x: (10, 18, 84, 84, 3) and shape of true_labels: (10, 5)
Accuracy is = 0.81


#### Training model with batch size = 20 and epoch = 30

In [15]:
batch_size = 20
epoch = 30
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1
    
model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)

Epoch 1/30


`imread` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``imageio.imread`` instead.
  del sys.path[0]
`imresize` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``skimage.transform.resize`` instead.
`imresize` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``skimage.transform.resize`` instead.
  from ipykernel import kernelapp as app



Epoch 00001: saving model to model_init_2020-09-1314_28_24.024543/model-00001-0.38120-0.87353-0.56175-0.82000.h5
Epoch 2/30

`imread` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``imageio.imread`` instead.
`imresize` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``skimage.transform.resize`` instead.
`imresize` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``skimage.transform.resize`` instead.



Epoch 00002: saving model to model_init_2020-09-1314_28_24.024543/model-00002-0.39616-0.84685-0.47737-0.78000.h5
Epoch 3/30

Epoch 00003: saving model to model_init_2020-09-1314_28_24.024543/model-00003-0.35627-0.87941-0.43775-0.80000.h5
Epoch 4/30

Epoch 00004: saving model to model_init_2020-09-1314_28_24.024543/model-00004-0.37383-0.86186-0.64602-0.78000.h5
Epoch 5/30

Epoch 00005: saving model to model_init_2020-09-1314_28_24.024543/model-00005-0.39440-0.83235-0.45071-0.86000.h5

Epoch 00005: ReduceLROnPlateau reducing learning rate to 6.25000029685907e-05.
Epoch 6/30

Epoch 00006: saving model to model_init_2020-09-1314_28_24.024543/model-00006-0.36714-0.86186-0.53477-0.80000.h5
Epoch 7/30

Epoch 00007: saving model to model_init_2020-09-1314_28_24.024543/model-00007-0.37272-0.84412-0.53560-0.84000.h5

Epoch 00007: ReduceLROnPlateau reducing learning rate to 3.125000148429535e-05.
Epoch 8/30

Epoch 00008: saving model to model_init_2020-09-1314_28_24.024543/model-00008-0.31598-0.


Epoch 00030: saving model to model_init_2020-09-1314_28_24.024543/model-00030-0.31574-0.88889-0.49893-0.78000.h5


<keras.callbacks.History at 0x7fdc103ee4a8>

In [16]:
from keras.applications.vgg16 import VGG16
from keras.layers import TimeDistributed

base_model = VGG16(include_top=False, weights='imagenet', input_shape=(84,84,3))
x = base_model.output
x = Flatten()(x)
#x.add(Dropout(0.5))
features = Dense(64, activation='relu')(x)
conv_model = Model(inputs=base_model.input, outputs=features)
    
for layer in base_model.layers:
    layer.trainable = False
        
model = Sequential()
model.add(TimeDistributed(conv_model, input_shape=(18,84,84,3)))
model.add(GRU(32, return_sequences=True))
model.add(GRU(16))
model.add(Dropout(0.5))
model.add(Dense(8, activation='relu'))
model.add(Dense(5, activation='softmax'))

sgd = optimizers.SGD(lr=0.001, decay=1e-6, momentum=0.7, nesterov=True)
model.compile(optimizer=sgd, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
print (model.summary())

Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
time_distributed_1 (TimeDist (None, 18, 64)            14845824  
_________________________________________________________________
gru_1 (GRU)                  (None, 18, 32)            9312      
_________________________________________________________________
gru_2 (GRU)                  (None, 16)                2352      
_________________________________________________________________
dropout_3 (Dropout)          (None, 16)                0         
_________________________________________________________________
dense_4 (Dense)              (None, 8)                 136       
_________________________________________________________________
dense_5 (Dense)              (None, 5)                 45       

In [17]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [18]:
model_name = 'model_init_conv_lstm' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', period=1)

LR = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=2, verbose=1, mode='min', epsilon=0.0001, cooldown=0, min_lr=0.00001)
callbacks_list = [checkpoint, LR]

if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1
    
model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)



Source path =  Project_data/val ; batch size = 20
Source path =  Project_data/train ; batch size = Epoch 1/30
20


`imread` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``imageio.imread`` instead.
  del sys.path[0]
`imresize` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``skimage.transform.resize`` instead.
  from ipykernel import kernelapp as app
`imresize` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``skimage.transform.resize`` instead.




`imread` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``imageio.imread`` instead.
`imresize` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``skimage.transform.resize`` instead.
`imresize` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``skimage.transform.resize`` instead.



Epoch 00001: saving model to model_init_conv_lstm_2020-09-1314_28_24.024543/model-00001-1.75868-0.17949-1.62500-0.17000.h5
Epoch 2/30

Epoch 00002: saving model to model_init_conv_lstm_2020-09-1314_28_24.024543/model-00002-1.65170-0.21418-1.59917-0.22000.h5
Epoch 3/30

Epoch 00003: saving model to model_init_conv_lstm_2020-09-1314_28_24.024543/model-00003-1.60561-0.23077-1.60843-0.18000.h5
Epoch 4/30

Epoch 00004: saving model to model_init_conv_lstm_2020-09-1314_28_24.024543/model-00004-1.57236-0.27300-1.58851-0.29000.h5
Epoch 5/30

Epoch 00005: saving model to model_init_conv_lstm_2020-09-1314_28_24.024543/model-00005-1.54740-0.27903-1.57890-0.24000.h5
Epoch 6/30

Epoch 00006: saving model to model_init_conv_lstm_2020-09-1314_28_24.024543/model-00006-1.54463-0.31674-1.56166-0.31000.h5
Epoch 7/30

Epoch 00007: saving model to model_init_conv_lstm_2020-09-1314_28_24.024543/model-00007-1.54118-0.31222-1.57411-0.26000.h5
Epoch 8/30

Epoch 00008: saving model to model_init_conv_lstm_2020


Epoch 00029: saving model to model_init_conv_lstm_2020-09-1314_28_24.024543/model-00029-1.29939-0.50528-1.50761-0.31000.h5
Epoch 30/30

Epoch 00030: saving model to model_init_conv_lstm_2020-09-1314_28_24.024543/model-00030-1.28808-0.50980-1.51055-0.31000.h5


<keras.callbacks.History at 0x7fdc30c19cf8>

### Conclusions
* To conclude, a smaller mini-batch size (not too small) usually leads not only to a smaller number of iterations of a training algorithm, than a large batch size, but also to a higher accuracy overall, i.e, a neural network that performs better, in the same amount of training time, or less. Here we find the optimal batch size as 20. Whereas batch size 10 is too small and also need more time for training.
* GRU is better than LSTM as it is easy to modify and doesn't need memory units, therefore, faster to train than LSTM and give as per performance.
* Conv3D gives better accuracy when compared with LSTM/GRU models
* We can **select the Conv3D model with 82% accuracy as our final model for Gesture Recognition**.