<a href="https://colab.research.google.com/github/s-ahuja/gesture_recognition/blob/master/Conv%203D%20Model%20Execution%20RESNET3D.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# Checking whether we are in GPU mode
!nvidia-smi

Sun Jun 16 04:24:34 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67       Driver Version: 410.79       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   71C    P8    18W /  70W |      0MiB / 15079MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|  No ru

In [0]:
# Clone the entire repo.
data_folder = '/content/cloned-repo'
root_dir = f'{data_folder}/Project_data'

In [3]:
# full_restart = True..will gitclone on numpy arrays of images to google colab repo for access. required once per session.
full_restart = False
if (full_restart):
  import shutil
  shutil.rmtree(data_folder,ignore_errors=True)
  import os
  os.makedirs(data_folder,exist_ok=True)
  os.chdir(data_folder)
  !git clone -l -s "https://github.com/s-ahuja/gesture_recognition.git" {data_folder}
%cd {data_folder}/Project_data
!ls

/content/cloned-repo/Project_data
model_init_2019-06-1604_17_16.550747  train.csv		val.csv
numpy				      train_images.pkl	val_images.pkl


In [0]:
# The First Notebook - Data Pre-Processing converted raw images to numpy arrays of standard format 100x100x3 
# and created 2 pkl files(train_images,val_images) for a list of files with paths of npy arrays.

import pickle
filehandler = open('train_images.pkl',"rb")
train_images = pickle.load(filehandler)
filehandler.close()
filehandler = open('val_images.pkl',"rb")
val_images = pickle.load(filehandler)
filehandler.close()

In [5]:
len(train_images[0]),len(val_images[0])

(19890, 3000)

In [0]:
# with limited disk space we need to delete model directories which didnt get any better experiment result
import glob, shutil
model_directories = glob.glob(f'{root_dir}/model*')
for directory in model_directories:
    shutil.rmtree(directory)

# Gesture Recognition
In this group project, you are going to build a 3D Conv model that will be able to predict the 5 gestures correctly. Please import the following libraries to get started.

In [0]:
%load_ext autoreload
%autoreload 2
import numpy as np
import os
import glob
from cv2 import imread,resize,cvtColor,COLOR_BGR2RGB,INTER_AREA
import datetime
import matplotlib.pyplot as plt
import warnings
warnings.simplefilter('ignore')

We set the random seed so that the results don't vary drastically.

In [8]:
np.random.seed(30)
import random as rn
rn.seed(30)
from keras import backend as K
import tensorflow as tf
tf.set_random_seed(30)

Using TensorFlow backend.


In [9]:
# setting up basic parameters for the model
img_indices = list(range(1,30,2))
print(img_indices)
input_shape=(len(img_indices),100,100,3) #15 images per video, each image is of 100x100x3
print(input_shape)
batch_size = 10
no_of_classes = 5
num_epochs = 50 # choose the number of epochs

[1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29]
(15, 100, 100, 3)


In [10]:
train_doc = np.random.permutation(open(f'{root_dir}/train.csv').readlines())
val_doc = np.random.permutation(open(f'{root_dir}/val.csv').readlines())

# since train.csv and val.csv contains all traning/validation data
# these statements filter out if we want to first train/val on a limited set only
# filter out records which are not considered during experiments
train_doc = [x for x in train_doc if x.split(';')[0] in [paths.split('/')[-2] for paths in train_images[0]]] 
val_doc = [x for x in val_doc if x.split(';')[0] in [paths.split('/')[-2] for paths in val_images[0]]]

print('no. of training video seq=',len(train_doc),'no. of validation video seq=',len(val_doc))

no. of training video seq= 663 no. of validation video seq= 100


## Generator
This is one of the most important part of the code. The overall structure of the generator has been given. In the generator, you are going to preprocess the images as you have images of 2 different dimensions as well as create a batch of video frames. You have to experiment with `img_idx`, `y`,`z` and normalization such that you get high accuracy.

In [0]:
def generator(source_path, folder_list, batch_size, ablation=False):
    #print('Source path = ', source_path, '; batch size =', batch_size)
    # if ablation is true then only take only 2 images else take much more.
    img_idx = img_indices #create a list of image numbers you want to use for a particular video. 
    x = len(img_idx)    
    y,z = input_shape[1],input_shape[2] # image_standard_size
    Model_Execution = True # this parameter was used to test the generator function. Model_Execution = FALSE, iterates 1 EPOCH and exits after 1 full EPOCH is complete
    while Model_Execution:
        t = np.random.permutation(folder_list)
        num_batches = int(len(folder_list)/batch_size) # calculate the number of batches
        for batch in range(num_batches): # we iterate over the number of batches
            batch_data = np.zeros((batch_size,x,y,z,input_shape[3])) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((batch_size,no_of_classes)) # batch_labels is the one hot representation of the output
            for folder in range(batch_size): # iterate over the batch_size
                imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    img_file_path = source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]
                    image = np.load(img_file_path)#.astype(np.float32)
                    #crop the images and resize them. Note that the images are of 2 different shape
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    # IMPORTANT NOTE:  the CROPPING/RESIZING OF IMAGES IS DONE IN THE PRE-PROCESSING STEP ITSELF
                    # TO SAVE GENERATOR TIME AND TO RE-DO EACH MODEL EXECUTION RUN
                    batch_data[folder,idx,:,:,0] = (image[:,:,0] - image[:,:,0].min())/(image[:,:,0].max() - image[:,:,0].min()) #normalise and feed in the image
                    batch_data[folder,idx,:,:,1] = (image[:,:,1] - image[:,:,1].min())/(image[:,:,1].max() - image[:,:,1].min()) #normalise and feed in the image
                    batch_data[folder,idx,:,:,2] = (image[:,:,2] - image[:,:,2].min())/(image[:,:,2].max() - image[:,:,2].min()) #normalise and feed in the image

                batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
            yield batch_data, batch_labels #you yield the batch_data and the batch_labels, remember what does yield do
        # write the code for the remaining data points which are left after full batches
        remaining_size = len(folder_list) - (batch_size*num_batches)
        if (remaining_size > 0):
            #print('remaining_size=',remaining_size)
            batch_data = np.zeros((remaining_size,x,y,z,input_shape[3])) # x is the number of images you use for each video, (y,z) is the final size of the input images and 3 is the number of channels RGB
            batch_labels = np.zeros((remaining_size,no_of_classes)) # batch_labels is the one hot representation of the output
            for folder in range(remaining_size):
                imgs = os.listdir(source_path+'/'+ t[folder].split(';')[0]) # read all the images in the folder
                for idx,item in enumerate(img_idx): #  Iterate iver the frames/images of a folder to read them in
                    img_file_path = source_path+'/'+ t[folder].strip().split(';')[0]+'/'+imgs[item]
                    # print('img_file_path=' + img_file_path)
                    image = np.load(img_file_path)#.astype(np.float32)                                        
                    #print(image.shape)
                    #crop the images and resize them. Note that the images are of 2 different shape
                    #and the conv3D will throw error if the inputs in a batch have different shapes
                    # IMPORTANT NOTE:  the CROPPING/RESIZING OF IMAGES IS DONE IN THE PRE-PROCESSING STEP ITSELF
                    # TO SAVE GENERATOR TIME AND TO RE-DO EACH MODEL EXECUTION RUN
                    batch_data[folder,idx,:,:,0] = (image[:,:,0] - image[:,:,0].min())/(image[:,:,0].max() - image[:,:,0].min()) #normalise and feed in the image
                    batch_data[folder,idx,:,:,1] = (image[:,:,1] - image[:,:,1].min())/(image[:,:,1].max() - image[:,:,1].min()) #normalise and feed in the image
                    batch_data[folder,idx,:,:,2] = (image[:,:,2] - image[:,:,2].min())/(image[:,:,2].max() - image[:,:,2].min()) #normalise and feed in the image
                batch_labels[folder, int(t[folder].strip().split(';')[2])] = 1
            yield batch_data, batch_labels
            Model_Execution = True 

# ## test code to test generator function
# train_path = f'{root_dir}/numpy/train'
# num_train_sequences = len(train_doc)
# train_generator = generator(source_path=train_path, folder_list=train_doc, batch_size=batch_size, ablation=False)
# index = 0
# for batch_data, batch_labels in train_generator: 
#     index += len(batch_data)    
#     print(batch_data.shape,batch_labels.shape) 

Note here that a video is represented above in the generator as (number of images, height, width, number of channels). Take this into consideration while creating the model architecture.

In [12]:
curr_dt_time = datetime.datetime.now()
train_path = f'{root_dir}/numpy/train'
val_path = f'{root_dir}/numpy/val'
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
print ('# epochs =', num_epochs)

# training sequences = 663
# validation sequences = 100
# epochs = 50


## Model
Here you make the model using different functionalities that Keras provides. Remember to use `Conv3D` and `MaxPooling3D` and not `Conv2D` and `Maxpooling2D` for a 3D convolution model. You would want to use `TimeDistributed` while building a Conv2D + RNN model. Also remember that the last layer is the softmax. Design the network in such a way that the model is able to give good accuracy on the least number of parameters so that it can fit in the memory of the webcam.

In [0]:
from keras.models import Sequential, Model
from keras.layers import Dense, GRU, Flatten, TimeDistributed, Flatten, BatchNormalization, Activation, Dropout
from keras.layers.convolutional import Conv3D, MaxPooling3D
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau, CSVLogger,EarlyStopping
from keras import optimizers

import matplotlib.pyplot as plt
import keras.backend as K
from keras.callbacks import Callback

In [14]:


"""A vanilla 3D resnet implementation.

Based on Raghavendra Kotikalapudi's 2D implementation
keras-resnet (See https://github.com/raghakot/keras-resnet.)
"""
from __future__ import (
    absolute_import,
    division,
    print_function,
    unicode_literals
)
import six
from math import ceil
from keras.models import Model
from keras.layers import (
    Input,
    Activation,
    Dense,
    Flatten
)
from keras.layers.convolutional import (
    Conv3D,
    AveragePooling3D,
    MaxPooling3D
)
from keras.layers.merge import add
from keras.layers.normalization import BatchNormalization
from keras.regularizers import l2
from keras import backend as K


def _bn_relu(input):
    """Helper to build a BN -> relu block (by @raghakot)."""
    norm = BatchNormalization(axis=CHANNEL_AXIS)(input)
    return Activation("relu")(norm)


def _conv_bn_relu3D(**conv_params):
    filters = conv_params["filters"]
    kernel_size = conv_params["kernel_size"]
    strides = conv_params.setdefault("strides", (1, 1, 1))
    kernel_initializer = conv_params.setdefault(
        "kernel_initializer", "he_normal")
    padding = conv_params.setdefault("padding", "same")
    kernel_regularizer = conv_params.setdefault("kernel_regularizer",
                                                l2(1e-4))

    def f(input):
        conv = Conv3D(filters=filters, kernel_size=kernel_size,
                      strides=strides, kernel_initializer=kernel_initializer,
                      padding=padding,
                      kernel_regularizer=kernel_regularizer)(input)
        return _bn_relu(conv)

    return f


def _bn_relu_conv3d(**conv_params):
    """Helper to build a  BN -> relu -> conv3d block."""
    filters = conv_params["filters"]
    kernel_size = conv_params["kernel_size"]
    strides = conv_params.setdefault("strides", (1, 1, 1))
    kernel_initializer = conv_params.setdefault("kernel_initializer",
                                                "he_normal")
    padding = conv_params.setdefault("padding", "same")
    kernel_regularizer = conv_params.setdefault("kernel_regularizer",
                                                l2(1e-4))

    def f(input):
        activation = _bn_relu(input)
        return Conv3D(filters=filters, kernel_size=kernel_size,
                      strides=strides, kernel_initializer=kernel_initializer,
                      padding=padding,
                      kernel_regularizer=kernel_regularizer)(activation)
    return f


def _shortcut3d(input, residual):
    """3D shortcut to match input and residual and merges them with "sum"."""
    stride_dim1 = ceil(input._keras_shape[DIM1_AXIS] \
        / residual._keras_shape[DIM1_AXIS])
    stride_dim2 = ceil(input._keras_shape[DIM2_AXIS] \
        / residual._keras_shape[DIM2_AXIS])
    stride_dim3 = ceil(input._keras_shape[DIM3_AXIS] \
        / residual._keras_shape[DIM3_AXIS])
    equal_channels = residual._keras_shape[CHANNEL_AXIS] \
        == input._keras_shape[CHANNEL_AXIS]

    shortcut = input
    if stride_dim1 > 1 or stride_dim2 > 1 or stride_dim3 > 1 \
            or not equal_channels:
        shortcut = Conv3D(
            filters=residual._keras_shape[CHANNEL_AXIS],
            kernel_size=(1, 1, 1),
            strides=(stride_dim1, stride_dim2, stride_dim3),
            kernel_initializer="he_normal", padding="valid",
            kernel_regularizer=l2(1e-4)
            )(input)
    return add([shortcut, residual])


def _residual_block3d(block_function, filters, kernel_regularizer, repetitions,
                      is_first_layer=False):
    def f(input):
        for i in range(repetitions):
            strides = (1, 1, 1)
            if i == 0 and not is_first_layer:
                strides = (2, 2, 2)
            input = block_function(filters=filters, strides=strides,
                                   kernel_regularizer=kernel_regularizer,
                                   is_first_block_of_first_layer=(
                                       is_first_layer and i == 0)
                                   )(input)
        return input

    return f


def basic_block(filters, strides=(1, 1, 1), kernel_regularizer=l2(1e-4),
                is_first_block_of_first_layer=False):
    """Basic 3 X 3 X 3 convolution blocks. Extended from raghakot's 2D impl."""
    def f(input):
        if is_first_block_of_first_layer:
            # don't repeat bn->relu since we just did bn->relu->maxpool
            conv1 = Conv3D(filters=filters, kernel_size=(3, 3, 3),
                           strides=strides, padding="same",
                           kernel_initializer="he_normal",
                           kernel_regularizer=kernel_regularizer
                           )(input)
        else:
            conv1 = _bn_relu_conv3d(filters=filters,
                                    kernel_size=(3, 3, 3),
                                    strides=strides,
                                    kernel_regularizer=kernel_regularizer
                                    )(input)

        residual = _bn_relu_conv3d(filters=filters, kernel_size=(3, 3, 3),
                                   kernel_regularizer=kernel_regularizer
                                   )(conv1)
        return _shortcut3d(input, residual)

    return f


def bottleneck(filters, strides=(1, 1, 1), kernel_regularizer=l2(1e-4),
               is_first_block_of_first_layer=False):
    """Basic 3 X 3 X 3 convolution blocks. Extended from raghakot's 2D impl."""
    def f(input):
        if is_first_block_of_first_layer:
            # don't repeat bn->relu since we just did bn->relu->maxpool
            conv_1_1 = Conv3D(filters=filters, kernel_size=(1, 1, 1),
                              strides=strides, padding="same",
                              kernel_initializer="he_normal",
                              kernel_regularizer=kernel_regularizer
                              )(input)
        else:
            conv_1_1 = _bn_relu_conv3d(filters=filters, kernel_size=(1, 1, 1),
                                       strides=strides,
                                       kernel_regularizer=kernel_regularizer
                                       )(input)

        conv_3_3 = _bn_relu_conv3d(filters=filters, kernel_size=(3, 3, 3),
                                   kernel_regularizer=kernel_regularizer
                                   )(conv_1_1)
        residual = _bn_relu_conv3d(filters=filters * 4, kernel_size=(1, 1, 1),
                                   kernel_regularizer=kernel_regularizer
                                   )(conv_3_3)

        return _shortcut3d(input, residual)

    return f


def _handle_data_format():
    global DIM1_AXIS
    global DIM2_AXIS
    global DIM3_AXIS
    global CHANNEL_AXIS
    if K.image_data_format() == 'channels_last':
        DIM1_AXIS = 1
        DIM2_AXIS = 2
        DIM3_AXIS = 3
        CHANNEL_AXIS = 4
    else:
        CHANNEL_AXIS = 1
        DIM1_AXIS = 2
        DIM2_AXIS = 3
        DIM3_AXIS = 4


def _get_block(identifier):
    if isinstance(identifier, six.string_types):
        res = globals().get(identifier)
        if not res:
            raise ValueError('Invalid {}'.format(identifier))
        return res
    return identifier


class Resnet3DBuilder(object):
    """ResNet3D."""

    @staticmethod
    def build(input_shape, num_outputs, block_fn, repetitions, reg_factor):
        """Instantiate a vanilla ResNet3D keras model.

        # Arguments
            input_shape: Tuple of input shape in the format
            (conv_dim1, conv_dim2, conv_dim3, channels) if dim_ordering='tf'
            (filter, conv_dim1, conv_dim2, conv_dim3) if dim_ordering='th'
            num_outputs: The number of outputs at the final softmax layer
            block_fn: Unit block to use {'basic_block', 'bottlenack_block'}
            repetitions: Repetitions of unit blocks
        # Returns
            model: a 3D ResNet model that takes a 5D tensor (volumetric images
            in batch) as input and returns a 1D vector (prediction) as output.
        """
        _handle_data_format()
        if len(input_shape) != 4:
            raise ValueError("Input shape should be a tuple "
                             "(conv_dim1, conv_dim2, conv_dim3, channels) "
                             "for tensorflow as backend or "
                             "(channels, conv_dim1, conv_dim2, conv_dim3) "
                             "for theano as backend")

        block_fn = _get_block(block_fn)
        input = Input(shape=input_shape)
        # first conv
        conv1 = _conv_bn_relu3D(filters=64, kernel_size=(7, 7, 7),
                                strides=(2, 2, 2),
                                kernel_regularizer=l2(reg_factor)
                                )(input)
        pool1 = MaxPooling3D(pool_size=(3, 3, 3), strides=(2, 2, 2),
                             padding="same")(conv1)

        # repeat blocks
        block = pool1
        filters = 64
        for i, r in enumerate(repetitions):
            block = _residual_block3d(block_fn, filters=filters,
                                      kernel_regularizer=l2(reg_factor),
                                      repetitions=r, is_first_layer=(i == 0)
                                      )(block)
            filters *= 2

        # last activation
        block_output = _bn_relu(block)

        # average poll and classification
        pool2 = AveragePooling3D(pool_size=(block._keras_shape[DIM1_AXIS],
                                            block._keras_shape[DIM2_AXIS],
                                            block._keras_shape[DIM3_AXIS]),
                                 strides=(1, 1, 1))(block_output)
        flatten1 = Flatten()(pool2)
        if num_outputs > 1:
            dense = Dense(units=num_outputs,
                          kernel_initializer="he_normal",
                          activation="softmax",
                          kernel_regularizer=l2(reg_factor))(flatten1)
        else:
            dense = Dense(units=num_outputs,
                          kernel_initializer="he_normal",
                          activation="sigmoid",
                          kernel_regularizer=l2(reg_factor))(flatten1)

        model = Model(inputs=input, outputs=dense)
        return model

    @staticmethod
    def build_resnet_18(input_shape, num_outputs, reg_factor=1e-4):
        """Build resnet 18."""
        return Resnet3DBuilder.build(input_shape, num_outputs, basic_block,
                                     [2, 2, 2, 2], reg_factor=reg_factor)

    @staticmethod
    def build_resnet_34(input_shape, num_outputs, reg_factor=1e-4):
        """Build resnet 34."""
        return Resnet3DBuilder.build(input_shape, num_outputs, basic_block,
                                     [3, 4, 6, 3], reg_factor=reg_factor)

    @staticmethod
    def build_resnet_50(input_shape, num_outputs, reg_factor=1e-4):
        """Build resnet 50."""
        return Resnet3DBuilder.build(input_shape, num_outputs, bottleneck,
                                     [3, 4, 6, 3], reg_factor=reg_factor)

    @staticmethod
    def build_resnet_101(input_shape, num_outputs, reg_factor=1e-4):
        """Build resnet 101."""
        return Resnet3DBuilder.build(input_shape, num_outputs, bottleneck,
                                     [3, 4, 23, 3], reg_factor=reg_factor)

    @staticmethod
    def build_resnet_152(input_shape, num_outputs, reg_factor=1e-4):
        """Build resnet 152."""
        return Resnet3DBuilder.build(input_shape, num_outputs, bottleneck,
                                     [3, 8, 36, 3], reg_factor=reg_factor)


if __name__ == '__main__':

    from keras.optimizers import Adam
    import numpy as np
    import os
    
    os.environ["CUDA_VISIBLE_DEVICES"] = '0' 
    os.environ['KERAS_BACKEND'] = 'tensorflow'

    target_shape = (151, 139, 139, 1)
    data_size = 10
    num_outputs = 2

    model = Resnet3DBuilder.build_resnet_18(target_shape, num_outputs)
    
    adam = Adam(lr=1e-3, amsgrad=True)
    model.compile(loss="binary_crossentropy", optimizer=adam, metrics=['accuracy'])
    print(model.summary())

    # Mimic training

    ## Data Preparation
    sample_data = np.random.random((data_size, *target_shape))
    sample_raw_labels = np.random.randint(0, num_outputs, data_size, dtype=int)
    sample_labels = np.zeros((data_size, num_outputs))
    sample_labels[np.arange(data_size), sample_raw_labels] = 1
    ## Training
    model.fit(sample_data, sample_labels, epochs=2, batch_size=1)

W0616 04:24:47.534686 140151224657792 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:74: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

W0616 04:24:47.536144 140151224657792 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

W0616 04:24:47.540486 140151224657792 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4185: The name tf.truncated_normal is deprecated. Please use tf.random.truncated_normal instead.

W0616 04:24:47.628473 140151224657792 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:133: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.

W0616 04:24

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            (None, 151, 139, 139 0                                            
__________________________________________________________________________________________________
conv3d_1 (Conv3D)               (None, 76, 70, 70, 6 22016       input_1[0][0]                    
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 76, 70, 70, 6 256         conv3d_1[0][0]                   
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 76, 70, 70, 6 0           batch_normalization_1[0][0]      
__________________________________________________________________________________________________
max_poolin

In [0]:
#model.add(Conv3D(filters=32, kernel_size=(3,3,3), strides=(1, 1, 1), padding='same', activation='relu',data_format='channels_last'))
#model.add(MaxPooling3D(pool_size=(2, 2, 1)))
#model.add(Dense(1024))


#write your model here

##################### model 1 ################
# model = Sequential()
# model.add(Conv3D(filters=32, kernel_size=(3,3,3), strides=(1, 1, 1), activation=None, input_shape=(len(list(range(0,30,1))),120,160,3),data_format='channels_last'))
# model.add(LeakyReLU(alpha=0.1))
# model.add(MaxPooling3D(pool_size=(1, 1, 1)))
# model.add(Conv3D(filters=64, kernel_size=(3,3,3), strides=(1, 1, 1), activation=None))
# model.add(LeakyReLU(alpha=0.1))
# model.add(Flatten())
# model.add(Dense(5, activation='softmax')) # 5 are number of classes

In [0]:
# ###################### model 2 ################
# model = Sequential()
# model.add(Conv3D(64, kernel_size=(3,3,3), activation='relu', input_shape=input_shape,data_format='channels_last'))
# model.add(MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)))
# model.add(Conv3D(128, (3,3,3), activation='relu'))
# model.add(MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)))
# model.add(Conv3D(256, (3,3,3), activation='relu'))
# model.add(Conv3D(256, (3,3,3), activation='relu'))
# model.add(MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)))
# model.add(Conv3D(512, (3,3,3), activation='relu'))
# model.add(Conv3D(512, (3,3,3), activation='relu'))
# model.add(MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)))
# # model.add(Conv3D(512, (3,3,3), activation='relu'))
# # model.add(Conv3D(512, (3,3,3), activation='relu'))
# # model.add(MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)))
# model.add(Flatten())
# model.add(Dense(2048,activation='relu'))
# model.add(Dense(2048,activation='relu'))
# model.add(Dense(no_of_classes, activation='softmax'))

In [0]:
# ###################### model 2 ################
# model = Sequential()
# model.add(Conv3D(64, kernel_size=(3,3,3), activation='relu', input_shape=input_shape,data_format='channels_last'))
# model.add(MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)))
# model.add(Conv3D(128, (3,3,3), activation='relu'))
# model.add(MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)))
# model.add(Conv3D(256, (3,3,3), activation='relu'))
# model.add(Conv3D(256, (3,3,3), activation='relu'))
# model.add(MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)))
# model.add(Conv3D(512, (3,3,3), activation='relu'))
# model.add(Conv3D(512, (3,3,3), activation='relu'))
# model.add(MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)))
# # model.add(Conv3D(512, (3,3,3), activation='relu'))
# # model.add(Conv3D(512, (3,3,3), activation='relu'))
# # model.add(MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)))
# model.add(Flatten())
# model.add(Dense(2048,activation='relu'))
# model.add(Dense(2048,activation='relu'))
# model.add(Dense(no_of_classes, activation='softmax'))

In [0]:
# ###################### model 3 - deep model ################
# model = Sequential()
# model.add(Conv3D(64, kernel_size=(3,3,3), activation='relu', input_shape=input_shape,data_format='channels_last'))
# model.add(Conv3D(64, (3,3,3), padding='same',activation='relu'))
# model.add(MaxPooling3D(pool_size=(1,2,2), strides=(1,2,2)))
# model.add(Conv3D(128, (3,3,3), padding='same',activation='relu'))
# model.add(Conv3D(128, (3,3,3),  activation='relu'))
# model.add(MaxPooling3D(pool_size=(1,2,2), strides=(1,2,2)))
# model.add(Conv3D(256, (3,3,3), activation='relu'))
# model.add(Conv3D(256, (3,3,3), activation='relu'))
# model.add(Conv3D(256, (3,3,3),  activation='relu'))
# model.add(Conv3D(256, (3,3,3),  activation='relu'))
# model.add(MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)))
# model.add(Flatten())
# model.add(Dense(2048,activation='relu'))
# model.add(Dense(2048,activation='relu'))
# model.add(Dense(no_of_classes, activation='softmax'))
# model.summary()

In [19]:
##################### model 3 - deep model ################
model = Sequential()
model.add(Conv3D(16, kernel_size=(3,3,3), activation='relu', input_shape=input_shape,data_format='channels_last'))
model.add(BatchNormalization())
model.add(Conv3D(16, (3,3,3), padding='same',activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.25))
model.add(MaxPooling3D(pool_size=(1,2,2), strides=(1,2,2)))
model.add(Conv3D(32, (3,3,3), padding='same',activation='relu'))
model.add(BatchNormalization())
model.add(Conv3D(32, (3,3,3), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.25))
model.add(MaxPooling3D(pool_size=(1,2,2), strides=(1,2,2)))
model.add(Conv3D(64, (3,3,3),padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Conv3D(64, (3,3,3),padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Conv3D(64, (3,3,3),padding='same',  activation='relu'))
model.add(BatchNormalization())
model.add(Conv3D(64, (3,3,3),padding='same',  activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.25))
model.add(MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)))
model.add(Conv3D(128, (3,3,3),padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Conv3D(128, (3,3,3),padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Conv3D(128, (3,3,3),padding='same',  activation='relu'))
model.add(BatchNormalization())
model.add(Conv3D(128, (3,3,3),padding='same',  activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(2048,activation='relu'))
model.add(BatchNormalization())
model.add(Dense(2048,activation='relu'))
model.add(BatchNormalization())
model.add(Dense(no_of_classes, activation='softmax'))
model.summary()

W0616 04:25:11.684997 140151224657792 deprecation.py:506] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3445: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_21 (Conv3D)           (None, 13, 98, 98, 16)    1312      
_________________________________________________________________
batch_normalization_18 (Batc (None, 13, 98, 98, 16)    64        
_________________________________________________________________
conv3d_22 (Conv3D)           (None, 13, 98, 98, 16)    6928      
_________________________________________________________________
batch_normalization_19 (Batc (None, 13, 98, 98, 16)    64        
_________________________________________________________________
dropout_1 (Dropout)          (None, 13, 98, 98, 16)    0         
_________________________________________________________________
max_pooling3d_2 (MaxPooling3 (None, 13, 49, 49, 16)    0         
_________________________________________________________________
conv3d_23 (Conv3D)           (None, 13, 49, 49, 32)    13856     
__________

Now that you have written the model, the next step is to `compile` the model. When you print the `summary` of the model, you'll see the total number of parameters you have to train.

In [20]:
optimiser = optimizers.SGD(lr=1e-3, decay=1e-6, momentum=0.9, nesterov=True) #write your optimizer
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d_21 (Conv3D)           (None, 13, 98, 98, 16)    1312      
_________________________________________________________________
batch_normalization_18 (Batc (None, 13, 98, 98, 16)    64        
_________________________________________________________________
conv3d_22 (Conv3D)           (None, 13, 98, 98, 16)    6928      
_________________________________________________________________
batch_normalization_19 (Batc (None, 13, 98, 98, 16)    64        
_________________________________________________________________
dropout_1 (Dropout)          (None, 13, 98, 98, 16)    0         
_________________________________________________________________
max_pooling3d_2 (MaxPooling3 (None, 13, 49, 49, 16)    0         
_________________________________________________________________
conv3d_23 (Conv3D)           (None, 13, 49, 49, 32)    13856     
__________

Let us create the `train_generator` and the `val_generator` which will be used in `.fit_generator`.

In [0]:
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size)

In [0]:
model_name = 'model_init' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
    
if not os.path.exists(model_name):
    os.mkdir(model_name)
        
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'

#checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', period=1)
my_model_checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=True, save_weights_only=True, mode='auto', period=1)
csv_logger = CSVLogger('training.cnn3d.txt',separator=',', append=True)
early_stopping = EarlyStopping(monitor='val_loss', min_delta=0, patience=20, verbose=1)
reduce_LR = ReduceLROnPlateau(monitor='val_loss', factor=0.8, patience=10, min_lr=1e-10,verbose=1) # write the REducelronplateau code here
callbacks_list = [reduce_LR,csv_logger,my_model_checkpoint,early_stopping] #checkpoint

The `steps_per_epoch` and `validation_steps` are used by `fit_generator` to decide the number of next() calls it need to make.

In [0]:
if (num_train_sequences%batch_size) == 0:
    steps_per_epoch = int(num_train_sequences/batch_size)
else:
    steps_per_epoch = (num_train_sequences//batch_size) + 1

if (num_val_sequences%batch_size) == 0:
    validation_steps = int(num_val_sequences/batch_size)
else:
    validation_steps = (num_val_sequences//batch_size) + 1

Let us now fit the model. This will start training the model and with the help of the checkpoints, you'll be able to save the model at the end of each epoch.

In [24]:
#from resnet3d import Resnet3DBuilder
model = Resnet3DBuilder.build_resnet_34((15, 100, 100, 3), 5)
# print(model.summary())
# optimiser = optimizers.SGD(lr=1e-3, decay=1e-6, momentum=0.9, nesterov=True) #write your optimizer
# model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
# model.summary()
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['categorical_accuracy'])
model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
                    callbacks=callbacks_list, validation_data=val_generator, 
                    validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)


# lr_finder = LRFinder(min_lr=1e-5, 
#                      max_lr=1e-2, 
#                      steps_per_epoch=steps_per_epoch, 
#                      epochs=3)
# callbacks_list.append(lr_finder)


# #model.load_weights('./03-model_init_2019-06-1205_47_36.337812/model-00010-0.84542-0.67270-1.15618-0.59000.h5')
# model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1, 
#                     callbacks=callbacks_list, validation_data=val_generator, 
#                     validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)
# lr_finder.plot_loss()





Epoch 1/50

Epoch 00001: val_loss improved from inf to 10.46454, saving model to model_init_2019-06-1604_24_46.937344/model-00001-2.49934-0.36199-10.46454-0.23000.h5
Epoch 2/50

Epoch 00002: val_loss improved from 10.46454 to 5.01731, saving model to model_init_2019-06-1604_24_46.937344/model-00002-2.06945-0.45551-5.01731-0.26000.h5
Epoch 3/50

Epoch 00003: val_loss did not improve from 5.01731
Epoch 4/50

Epoch 00004: val_loss did not improve from 5.01731
Epoch 5/50

Epoch 00005: val_loss improved from 5.01731 to 1.80194, saving model to model_init_2019-06-1604_24_46.937344/model-00005-1.69415-0.52036-1.80194-0.44000.h5
Epoch 6/50

Epoch 00006: val_loss did not improve from 1.80194
Epoch 7/50

Epoch 00007: val_loss did not improve from 1.80194
Epoch 8/50

Epoch 00008: val_loss did not improve from 1.80194
Epoch 9/50

Epoch 00009: val_loss did not improve from 1.80194
Epoch 10/50

Epoch 00010: val_loss did not improve from 1.80194
Epoch 11/50

Epoch 00011: val_loss did not improve from

<keras.callbacks.History at 0x7f74b20a2748>

In [0]:
model.save('resnet3d.h5')

In [26]:
os.rename('/content/cloned-repo/Project_data/model_init_2019-06-1604_24_46.937344/','/content/cloned-repo/Project_data/04-model_init_2019-06-1208_44_45.080481')


FileNotFoundError: ignored

In [0]:
model.save(model_name + 'COMPLETE_model-00020-0.27388-0.88688-1.37700-0.57000.h5')

In [27]:
os.makedirs('/content/gdrive',exist_ok=True)
from google.colab import drive
drive.mount('/content/gdrive')


Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=email%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocs.test%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.photos.readonly%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /content/gdrive


In [28]:
import shutil
shutil.copy('../../../content/cloned-repo/Project_data/model_init_2019-06-1604_24_46.937344/model-00042-0.53521-0.95626-1.27140-0.70000.h5', '../../../content/gdrive/My Drive/gesture_recognition/resnet3d-34.model-00042-0.53521-0.95626-1.27140-0.70000.h5')

'../../../content/gdrive/My Drive/gesture_recognition/resnet3d-34.model-00042-0.53521-0.95626-1.27140-0.70000.h5'

In [0]:
!pwd

In [0]:
os.makedirs('../../../content/gdrive/My Drive/gesture_recognition/')