#ResNet
It is one of the most used architectures for classification of objects. It builds on the theory that the CNN's are much more than just building deep neural networks where building a neural network with more number of layers just would not increase the efficiency of the neural network. When testes against the CIFAR-10 dataset, the state of the art ResNet predicts with an accuracy of more than 93%. This is an effort to replicate and understand the working of ResNet architecture. 



#Packages
Load the neccessary packages.
 

In [10]:
import numpy as np
import pandas as pd
import cv2
import keras
import tensorflow as tf
import matplotlib.pyplot as plt
from keras import layers
from keras.layers import Input, Add, Dense, Activation, ZeroPadding2D, BatchNormalization, Flatten, Conv2D, AveragePooling2D, MaxPooling2D, GlobalMaxPooling2D
from keras.models import Model, load_model
from keras.preprocessing import image
from keras.utils import layer_utils
from keras.preprocessing.image import ImageDataGenerator
from keras.initializers import glorot_uniform
%matplotlib inline
import os

#Index Block
Index or the identity block here is a simple Convolutional Neural Network with each block containing 8 layers each. Each convolutional layer is followed by a layer of Batch Normalization followed by an activation layer, activation used here is relu. The output of this block is added to the shortcut layer(will be explained next) which is followed by an activation layer again.



In [11]:
def index_block(x, f, filters, level, layer):
    
    con_name = str('res' + str(level) + str(layer)) 
    bnz_name = str('bn' + str(level) + str(layer)) 
    
    F1, F2, F3 = filters
    
    x_shortcut = x
    
    x = Conv2D(filters = F1, kernel_size = (1, 1), strides = (1,1), padding = 'valid', name = con_name + '2a', kernel_initializer = glorot_uniform(seed=0))(x)
    x = BatchNormalization(axis = 3, name = bnz_name + '2a')(x)
    x = Activation('relu')(x)

    x = Conv2D(filters = F2, kernel_size = (f, f), strides = (1,1), padding = 'same', name = con_name + '2b', kernel_initializer = glorot_uniform(seed=0))(x)
    x = BatchNormalization(axis = 3, name = bnz_name + '2b')(x)
    x = Activation('relu')(x)

    x = Conv2D(filters = F3, kernel_size = (1, 1), strides = (1,1), padding = 'valid', name = con_name + '2c', kernel_initializer = glorot_uniform(seed=0))(x)
    x = BatchNormalization(axis = 3, name = bnz_name + '2c')(x)

    x = Add()([x, x_shortcut])
    x = Activation('relu')(x)
    
    return x

#Convolutional Block
The convolutional block contains 3 convolutional layers with each followed by a batch normalizing layer and the first 2 have a relu activation function as well. The third layer is where the things change and here th shortcut layer is created with a 1*1 convolutional layer followed by a batch normalization layer which is added to the output of the previous 3 convolutional layers and then followed again with a relu activation layer.

In [12]:
def con_block(x, f, filters, level, layer, s = 2):

    con_name = str('res' + str(level) + str(layer))
    bnz_name = str('bn' + str(level) + str(layer))
    
    F1, F2, F3 = filters
    
    x_shortcut = x

    x = Conv2D(F1, (1, 1), strides = (s,s), name = con_name + '2a', kernel_initializer = glorot_uniform(seed=0))(x)
    x = BatchNormalization(axis = 3, name = bnz_name + '2a')(x)
    x = Activation('relu')(x)

    x = Conv2D(filters = F2, kernel_size = (f, f), strides = (1,1), padding = 'same', name = con_name + '2b', kernel_initializer = glorot_uniform(seed=0))(x)
    x = BatchNormalization(axis = 3, name = bnz_name + '2b')(x)
    x = Activation('relu')(x)

    x = Conv2D(filters = F3, kernel_size = (1, 1), strides = (1,1), padding = 'valid', name = con_name + '2c', kernel_initializer = glorot_uniform(seed=0))(x)
    x = BatchNormalization(axis = 3, name = bnz_name + '2c')(x)

    x_shortcut = Conv2D(filters = F3, kernel_size = (1, 1), strides = (s,s), padding = 'valid', name = con_name + '1',
                        kernel_initializer = glorot_uniform(seed=0))(x_shortcut)
    x_shortcut = BatchNormalization(axis = 3, name = bnz_name + '1')(x_shortcut)

    x = Add()([x, x_shortcut])
    x = Activation('relu')(x)
    
    return x

#Model
A deep ResNet model is created here with the first convolutionla block followed by 2 index blocks and increasing so on after that. The advantage that ResNet offer with the shortcut layer is that the performance of the model is only supposed to improve as go deeper in to the network thus solving the problem of the Vanishing Gradient problem which is often a worry with a deep network. The shortcut layer here ensures that the learnings of the network are not lost in between.

In [13]:
def ResNet(input_shape=(64, 64, 3), classes=6):
    
    x_input = Input(input_shape)

    x = ZeroPadding2D((3, 3))(x_input)

    x = Conv2D(64, (7, 7), strides=(2, 2), name='conv1', kernel_initializer=glorot_uniform(seed=0))(x)
    x = BatchNormalization(axis=3, name='bn_conv1')(x)
    x = Activation('relu')(x)
    x = MaxPooling2D((3, 3), strides=(2, 2))(x)

    x = con_block(x, f=3, filters=[64, 64, 256], layer=2, level='a', s=1)
    x = index_block(x, 3, [64, 64, 256], layer=2, level='b')
    x = index_block(x, 3, [64, 64, 256], layer=2, level='c')

    x = con_block(x, f = 3, filters = [128, 128, 512], layer = 3, level='a', s = 2)
    x = index_block(x, 3, [128, 128, 512], layer=3, level='b')
    x = index_block(x, 3, [128, 128, 512], layer=3, level='c')
    x = index_block(x, 3, [128, 128, 512], layer=3, level='d')

    x = con_block(x, f = 3, filters = [256, 256, 1024], layer = 4, level='a', s = 2)
    x = index_block(x, 3, [256, 256, 1024], level=4, layer='b')
    x = index_block(x, 3, [256, 256, 1024], level=4, layer='c')
    x = index_block(x, 3, [256, 256, 1024], level=4, layer='d')
    x = index_block(x, 3, [256, 256, 1024], level=4, layer='e')
    x = index_block(x, 3, [256, 256, 1024], level=4, layer='f')

    x = con_block(x, f = 3, filters = [512, 512, 2048], level = 5, layer='a', s = 2)
    x = index_block(x, 3, [512, 512, 2048], level=5, layer='b')
    x = index_block(x, 3, [512, 512, 2048], level=5, layer='c')

    x = AveragePooling2D((2,2), name="avg_pool")(x)

    x = Flatten()(x)
    x = Dense(classes, activation='softmax', name='fc' + str(classes), kernel_initializer = glorot_uniform(seed=0))(x)
    
    model = Model(inputs = x_input, outputs = x, name='ResNet50')

    return model

#Compiling
The model is compiled with an 'Adam' optimizer and a common loss method, Categorical Cross Entropy is used as it suits better to deal with categorical losses.

In [9]:
model = ResNet(input_shape = (64, 64, 3), classes = 10)
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

#Summary
The model built here has over 2.3 million trainable parameters and when trained over a decently decently processed set of images yields a very good accuracy. Naming all the layers differentiate each layer and also helps in debugging the code when needed especially when building a deep learning model like this.



In [6]:
model.summary()

Model: "ResNet50"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            [(None, 64, 64, 3)]  0                                            
__________________________________________________________________________________________________
zero_padding2d (ZeroPadding2D)  (None, 70, 70, 3)    0           input_1[0][0]                    
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 32, 32, 64)   9472        zero_padding2d[0][0]             
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, 32, 32, 64)   256         conv1[0][0]                      
___________________________________________________________________________________________