# Implementation of ResNet50 Explained

### This notebook explains the implemetation of ResNet50 using Keras

### 1.Import Packages and Define Backend

Keras.backend:
Keras has three backend implementations available: the TensorFlow backend, the Theano backend, and the CNTK backend.

The default configuration file looks like this:<br>
{<br>
    "image_data_format": "channels_last",<br>
    "epsilon": 1e-07,<br>
    "floatx": "float32",<br>
    "backend": "tensorflow"<br>
}<br>
You can change these settings by editing $HOME/.keras/keras.json.

image_data_format: String, either "channels_last" or "channels_first". It specifies which data format convention Keras will follow. (keras.backend.image_data_format() returns it.)

For 2D data (e.g. image), "channels_last" assumes (rows, cols, channels) while "channels_first" assumes (channels, rows, cols).

For 3D data, "channels_last" assumes (conv_dim1, conv_dim2, conv_dim3, channels) while "channels_first" assumes (channels, conv_dim1, conv_dim2, conv_dim3).

epsilon: Float, a numeric fuzzing constant used to avoid dividing by zero in some operations.

floatx: String, "float16", "float32", or "float64". Default float precision.

backend: String, "tensorflow", "theano", or "cntk".

In [None]:
import numpy as np
from keras import layers
from keras.layers import Input, Add, Dense, Activation, ZeroPadding2D, BatchNormalization, Flatten, Conv2D, AveragePooling2D, MaxPooling2D, GlobalMaxPooling2D
from keras.models import Model, load_model
from keras import optimizers
from keras.preprocessing import image
import keras.backend as K
K.set_image_data_format('channels_last')
K.set_learning_phase(1)

### 2.Identity Block
There are two kinds of block in resnet: identity block and convolutional block

Identity block has the following structure.

<img src="identity_block.jpg">
    


In [None]:
def identity_block(X, f, filters, stage, block):
    """
    Implementation of the identity block as defined in Figure 3
    
    Arguments:
    X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev)
    f -- integer, specifying the shape of the middle CONV's window for the main path
    filters -- python list of integers, defining the number of filters in the CONV layers of the main path
    stage -- integer, used to name the layers, depending on their position in the network
    block -- string/character, used to name the layers, depending on their position in the network
    
    Returns:
    X -- output of the identity block, tensor of shape (n_H, n_W, n_C)
    """
    
    # defining name basis e.g res1a_branch2a
    conv_name_base = 'res' + str(stage) + block + '_branch'
    bn_name_base = 'bn' + str(stage) + block + '_branch'
    
    # Retrieve Filters
    F1, F2, F3 = filters
    
    # Save the input value. You'll need this later to add back to the main path. 
    X_shortcut = X
    
    # First component of main path
    #e.g res1a_branch2a
    X = Conv2D(filters = F1, kernel_size = (1, 1), strides = (1,1), padding = 'valid',
                     name = conv_name_base + '2a', kernel_initializer = glorot_uniform(seed=0))(X)
    #e.g bn1a_branch2a
    X = BatchNormalization(axis = 3, name = bn_name_base + '2a')(X)
    X = Activation('relu')(X)
       
    # Second component of main path (≈3 lines)
    X = Conv2D(filters = F2, kernel_size = (f,f), strides = (1,1), padding = 'same',
                    name = conv_name_base + '2b',kernel_initializer = glorot_uniform(seed=0))(X)
    X = BatchNormalization(axis=3,name = bn_name_base + '2b')(X)
    X = Activation('relu')(X)

    # Third component of main path (≈2 lines)
    X = Conv2D(filters = F3,kernel_size = (1,1),strides = (1,1),padding = 'valid',
                    name = conv_name_base + '2c',kernel_initializer = glorot_uniform(seed=0))(X)
    X = BatchNormalization(axis=3,name = bn_name_base + '2c')(X)

    # Final step: Add shortcut value to main path, and pass it through a RELU activation (≈2 lines)
    X = layers.add([X,X_shortcut])
    X = Activation('relu')(X)
    
    return X

### 3.Convolutional Block

It has the following structure
<img src="conv_block.jpg">

The only differece is there's convolutional layers in the shortcut.


In [None]:
def convolutional_block(X, f, filters, stage, block, s = 2):
    """
    Implementation of the convolutional block as defined in Figure 4
    
    Arguments:
    X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev)
    f -- integer, specifying the shape of the middle CONV's window for the main path
    filters -- python list of integers, defining the number of filters in the CONV layers of the main path
    stage -- integer, used to name the layers, depending on their position in the network
    block -- string/character, used to name the layers, depending on their position in the network
    s -- Integer, specifying the stride to be used
    
    Returns:
    X -- output of the convolutional block, tensor of shape (n_H, n_W, n_C)
    """
    
    # defining name basis
    conv_name_base = 'res' + str(stage) + block + '_branch'
    bn_name_base = 'bn' + str(stage) + block + '_branch'
    
    # Retrieve Filters
    F1, F2, F3 = filters
    
    # Save the input value
    X_shortcut = X

    ##### MAIN PATH #####
    # First component of main path 
    X = Conv2D(filters = F1, kernel_size = (1, 1), strides = (s,s), padding='valid',
                   name = conv_name_base + '2a', kernel_initializer = glorot_uniform(seed=0))(X)
    X = BatchNormalization(axis = 3, name = bn_name_base + '2a')(X)
    X = Activation('relu')(X)
    
    # Second component of main path (≈3 lines)
    X = Conv2D(filters = F2, kernel_size = (f,f),strides = (1,1),padding='same',
                   name = conv_name_base + '2b', kernel_initializer = glorot_uniform(seed=0))(X)
    X = BatchNormalization(axis=3,name = bn_name_base + '2b')(X)
    X = Activation('relu')(X)

    # Third component of main path (≈2 lines)
    X = Conv2D(filters=F3, kernel_size=(1,1),strides=(1,1), padding='valid',
              name = conv_name_base + '2c', kernel_initializer = glorot_uniform(seed=0))(X)
    X = BatchNormalization(axis=3,name = bn_name_base + '2c')(X)

    ##### SHORTCUT PATH #### (≈2 lines)
    X_shortcut = Conv2D(filters=F3,kernel_size = (1,1),strides=(s,s),padding='same',
                       name = conv_name_base + '1',kernel_initializer = glorot_uniform(seed=0))(X_shortcut)
    
    X_shortcut = BatchNormalization(axis=1,name = bn_name_base + '1')(X_shortcut)

    # Final step: Add shortcut value to main path, and pass it through a RELU activation (≈2 lines)
    X = layers.add([X_shortcut,X])
    X = Activation('relu')(X)
      
    return X

### 4.ResNet50 Summary

Stage 1:<br>
conv layer: conv1<br>
BN layer: bn_conv1<br>
Activation: relu<br>
MaxPooling: kernel=(3,3), stride=(2,2)
<br>
Stage 2:<br>
1 conv block: filters = [64, 64, 256]<br>
2 identity blocks: filters = [64, 64, 256]
<br>
<br>
Stage 3:<br>
1 conv block: filters=[128,128,512]<br>
3 identity blocks:  filters=[128,128,512]
<br>
<br>
Stage 4:<br>
1 conv block: filters=[256,256,1024]<br>
5 identity blocks: filters=[256,256,1024]
<br>
<br>
Stage 5:<br>
1 conv block: filters=[512,512,2048]<br>
2 identity blocks: filters=[512,512,2048]
<br>
<br>
Pooling:<br>
AvergaePooling: pool_size=(2,2)
<br>
<br>
Output Layer:<br>
FC: activation='softmax'
     
        

In [None]:
def ResNet50(input_shape, classes_num):
    """
    Implementation of the popular ResNet50 the following architecture:
    CONV2D -> BATCHNORM -> RELU -> MAXPOOL -> CONVBLOCK -> IDBLOCK*2 -> CONVBLOCK -> IDBLOCK*3
    -> CONVBLOCK -> IDBLOCK*5 -> CONVBLOCK -> IDBLOCK*2 -> AVGPOOL -> TOPLAYER

    Arguments:
    input_shape -- shape of the images of the dataset
    classes -- integer, number of classes

    Returns:
    model -- a Model() instance in Keras
    """
    
    # Define the input as a tensor with shape input_shape
    X_input = Input(input_shape)
   
    # Zero-Padding
    X = ZeroPadding2D(padding=(3, 3))(X_input)
    
    # Stage 1
    X = Conv2D(64, (7, 7), strides = (2, 2), name = 'conv1', kernel_initializer = glorot_uniform(seed=0))(X)
    X = BatchNormalization(axis = 3, name = 'bn_conv1')(X)
    X = Activation('relu')(X)
    X = MaxPooling2D((3, 3), strides=(2, 2))(X)

    # Stage 2
    X = convolutional_block(X, f = 3, filters = [64, 64, 256], stage = 2, block='a', s = 1)
    X = identity_block(X, 3, [64, 64, 256], stage=2, block='b')
    X = identity_block(X, 3, [64, 64, 256], stage=2, block='c')

    # Stage 3 (≈4 lines)
    
#     The convolutional block uses three set of filters of size [128,128,512], 
#     "f" is 3, "s" is 2 and the block is "a".The 3 identity blocks use three set
#     of filters of size [128,128,512], "f" is 3 and the blocks are "b", "c" and "d".
    X = convolutional_block(X,f = 3, filters=[128,128,512],stage=3,block='a',s = 2)
    X = identity_block(X, f = 3, filters=[128,128,512], stage=3,block='b')
    X = identity_block(X, f = 3, filters=[128,128,512], stage=3,block='c')
    X = identity_block(X, f = 3, filters=[128,128,512], stage=3,block='d')

    # Stage 4 (≈6 lines)
    
#     The convolutional block uses three set of filters of size [256, 256, 1024], 
#     "f" is 3, "s" is 2 and the block is "a".
#     The 5 identity blocks use three set of filters of size [256, 256, 1024], "f" 
#     is 3 and the blocks are "b", "c", "d", "e" and "f".
    X = convolutional_block(X,f = 3, filters=[256,256,1024],stage=4,block='a',s=2)
    X = identity_block(X,f = 3, filters=[256,256,1024],stage=4, block='b')
    X = identity_block(X,f = 3, filters=[256,256,1024],stage=4, block='c')
    X = identity_block(X,f = 3, filters=[256,256,1024],stage=4, block='d')
    X = identity_block(X,f = 3, filters=[256,256,1024],stage=4, block='e')
    X = identity_block(X,f = 3, filters=[256,256,1024],stage=4, block='f')

    # Stage 5 (≈3 lines)
    
#     The convolutional block uses three set of filters of size [512, 512, 2048], "f"
#     is 3, "s" is 2 and the block is "a".
#     The 2 identity blocks use three set of filters of size [256, 256, 2048], "f" 
#     is 3 and the blocks are "b" and "c".
    X = convolutional_block(X,f=3,filters=[512,512,2048],stage=5,block='a',s=2)
    X = identity_block(X, f=3, filters=[256,256,2048],stage=5,block='b')
    X = identity_block(X, f=3, filters=[256,256,2048],stage=5,block='c')

    # AVGPOOL (≈1 line). Use "X = AveragePooling2D(...)(X)"
    # The 2D Average Pooling uses a window of shape (2,2) and its name is "avg_pool".
    X = AveragePooling2D(pool_size=(2,2))(X)
    
    # output layer
    X = Flatten()(X)
    X = Dense(classes, activation='softmax', name='fc' + str(classes), kernel_initializer = glorot_uniform(seed=0))(X)
     
    # Create model
    model = Model(inputs = X_input, outputs = X, name='ResNet50')

    return model

### 5.Model Building and Compiling

In [None]:
if __name__ == '__main__':
    input_shape = (64,64,3)
    class_num = 10
    resnet_model = ResNet50(input_shape, class_num)
    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    model.fit(X_train, Y_train, epochs = 20, batch_size = 32)
    preds = model.evaluate(X_test, Y_test)

### 6.Notes
This is a simplified version of ResNet50 with random initial weights. To use pre-trained weights, please refer to [fchollet's Github](https://github.com/fchollet/deep-learning-models/blob/master/resnet50.py). Pre-trained weights can also be found in [fchollet's releases](https://github.com/fchollet/deep-learning-models/releases).

### References

1. [fchollet's Implementation of ResNet50](https://github.com/fchollet/deep-learning-models/blob/master/resnet50.py)
2. [deeplearning.ai-CNN作业-ResNet实现](https://zhuanlan.zhihu.com/p/31820167)
3. [使用keras搭建残差网络](https://zhuanlan.zhihu.com/p/31820167)
4. [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)