# CNN Architectures

In this tutorial, we will build the popular CNN architectures. we will take two approaches



1.   Build from scratch with keras layers package
2.   Customize the prebuilt models from keras applications package and also use transfer learning with pretrained imagenet weights.



# Dataset for classification of Dogs and Cats

**DataSource**:
https://www.kaggle.com/ayushsharma2k/dogcat-classificationcnn

This dataset contains 8000 images of dogs and cats for training set and 2000 images for test set.

# Mount the Drive

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


# Import libraries

In [84]:
import keras
from keras.models import Sequential,Model
from keras.layers import Dense, Activation, Dropout, Flatten,Conv2D, MaxPooling2D,Input,AveragePooling2D,Concatenate,add,GlobalAveragePooling2D
from keras.layers.normalization import BatchNormalization
from keras.preprocessing.image import ImageDataGenerator
from keras.layers.merge import concatenate
from keras import optimizers
import numpy as np
np.random.seed(1000)


# Data preprocessing

 As you already know by now, data should be formatted into appropriately pre-processed floating point tensors before being fed into our network. Currently, our data sits on a drive as JPEG files, so the steps for getting it into our network are roughly:

* Read the picture files.
* Decode the JPEG content to RBG grids of pixels.
* Convert these into floating point tensors.
* Rescale the pixel values (between 0 and 255) to the [0, 1] interval (as you know, neural networks prefer to deal with small input values).

 It may seem a bit daunting, but thankfully Keras has utilities to take care of these steps automatically. Keras has a module with image  processing helper tools, located at `keras.preprocessing.image`. In particular, it contains the class `ImageDataGenerator` which allows to  quickly set up Python generators that can automatically turn image files on disk into batches of pre-processed tensors. This is what we will use here.



In [3]:
train_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)

In [39]:
#Let's take a look at the output of one of these generators: it yields batches of 224*224 RGB images (shape `(32, 224, 224, 3)`) and categorical outputs

training_set = train_datagen.flow_from_directory('/content/drive/My Drive/cat-dog dataset/training_set',
                                                 target_size=(224,224),
                                                 batch_size=32,
                                                 class_mode='categorical')
test_set = test_datagen.flow_from_directory('/content/drive/My Drive/cat-dog dataset/test_set',
                                            target_size=(224,224),
                                            batch_size=32,
                                            class_mode='categorical')

Found 8000 images belonging to 2 classes.
Found 2000 images belonging to 2 classes.


# Alexnet - from scratch

AlexNet consists of 5 Convolutional Layers and 3 Fully Connected Layers.

Imagesource:https://www.learnopencv.com/wp-content/uploads/2018/05/AlexNet-1.png
![Alexnet architecture](https://www.learnopencv.com/wp-content/uploads/2018/05/AlexNet-1.png)

In [24]:
model = Sequential()

model.add(Conv2D(filters=96, input_shape=(224,224,3), kernel_size=(11,11),strides=(4,4), padding='valid',activation="relu"))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='valid'))
model.add(BatchNormalization())

model.add(Conv2D(filters=256, kernel_size=(11,11), strides=(1,1), padding='valid',activation="relu"))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='valid'))
model.add(BatchNormalization())

model.add(Conv2D(filters=384, kernel_size=(3,3), strides=(1,1), padding='valid',activation="relu"))
model.add(BatchNormalization())

model.add(Conv2D(filters=384, kernel_size=(3,3), strides=(1,1), padding='valid',activation="relu"))
model.add(BatchNormalization())


model.add(Conv2D(filters=256, kernel_size=(3,3), strides=(1,1), padding='valid',activation="relu"))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='valid'))
model.add(BatchNormalization())

# Passing it to a dense layer
model.add(Flatten())
model.add(Dense(4096, input_shape=(224*224*3,),activation="relu"))
# Add Dropout to prevent overfitting
model.add(Dropout(0.4))
model.add(BatchNormalization())


model.add(Dense(4096,activation="relu"))
model.add(Dropout(0.4))
model.add(BatchNormalization())

model.add(Dense(1000,activation="relu"))
model.add(Dropout(0.4))
model.add(BatchNormalization())

# Output Layer
model.add(Dense(2,activation="softmax"))

model.summary()



Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_16 (Conv2D)           (None, 54, 54, 96)        34944     
_________________________________________________________________
max_pooling2d_10 (MaxPooling (None, 27, 27, 96)        0         
_________________________________________________________________
batch_normalization_25 (Batc (None, 27, 27, 96)        384       
_________________________________________________________________
conv2d_17 (Conv2D)           (None, 17, 17, 256)       2973952   
_________________________________________________________________
max_pooling2d_11 (MaxPooling (None, 8, 8, 256)         0         
_________________________________________________________________
batch_normalization_26 (Batc (None, 8, 8, 256)         1024      
_________________________________________________________________
conv2d_18 (Conv2D)           (None, 6, 6, 384)        

In [25]:
#  Compile 
model.compile(loss='categorical_crossentropy', optimizer='adam',metrics=['accuracy'])

# Train
model.fit(training_set,
          epochs=1,
          validation_data=test_set)

Epoch 1/1


<keras.callbacks.callbacks.History at 0x7efec0b935f8>

# VGG Model from scratch

VGG16 is a convolution neural net (CNN ) architecture which was used to win ILSVR(Imagenet) competition in 2014. It is considered to be one of the excellent vision model architecture till date. Most unique thing about VGG16 is that instead of having a large number of hyper-parameter they focused on having convolution layers of 3x3 filter with a stride 1 and always used same padding and maxpool layer of 2x2 filter of stride 2. It follows this arrangement of convolution and max pool layers consistently throughout the whole architecture. In the end it has 2 FC(fully connected layers) followed by a softmax for output. The 16 in VGG16 refers to it has 16 layers that have weights. This network is a pretty large network and it has about 138 million (approx) parameters.

Image source: https://miro.medium.com/max/940/1*3-TqqkRQ4rWLOMX-gvkYwA.png
![alt text](https://miro.medium.com/max/940/1*3-TqqkRQ4rWLOMX-gvkYwA.png)

There are different configurations in vgg model based on the number of layers.Here, we have implemented VGG-16.

Image source:https://qph.fs.quoracdn.net/main-qimg-30abbdf1982c8cb049ac65f3cf9d5640
![alt text](https://qph.fs.quoracdn.net/main-qimg-30abbdf1982c8cb049ac65f3cf9d5640)

**VGG16 Model**

In [89]:
model = Sequential()

model.add(Conv2D(input_shape=(224,224,3),filters=64,kernel_size=(3,3),padding="same", activation="relu"))
model.add(Conv2D(filters=64,kernel_size=(3,3),padding="same", activation="relu"))
model.add(MaxPooling2D(pool_size=(2,2),strides=(2,2)))

model.add(Conv2D(filters=128, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=128, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPooling2D(pool_size=(2,2),strides=(2,2)))

model.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPooling2D(pool_size=(2,2),strides=(2,2)))

model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPooling2D(pool_size=(2,2),strides=(2,2)))

model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPooling2D(pool_size=(2,2),strides=(2,2)))

model.add(Flatten())
model.add(Dense(units=4096,activation="relu"))
model.add(Dense(units=4096,activation="relu"))
model.add(Dense(units=2, activation="softmax"))
model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_520 (Conv2D)          (None, 224, 224, 64)      1792      
_________________________________________________________________
conv2d_521 (Conv2D)          (None, 224, 224, 64)      36928     
_________________________________________________________________
max_pooling2d_101 (MaxPoolin (None, 112, 112, 64)      0         
_________________________________________________________________
conv2d_522 (Conv2D)          (None, 112, 112, 128)     73856     
_________________________________________________________________
conv2d_523 (Conv2D)          (None, 112, 112, 128)     147584    
_________________________________________________________________
max_pooling2d_102 (MaxPoolin (None, 56, 56, 128)       0         
_________________________________________________________________
conv2d_524 (Conv2D)          (None, 56, 56, 256)      

In [90]:
#  Compile 
model.compile(loss='categorical_crossentropy', optimizer='adam',metrics=['accuracy'])

# Train
model.fit(training_set,
          epochs=1,
          validation_data=test_set)

Epoch 1/1


<keras.callbacks.callbacks.History at 0x7faaeb1a8f60>

# Transfer learning

Transfer learning is the concept in deep learning in which we take an existing model which is trained on far more data and use the features that the model learned from that data and use it for our problem. Since that model has learned from a lot of data so that model has been trained quite well to find some features. We can use those features and by tweaking some part of that trained model use it for our use case. In transfer learning instead of training all the layers of the model we lock some of the layers and use those trained weights in the locked layers to extract particular features from our data. We don’t need to lock all the layers we can choose to retrain some of the lower layers because those lower layers will be specialised for our data.

#VGG Model - using keras function with transfer learning

VGG model is already built in keras applications.

We are also going to use the weights which were saved based on pretrained models.This process is called Transfer learning.

Transfer learning helps in improving the accuracy and also reduces the time in training the model as most of the parameters are already trained.

we will use the custom layers at the end of the model.these are called top layers.we will train only the parameters of the top layers based on our current training set.

In [101]:
# we are including the top layers but we will train them. we are using weights which were obtained by pretraining imagenet dataset.
from keras.applications.vgg16 import VGG16
vggmodel = VGG16(weights='imagenet', include_top=True)

In [80]:
vggmodel.summary()

Model: "vgg16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_22 (InputLayer)        (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0     

In [81]:
# all the layers except the top layers are made non trainable.
for layers in (vggmodel.layers)[:19]:
    print(layers)
    layers.trainable = False

<keras.engine.input_layer.InputLayer object at 0x7faae9d0c978>
<keras.layers.convolutional.Conv2D object at 0x7faae9d0cac8>
<keras.layers.convolutional.Conv2D object at 0x7faae9d0c5f8>
<keras.layers.pooling.MaxPooling2D object at 0x7faae9d0cdd8>
<keras.layers.convolutional.Conv2D object at 0x7faae9d0c2e8>
<keras.layers.convolutional.Conv2D object at 0x7faae9cc8a20>
<keras.layers.pooling.MaxPooling2D object at 0x7faae9d77400>
<keras.layers.convolutional.Conv2D object at 0x7faae9d770f0>
<keras.layers.convolutional.Conv2D object at 0x7faafd1305c0>
<keras.layers.convolutional.Conv2D object at 0x7faae9c544e0>
<keras.layers.pooling.MaxPooling2D object at 0x7faae9c54ef0>
<keras.layers.convolutional.Conv2D object at 0x7faae9c54cf8>
<keras.layers.convolutional.Conv2D object at 0x7faaea234a90>
<keras.layers.convolutional.Conv2D object at 0x7faaea1a9630>
<keras.layers.pooling.MaxPooling2D object at 0x7faaea1a9e48>
<keras.layers.convolutional.Conv2D object at 0x7faaea1a9fd0>
<keras.layers.convolut

In [82]:
X= vggmodel.layers[-2].output
predictions = Dense(2, activation="softmax")(X)
model_final = Model(input = vggmodel.input, output = predictions)

  This is separate from the ipykernel package so we can avoid doing imports until


In [85]:
model_final.compile(loss = "categorical_crossentropy", optimizer = optimizers.SGD(lr=0.0001, momentum=0.9), metrics=["accuracy"])

In [86]:
model_final.summary()

Model: "model_12"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_22 (InputLayer)        (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0  

In [87]:
# Train
model_final.fit(training_set,
          epochs=1,
          validation_data = test_set)

Epoch 1/1


<keras.callbacks.callbacks.History at 0x7faaeaab6b38>

you can see the accuracy being improved significantly.

# GoogLenet - from scratch

The GoogLeNet architecture is very different from previous state-of-the-art architectures such as AlexNet and ZF-Net. It uses many different kinds of methods such as 1×1 convolution and global average pooling that enables it to create deeper architecture. 

Image source:https://miro.medium.com/max/1400/1*66hY3zZTf0Lw2ItybiRxyg.png

![alt text](https://miro.medium.com/max/1400/1*66hY3zZTf0Lw2ItybiRxyg.png)


 GoogLeNet architecture is also known as Inception Module. It goes deeper in parallel paths with different receptive field sizes.
 
 The idea of the inception layer is to cover a bigger area, but also keep a fine resolution for small information on the images. So the idea is to convolve in parallel different sizes from the most accurate detailing (1x1) to a bigger one (5x5).

Image source: https://iq.opengenus.org/content/images/2019/01/temp10.png
![alt text](https://iq.opengenus.org/content/images/2019/01/temp10.png)

In [40]:
def inception(x, filters):
    # 1x1
    path1 = Conv2D(filters=filters[0], kernel_size=(1,1), strides=1, padding='same', activation='relu')(x)

    # 1x1->3x3
    path2 = Conv2D(filters=filters[1][0], kernel_size=(1,1), strides=1, padding='same', activation='relu')(x)
    path2 = Conv2D(filters=filters[1][1], kernel_size=(3,3), strides=1, padding='same', activation='relu')(path2)
    
    # 1x1->5x5
    path3 = Conv2D(filters=filters[2][0], kernel_size=(1,1), strides=1, padding='same', activation='relu')(x)
    path3 = Conv2D(filters=filters[2][1], kernel_size=(5,5), strides=1, padding='same', activation='relu')(path3)

    # 3x3->1x1
    path4 = MaxPooling2D(pool_size=(3,3), strides=1, padding='same')(x)
    path4 = Conv2D(filters=filters[3], kernel_size=(1,1), strides=1, padding='same', activation='relu')(path4)

    return Concatenate(axis=-1)([path1,path2,path3,path4])

In [41]:
def auxiliary(x, name=None):
    layer = AveragePooling2D(pool_size=(5,5), strides=3, padding='valid')(x)
    layer = Conv2D(filters=128, kernel_size=(1,1), strides=1, padding='same', activation='relu')(layer)
    layer = Flatten()(layer)
    layer = Dense(units=256, activation='relu')(layer)
    layer = Dropout(0.4)(layer)
    layer = Dense(units=2, activation='softmax', name=name)(layer)
    return layer

In [45]:
def googlenet():
    layer_in = Input(shape=(224,224,3))
    
    # stage-1
    layer = Conv2D(filters=64, kernel_size=(7,7), strides=2, padding='same', activation='relu')(layer_in)
    layer = MaxPooling2D(pool_size=(3,3), strides=2, padding='same')(layer)
    layer = BatchNormalization()(layer)

    # stage-2
    layer = Conv2D(filters=64, kernel_size=(1,1), strides=1, padding='same', activation='relu')(layer)
    layer = Conv2D(filters=192, kernel_size=(3,3), strides=1, padding='same', activation='relu')(layer)
    layer = BatchNormalization()(layer)
    layer = MaxPooling2D(pool_size=(3,3), strides=2, padding='same')(layer)

    # stage-3
    layer = inception(layer, [ 64,  (96,128), (16,32), 32]) #3a
    layer = inception(layer, [128, (128,192), (32,96), 64]) #3b
    layer = MaxPooling2D(pool_size=(3,3), strides=2, padding='same')(layer)
    
    # stage-4
    layer = inception(layer, [192,  (96,208),  (16,48),  64]) #4a
    aux1  = auxiliary(layer, name='aux1')
    layer = inception(layer, [160, (112,224),  (24,64),  64]) #4b
    layer = inception(layer, [128, (128,256),  (24,64),  64]) #4c
    layer = inception(layer, [112, (144,288),  (32,64),  64]) #4d
    aux2  = auxiliary(layer, name='aux2')
    layer = inception(layer, [256, (160,320), (32,128), 128]) #4e
    layer = MaxPooling2D(pool_size=(3,3), strides=2, padding='same')(layer)
    
    # stage-5
    layer = inception(layer, [256, (160,320), (32,128), 128]) #5a
    layer = inception(layer, [384, (192,384), (48,128), 128]) #5b
    layer = AveragePooling2D(pool_size=(7,7), strides=1, padding='valid')(layer)
    
    # stage-6
    layer = Flatten()(layer)
    layer = Dropout(0.4)(layer)
    layer = Dense(units=256, activation='linear')(layer)
    main = Dense(units=2, activation='softmax', name='main')(layer)
    
    model = Model(inputs=layer_in, outputs=[main])
    
    return model

In [46]:
model = googlenet()
model.summary()

Model: "model_7"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_9 (InputLayer)            (None, 224, 224, 3)  0                                            
__________________________________________________________________________________________________
conv2d_331 (Conv2D)             (None, 112, 112, 64) 9472        input_9[0][0]                    
__________________________________________________________________________________________________
max_pooling2d_76 (MaxPooling2D) (None, 56, 56, 64)   0           conv2d_331[0][0]                 
__________________________________________________________________________________________________
batch_normalization_15 (BatchNo (None, 56, 56, 64)   256         max_pooling2d_76[0][0]           
____________________________________________________________________________________________

In [48]:
#  Compile 
model.compile(loss='categorical_crossentropy', optimizer='adam',metrics=['accuracy'])

# Train
model.fit(training_set,
          epochs=1,
          validation_data = test_set)

Epoch 1/1


<keras.callbacks.callbacks.History at 0x7faaff53fbe0>

#Googlenet - using keras function with transfer learning

we have done the transfer learning by creating an inception model without including the top layers.
The model is intialized with weights trained on imagenet dataset.

The layers (parameters) in this model are made non-trainable.

The top layers are added to the model and are trained using our dataset(cat-dog).


we can observe the decrease in trainable parameters and improvement in the accuracy.The time taken for training was also significantly less.


In [95]:
from keras.applications.inception_v3 import InceptionV3

base_model = InceptionV3(weights='imagenet', 
                                include_top=False, 
                                input_shape=(224, 224,3))
base_model.trainable = False

add_model = Sequential()
add_model.add(base_model)
add_model.add(GlobalAveragePooling2D())
add_model.add(Dropout(0.5))
add_model.add(Dense(2, 
                    activation='softmax'))

model = add_model
model.compile(loss='categorical_crossentropy', 
              optimizer=optimizers.SGD(lr=1e-4, 
                                       momentum=0.9),
              metrics=['accuracy'])
model.summary()

Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
inception_v3 (Model)         (None, 5, 5, 2048)        21802784  
_________________________________________________________________
global_average_pooling2d_7 ( (None, 2048)              0         
_________________________________________________________________
dropout_27 (Dropout)         (None, 2048)              0         
_________________________________________________________________
dense_34 (Dense)             (None, 2)                 4098      
Total params: 21,806,882
Trainable params: 4,098
Non-trainable params: 21,802,784
_________________________________________________________________


In [96]:
model.fit(training_set,
          epochs=1,
          validation_data = test_set)

Epoch 1/1


<keras.callbacks.callbacks.History at 0x7faae10c6f28>

# Resnet - from scratch

ResNet, short for Residual Networks is a classic neural network used as a backbone for many computer vision tasks.

ResNet first introduced the concept of skip connection. There are two reasons why Skip connections work:

1.   They mitigate the problem of vanishing gradient by allowing this alternate shortcut path for gradient to flow through
2.   They allow the model to learn an identity function which ensures that the higher layer will perform at least as good as the lower layer, and not worse





Image source :https://tariq-hasan.github.io/assets/images/resnet.png
![Resnet architecture](https://tariq-hasan.github.io/assets/images/resnet.png)

There are many variations in restnet depending on the number of layers. here we have implemented resnet 50.

Image source : https://neurohive.io/en/popular-networks/resnet/
![resnet architecture](https://neurohive.io/wp-content/uploads/2019/01/resnet-architectures-34-101.png)

In [52]:
# function for creating an identity or projection residual module
def residual_module(layer_in, n_filters):
	merge_input = layer_in
	# check if the number of filters needs to be increase, assumes channels last format
	if layer_in.shape[-1] != n_filters:
		merge_input = Conv2D(n_filters, (1,1), padding='same', activation='relu', kernel_initializer='he_normal')(layer_in)
	# conv1
	conv1 = Conv2D(n_filters, (3,3), padding='same', activation='relu', kernel_initializer='he_normal')(layer_in)
	# conv2
	conv2 = Conv2D(n_filters, (3,3), padding='same', activation='linear', kernel_initializer='he_normal')(conv1)
	# add filters, assumes filters/channels last
	layer_out = add([conv2, merge_input])
	# activation function
	layer_out = Activation('relu')(layer_out)
	return layer_out

In [69]:
def convolutional_block(X, filters,s=2):
   
    # Retrieve Filters
    F1, F2, F3 = filters
    
    # Save the input value
    X_shortcut = X


    ##### MAIN PATH #####
    # First component of main path 
    X = Conv2D(F1, (1, 1), strides = (s,s))(X)
    X = BatchNormalization(axis = 3)(X)
    X = Activation('relu')(X)

    # Second component of main path (≈3 lines)
    X = Conv2D(filters = F2, kernel_size = (3, 3), strides = (1,1), padding = 'same')(X)
    X = BatchNormalization(axis = 3)(X)
    X = Activation('relu')(X)


    # Third component of main path (≈2 lines)
    X = Conv2D(filters = F3, kernel_size = (1, 1), strides = (1,1), padding = 'valid')(X)
    X = BatchNormalization(axis = 3)(X)


    ##### SHORTCUT PATH #### (≈2 lines)
    X_shortcut = Conv2D(filters = F3, kernel_size = (1, 1), strides = (s,s), padding = 'valid')(X_shortcut)
    X_shortcut = BatchNormalization(axis = 3)(X_shortcut)

    # Final step: Add shortcut value to main path, and pass it through a RELU activation (≈2 lines)
    X = add([X, X_shortcut])
    X = Activation('relu')(X)
    
    
    return X

In [71]:
# define model input
input_layer = Input(shape=(224, 224, 3))
layer = Conv2D(64, kernel_size=(7, 7), strides=(2, 2), padding='same',activation='relu')(input_layer)
layer = BatchNormalization()(layer)
layer = MaxPooling2D(pool_size=(3,3), strides=2, padding='same')(layer)
layer = convolutional_block(layer, filters=[64, 64, 256],s=1)
# add resnet
layer = residual_module(layer, 64)
layer = convolutional_block(layer, filters=[128, 128, 512])
layer = residual_module(layer, 128)
layer = convolutional_block(layer, filters=[256, 256, 1024])
layer = residual_module(layer, 256)
layer = convolutional_block(layer, filters=[512, 512, 2048])
layer = residual_module(layer, 512)
layer = GlobalAveragePooling2D()(layer)
layer = Dropout(0.7)(layer)
layer = Dense(2, activation= 'softmax')(layer)

# create model
model = Model(inputs=input_layer, outputs=layer)
# summarize model
model.summary()

Model: "model_11"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_21 (InputLayer)           (None, 224, 224, 3)  0                                            
__________________________________________________________________________________________________
conv2d_478 (Conv2D)             (None, 112, 112, 64) 9472        input_21[0][0]                   
__________________________________________________________________________________________________
batch_normalization_41 (BatchNo (None, 112, 112, 64) 256         conv2d_478[0][0]                 
__________________________________________________________________________________________________
max_pooling2d_95 (MaxPooling2D) (None, 56, 56, 64)   0           batch_normalization_41[0][0]     
___________________________________________________________________________________________

In [72]:
#  Compile 
model.compile(loss='categorical_crossentropy', optimizer='adam',metrics=['accuracy'])

# Train
model.fit(training_set,
          epochs=1,
          validation_data = test_set)

Epoch 1/1


<keras.callbacks.callbacks.History at 0x7faaff9cff98>

# Resnet - using keras function with transfer learning

we have done the transfer learning by creating an Resnet50 model without including the top layers.
The model is intialized with weights trained on imagenet dataset.

The layers (parameters) in this model are made non-trainable.

The top layers are added to the model and are trained using our dataset(cat-dog).


we can observe the decrease in trainable parameters and improvement in the accuracy.The time taken for training was also significantly less.

In [99]:
from keras.applications.resnet50 import ResNet50

base_model = ResNet50(weights= None, include_top=False, input_shape= (224,224,3))
base_model.trainable = False

add_model = Sequential()
add_model.add(base_model)
add_model.add(GlobalAveragePooling2D())
add_model.add(Dropout(0.5))
add_model.add(Dense(2, 
                    activation='softmax'))

model = add_model
model.compile(loss='categorical_crossentropy', 
              optimizer=optimizers.SGD(lr=1e-4, 
                                       momentum=0.9),
              metrics=['accuracy'])
model.summary()



Model: "sequential_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
resnet50 (Model)             (None, 7, 7, 2048)        23587712  
_________________________________________________________________
global_average_pooling2d_9 ( (None, 2048)              0         
_________________________________________________________________
dropout_29 (Dropout)         (None, 2048)              0         
_________________________________________________________________
dense_36 (Dense)             (None, 2)                 4098      
Total params: 23,591,810
Trainable params: 4,098
Non-trainable params: 23,587,712
_________________________________________________________________


In [100]:
model.fit(training_set,
          epochs=1,
          validation_data = test_set)

Epoch 1/1


<keras.callbacks.callbacks.History at 0x7faadd9e67b8>