# Transfer Learning in Computer Vision

In this notebook, we will apply transfer learning for the CIFAR-10 classification task. CIFAR-10 dataset (https://www.cs.toronto.edu/~kriz/cifar.html) consists of 60000 32x32 color images in 10 classes (6000 images per class). There are 50000 training images and 10000 test images. We will use the VGG16 pretrained model which is trained on the ImageNet dataset.

In [48]:
import tensorflow.keras as K

Let's load the CIFAR-10 dataset whic is included in keras:

In [49]:
(train_X, train_y), (vaild_X, valid_y) = K.datasets.cifar10.load_data()

Below, we preprocess the train and validation sets. We preprocess the input images using the vgg16.preprocess_input function which converts images from RGB to BGR and then each color channel is zero-centered with respect to the ImageNet dataset (without scaling). We also convert the label column to categorical (one-hot encoding).

In [50]:
def preprocess_input(X, Y):
        X = X.astype('float32')
        X_preprocessed = K.applications.vgg16.preprocess_input(X)
        Y_preprocessed = K.utils.to_categorical(Y, 10)
        return(X_preprocessed, Y_preprocessed)

In [51]:
train_X,train_y = preprocess_input(Xt, Yt)
vaild_X,valid_y = preprocess_input(X, Y)

Let's load the VGG16 model without the top classification layer, these layer are frozen so that we dont make changes to them as we train the model for our new task. The include_top parameter determines whether to include the 3 fully-connected layers at the top of the network.

In [61]:
base_model = K.applications.vgg16.VGG16(include_top=False,
                                        weights='imagenet',
                                        pooling='avg',
                                        input_shape=(32, 32, 3))

In [62]:
base_model.summary()

Model: "vgg16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_6 (InputLayer)        [(None, 32, 32, 3)]       0         
                                                                 
 block1_conv1 (Conv2D)       (None, 32, 32, 64)        1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 32, 32, 64)        36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 16, 16, 64)        0         
                                                                 
 block2_conv1 (Conv2D)       (None, 16, 16, 128)       73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 16, 16, 128)       147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, 8, 8, 128)         0     

We will add new custom top layers to this pretrained model and will train only those last layers. We will add a flatten layer, followed by a batchnormalization, followed by two dense layers and a dropout layer, and finally a softmax layer for classification. The number of units in the softmax layer will be equal to the number of classes (10).

In [66]:
# Extract the last layer from third block of vgg16 model
last = base_model.get_layer('block3_pool').output
# Add own new layers on top of it
x = K.layers.Flatten()(last)
x = K.layers.BatchNormalization()(x)
x = K.layers.Dense(256, activation='relu')(x)
x = K.layers.Dense(256, activation='relu')(x)
x = K.layers.Dropout(0.6)(x)
output_layer = K.layers.Dense(10, activation='softmax')(x)
model = K.Model(base_model.input, output_layer)
    
model.summary()

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_6 (InputLayer)        [(None, 32, 32, 3)]       0         
                                                                 
 block1_conv1 (Conv2D)       (None, 32, 32, 64)        1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 32, 32, 64)        36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 16, 16, 64)        0         
                                                                 
 block2_conv1 (Conv2D)       (None, 16, 16, 128)       73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 16, 16, 128)       147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, 8, 8, 128)         0     

Before we retrain the model, we will freeze the vgg16 layers so that we don't retrain them (we only want to retrain our own newly added top layers):

In [67]:
for layer in base_model.layers:
     layer.trainable = False

We will define a callback to periodically save the best model to disk. A callback is an object that can perform actions at various stages of training such as at the start or end of an epoch.

In [69]:
callback = []
callback += [K.callbacks.ModelCheckpoint('cifar10.h5',save_best_only=True,mode='min')]
LEARNING_RATE=1e-4
model.compile(optimizer=K.optimizers.Adam(lr=LEARNING_RATE), loss='categorical_crossentropy',metrics=['accuracy'])

  super(Adam, self).__init__(name, **kwargs)


In [70]:
model.fit(x=train_X, y=train_y,
              batch_size=128,
              validation_data=(vaild_X, valid_y),
              epochs=10, shuffle=True,
              callbacks=callback,
              verbose=1
              )

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7fc1dfbeba30>