# TRANSFER LEARNING - VGG 16

Transfer learning is a very useful idea in Deep Learning, where we use pretrained model and apply it on similar task. This is particularly useful in a scenario where we have very less data to train deep model from scratch.

Typically Transfer learning involves below steps:

1. Take a pretrained model
2. Freeze the model parameter, i.e. our pretrained model weight will not chnage during training
3. Add additional trainable layer on top of our pretrained model that will be trained to achieve final goal.
4. Additionally it is possible to finetune model by training complete model by setting Trainable parameter of our Pretrained Model as True. (In this project we will not do that)

The intution is simple, in the model we can think the job of initial layers is to identify simpler and more generic pattern like lines, edges etc. in case of conv layers and as we go deep specialized feature/pattern is identified like facial expressions etc. , we will give pretrained model the job to identify generic patterns/features which it has already learnt and additional added layer to find specialized pattern by training it.

Ideally we should use model trained on similar dataset for transfer learning. Also transfer learning is most effective in scenario where we have small dataset to train model from Scratch.

For our project to demonstrate Transfer learning concept we will use Pretarined model "VGG 16" to perform classification problem on CIFAR10 dataset. VGG 16 model is trained on "Imagenet" dataset

In [1]:
import tensorflow as tf

In [2]:
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
[1m170498071/170498071[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m50s[0m 0us/step


In [8]:
x_train_preprocess = tf.keras.applications.vgg16.preprocess_input(x_train, data_format=None)
x_test_preprocess = tf.keras.applications.vgg16.preprocess_input(x_test, data_format=None)

In [9]:
x_train_preprocess.shape

(50000, 32, 32, 3)

We are initializing our VGG 16 model provided as part of keras with weights it has learnt while being trained on Imagenet dataset. 
We will keep include_top = False, this will make sure Fully connected layers of the model(towards the end of model) are not considered as we need to add our own trainable fully connected layer that does classification across 10 classes.
we use pooling ="avg" to ensure the model give 2D output rather that 4D output which will help in easily connecting to additional dense layers, otherwise we need to Flatten it.


In [79]:
base_model = tf.keras.applications.VGG16(
    include_top=False,
    weights="imagenet",
    input_shape=(32,32,3),
    pooling='avg',
    name="vgg16",
)

In [80]:
base_model.summary()

Below we are freezing the layers in the model, as mentioned initially and adding additional trainable layer. Check the model summary to see trainable and un-trainable parameter

In [81]:
base_model.trainable = False

In [14]:
inputs = tf.keras.layers.Input(shape=(32,32,3))
x = base_model(inputs)
x = tf.keras.layers.Dense(64,activation='relu')(x)
x = tf.keras.layers.Dense(32, activation='relu')(x)
outputs = tf.keras.layers.Dense(10,activation='softmax')(x)

In [15]:
final_model = tf.keras.models.Model(inputs = inputs, outputs=outputs)

In [16]:
final_model.summary()

In [17]:
final_model.compile(loss='sparse_categorical_crossentropy',optimizer='adam', metrics=['sparse_categorical_accuracy'])

We did training for only small number of epochs considering training is done on local machine and we want to just show the idea behind the same, and acchieved val_accuracy of around 65%

In [19]:
final_model.fit(x=x_train_preprocess,y = y_train,batch_size=64,epochs=5,validation_data=(x_test_preprocess,y_test))

Epoch 1/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m347s[0m 444ms/step - loss: 0.9827 - sparse_categorical_accuracy: 0.6604 - val_loss: 1.0577 - val_sparse_categorical_accuracy: 0.6381
Epoch 2/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m334s[0m 427ms/step - loss: 0.9253 - sparse_categorical_accuracy: 0.6797 - val_loss: 1.0203 - val_sparse_categorical_accuracy: 0.6509
Epoch 3/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m323s[0m 413ms/step - loss: 0.8614 - sparse_categorical_accuracy: 0.7022 - val_loss: 1.0235 - val_sparse_categorical_accuracy: 0.6541
Epoch 4/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m339s[0m 433ms/step - loss: 0.8175 - sparse_categorical_accuracy: 0.7139 - val_loss: 1.0350 - val_sparse_categorical_accuracy: 0.6515
Epoch 5/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m337s[0m 431ms/step - loss: 0.7986 - sparse_categorical_accuracy: 0.7215 - val_loss: 1.0379 - val_sparse_categorical_ac

<keras.src.callbacks.history.History at 0x27d700c9cd0>

Now it is very much possible to consider any layer in the pretrained model and from there add our own trainable layers as shown below

In [82]:
base_last_layer = base_model.get_layer('block4_conv3')

In [83]:
base_last_layer_output = base_last_layer.output

In [87]:
z = tf.keras.layers.Flatten()(base_last_layer_output)
z = tf.keras.layers.Dense(64,activation='relu')(z)
z = tf.keras.layers.Dense(32, activation='relu')(z)
outputs1 = tf.keras.layers.Dense(10,activation='softmax')(z)


In [88]:
final_model1 = tf.keras.models.Model(inputs = base_model.input, outputs=outputs1)

In [89]:
final_model1.summary()

In [90]:
final_model1.compile(loss='sparse_categorical_crossentropy',optimizer='adam', metrics=['sparse_categorical_accuracy'])

In [93]:
final_model1.fit(x=x_train_preprocess,y = y_train,batch_size=64,epochs=5,validation_data=(x_test_preprocess,y_test))

Epoch 1/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m285s[0m 364ms/step - loss: 1.0230 - sparse_categorical_accuracy: 0.6825 - val_loss: 1.3747 - val_sparse_categorical_accuracy: 0.6422
Epoch 2/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m284s[0m 363ms/step - loss: 1.1718 - sparse_categorical_accuracy: 0.6416 - val_loss: 1.3340 - val_sparse_categorical_accuracy: 0.6110
Epoch 3/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m275s[0m 351ms/step - loss: 1.0410 - sparse_categorical_accuracy: 0.6670 - val_loss: 1.2580 - val_sparse_categorical_accuracy: 0.6248
Epoch 4/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m277s[0m 354ms/step - loss: 1.0460 - sparse_categorical_accuracy: 0.6553 - val_loss: 1.2053 - val_sparse_categorical_accuracy: 0.6223
Epoch 5/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m283s[0m 362ms/step - loss: 1.0703 - sparse_categorical_accuracy: 0.6588 - val_loss: 1.3687 - val_sparse_categorical_ac

<keras.src.callbacks.history.History at 0x27de6b03590>