# Transfer Learning

# When and how to fine-tune
How do you decide what type of transfer learning you should perform on a new dataset? This is a function of several factors, but the two most important ones are the size of the new dataset (small or big), and its similarity to the original dataset (e.g. ImageNet-like in terms of the content of images and the classes, or very different, such as microscope images). Keeping in mind that ConvNet features are more generic in early layers and more original-dataset-specific in later layers, here are some common rules of thumb for navigating the 4 major scenarios:

1.New dataset is small and similar to original dataset : Since the data is small, it is not a good idea to fine-tune the ConvNet due to overfitting concerns. Since the data is similar to the original data, we expect higher-level features in the ConvNet to be relevant to this dataset as well. Hence, the best idea might be to train a linear classifier on the CNN codes.
2.New dataset is large and similar to the original dataset. Since we have more data, we can have more confidence that we wonâ€™t overfit if we were to try to fine-tune through the full network.
3.New dataset is small but very different from the original dataset. Since the data is small, it is likely best to only train a linear classifier. Since the dataset is very different, it might not be best to train the classifier form the top of the network, which contains more dataset-specific features. Instead, it might work better to train the SVM classifier from activations somewhere earlier in the network.
4.New dataset is large and very different from the original dataset. Since the dataset is very large, we may expect that we can afford to train a ConvNet from scratch. However, in practice it is very often still beneficial to initialize with weights from a pretrained model. In this case, we would have enough data and confidence to fine-tune through the entire network.

In [28]:
import os 
folders = os.listdir("images")
print(folders)

['cats', 'dogs', 'horses', 'humans']


In [29]:
image_data = []
labels = []

label_dict = {
            "cats":0,
            "dogs" : 1,
            "horses" :2,
            "humans"  :3
}

In [30]:
from keras.preprocessing import image

In [31]:
for ix in folders:
    path = os.path.join("images",ix)
    for im in os.listdir(path):
        img = image.load_img(os.path.join(path,im),target_size=((224,224)))
        img_array = image.img_to_array(img)
        image_data.append(img_array)
        labels.append(label_dict[ix])

In [32]:
print(len(image_data),len(labels))

580 580


In [33]:
import random
combined = list(zip(image_data,labels))
random.shuffle(combined)

image_data[:],labels[:] = zip(*combined)

In [34]:
print(labels[:5])

[1, 0, 1, 1, 1]


In [35]:
import numpy as np
x_train = np.array(image_data)
y_train = np.array(labels)

print(x_train.shape)

(580, 224, 224, 3)


In [36]:
from keras.utils import np_utils

y_train = np_utils.to_categorical(y_train)
print(y_train.shape)

(580, 4)


In [37]:
#creating the model Resnet 50
from keras.applications.resnet50 import ResNet50
from keras.optimizers import Adam
from keras.layers import *
from keras.models import Model

import matplotlib.pyplot as plt


In [38]:
model = ResNet50(include_top=False,weights = 'imagenet',input_shape=(224,224,3))

In [39]:
 model.summary()

Model: "resnet50"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_2 (InputLayer)            [(None, 224, 224, 3) 0                                            
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D)       (None, 230, 230, 3)  0           input_2[0][0]                    
__________________________________________________________________________________________________
conv1_conv (Conv2D)             (None, 112, 112, 64) 9472        conv1_pad[0][0]                  
__________________________________________________________________________________________________
conv1_bn (BatchNormalization)   (None, 112, 112, 64) 256         conv1_conv[0][0]                 
___________________________________________________________________________________________

In [47]:
avl = GlobalAveragePooling2D()(model.output)
fcl  = Dense(256,activation='relu')(avl)
d1 = Dropout(0.5)(fc1)
fc2 = Dense(4,activation='softmax')(d1)

model_new = Model(inputs = model.input ,outputs = fc2)
model_new.summary()



Model: "functional_4"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_2 (InputLayer)            [(None, 224, 224, 3) 0                                            
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D)       (None, 230, 230, 3)  0           input_2[0][0]                    
__________________________________________________________________________________________________
conv1_conv (Conv2D)             (None, 112, 112, 64) 9472        conv1_pad[0][0]                  
__________________________________________________________________________________________________
conv1_bn (BatchNormalization)   (None, 112, 112, 64) 256         conv1_conv[0][0]                 
_______________________________________________________________________________________

In [48]:
adam = Adam(lr=0.00003)
model_new.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['accuracy'])

In [49]:
for ix in range(len(model_new.layers)):
    print(ix, model_new.layers[ix])

0 <tensorflow.python.keras.engine.input_layer.InputLayer object at 0x000001A7F2004820>
1 <tensorflow.python.keras.layers.convolutional.ZeroPadding2D object at 0x000001A7EF59EAC0>
2 <tensorflow.python.keras.layers.convolutional.Conv2D object at 0x000001A7EF59E1F0>
3 <tensorflow.python.keras.layers.normalization_v2.BatchNormalization object at 0x000001A7EF46B850>
4 <tensorflow.python.keras.layers.core.Activation object at 0x000001A7C35859D0>
5 <tensorflow.python.keras.layers.convolutional.ZeroPadding2D object at 0x000001A7C3585F70>
6 <tensorflow.python.keras.layers.pooling.MaxPooling2D object at 0x000001A7EF5C4A00>
7 <tensorflow.python.keras.layers.convolutional.Conv2D object at 0x000001A7F1FF99A0>
8 <tensorflow.python.keras.layers.normalization_v2.BatchNormalization object at 0x000001A7F1C2E9A0>
9 <tensorflow.python.keras.layers.core.Activation object at 0x000001A78FD37B20>
10 <tensorflow.python.keras.layers.convolutional.Conv2D object at 0x000001A78FD3F1F0>
11 <tensorflow.python.keras.

In [50]:
for ix in range(169):
    model_new.layers[ix].trainable = False
    
model_new.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['accuracy'])
model_new.summary()

Model: "functional_4"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_2 (InputLayer)            [(None, 224, 224, 3) 0                                            
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D)       (None, 230, 230, 3)  0           input_2[0][0]                    
__________________________________________________________________________________________________
conv1_conv (Conv2D)             (None, 112, 112, 64) 9472        conv1_pad[0][0]                  
__________________________________________________________________________________________________
conv1_bn (BatchNormalization)   (None, 112, 112, 64) 256         conv1_conv[0][0]                 
_______________________________________________________________________________________

In [None]:
hist = model_new.fit(x_train,y_train,
                    shuffle = True,
                    batch_size = 16,
                    epochs = 20,
                    validation_split=0.20
                    )