# Transfer learning

### What is Transfer Learning?

A machine learning method where a model developed for a task is reused as the starting point for a model on a second task.


### Why Transfer Learning?

- In practice a very few people train a Convolution network from scratch (random initialisation) because it is rare to get  enough dataset. So, using pre-trained network weights as initialisations or a fixed feature extractor helps in solving most of the problems in hand.


- Very Deep Networks are expensive to train. The most complex models take weeks to train using hundreds of machines equipped with expensive GPUs.


- Determining the topology/flavour/training method/hyper parameters for deep learning is a black art with not much theory to guide you.


### How Transfer Learning helps ?

When you look at what these Deep Learning networks learn (Figure below), they try to detect edges in the earlier layers, Shapes in the middle layer and some high level data specific features in the later layers. These trained networks are generally helpful in solving other computer vision problems. If you have another different set of image, it will be good if just using the pre-trained weights rather than you trainyour architecture from scratch.

<img src = "https://cdn-images-1.medium.com/max/1000/1*L8NWufrce1Bt9aDIN7Tu4A.png">

### Lets have a look at how to do transfer learning using Keras and various cases in Transfer learning.

Starting with include important libraries

In [None]:
import os
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential, Model 
from keras.layers import Dropout, Flatten, Dense, GlobalAveragePooling2D
from keras import backend as k 
from keras.callbacks import ModelCheckpoint, LearningRateScheduler, TensorBoard, EarlyStopping

## Loading dataset

In this excercise, we use cats vs dogs data

In [None]:
dataset_folder = 'catdog'

filename_path = os.getcwd() + '\\' + dataset_folder;

model_train = filename_path + '\\train\\'  
model_val = filename_path + '\\val\\'
model_test = filename_path + '\\test\\'

train_datagen = ImageDataGenerator(rescale=1./255, fill_mode='nearest')

valid_datagen = ImageDataGenerator(rescale=1./255)

test_datagen = ImageDataGenerator(rescale=1./255)

##Creating Training set
training_set = train_datagen.flow_from_directory(directory = model_train, # train dataset directory
                                                 color_mode = 'rgb',      # Since we use color picture, set to 'rgb' or 'gray' for gray picture
                                                 target_size = (240, 240),# rescaling the picture size to 240pix x 240pix
                                                 batch_size = 64, 
                                                 shuffle=False, 
                                                 class_mode = 'binary')   # Since there are only cats and dogs class. We set it to binary


##Creating Validation set
valid_set = valid_datagen.flow_from_directory(directory = model_val,
                                             color_mode = 'rgb',
                                             target_size = (240, 240),
                                             shuffle=False, 
                                             batch_size = 64,
                                             class_mode = 'binary')


##Creating Test set
test_set = test_datagen.flow_from_directory(directory=model_test,
                                            target_size=(240, 240),
                                            color_mode="rgb",
                                            batch_size=1,
                                            class_mode=None, #Since in the test folder only have 1 class
                                            shuffle=False,
                                            seed=42 )


print(training_set.class_indices)
print(len(training_set))
valid_set.n
valid_set.batch_size

In [None]:
img_width, img_height = 240, 240
train_data_dir = "data/train"
validation_data_dir = "data/val"
nb_train_samples = 20000
nb_validation_samples = 5000
batch_size = 16
epochs = 50

## 1.0 Transfer learning





For the first step, call a library for the specific architecture (in this case is VGG16). For more information on lists of available architecture on Keras, click [HERE](https://keras.io/applications/)

In [None]:
from keras.applications.vgg16 import VGG16 # in this excercise we are going to use VGG16 architecture

Then, call the model and show summary of the model.



In [None]:
model = VGG16(weights = 'imagenet', include_top=False, input_shape = (img_width, img_height, 3))

In [None]:
model.summary()

The code below is to determine which layers on the VGG16 to be freeze. That means, the weight on the freeze layer will not be trained again.

There are **two** conditions whether to freeze or train the whole network again

### 1 - New dataset is small and similar to original dataset:

There is a problem of over-fitting, if we try to train the entire network. Since the data is similar to the original data, we expect higher-level features in the ConvNet to be relevant to this dataset as well. Hence, the best idea might be to train a linear classifier on the CNN codes.

<img src = "img/freeze.jpg">

### 2 - New dataset is large and similar to the original dataset

Since we have more data, we can have more confidence that we won’t overfit if we were to try to fine-tune through the full network.

<img src = "img/nf.jpg">


In [None]:
for layer in model.layers:
    layer.trainable = False

Now lets model that need to be fine tuning

In [None]:
x = model.output
x = Flatten()(x)
x = Dense(1024, activation="relu")(x)
x = Dropout(0.5)(x)
x = Dense(1024, activation="relu")(x)
predictions = Dense(1, activation="sigmoid")(x)

In [None]:
model_final = Model(input = model.input, output = predictions)

In [None]:
model_final.compile(loss = "binary_crossentropy", optimizer = optimizers.SGD(lr=0.0001, momentum=0.9), metrics=["accuracy"])

Run it!

In [None]:
model_final.fit_generator(
        training_set,
        steps_per_epoch=training_set.n // training_set.batch_size,
        epochs=5,
        validation_data=valid_set,
        validation_steps=valid_set.n // valid_set.batch_size)

**Expected output**: your expected output should be close to ours and your lost value should decrease, and it should be better than previous excercise.

<table> 
<tr>
    <td> 
    **Train Accuracy   =**
    </td>

    <td> 
      0.8501
    </td> 
</tr> 

<tr>
    <td> 
    **Test Accuracy   =**
    </td>

    <td> 
      0.8814
    </td> 
</tr> 
</table>

### References:

- https://medium.com/@14prakash/transfer-learning-using-keras-d804b2e04ef8
- https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html