<a href="https://colab.research.google.com/github/Mikful/Coin-Collector/blob/master/Template_TensorFlow_2_0_Colab_6_Transfer_Learning_and_Fine_Tuning_with_MobileNetV2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

![alt text](https://live.staticflickr.com/4544/38228876666_3782386ca7_b.jpg)

## Stage 1: Install dependencies and setting up GPU environment

In [0]:
!pip install tensorflow-gpu==2.0.0.alpha0

Collecting tensorflow-gpu==2.0.0.alpha0
[?25l  Downloading https://files.pythonhosted.org/packages/1a/66/32cffad095253219d53f6b6c2a436637bbe45ac4e7be0244557210dc3918/tensorflow_gpu-2.0.0a0-cp36-cp36m-manylinux1_x86_64.whl (332.1MB)
[K     |████████████████████████████████| 332.1MB 79kB/s 
Collecting tf-estimator-nightly<1.14.0.dev2019030116,>=1.14.0.dev2019030115 (from tensorflow-gpu==2.0.0.alpha0)
[?25l  Downloading https://files.pythonhosted.org/packages/13/82/f16063b4eed210dc2ab057930ac1da4fbe1e91b7b051a6c8370b401e6ae7/tf_estimator_nightly-1.14.0.dev2019030115-py2.py3-none-any.whl (411kB)
[K     |████████████████████████████████| 419kB 41.1MB/s 
Collecting tb-nightly<1.14.0a20190302,>=1.14.0a20190301 (from tensorflow-gpu==2.0.0.alpha0)
[?25l  Downloading https://files.pythonhosted.org/packages/a9/51/aa1d756644bf4624c03844115e4ac4058eff77acd786b26315f051a4b195/tb_nightly-1.14.0a20190301-py3-none-any.whl (3.0MB)
[K     |████████████████████████████████| 3.0MB 28.9MB/s 
Installin

In [0]:
!pip install tqdm # progress bar



### Downloading the Dogs vs Cats dataset 

- `!wget` goes to a particular server and downloads the package locally
- link to find dataset
- `-0` specifies where dataset is to be saved


In [0]:
!wget --no-check-certificate \
    https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip \
    -O ./cats_and_dogs_filtered.zip

--2019-10-13 19:30:26--  https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip
Resolving storage.googleapis.com (storage.googleapis.com)... 74.125.124.128, 2607:f8b0:4001:c08::80
Connecting to storage.googleapis.com (storage.googleapis.com)|74.125.124.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 68606236 (65M) [application/zip]
Saving to: ‘./cats_and_dogs_filtered.zip’


2019-10-13 19:30:27 (156 MB/s) - ‘./cats_and_dogs_filtered.zip’ saved [68606236/68606236]



## Stage 2: Dataset preprocessing

### Import project dependencies


In [0]:
import os # Operating System for paths, folders etc
import zipfile # For unzipping the dataset
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt

from tqdm import tqdm_notebook # progress bar
from tensorflow.keras.preprocessing.image import ImageDataGenerator # to create automatic image preprocessing pipeline

%matplotlib inline # all visualizations stay within notebook
tf.__version__ # print tf version

UsageError: unrecognized arguments: # all visualizations stay within notebook


### Unzipping the Dogs vs Cats dataset

In [0]:
dataset_path = "./cats_and_dogs_filtered.zip"

In [0]:
zip_object = zipfile.ZipFile(file=dataset_path, mode="r")

In [0]:
zip_object.extractall("./")

In [0]:
zip_object.close()

### Seting up dataset paths

In [0]:
dataset_path_new = "./cats_and_dogs_filtered/"

In [0]:
train_dir = os.path.join(dataset_path_new, "train")
validation_dir = os.path.join(dataset_path_new, "validation")

## Building the model

### Loading the pre-trained model (MobileNetV2)

In [0]:
IMG_SHAPE = (128, 128, 3) # (px_x, px_y, 3 RGB layers)

- `tf.keras.applications.` allows to choose many different pre-trained models (Inception,ResNet, MobileNet etc)

- `input_shape=IMG_SHAPE` -- (px_x, px_y, 3 RGB layers)
- `include_top=False` -- To Create a Custom Head
- `weights="imagenet"` -- Load pre-trained ImageNet weights


In [0]:
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE, include_top=False, weights="imagenet")

Downloading data from https://github.com/JonathanCMitchell/mobilenet_v2_keras/releases/download/v1.1/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_128_no_top.h5


In [0]:
base_model.summary()

Model: "mobilenetv2_1.00_128"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            [(None, 128, 128, 3) 0                                            
__________________________________________________________________________________________________
Conv1_pad (ZeroPadding2D)       (None, 129, 129, 3)  0           input_1[0][0]                    
__________________________________________________________________________________________________
Conv1 (Conv2D)                  (None, 64, 64, 32)   864         Conv1_pad[0][0]                  
__________________________________________________________________________________________________
bn_Conv1 (BatchNormalizationV1) (None, 64, 64, 32)   128         Conv1[0][0]                      
_______________________________________________________________________________

### Freezing the base model

- First freeze the `base_model`, to only train the weights of the Custom Head we add

In [0]:
base_model.trainable = False

### Defining the custom head for our network

1.   Check the output size of the base model: `base_model.output`
2.  Define variable `global_average_layer` to take the average size of all number of layers in the input using:`GlobalAveragePooling2D`
3. Size of `global_average_layer` should be (None, Last Dim of `base_model.output` Layer)
4. Define our Output/Prediction Layer:

`prediction_layer = tf.keras.layers.Dense(units=1, activation='sigmoid')(global_average_layer)`

- `units=` Should have same number of units as output classes in our dataset. In this case, as Binary Classification == 1. 
- `activation='sigmoid'` -- For sigmoid activation for Binary Classification in this case.
- `(global_average_layer)` -- Takes the input as the global_average_layer








In [0]:
base_model.output

<tf.Tensor 'out_relu/Relu6:0' shape=(None, 4, 4, 1280) dtype=float32>

In [0]:
global_average_layer = tf.keras.layers.GlobalAveragePooling2D()(base_model.output)

In [0]:
global_average_layer # check size of layer (should be None, Last Dimension of Input Layer)

<tf.Tensor 'global_average_pooling2d/Mean:0' shape=(None, 1280) dtype=float32>

In [0]:
prediction_layer = tf.keras.layers.Dense(units=1, activation='sigmoid')(global_average_layer)

### Defining the model

- `tf.keras.models.Model()` - used instead of Sequential as can specify inputs (base_model) and outputs (prediction_layer) custom layer

In [0]:
model = tf.keras.models.Model(inputs=base_model.input, outputs=prediction_layer)

In [0]:
model.summary()

Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            [(None, 128, 128, 3) 0                                            
__________________________________________________________________________________________________
Conv1_pad (ZeroPadding2D)       (None, 129, 129, 3)  0           input_1[0][0]                    
__________________________________________________________________________________________________
Conv1 (Conv2D)                  (None, 64, 64, 32)   864         Conv1_pad[0][0]                  
__________________________________________________________________________________________________
bn_Conv1 (BatchNormalizationV1) (None, 64, 64, 32)   128         Conv1[0][0]                      
______________________________________________________________________________________________

### Compiling the model
3 arguments:

- Optimizer: `optimizer=tf.keras.optimizers.RMSprop(lr=0.0001)` RMSprop proven to work best with MobileNet. Small Learning Rate lr as pre-trained model (chosen using Stack Overflow info).
- Loss Function: `binary_crossentropy` for Binary Classification.
- Metrics: `accuracy` for simple metric

In [0]:
model.compile(optimizer=tf.keras.optimizers.RMSprop(lr=0.0001), loss="binary_crossentropy", metrics=["accuracy"])

### Creating Data Generators

Resizing images

    Big pre-trained architecture support only certain input sizes.

For example: MobileNet (architecture that we use) supports: (96, 96), (128, 128), (160, 160), (192, 192), (224, 224).

In [0]:
data_gen_train = ImageDataGenerator(rescale=1/255.)
data_gen_valid = ImageDataGenerator(rescale=1/255.)

**Create generators:**

`flow_from_directory` loads data from the folder directly, so won't take up any RAM or slow down the training process

1.   Folder path: e.g. `train_dir`
2.   `target_size=(128,128)` -- should be the same a IMG shape defined earlier (minus 3 RGB layers)
3. `batch_size=` -- how many images we feed to the model at one time
4. `class_mode=` -- can be binary, categorical or input



In [0]:
train_generator = data_gen_train.flow_from_directory(train_dir, target_size=(128,128), batch_size=128, class_mode="binary")

Found 2000 images belonging to 2 classes.


In [0]:
valid_generator = data_gen_valid.flow_from_directory(validation_dir, target_size=(128,128), batch_size=128, class_mode="binary")

Found 1000 images belonging to 2 classes.


### Training the model

Use `model.fit_generator()` function not model.fit()

Inputs:

1.   Training data: `train_generator`
2.   Epochs
3. Validation data: `validation_data=valid_generator`



In [0]:
model.fit_generator(train_generator, epochs=5, validation_data=valid_generator)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7fc615522c18>

### Transfer learning model evaluation

In [0]:
valid_loss, valid_accuracy = model.evaluate_generator(valid_generator)

In [0]:
print("Accuracy after transfer learning: {}".format(valid_accuracy))

Accuracy after transfer learning: 0.9549999833106995


## Fine tuning


There are a few pointers:

- DO NOT use Fine tuning on the whole network; only a few top layers are enough. In most cases, they are more specialized. The goal of the Fine-tuning is to adopt that specific part of the network for our custom (new) dataset.
- Start with the fine tunning AFTER you have finished with transfer learning step. If we try to perform Fine tuning immediately, gradients will be too different between our custom head layer and a few unfrozen layers from the base model.
- If you have a small dataset, it may not be a good idea to apply Fine Tuning as you may overfit the model.

### Un-freeze a few top layers from the model

In [0]:
base_model.trainable = True

In [0]:
print("Number of layers in the base model: {}".format(len(base_model.layers)))

Number of layers in the base model: 155


In [0]:
fine_tune_at = 120

In [0]:
for layer in base_model.layers[:fine_tune_at]:
    layer.trainable = False

### Compiling the model for fine-tuning

In [0]:
model.compile(optimizer=tf.keras.optimizers.RMSprop(lr=0.0001),
              loss='binary_crossentropy',
              metrics=['accuracy'])

### Fine tuning

In [0]:
model.fit_generator(train_generator,  
                    epochs=5, 
                    validation_data=valid_generator)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7fc59f8fbf98>

### Evaluating the fine tuned model

In [0]:
valid_loss, valid_accuracy = model.evaluate_generator(valid_generator)

In [0]:
print("Validation accuracy after fine tuning: {}".format(valid_accuracy))

Validation accuracy after fine tuning: 0.9670000076293945
