<a href="https://colab.research.google.com/github/rahiakela/deep_learning_for_vision_systems/blob/master/6-transfer-learning/2_pretrained_network_as_feature_extractor.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##Project 1: A pretrained network as a feature extractor

In this project, we use a very small amount of data to train a classifier that detects images of dogs and cats. This is a pretty simple project, but the goal of the exercise is to see how to implement transfer learning when you have a very small amount of data and the target domain is similar to the source domain.

We will use the pretrained convolutional network as a feature
extractor. This means we are going to freeze the feature extractor part of the network, add our own classifier, and then retrain the network on our new small dataset.

For this implementation, we’ll be using the VGG16. Although it didn’t record the
lowest error in the ILSVRC, I found that it worked well for the task and was quicker to train than other models. I got an accuracy of about 96%, but you can feel free to use GoogLeNet or ResNet to experiment and compare results.

There are three major transfer learning approaches as follows:

1. Pretrained network as a classifier
2. Pretrained network as feature extractor
3. Fine tuning

Each approach can be effective and save significant time in developing and training a deep convolutional neural network model. It may not be clear as to which usage of the pre-trained model may yield the best results on your new computer vision task, therefore some experimentation may be required.

## Setup

In [1]:
import tensorflow as tf
from tensorflow import keras

from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Flatten, Dropout, BatchNormalization
from tensorflow.keras.preprocessing.image import ImageDataGenerator, load_img, img_to_array
from tensorflow.keras.applications import mobilenet, imagenet_utils
from tensorflow.keras.applications.vgg16 import VGG16, preprocess_input
from tensorflow.keras.optimizers import Adam, SGD
from tensorflow.keras.metrics import CategoricalCrossentropy
from tensorflow.keras.utils import to_categorical
import tensorflow.keras.backend as K

from sklearn.metrics import confusion_matrix
from sklearn.datasets import load_files

import numpy as np
import itertools
from tqdm import tqdm

import matplotlib.pyplot as plt
%matplotlib inline

## Pretrained network as a feature extractor

One other important takeaway from this project is to learn how to preprocess custom data and make it ready to train your neural network. In previous projects, we used CIFAR and MNIST datasets which are already preprocessed for us by Keras and all we had to do is to download it from Keras library and directly use them to train the network. In this project, I’m going to show a tutorial on how to structure your data repository and use Keras library to get your data ready.

The process to use a pretrained model as a feature extractor is well-established:

1. Preprocess the data to make it ready for the neural network.
2. Load in pretrained weights from the VGG16 network trained on a large dataset.
3. Freeze all the weights in the convolutional layers (feature extraction part). Remember - the layers to freeze are adjusted depending on the similarity of new task to original dataset. In our case, we observed that ImageNet has a lot of dogs and cats images so the network has already been trained to extract the detailed features of our target object.

4. Replace the fully-connected layers of the network with a custom classifier. You can add as many FC layers as you see fit and each have as many hidden units as you want. For simple problem like this, we are just going to add one hidden layer with 64 units. You can observe the results and tune up if the model is underfitting or down if the model is overfitting. For the softmax layer, the number of units must be set equal to the number of classes (2 units in our case).
5. Compile the network and run the training process on the new data of cats and dogs to optimize the model for the smaller dataset.
6. Evaluate the model.

Now let’s go through these steps one-by-one and implement this project.

## 1- Preprocess the data to make it ready for the neural network

Keras has this ImageDataGenerator class which allows us to perform image augmentation on the fly in a very easy way. You can read about that in Keras’s official [documentation](https://keras.io/api/preprocessing/image/). In this example, we are going to use the ImageDataGenerator class to generate our image tensors but we are not going to implement image augmentation for simplicity.

The ImageDataGenerator class has a method called flow_from_directory() that is used to read the images from folders containing images. This method expects your data directory to be structured as follows:

<img src='https://github.com/rahiakela/img-repo/blob/master/deep_learning_for_vision_systems/directory-structure.png?raw=1' width='800'/>

I have the data structured for you in the Github repo to be ready for you to use
flow_from_directory() method.

In [2]:
# download dataset
!git clone https://github.com/rahiakela/machine-learning-datasets -b dogs_vs_cats_dataset

Cloning into 'machine-learning-datasets'...
remote: Enumerating objects: 4119, done.[K
remote: Counting objects: 100% (4119/4119), done.[K
remote: Compressing objects: 100% (3465/3465), done.[K
remote: Total 4119 (delta 655), reused 4117 (delta 653), pack-reused 0[K
Receiving objects: 100% (4119/4119), 51.90 MiB | 29.52 MiB/s, done.
Resolving deltas: 100% (655/655), done.


In [3]:
!ls

machine-learning-datasets  sample_data


In [4]:
import zipfile

# Unzipping files
zip_file = "machine-learning-datasets/dogs_vs_cats_dataset.zip"
with zipfile.ZipFile(zip_file, 'r') as zip_ref:
    zip_ref.extractall(".")

Now, let’s load the data into train_path, valid_path, and
test_path variables then generate the train, valid, and test batches:

In [5]:
train_path  = 'dogs_vs_cats_dataset/data/train'
valid_path  = 'dogs_vs_cats_dataset/data/valid'
test_path  = 'dogs_vs_cats_dataset/data/test'

# ImageDataGenerator generates batches of tensor image data with real-time data augmentation.
# The data will be looped over (in batches). in this example, we won't be doing any image augmentation
train_batches = ImageDataGenerator(preprocessing_function=preprocess_input).flow_from_directory(train_path, target_size=(224, 224), batch_size=10)
valid_batches = ImageDataGenerator(preprocessing_function=preprocess_input).flow_from_directory(valid_path, target_size=(224, 224), batch_size=30)
test_batches  = ImageDataGenerator(preprocessing_function=preprocess_input).flow_from_directory(test_path, target_size=(224, 224), batch_size=50, shuffle=False)

Found 202 images belonging to 2 classes.
Found 103 images belonging to 2 classes.
Found 451 images belonging to 2 classes.


## 2- Download VGGNet and create network

We are going to download the VGG16 network from Keras and download it’s weights after being pretrained on ImageNet dataset. Remember that we want to remove the classifier part from this network so we will set the parameter include_top=False.

<img src='https://github.com/rahiakela/img-repo/blob/master/deep_learning_for_vision_systems/transfer_network.png?raw=1' width='800'/>


In [6]:
base_model = VGG16(weights="imagenet", include_top=False, input_shape=(224, 224, 3))
base_model.summary()

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
Model: "vgg16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, 224, 224, 3)]     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)    

## 3- Freeze all the weights in the convolutional layers

We will freeze the convolutional layers from the base_model created from the previous step and use that as a feature extractor, then add a classifier on top of it in the next step.

In [7]:
# iterate through its layers and lock them to make them not trainable
for layer in base_model.layers:
  layer.trainable = False

In [8]:
base_model.summary()

Model: "vgg16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, 224, 224, 3)]     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0     

## 4- Replace the fully-connected layers of the network with a custom classifier

Now let's add a few layers on top of the base model. In this example, we will add one FC layer with 64 hidden units and a softmax with 2 hidden units. We will also add batch norm and dropout layers to avoid overfitting.

In [9]:
# use “get_layer” method to save the last layer of the network
# save the output of the last layer to be the input of the next layer
last_layer = base_model.get_layer('block5_pool')
last_output = last_layer.output

# flatten the classifier input which is output of the last layer of VGG16 model
x = Flatten()(last_output)

# add 1 FC layers that has 64 units, batchnorm, dropout, and softmax layers
x = Dense(64, activation='relu', name='FC_2')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
# add our new softmax layer with 3 hidden units
x = Dense(2, activation='softmax', name='softmax')(x)

# instantiate a new_model using keras’s Model class
new_model = Model(inputs=base_model.input, outputs=x)
new_model.summary()

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, 224, 224, 3)]     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0     

## 5- Compile the network and run the training process

In [10]:
new_model.compile(optimizer=Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['accuracy'])
new_model.fit_generator(train_batches, steps_per_epoch=4, validation_data=valid_batches, validation_steps=2, epochs=20, verbose=2)



Epoch 1/20
4/4 - 63s - loss: 1.1611 - accuracy: 0.4500 - val_loss: 3.6035 - val_accuracy: 0.5167
Epoch 2/20
4/4 - 51s - loss: 1.1480 - accuracy: 0.6250 - val_loss: 1.4930 - val_accuracy: 0.7000
Epoch 3/20
4/4 - 50s - loss: 0.5984 - accuracy: 0.7000 - val_loss: 0.7127 - val_accuracy: 0.8000
Epoch 4/20
4/4 - 50s - loss: 0.3281 - accuracy: 0.8250 - val_loss: 0.7115 - val_accuracy: 0.8167
Epoch 5/20
4/4 - 50s - loss: 0.5290 - accuracy: 0.7750 - val_loss: 0.3947 - val_accuracy: 0.8833
Epoch 6/20
4/4 - 47s - loss: 0.2074 - accuracy: 0.9375 - val_loss: 0.2994 - val_accuracy: 0.9000
Epoch 7/20
4/4 - 50s - loss: 0.1847 - accuracy: 0.9250 - val_loss: 0.1828 - val_accuracy: 0.9333
Epoch 8/20
4/4 - 50s - loss: 0.3079 - accuracy: 0.8500 - val_loss: 0.2062 - val_accuracy: 0.9333
Epoch 9/20
4/4 - 50s - loss: 0.2192 - accuracy: 0.9000 - val_loss: 0.3646 - val_accuracy: 0.8500
Epoch 10/20
4/4 - 61s - loss: 0.0332 - accuracy: 1.0000 - val_loss: 0.3331 - val_accuracy: 0.8833
Epoch 11/20
4/4 - 50s - loss:

<tensorflow.python.keras.callbacks.History at 0x7fd4cac937d0>

Notice that the model was trained very quickly using a regular CPU computing power. Each epoch took approximately 25 to 29 seconds which means that it took the model less than 10 minutes to train for 20 epochs.

## 6- Evaluate the model.

Now, let’s use Keras’s evaluate() method to calculate the model accuracy.

In [13]:
def load_dataset(path):
  data = load_files(path)
  paths = np.array(data['filenames'])
  targets = to_categorical(np.array(data['target']))

  return paths, targets

In [14]:
test_files, test_targets = load_dataset("dogs_vs_cats_dataset/data/test")

Then, we create test_tensors to evaluate the model on them:

In [15]:
def path_to_tensor(img_path): 
  # loads RGB image as PIL.Image.Image type
  img = load_img(img_path, target_size=(224, 224))

  # convert PIL.Image.Image type to 3D tensor with shape (224, 224, 3)
  x = img_to_array(img)

  # convert 3D tensor to 4D tensor with shape (1, 224, 224, 3) and return 4D tensor 
  return np.expand_dims(x, axis=0)

In [16]:
def paths_to_tensor(img_paths):
  list_of_tensors = [path_to_tensor(img_path) for img_path in tqdm(img_paths)]

  return np.vstack(list_of_tensors)

In [17]:
test_tensors = preprocess_input(paths_to_tensor(test_files))

100%|██████████| 451/451 [00:03<00:00, 137.46it/s]


Now we can run Keras’s evaluate() method to calculate the model accuracy:

In [18]:
print('\nTesting loss: {:.4f}\nTesting accuracy: {:.4f}'.format(*new_model.evaluate(test_tensors, test_targets)))


Testing loss: 0.1069
Testing accuracy: 0.9512


In [19]:
# evaluate and print test accuracy
score = new_model.evaluate(test_tensors, test_targets)
print('\n', 'Test accuracy:', score[1])


 Test accuracy: 0.9512194991111755


The model has achieved an accuracy of 95.12% in less than 10 minutes of training.

This is very good, given our very small dataset.