# Transfer Learning 

(with a very large pre-trained convolutional network (VGG-16) on a small dataset of natural images)


We abandon MNIST for a more complex dataset of natural images of ants and bees. The task is of course telling ants from bees (A from B).

<img src="../NotebooksFigures/ants_and_bees.jpg" alt="drawing" width="600" >

Being the data much more complex we will need a very large convolutional network to do the job, but at the same time - since our dataset is relatively small (a few hundred images between training and validation set) - we cannot train it from scratch.
Having observed that the first layers of a large convolutional network learn representations which can be universally useful for object recognition purposes, we will take a very large one that will help us.
Specifically, we will use a VGG-16 pre-trained on ImageNet as a feature extractor.

<img src="../NotebooksFigures/vgg16.png" alt="drawing" width="200" >


On top of the VGG-16 - maimed of its classifier stack - we will add a brand new classifier stack with the correct output shape, train it for a few epochs, and finally test it on our data.

This operation is commonly denoted *Transfer Learning*.




In this exercise we will learn how to :

- download a pre-trained model
- prepare a dataset for a (binary) classification problem
- extract the representations (features) at the end of the convolutional stack
- fit a classifier stack to the new dataset, this classifier will be identical to the classifier stack of the original VGG-16, a part from the output shape
- evaluate the model obtained on validation data
- look at the errors made by the network i.e. false ants and false bees

In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from __future__ import print_function
import keras
from keras.utils import to_categorical
import os
from keras.preprocessing.image import ImageDataGenerator, load_img

SyntaxError: from __future__ imports must occur at the beginning of the file (cell_name, line 7)

# Instantiate a pre-trained VGG-16 without its classifier stack

In [2]:
from keras.applications import VGG16
vgg_conv = VGG16(weights='imagenet',
                  include_top=False,
                  input_shape=(224, 224, 3))

Using TensorFlow backend.


Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5


In [3]:
vgg_conv.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
__________

In [4]:
train_dir = './data/hymenoptera_data/train'
validation_dir = './data/hymenoptera_data/val'

nTrain = 600
nVal = 150

In [5]:
#delete 4 images from train/ants

datagen = ImageDataGenerator(rescale=1./255)
batch_size = 20

train_features = np.zeros(shape=(nTrain, 7, 7, 512))
train_labels = np.zeros(shape=(nTrain,2))

train_generator = datagen.flow_from_directory(
    train_dir,
    target_size=(224, 224),
    batch_size=batch_size,
    class_mode='categorical',
    shuffle=True)

i = 0
for inputs_batch, labels_batch in train_generator:
    features_batch = vgg_conv.predict(inputs_batch)
    train_features[i * batch_size : (i + 1) * batch_size] = features_batch
    train_labels[i * batch_size : (i + 1) * batch_size] = labels_batch
    i += 1
    if i * batch_size >= nTrain:
        break
        
train_features = np.reshape(train_features, (nTrain, 7 * 7 * 512))

NameError: name 'ImageDataGenerator' is not defined

In [None]:
batch_size = 10

validation_features = np.zeros(shape=(nVal, 7, 7, 512))
validation_labels = np.zeros(shape=(nVal,2))

validation_generator = datagen.flow_from_directory(
    validation_dir,
    target_size=(224, 224),
    batch_size=batch_size,
    class_mode='categorical',
    shuffle=False)

i = 0
for inputs_batch, labels_batch in validation_generator:
    features_batch = vgg_conv.predict(inputs_batch)
    validation_features[i * batch_size : (i + 1) * batch_size] = features_batch
    validation_labels[i * batch_size : (i + 1) * batch_size] = labels_batch
    i += 1
    if i * batch_size >= nVal:
        break

validation_features = np.reshape(validation_features, (nVal, 7 * 7 * 512))

In [None]:
from keras import models
from keras import layers
from keras import optimizers

model = models.Sequential()
model.add(layers.Dense(512, activation='relu', input_dim=7 * 7 * 512))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(2, activation='softmax'))

model.compile(optimizer=optimizers.RMSprop(lr=2e-4),
              loss='categorical_crossentropy',
              metrics=['acc'])

history = model.fit(train_features,
                    train_labels,
                    epochs=20,
                    batch_size=batch_size,
                    validation_data=(validation_features,validation_labels))

In [None]:
fnames = validation_generator.filenames

ground_truth = validation_generator.classes

label2index = validation_generator.class_indices

# Getting the mapping from class index to class label
idx2label = dict((v,k) for k,v in label2index.items())

In [None]:
predictions = model.predict_classes(validation_features)
prob = model.predict(validation_features)

In [None]:
errors = np.where(predictions != ground_truth)[0]
print("No of errors = {}/{}".format(len(errors),nVal))

In [None]:
for i in range(len(errors)):
    pred_class = np.argmax(prob[errors[i]])
    pred_label = idx2label[pred_class]
    
    print('Original label:{}, Prediction :{}, confidence : {:.3f}'.format(
        fnames[errors[i]].split('/')[0],
        pred_label,
        prob[errors[i]][pred_class]))
    
    original = load_img('{}/{}'.format(validation_dir,fnames[errors[i]]))
    plt.imshow(original)
    plt.xticks([])
    plt.yticks([])
    plt.show()