 # Lab 8:  Transfer Learning with a Pre-Trained Deep Neural Network

As we discussed earlier, state-of-the-art neural networks involve millions of parameters that are prohibitively difficult to train from scratch.  In this lab, we will illustrate a powerful technique called *fine-tuning* where we start with a large pre-trained network and then re-train only the final layers to adapt to a new task.  The method is also called *transfer learning* and can produce excellent results on very small datasets with very little computational time.  

This lab is based partially on this
[excellent blog](https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html).  In performing the lab, you will learn to:
* Build a custom image dataset
* Fine tune the final layers of an existing deep neural network for a new classification task.
* Load images with a `DataGenerator`.

You may run the lab on a CPU machine (like your laptop) or a GPU.  See the [notes](../GCP/gpu_setup.md) on setting up a GPU instance on Google Cloud Platform.  The GPU training is much faster (< 1 minute).  But, even the CPU machine training time will be less than 20 minutes.

## Create a Dataset

In this example, we will try to develop a classifier that can discriminate between two classes:  `cars` and `bicycles`.  One could imagine this type of classifier would be useful in vehicle vision systems.   The first task is to build a dataset.  

TODO:  Create training and test datasets with:
* 1000 training images of cars
* 1000 training images of bicylces
* 300 test images of cars
* 300 test images of bicylces
* The images don't need to be the same size.  But, you can reduce the resolution if you need to save disk space.

The images should be organized in the following directory structure:

    ./train
        /car
           car_0000.jpg
           car_0001.jpg
           ...
           car_0999.jpg
        /bicycle
           bicycle_0000.jpg
           bicycle_0001.jpg
           ...
           bicycle_0999.jpg
    ./test
        /car
           car_0000.jpg
           car_0001.jpg
           ...
           car_0299.jpg
        /bicycle
           bicycle_0000.jpg
           bicycle_0001.jpg
           ...
           bicycle_0299.jpg
           
A nice automated way of building such a dataset if through the [FlickrAPI](flickr_images.ipynb).           
        

In [1]:
import flickrapi
import urllib.request
import numpy as np
import skimage.io
import skimage.transform
import os
import shutil
import warnings
from IPython.display import clear_output

In [2]:
api_key = u'52c68de05a004c919bebf5cf337e8814'
api_secret = u'a3a39bf39168ba7c'
flickr = flickrapi.FlickrAPI(api_key, api_secret)

In [3]:
def resize_im(im0,nrow,ncol):
    nrow0 = im0.shape[0]
    ncol0 = im0.shape[1]
    nchan = im0.shape[2]
    if (ncol0 > nrow0):
        pad = (ncol0-nrow0)//2
        im = np.zeros((ncol0,ncol0,nchan),dtype=np.uint8)
        im[pad:pad+nrow0,:,:] = im0
    elif (nrow0 >= ncol0):
        pad = (nrow0-ncol0)//2
        im = np.zeros((nrow0,nrow0,nchan),dtype=np.uint8)        
        im[:,pad:pad+ncol0,:] = im0
    im = skimage.transform.resize(im,(nrow,ncol),mode='constant')
    return im

In [None]:
def load_images(keywords):
    if os.path.isdir('train'):
        shutil.rmtree('train')
    os.mkdir('train')
    
    if os.path.isdir('test'):
        shutil.rmtree('test')
    os.mkdir('test')
    
    for keyword in keywords:
        
        if os.path.isdir('train/' + keyword):
            shutil.rmtree('train/' + keyword)
        os.mkdir('train/' + keyword)
        
        if os.path.isdir('test/' + keyword):
            shutil.rmtree('test/' + keyword)
        os.mkdir('test/' + keyword)
        
        photos = flickr.walk(text=keyword,
                         tag_mode='all',
                         tags= keyword, 
                         extras='url_c', 
                         sort='relevance',
                         per_page=2000)
        
        full_size_fn = 'full_size'
        train_images = 1000
        test_images = 300
        nimage = 1300
        i = 0
        nrow = 150
        ncol = 150
        
        for photo in photos:
            url = photo.get('url_c')
            
            if not (url is None):
                try:
                    urllib.request.urlretrieve(url, full_size_fn)
                except urllib.error.HTTPError:
                    continue
                    
                im = skimage.io.imread(full_size_fn)
                
                if not (len(im.shape) is 3):
                    continue
                    
                im1 = resize_im(im, nrow, ncol)
                
                with warnings.catch_warnings():
                    warnings.simplefilter("ignore")
                    im2 = skimage.img_as_ubyte(im1)
                    
                folder = 'train/' if i < train_images else 'test/'
                index = i if i < train_images else i - train_images
                local_name = '{0:s}/{1:s}_{2:04d}.jpg'.format(folder + keyword, keyword, index)
                skimage.io.imsave(local_name, im2) 
                clear_output()
                print(local_name)
                i += 1 
                
            if (i >= nimage):
                shutil.remove(full_size_fn)
                clear_output()
                print('Training and testing images loaded')
                break

In [None]:
keywords = ['bicycle', 'car']
load_images(keywords)

test/bicycle/bicycle_0229.jpg


## Loading a Pre-Trained Deep Network

We follow the [VGG16 demo](./vgg16.ipynb) to load a pre-trained deep VGG16 network.  We first load the appropriate Keras packages.

In [None]:
import keras

In [None]:
from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense

We also load some standard packages.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

Clear the Keras session.

In [None]:
keras.backend.clear_session()

Set the dimensions of the input image.  The sizes below would work on a CPU machine.  But, if you have a GPU image, you can use a larger image size, like 150 x 150.

In [None]:
nrow = 150
ncol = 150

Now we follow the [VGG16 demo](./vgg16.ipynb) and load the deep VGG16 network.  Alternatively, you can use any other pre-trained model in keras.  When using the `applications.VGG16` method you will need to:
* Set `include_top=False` to not include the top layer
* Set the `image_shape` based on the above dimensions.  Remember, `image_shape` should be height x width x 3 since the images are color.

In [None]:
input_shape = (nrow, ncol, 3)
base_model = applications.VGG16(weights='imagenet', input_shape = input_shape, include_top=False)

To create now new model, we create a Sequential model.  Then, loop over the layers in `base_model.layers` and add each layer to the new model.

In [None]:
model = Sequential()
for layer in base_model.layers:
    model.add(layer)

Next, loop through the layers in `model`, and freeze each layer by setting `layer.trainable = False`.  This way, you will not have to *re-train* any of the existing layers.

In [None]:
for layer in model.layers:
    layer.trainable = False

Now, add the following layers to `model`:
* A `Flatten()` layer which reshapes the outputs to a single channel.
* A fully-connected layer with 256 output units and `relu` activation
* A `Dropout(0.5)` layer
* A final fully-connected layer.  Since this is a binary classification, there should be one output and `sigmoid` activation.

In [None]:
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

Print the model summary.  This will display the number of trainable parameters vs. the non-trainable parameters.

In [None]:
model.summary()

## Using Generators to Load Data

Up to now, the training data has been represented in a large matrix.  This is not possible for image data when the datasets are very large.  For these applications, the `keras` package provides a `ImageDataGenerator` class that can fetch images on the fly from a directory of images.  Using multi-threading, training can be performed on one mini-batch while the image reader can read files for the next mini-batch. The code below creates an `ImageDataGenerator` for the training data.  In addition to the reading the files, the `ImageDataGenerator` creates random deformations of the image to expand the total dataset size.  This is a classic trick that was key in the early deep learning experiments.

In [None]:
train_data_dir = './train'
batch_size = 32
train_datagen = ImageDataGenerator(rescale=1./255,
                                   shear_range=0.2,
                                   zoom_range=0.2,
                                   horizontal_flip=True)
train_generator = train_datagen.flow_from_directory(
                        train_data_dir,
                        target_size=(nrow,ncol),
                        batch_size=batch_size,
                        class_mode='binary')

Now, create a similar `test_generator` for the test data.

In [None]:
test_data_dir = './test'
batch_size = 32
test_datagen = ImageDataGenerator(rescale=1./255,
                                 shear_range=0.2,
                                 zoom_range=0.2,
                                 horizontal_flip=True)
test_generator = test_datagen.flow_from_directory(
    test_data_dir, 
    target_size=(nrow,ncol),
    batch_size=batch_size, 
    class_mode='binary')

The following function displays images that will be useful below.

In [None]:
def disp_image(im):
    if (len(im.shape) == 2):
        plt.imshow(im, cmap='gray') 
    else: 
        im1 = (im-np.min(im))/(np.max(im)-np.min(im))*255
        im1 = im1.astype(np.uint8)
        plt.imshow(im1)
    plt.xticks([])
    plt.yticks([])

To see how the `train_generator` works, use the `train_generator.next()` method to get a minibatch of data `X,y`.  Display the first 8 images in this mini-batch and label the image with the class label.  You should see that bicycles have `y=0` and cars have `y=1`.

In [None]:
X, y = train_generator.next()
i = 8
for j in range(i):
    plt.subplot(2, 4, j+1)
    plt.imshow(X[j])
    plt.title(y[j])
    plt.xticks([])
    plt.yticks([])
plt.show()

## Train the Model

Compile the model.  Select the correct `loss` function, `optimizer` and `metrics`.  Remember that we are performing binary classification.

In [None]:
model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

When using an `ImageDataGenerator`, we have to set two parameters manually:
* `steps_per_epoch =  training data size // batch_size`
* `validation_steps =  test data size // batch_size`

We can obtain get the training and test data size from `train_generator.n` and `test_generator.n`, respectively.

In [None]:
steps_per_epoch = train_generator.n // batch_size
validation_steps = test_generator.n // batch_size

Now, we run the fit.  If you are using a CPU on a regular laptop, each epoch will take about 3-4 minutes, so you should be able to finish 5 epochs or so within 20 minutes.  On a reasonable GPU, even with the larger images, it will take about 10 seconds per epoch.
* If you use `(nrow,ncol) = (64,64)` images, you should get over 94% accuracy after 5 epochs.
* If you use `(nrow,ncol) = (150,150)` images, you should get over 98% accuracy after 5 epochs.  But, this will need a GPU.

You will get full credit for either version.  With more epochs, you may get slightly higher, but you will have to play with the damping.

In [None]:
nepochs = 5

model.fit_generator(
    train_generator,
    steps_per_epoch=steps_per_epoch,
    epochs=nepochs,
    validation_data=test_generator,
    validation_steps=validation_steps)