<a href="https://colab.research.google.com/github/jeyoor/iusb-applied-deep-learning-lecture/blob/feature%2F2019-09-18-lecture/dogs_and_cats_transfer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Dogs and Cats (Transfer Learning)

Extract training data (requires tarball to be available)

In [0]:
import tarfile
data_tarball = tarfile.open('/content/dogs_cats_data.tgz', 'r:gz')
data_tarball.extractall()
data_tarball.close()


Imports and helpers we'll use later

In [0]:
import os
import time as t
import numpy as np
import pandas as pd
import matplotlib
%matplotlib inline
import matplotlib.pyplot as plt

from keras.applications.vgg16 import VGG16
from keras.applications.xception import Xception
from keras.applications.resnet50 import ResNet50

from keras.models import Model
from keras.layers import Dense
from keras.optimizers import SGD
from keras.callbacks import CSVLogger, ModelCheckpoint
from keras.preprocessing.image import ImageDataGenerator, array_to_img

def preprocess_input_vgg(x):
    """Wrapper around keras.applications.vgg16.preprocess_input()
    to make it compatible for use with keras.preprocessing.image.ImageDataGenerator's
    `preprocessing_function` argument.
    
    Parameters
    ----------
    x : a numpy 3darray (a single image to be preprocessed)
    
    Note we cannot pass keras.applications.vgg16.preprocess_input()
    directly to to keras.preprocessing.image.ImageDataGenerator's
    `preprocessing_function` argument because the former expects a
    4D tensor whereas the latter expects a 3D tensor. Hence the
    existence of this wrapper.
    
    Returns a numpy 3darray (the preprocessed image).
    
    """
    from keras.applications.vgg16 import preprocess_input
    X = np.expand_dims(x, axis=0)
    X = preprocess_input(X)
    return X[0]

def show_prediction_examples(num_examples=1, validation_image_generator=None, model=None, folder_path='results/', model_name='name'):
    """Helper method to save examples of images and prediction probabilities"""
    X_val_sample, _ = next(validation_image_generator)
    y_pred = model.predict(X_val_sample)
    for idx, x, y in zip(range(num_examples), X_val_sample[:num_examples], y_pred.flatten()[:num_examples]):
        s = pd.Series({'Cat': 1-y, 'Dog': y})
        axes = s.plot(kind='bar')
        axes.set_xlabel('Class')
        axes.set_ylabel('Probability')
        axes.set_ylim([0, 1])
        plt.show()


For more information on the meaning of some of the constants below, see these links.

[Keras fit_generator documentation](https://keras.io/models/sequential/#fit_generator)

[StackExchange CrossValidated Question: "Tradeoff batch size vs. number of iterations"](https://stats.stackexchange.com/questions/164876/tradeoff-batch-size-vs-number-of-iterations-to-train-a-neural-network)

[StackExchange CrossValidated Question: "What is batch size in neural network?"](https://stats.stackexchange.com/questions/153531/what-is-batch-size-in-neural-network)

In [0]:
#create constants
SAMPLES_PER_EPOCH = 16
#using 10 epochs for training test
EPOCHS = 10
STEPS_PER_EPOCH = 1
VALIDATION_STEPS = 32
#dump full model to disk every 10 epochs
MODEL_SAVER_PERIOD = EPOCHS / 10

Creating data generators (note that training data generator is augmenting the training data, but the validation data generator is not augmenting)

In [18]:
#create training and validation data generators
train_datagen = ImageDataGenerator(preprocessing_function=preprocess_input_vgg,
                                   rotation_range=40,
                                   width_shift_range=0.2,
                                   height_shift_range=0.2,
                                   shear_range=0.2,
                                   zoom_range=0.2,
                                   horizontal_flip=True,
                                   fill_mode='nearest')
train_generator = train_datagen.flow_from_directory(directory='data/train',
                                                    target_size=[224, 224],
                                                    batch_size=SAMPLES_PER_EPOCH,
                                                    class_mode='binary')

validation_datagen = ImageDataGenerator(preprocessing_function=preprocess_input_vgg)
validation_generator = validation_datagen.flow_from_directory(directory='data/validation',
                                                              target_size=[224, 224],
                                                              batch_size=SAMPLES_PER_EPOCH,
                                                              class_mode='binary')

Found 4000 images belonging to 2 classes.
Found 1600 images belonging to 2 classes.


Load three models (Xception, VGG16, and ResNet50) with weights that were trained on the imagenet dataset

Keras source code for these models is available [here](https://github.com/keras-team/keras-applications/tree/master/keras_applications)

Don't forget to switch the branch to your current Keras version.


In [16]:
#initialize existing application nets
xception = Xception(weights='imagenet')
vgg16 = VGG16(weights='imagenet')
resnet50 = ResNet50(weights='imagenet')








Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.4/xception_weights_tf_dim_ordering_tf_kernels.h5
Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5
Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels.h5


Print a summary of the layers of each model.

For explanation of why the various layers are used for each model, see their individual research papers.

[Xception](https://arxiv.org/abs/1610.02357)

[VGG16](https://arxiv.org/abs/1610.02357)

[ResNet-50](https://arxiv.org/abs/1512.03385)

For more recent state of the art results on tasks like ImageNet, the "[Papers With Code](https://paperswithcode.com/sota/image-classification-on-imagenet)" site is a good resource.


In [20]:
xception.summary()
vgg16.summary()
resnet50.summary()

Model: "xception"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            (None, 299, 299, 3)  0                                            
__________________________________________________________________________________________________
block1_conv1 (Conv2D)           (None, 149, 149, 32) 864         input_1[0][0]                    
__________________________________________________________________________________________________
block1_conv1_bn (BatchNormaliza (None, 149, 149, 32) 128         block1_conv1[0][0]               
__________________________________________________________________________________________________
block1_conv1_act (Activation)   (None, 149, 149, 32) 0           block1_conv1_bn[0][0]            
___________________________________________________________________________________________

Change the output from predicting thousands of imagenet categories to predicting two categories (dogs vs. cats)

In [21]:
#bind 1D sigmoid output prediction to second-to-last layer
final_vgg16_layer = vgg16.get_layer('fc2').output
vgg16_prediction = Dense(output_dim=1, activation='sigmoid', name='logit')(final_vgg16_layer)
vgg16_model = Model(input=vgg16.input, output=vgg16_prediction)

final_xception_layer = xception.get_layer('avg_pool').output
xception_prediction = Dense(output_dim=1, activation='sigmoid', name='logit')(final_xception_layer)
xception_model = Model(input=xception.input, output=xception_prediction)

final_resnet50_layer = resnet50.get_layer('avg_pool').output
resnet50_prediction = Dense(output_dim=1, activation='sigmoid', name='logit')(final_resnet50_layer)
resnet50_model = Model(input=resnet50.input, output=resnet50_prediction)

  
  This is separate from the ipykernel package so we can avoid doing imports until
  
  import sys
  # Remove the CWD from sys.path while we load stuff.
  # This is added back by InteractiveShellApp.init_path()


Create a separate, timestamped results directory for logging and saving intermediate model containers

Then, create Keras callbacks that will save CSV training logs and intermediate model containers to that results directory

In [0]:
timestamp = "results/" + t.strftime("%Y%m%d-%H%M%S")
if not os.path.exists(timestamp):
    os.makedirs(timestamp)

vgg16_csv_logger = CSVLogger(timestamp + '/vgg16_training_log.csv', append=True, separator=',')
vgg16_model_saver = ModelCheckpoint(timestamp + '/vgg16.weights.{epoch:02d}-{val_loss:.2f}.hdf5', monitor='val_loss', verbose=0, save_best_only=True, mode='auto', period=MODEL_SAVER_PERIOD)
xception_csv_logger = CSVLogger(timestamp + '/xception_training_log.csv', append=True, separator=',')
xception_model_saver = ModelCheckpoint(timestamp + '/xception.weights.{epoch:02d}-{val_loss:.2f}.hdf5', monitor='val_loss', verbose=0, save_best_only=True, mode='auto', period=MODEL_SAVER_PERIOD)
resnet50_csv_logger = CSVLogger(timestamp + '/resnet50_training_log.csv', append=True, separator=',')
resnet50_model_saver = ModelCheckpoint(timestamp + '/resnet50.weights.{epoch:02d}-{val_loss:.2f}.hdf5', monitor='val_loss', verbose=0, save_best_only=True, mode='auto', period=MODEL_SAVER_PERIOD)


Turn off training (freeze) the weights for the feature layers just before the last one

In [0]:

#freeze vgg16 feature layers
for layer in vgg16_model.layers:
    if layer.name in ['fc1', 'fc2', 'logit']:
        continue
    layer.trainable = False

for layer in xception_model.layers:
    if layer.name in ['avg_pool', 'logit']:
        continue
    layer.trainable = False

for layer in resnet50_model.layers:
    if layer.name in ['avg_pool', 'flatten_1', 'logit']:
        continue
    layer.trainable = False

Use stochastic gradient descent as the training algorithm with a learning rate of .0001

In [0]:
sgd = SGD(lr=1e-4, momentum=0.9)

Compile and train the models.

Note that the CSV and model saver callbacks we created earlier need to be passed in here.

We are using "binary_crossentropy" as the loss function because we are performing a binary classification (dogs vs. cats)

If we had multiple categories (like dogs vs. cats vs. parrots vs. horses) we could use "categorical_crossentropy" as the loss function

The full list of available Keras loss functions is documented [here](https://keras.io/losses/).

In [28]:
#compile and train models
vgg16_model.compile(optimizer=sgd, loss='binary_crossentropy', metrics=['accuracy'])
vgg16_model.fit_generator(train_generator,
                    samples_per_epoch=SAMPLES_PER_EPOCH,
                    epochs=EPOCHS,
                    steps_per_epoch=STEPS_PER_EPOCH,
                    validation_data=validation_generator,
                    validation_steps=VALIDATION_STEPS,
                    callbacks=[vgg16_csv_logger, vgg16_model_saver]);

xception_model.compile(optimizer=sgd, loss='binary_crossentropy', metrics=['accuracy'])
xception_model.fit_generator(train_generator,
                    samples_per_epoch=SAMPLES_PER_EPOCH,
                    epochs=EPOCHS,
                    steps_per_epoch=STEPS_PER_EPOCH,
                    validation_data=validation_generator,
                    validation_steps=VALIDATION_STEPS,
                    callbacks=[xception_csv_logger, xception_model_saver]);

resnet50_model.compile(optimizer=sgd, loss='binary_crossentropy', metrics=['accuracy'])
resnet50_model.fit_generator(train_generator,
                    samples_per_epoch=SAMPLES_PER_EPOCH,
                    epochs=EPOCHS,
                    steps_per_epoch=STEPS_PER_EPOCH,
                    validation_data=validation_generator,
                    validation_steps=VALIDATION_STEPS,
                    callbacks=[resnet50_csv_logger, resnet50_model_saver]);


Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where


  


Epoch 1/10


OSError: ignored

Show 4 images and the classification prediction from each model for those images

In [9]:
show_prediction_examples(num_examples=4, validation_image_generator=validation_generator, model=vgg16_model, folder_path=timestamp, model_name='vgg16')
show_prediction_examples(num_examples=4, validation_image_generator=validation_generator, model=xception_model, folder_path=timestamp, model_name='xception')
show_prediction_examples(num_examples=4, validation_image_generator=validation_generator, model=resnet50_model, folder_path=timestamp, model_name='resnet50')


NameError: ignored