# Lesson 1: classifying cats and dogs

This notebook contains my implementation of the code in week 1 of the Fast AI course.

This is a from-scratch implementation of the steps needed to classify the given image data, aiming to be a bit cleaner than the one provided with the course.

## Setup

In [None]:
%matplotlib inline

Set the path to the data. This assumes there's a symlink to the data directory in the directory where this notebook is stored.

In [None]:
path = "data/dogscats/"
# path = "data/dogscats/sample/"
model_path = "http://files.fast.ai/models/"
validation_path = path + '/valid/'

Some imports we'll need in subsequent code. Note that the code in this notebook assumes Python 3.

In [None]:
import os
import numpy as np
np.set_printoptions(precision=4, linewidth=100)

Load (and reload) the utility code we use in later steps.

In [None]:
from imp import reload
import utils; reload(utils)

## Model setup

In the subsequent steps, we implement the classification using the Keras API, not using the `Vgg16` helper code in the course notebook. First up: import the bits we need.

In [None]:
from numpy.random import random, permutation

import keras
from keras import backend as K
from keras.utils.data_utils import get_file
from keras.models import Sequential, Model
from keras.layers.core import Flatten, Dense, Dropout, Lambda
from keras.layers import Input
from keras.layers.convolutional import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.optimizers import Adam
from keras.preprocessing import image

# SPECIAL GLOBAL MAGIC SETTING THAT IF WE DON'T SET IT MAKES THINGS BLOW UP!!!  :-/
K.set_image_dim_ordering('th')

## Model creation

We want to create a neural network that implements the VGG16 model, i.e. a well-known model that has been trained for image recognition using a defined architecture and set of published weights. The following sections will build us such a model which we can then use for our cats and dogs data.

First up is a function for adding a VGG16 convolutional block to a model. Each convolutional block adds some zero padding (1 pixel each side), and the actual convolutional layer. This uses convolution with a given number of convolution filters, and 3x3 convolution kernel, and 'relu' activation (a rectified linear unit, i.e. with activation function $f(x)=max(0,x)$, a ramp function).

In [None]:
def add_conv_block(model, layers, filters):
    for i in range(layers): 
        model.add(ZeroPadding2D((1, 1)))
        model.add(Convolution2D(filters, 3, 3, activation='relu'))
    model.add(MaxPooling2D((2, 2), strides=(2, 2)))

This one adds a fully connected block, with 4096 nodes:

In [None]:
def add_fully_connected_block(model):
    model.add(Dense(4096, activation = 'relu'))
    model.add(Dropout(0.5))

Do the preprocessing of image data to fit the VGG model.

In [None]:
# Mean of each channel as provided by VGG researchers
vgg_mean = np.array([123.68, 116.779, 103.939]).reshape((3,1,1))

def vgg_preprocess(x):
    x = x - vgg_mean     # subtract mean
    return x[:, ::-1]    # reverse axis bgr->rgb

This function defines the network architecture:

In [None]:
def vgg16_model():
    model = Sequential()
    model.add(Lambda(vgg_preprocess, input_shape = (3, 224, 224)))
    
    add_conv_block(model, 2, 64)
    add_conv_block(model, 2, 128)
    add_conv_block(model, 3, 256)
    add_conv_block(model, 3, 512)
    add_conv_block(model, 3, 512)

    model.add(Flatten())
    add_fully_connected_block(model)
    add_fully_connected_block(model)
    model.add(Dense(1000, activation='softmax'))
    return model

Let's create us a model:

In [None]:
model = vgg16_model()

Load the weights into the model.

In [None]:
fpath = get_file('vgg16.h5', model_path + 'vgg16.h5', cache_subdir='models')
model.load_weights(fpath)

Next, we'll grab some images already classified as dogs or cats.

In [None]:
batch_size = 64

def get_batches(dirname, gen=image.ImageDataGenerator(), shuffle=True, 
                batch_size=batch_size, class_mode='categorical'):
    return gen.flow_from_directory(path + dirname, target_size=(224, 224), 
                class_mode=class_mode, shuffle=shuffle, batch_size=batch_size)

In [None]:
training_batches = get_batches('train', batch_size=batch_size)
validation_batches = get_batches('valid', batch_size=batch_size)

Let's have a look at some of the data:

In [None]:
sample_batches = get_batches('train', batch_size=batch_size)
imgs, labels = next(sample_batches)
utils.plot_images(imgs[:6], titles=labels[:6])

So, the model we have so far can predict ImageNet classes. We want to customise this model to predict cats or dogs instead.

We do this by replicating the "fine tune" and"fit" steps in the provided Vgg16 class, but we'll do this directly using the Keras API instead. Then, we'll use the resulting model to predict classifications for the test set.

## Fine tuning the model

First, pop off the last layer of the model (the 1000 node softmax layer), and add a 2 node one instead.

In [None]:
model.pop()

Mark all the remaining layers as non-trainable.

In [None]:
for layer in model.layers: layer.trainable=False

Add a new final layer, 2-node softmax.

In [None]:
model.add(Dense(2, activation='softmax'))

As an possible improvement over the basic case, fine tune the previous layer as well(?).

In [None]:
model.summary()

# Make the last 3 layers trainable, i.e. the final dense layer we added but also the previous dense + dropout layers.
for layer in model.layers[-3:]: layer.trainable=True

The model needs to be compiled before we can fit it on data.

In [None]:
model.compile(optimizer=Adam(lr=0.001),
                loss='categorical_crossentropy', metrics=['accuracy'])

Next, find the class labels we will use (ordered by their label index)

In [None]:
# We get class labels and indexes from the batches we read earlier
indexes_to_classes = dict((v,k) for k,v in training_batches.class_indices.items())
in_index_order = dict(sorted(indexes_to_classes.items()))
# Get the classes in order of their index values
classes = list(in_index_order.values())
print("Class labels: ", classes)

Now we can fit the updated model to our training data.

## Training the model

In [None]:
training_history = model.fit_generator(training_batches, samples_per_epoch=training_batches.nb_sample, nb_epoch=1,
                       validation_data=validation_batches, nb_val_samples=validation_batches.nb_sample)

Some example training stats, for future reference.

3 epochs on network, fine tuning last dense layer only:

```
Epoch 1/3
23000/23000 [=========================] - 647s - loss: 0.1163 - acc: 0.9719 - val_loss: 0.0509 - val_acc: 0.9875
Epoch 2/3
23000/23000 [=========================] - 643s - loss: 0.0924 - acc: 0.9785 - val_loss: 0.0453 - val_acc: 0.9910
Epoch 3/3
23000/23000 [=========================] - 643s - loss: 0.0940 - acc: 0.9792 - val_loss: 0.0599 - val_acc: 0.9875
```

1 epoch, fine tuning last 3 layers (last two dense layers and dropout in between):

```
Epoch 1/1
23000/23000 [==========================] - 654s - loss: 0.7241 - acc: 0.9522 - val_loss: 0.6085 - val_acc: 0.9615
```

Save the weights.

In [None]:
model.save_weights('fitted_weights_1epoch-finetune3layers.h5')

Save the whole model as well, so we can very quickly pick up where we left off.

In [None]:
model.save('cats-dogs-model_1epoch-finetune3layers.h5')

Alternatively, just create a model and load the weights instead of training the model (run this step instead of the training step above).

In [None]:
#model.load_weights('fitted_weights_1.h5')
#model = keras.models.load_model('cats-dogs-model_1.h5')

## Using the model for classification

Now we have a model that we can use to classify images! Let's try it out!

Here are some images:

In [None]:
batch_size = 50
sample_test_batches = get_batches('test', batch_size=batch_size, class_mode=None)
images = next(sample_test_batches)

Show the first few images so we can check the scores.

In [None]:
utils.plot_images(images[:8], titles=None)

In [None]:
predictions = model.predict(images)
print(predictions[:8])

Looking good eh!

So, now we need to run this across all the files in a given test set and produce a verdict on the cat-vs-dogness of the images. 

In [None]:
test_batches = get_batches('test', batch_size=batch_size, shuffle=False, class_mode=None)
all_test_predictions = model.predict_generator(test_batches, test_batches.nb_sample)

In [None]:
print(all_test_predictions[:8])
max_score_idxs = np.argmax(all_test_predictions, axis=1)
print(max_score_idxs[:8])
print(test_batches.filenames[:8], len(test_batches.filenames))
max_score_idxs

In [None]:
import pathlib
filenames = list(map((lambda fn: pathlib.Path(fn).stem), test_batches.filenames))
file_ids = np.array(filenames)[:test_batches.nb_sample]
clipped_probs = np.clip(max_score_idxs, 0.05, 0.95)
results = np.column_stack([file_ids, clipped_probs])
print(results[:8])

Now write out a CSV file in a suitable format for submission to Kaggle.

In [None]:
submission_filename = 'catsdogs-redux-7.csv'
np.savetxt(submission_filename, results, fmt='%s', delimiter=',', header='id,label', comments = '')

All done!

Here's a handy link to the generated file.

In [None]:
from IPython.display import FileLink
FileLink(submission_filename)

## Inspecting the results

Let's take a look at some of the results to get a feel for whether they are good or not.

First, calculate predictions on validation set, so we can find correct and incorrect examples:

In [None]:
#model.load_weights('fitted_weights_1.h5')
#val_batches, probs = vgg.test(valid_path, batch_size = batch_size)
eval_batches = get_batches('valid', batch_size=batch_size, class_mode=None)
eval_preds = model.predict_generator(eval_batches, test_batches.nb_sample)

Join up with the correct answers for these images.

In [None]:
filenames = eval_batches.filenames
expected_labels = eval_batches.classes #0 or 1

#Round our predictions to 0/1 to generate labels
our_predictions = eval_preds[:,0]
our_labels = np.round(1 - our_predictions)

Some helper code for easily plotting results:

In [None]:
from keras.preprocessing import image

#Helper function to plot images by index in the validation set 
#Plots is a helper function in utils.py
def plots_idx(idx, titles=None):
    utils.plot_images([image.load_img(validation_path + filenames[i]) for i in idx], titles=titles)
    
#Number of images to view for each visualization task
n_view = 4

First, some random correct results:

In [None]:
print(our_labels[:4], expected_labels[:4])
#TODO! one array has floats and the other has ints - how to convert?
correct = np.where(our_labels == expected_labels)[0]
print "Found %d correct labels" % len(correct)
#idx = permutation(correct)[:n_view]
#plots_idx(idx, our_predictions[idx])