# Lesson 1: classifying cats and dogs

This notebook contains my implementation of the code in week 1 of the Fast AI course.

This is a from-scratch implementation of the steps needed to classify the given image data, aiming to be a bit cleaner than the one provided with the course.

## Setup

In [None]:
%matplotlib inline

Set the path to the data. This assumes there's a symlink to the data directory in the directory where this notebook is stored.

In [None]:
#path = "data/dogscats"
path = "data/dogscats/sample/"
model_path = "http://files.fast.ai/models/"

Some imports we'll need in subsequent code. Note that the code in this notebook assumes Python 3.

In [None]:
import os
import numpy as np
np.set_printoptions(precision=4, linewidth=100)

Load (and reload) the utility code we use in later steps.

In [None]:
from imp import reload
import utils; reload(utils)

## Model setup

In the subsequent steps, we implement the classification using the Kera API, not using the `Vgg16` helper code in the course notebook. First up: import the bits we need.

In [None]:
from numpy.random import random, permutation

import keras
from keras import backend as K
from keras.utils.data_utils import get_file
from keras.models import Sequential, Model
from keras.layers.core import Flatten, Dense, Dropout, Lambda
from keras.layers import Input
from keras.layers.convolutional import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.optimizers import Adam
from keras.preprocessing import image

# SPECIAL GLOBAL MAGIC SETTING THAT IF WE DON'T SET IT MAKES THINGS BLOW UP!!!  :-/
K.set_image_dim_ordering('th')

## Model creation

We want to create a neural network that implements the VGG16 model, i.e. a well-known model that has been trained for image recognition using a defined architecture and set of published weights. The following sections will build us such a model which we can then use for our cats and dogs data.

First up is a function for adding a VGG16 convolutional block to a model. Each convolutional block adds some zero padding (1 pixel each side), and the actual convolutional layer. This uses convolution with a given number of convolution filters, and 3x3 convolution kernel, and 'relu' activation (a rectified linear unit, i.e. with activation function $f(x)=max(0,x)$, a ramp function).

In [None]:
def add_conv_block(model, layers, filters):
    for i in range(layers): 
        model.add(ZeroPadding2D((1, 1)))
        model.add(Convolution2D(filters, 3, 3, activation='relu'))
    model.add(MaxPooling2D((2, 2), strides=(2, 2)))

This one adds a fully connected block, with 4096 nodes:

In [None]:
def add_fully_connected_block(model):
    model.add(Dense(4096, activation = 'relu'))
    model.add(Dropout(0.5))

Do the preprocessing of image data to fit the VGG model.

In [None]:
# Mean of each channel as provided by VGG researchers
vgg_mean = np.array([123.68, 116.779, 103.939]).reshape((3,1,1))

def vgg_preprocess(x):
    x = x - vgg_mean     # subtract mean
    return x[:, ::-1]    # reverse axis bgr->rgb

This function defines the network architecture:

In [None]:
def vgg16_model():
    model = Sequential()
    model.add(Lambda(vgg_preprocess, input_shape = (3, 224, 224)))
    
    add_conv_block(model, 2, 64)
    add_conv_block(model, 2, 128)
    add_conv_block(model, 3, 256)
    add_conv_block(model, 3, 512)
    add_conv_block(model, 3, 512)

    model.add(Flatten())
    add_fully_connected_block(model)
    add_fully_connected_block(model)
    model.add(Dense(1000, activation='softmax'))
    return model

Let's create us a model:

In [None]:
model = vgg16_model()

Load the weights into the model.

In [None]:
fpath = get_file('vgg16.h5', model_path + 'vgg16.h5', cache_subdir='models')
model.load_weights(fpath)

Next, we'll grab some images already classified as dogs or cats.

In [None]:
batch_size = 64

def get_batches(dirname, gen=image.ImageDataGenerator(), shuffle=True, 
                batch_size=batch_size, class_mode='categorical'):
    return gen.flow_from_directory(path + dirname, target_size=(224, 224), 
                class_mode=class_mode, shuffle=shuffle, batch_size=batch_size)

In [None]:
training_batches = get_batches('train', batch_size=batch_size)
validation_batches = get_batches('valid', batch_size=batch_size)

Let's have a look at some of the data:

In [None]:
sample_batches = get_batches('train', batch_size=batch_size)
imgs, labels = next(sample_batches)
utils.plot_images(imgs[:6], titles=labels[:6])

So, the model we have so far can predict ImageNet classes. We want to customise this model to predict cats or dogs instead.

We do this by replicating the "fine tune" and"fit" steps in the provided Vgg16 class, but we'll do this directly using the Keras API instead. Then, we'll use the resulting model to predict classifications for the test set.

## Fine tuning the model

First, pop off the last layer of the model (the 1000 node softmax layer), and add a 2 node one instead.

In [None]:
model.pop()

Mark all the remaining layers as non-trainable.

In [None]:
for layer in model.layers: layer.trainable=False

Add a new final layer, 2-node softmax.

In [None]:
model.add(Dense(2, activation='softmax'))

The model needs to be compiled before we can fit it on data.

In [None]:
model.compile(optimizer=Adam(lr=0.001),
                loss='categorical_crossentropy', metrics=['accuracy'])

Next, find the class labels we will use (ordered by their label index)

In [None]:
# We get class labels and indexes from the batches we read earlier
indexes_to_classes = dict((v,k) for k,v in training_batches.class_indices.items())
in_index_order = dict(sorted(indexes_to_classes.items()))
# Get the classes in order of their index values
classes = list(in_index_order.values())
print("Class labels: ", classes)

Now we can fit the updated model to our training data.

In [None]:
model.fit_generator(training_batches, samples_per_epoch=training_batches.nb_sample, nb_epoch=1,
                validation_data=validation_batches, nb_val_samples=validation_batches.nb_sample)

## Using the model for classification

Now we have a model that we can use to classify images! Let's try it out!

Here are some images:

In [None]:
test_batches = get_batches('test1', batch_size=batch_size, class_mode=None)
images = next(test_batches)

Show the first few images so we can check the scores.

In [None]:
utils.plot_images(images[:8], titles=None)

In [None]:
predictions = model.predict(images)
print(predictions)

Looking good eh!

So, now we need to run this across all the files in a given test set and produce a verdict on the cat-vs-dogness of the images. 

In [None]:
# TODO: Bump this up on the big box!
# num_items = 10000
num_items = 50
all_test_predictions = model.predict_generator(test_batches, num_items)

In [None]:
print(all_test_predictions)
len(all_test_predictions)

In [None]:
max_score_idxs = np.argmax(all_test_predictions, axis=1)
print(max_score_idxs[:8])
print(test_batches.filenames[:8], len(test_batches.filenames))
max_score_idxs

In [None]:
import pathlib
filenames = list(map((lambda fn: pathlib.Path(fn).stem), test_batches.filenames))
file_ids = np.array(filenames)[:num_items]
results = np.column_stack([file_ids, max_score_idxs])
print(results[:4])

In [None]:
np.savetxt('catsdogs.csv', results, fmt='%s', delimiter=',')