# Dogs-vs-cats classification with CNNs

In this notebook, we'll train a convolutional neural network (CNN, ConvNet) to classify images of dogs from images of cats using Keras (version $\ge$ 2 is required). This notebook is largely based on the blog post [Building powerful image classification models using very little data](https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html) by François Chollet.

**Note that using a GPU with this notebook is highly recommended.**

First, the needed imports. Keras tells us which backend (Theano, Tensorflow, CNTK) it will be using.

In [None]:
%matplotlib inline

from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout, Flatten, MaxPooling2D
from keras.layers.convolutional import Conv2D 
from keras.preprocessing.image import (ImageDataGenerator, array_to_img, 
                                      img_to_array, load_img)
from keras import applications, optimizers

from keras.utils import np_utils
from keras import backend as K

from distutils.version import LooseVersion as LV
from keras import __version__

from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

print('Using Keras version:', __version__, 'backend:', K.backend())
assert(LV(__version__) >= LV("2.0.0"))

If we are using TensorFlow as the backend, we can use TensorBoard to visualize our progress during training.

In [None]:
if K.backend() == "tensorflow":
    import tensorflow as tf
    from keras.callbacks import TensorBoard
    import os, datetime
    logdir = os.path.join(os.getcwd(), "logs",
                     datetime.datetime.now().strftime('%Y-%m-%d_%H-%M-%S'))
    print('TensorBoard log directory:', logdir)
    os.makedirs(logdir)
    callbacks = [TensorBoard(log_dir=logdir)]
else:
    callbacks =  None

## Data

The training dataset consists of 2000 images of dogs and cats, split in half.  In addition, the validation set consists of 1000 images, and the test set of 22000 images.

### Downloading the data

In [None]:
datapath = "/home/cloud-user/dogs-vs-cats/train-2000"
(nimages_train, nimages_validation, nimages_test) = (2000, 1000, 22000)

### Data augmentation

First, we'll resize all training and validation images to a fized size. 

Then, to make the most of our limited number of training examples, we'll apply random transformations to them each time we are looping over them. This way, we "augment" our training dataset to contain more data. There are various transformations readily available in Keras, see [ImageDataGenerator](https://keras.io/preprocessing/image/) for more information.

In [None]:
input_image_size = (150, 150)

datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        #rotation_range=40,
        #width_shift_range=0.2,
        #height_shift_range=0.2,
        horizontal_flip=True)

noopgen = ImageDataGenerator(rescale=1./255)

Let's see a couple of training images with and without the augmentation.

In [None]:
orig_generator = noopgen.flow_from_directory(
        datapath+'/train',  
        target_size=input_image_size,  
        batch_size=9)

augm_generator = datagen.flow_from_directory(
        datapath+'/train',  
        target_size=input_image_size,  
        batch_size=9)

for batch, _ in orig_generator:
    plt.figure(figsize=(10,10))
    for i in range(9):
        plt.subplot(3,3,i+1)
        plt.imshow(batch[i,:,:,:])
        plt.suptitle('only resized training images', fontsize=16, y=0.93)
    break

for batch, _ in augm_generator:
    plt.figure(figsize=(10,10))
    for i in range(9):
        plt.subplot(3,3,i+1)
        plt.imshow(batch[i,:,:,:])
        plt.suptitle('augmented training images', fontsize=16, y=0.93)
    break

Let's insert the augmented images also to a TensorBoard event file. 

In [None]:
if K.backend() == "tensorflow":
    imgs = tf.convert_to_tensor(batch)
    summary_op = tf.summary.image("augmented", imgs, max_outputs=9)
    with tf.Session() as sess:
        summary = sess.run(summary_op)
        writer = tf.summary.FileWriter(logdir)
        writer.add_summary(summary)
        writer.close()

### Data loaders

Let's now define our real data loaders for training and validation data.

In [None]:
batch_size = 25

print('Train: ', end="")
train_generator = datagen.flow_from_directory(
        datapath+'/train',  
        target_size=input_image_size,
        batch_size=batch_size, 
        class_mode='binary')

print('Validation: ', end="")
validation_generator = noopgen.flow_from_directory(
        datapath+'/validation',  
        target_size=input_image_size,
        batch_size=batch_size,
        class_mode='binary')

print('Test: ', end="")
test_generator = noopgen.flow_from_directory(
        datapath+'/test',  
        target_size=input_image_size,
        batch_size=batch_size,
        class_mode='binary')

## Option 1: Train a small CNN from scratch

Similarly as with MNIST digits, we can start from scratch and train a CNN for the classification task. However, due to the small number of training images, a large network will easily overfit, regardless of the data augmentation.

### Initialization

In [None]:
model = Sequential()

model.add(Conv2D(32, (3, 3), input_shape=input_image_size+(3,), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

print(model.summary())

In [None]:
SVG(model_to_dot(model, show_shapes=True).create(prog='dot', format='svg'))

### Learning

In [None]:
%%time

epochs = 20 
        
history = model.fit_generator(train_generator,
                              steps_per_epoch=nimages_train // batch_size,
                              epochs=epochs,
                              validation_data=validation_generator,
                              validation_steps=nimages_validation // batch_size,
                              verbose=2, callbacks=callbacks)

model.save("dvc-small-cnn.h5")

In [None]:
plt.figure(figsize=(5,3))
plt.plot(history.epoch,history.history['loss'], label='training')
plt.plot(history.epoch,history.history['val_loss'], label='validation')
plt.title('loss')
plt.legend(loc='best')

plt.figure(figsize=(5,3))
plt.plot(history.epoch,history.history['acc'], label='training')
plt.plot(history.epoch,history.history['val_acc'], label='validation')
plt.title('accuracy')
plt.legend(loc='best');

### Inference

In [None]:
%%time
scores = model.evaluate_generator(test_generator,
                                  steps=nimages_test // batch_size)
print("Test set %s: %.2f%%" % (model.metrics_names[1], scores[1]*100))

## Option 2: Reuse a pre-trained CNN

Another option is to reuse a pretrained network.  Here we'll use the [VGG16](https://keras.io/applications/#vgg16) network architecture with weights learned using Imagenet.  We remove the top layers and freeze the pre-trained weights. 

### Initialization

In [None]:
model = Sequential()

vgg_model = applications.VGG16(weights='imagenet', 
                               include_top=False, 
                               input_shape=input_image_size+(3,))
for layer in vgg_model.layers:
    model.add(layer)

for layer in model.layers:
    layer.trainable = False

print(model.summary())

We then stack our own, randomly initialized layers on top of the VGG16 network.

In [None]:
model.add(Flatten())
model.add(Dense(64, activation='relu'))
#model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

print(model.summary())

### Learning 1: New layers

In [None]:
%%time

epochs = 20

history = model.fit_generator(train_generator,
                              steps_per_epoch=nimages_train // batch_size,
                              epochs=epochs,
                              validation_data=validation_generator,
                              validation_steps=nimages_validation // batch_size,
                              verbose=2, callbacks=callbacks)

model.save("dvc-vgg16-reuse.h5")

In [None]:
plt.figure(figsize=(5,3))
plt.plot(history.epoch,history.history['loss'], label='training')
plt.plot(history.epoch,history.history['val_loss'], label='validation')
plt.title('loss')
plt.legend(loc='best')

plt.figure(figsize=(5,3))
plt.plot(history.epoch,history.history['acc'], label='training')
plt.plot(history.epoch,history.history['val_acc'], label='validation')
plt.title('accuracy')
plt.legend(loc='best');

### Learning 2: Fine-tuning

Once the top layers have learned some reasonable weights, we can continue training by unfreezing the last convolution block of VGG16 (`block5`) so that it may adapt to our data. The learning rate should be smaller than usual. 

In [None]:
for layer in model.layers[15:]:
    layer.trainable = True
    print(layer.name, "now trainable")
    
model.compile(loss='binary_crossentropy',
    optimizer=optimizers.RMSprop(lr=1e-5),
    metrics=['accuracy'])

print(model.summary())

Note that before continuing the training, we create a separate TensorBoard log directory:

In [None]:
%%time

epochs = 20

if K.backend() == "tensorflow":
    logdir_ft = logdir + "-ft"
    os.makedirs(logdir_ft)
    callbacks_ft = [TensorBoard(log_dir=logdir_ft)]
else:
    callbacks_ft = None

history = model.fit_generator(train_generator,
                              steps_per_epoch=nimages_train // batch_size,
                              epochs=epochs,
                              validation_data=validation_generator,
                              validation_steps=nimages_validation // batch_size,
                              verbose=2, callbacks=callbacks_ft)

model.save("dvc-vgg16-finetune.h5")

In [None]:
plt.figure(figsize=(5,3))
plt.plot(history.epoch,history.history['loss'], label='training')
plt.plot(history.epoch,history.history['val_loss'], label='validation')
plt.title('loss')
plt.legend(loc='best')

plt.figure(figsize=(5,3))
plt.plot(history.epoch,history.history['acc'], label='training')
plt.plot(history.epoch,history.history['val_acc'], label='validation')
plt.title('accuracy')
plt.legend(loc='best');

### Inference

In [None]:
%%time
scores = model.evaluate_generator(test_generator,
                                  steps=nimages_test // batch_size)
print("Test set %s: %.2f%%" % (model.metrics_names[1], scores[1]*100))