This notebook demonstrates the effect of using data augmentation for training CNNs. In particular, we will load a subset of the CIFAR10 dataset will both train a CNN with and without data augmentation. In the following, we will make use of keras, which simplies training CNNs a lot. Note that we will make use of tensorflow as backend. This permits, for instance, to use tensorboard to visualize the model and the output generated.

This tutorial is based on a couple of other tutorials that are available online. It might be worse checking them out as well at some point!
* [http://parneetk.github.io/blog/cnn-cifar10/](http://parneetk.github.io/blog/cnn-cifar10/)
* [https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html](https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html)
* [https://github.com/stratospark/food-101-keras](https://github.com/stratospark/food-101-keras)
* [https://chsasank.github.io/keras-tutorial.html](https://chsasank.github.io/keras-tutorial.html)

In [None]:
import matplotlib.pyplot as plt

import os
import time
import numpy

from keras import backend as K
K.set_image_dim_ordering('th')
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D, Cropping2D
from keras.preprocessing.image import ImageDataGenerator

Next, we will use the [CIFAR-10 dataset](https://www.cs.toronto.edu/~kriz/cifar.html), which depicts a labeled set of images beloning to 10 classes. For this tutorial, we will also reduce the size of the training set to (a) speed up training and to (b) mimic lack of labeled data.

In [None]:
from keras.datasets import cifar10
(train_features, train_labels), (test_features, test_labels) = cifar10.load_data()

# only consider a small subset of the training data
n_instances_train = 1000
n_instances_test = 2000
train_features = train_features[:n_instances_train, :, :, :]
train_labels = train_labels[:n_instances_train]
test_features = test_features[:n_instances_test, :, :, :]
test_labels = test_labels[:n_instances_test]
num_classes = len(numpy.unique(train_labels))

print("Number of training instances: %i" % train_features.shape[0])
print("Number of testing instances: %i" % test_features.shape[0])
print("Number of classes: %i" % num_classes)

Next, let's plot one image instance per class!

In [None]:
fig = plt.figure(figsize=(15,5))
all_class_names = ['airplane','automobile','bird','cat','deer', 'dog','frog','horse','ship','truck']

for i in range(num_classes):
    
    # get first image of class i 
    idx = numpy.where(train_labels[:]==i)[0]
    all_images_class_i = train_features[idx]
    im = numpy.transpose(all_images_class_i[0,::], (1, 2, 0))
    
    # plot image
    ax = fig.add_subplot(1, 10, 1 + i, xticks=[], yticks=[])
    ax.set_title(all_class_names[i])
    plt.imshow(im)
    
plt.show()

For training CNNs via keras, it might make sense to rescale the pixel values to [0,1] (such that the default values depict good model parameter assignments). In addition, we need to convert each label to a binary class vector. For instance, an instance with label 1 is mapped to a vector (0,1,0,0,0,0,0,0,0,0).

In [None]:
# rescale pixel values from [0,255] to [0,1]
train_features = train_features.astype('float32') / 255
test_features = test_features.astype('float32') / 255

# convert each class label to a binary vector
train_labels = np_utils.to_categorical(train_labels, num_classes)
test_labels = np_utils.to_categorical(test_labels, num_classes)

We can now define a convolutional neural network with keras. 

In [None]:
model = Sequential()

model.add(Convolution2D(32, 3, 3, border_mode='same', input_shape=(3, 32, 32)))
model.add(Activation('relu'))
model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))

model.add(Convolution2D(64, 3, 3, border_mode='same'))
model.add(Activation('relu'))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))

model.add(Flatten())
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
original_weights_save = model.get_weights()

print("Training model ...")
model_info = model.fit(train_features, 
                       train_labels, 
                       batch_size=128, 
                       nb_epoch=150, 
                       validation_data = (test_features, test_labels), 
                       verbose=1)
trained_weights_save = model.get_weights()

In [None]:
def plot_model_history(model_history):

    fig, axs = plt.subplots(1,2,figsize=(15,5))

    # summarize history for accuracy
    axs[0].plot(range(1,len(model_history.history['acc'])+1),model_history.history['acc'])
    axs[0].plot(range(1,len(model_history.history['val_acc'])+1),model_history.history['val_acc'])
    axs[0].set_title('Model Accuracy')
    axs[0].set_ylabel('Accuracy')
    axs[0].set_xlabel('Epoch')
    axs[0].set_xticks(numpy.arange(1,len(model_history.history['acc'])+1),len(model_history.history['acc'])/10)
    axs[0].legend(['train', 'val'], loc='best')
    axs[0].set_ylim([0,1.0])

    # summarize history for loss
    axs[1].plot(range(1,len(model_history.history['loss'])+1),model_history.history['loss'])
    axs[1].plot(range(1,len(model_history.history['val_loss'])+1),model_history.history['val_loss'])
    axs[1].set_title('Model Loss')
    axs[1].set_ylabel('Loss')
    axs[1].set_xlabel('Epoch')
    axs[1].set_xticks(numpy.arange(1,len(model_history.history['loss'])+1),len(model_history.history['loss'])/10)
    axs[1].legend(['train', 'val'], loc='best')
    axs[1].set_ylim([0,1.1*max(numpy.array(model_history.history['loss']).max(), numpy.array(model_history.history['val_loss']).max())])
    
    plt.show()

def accuracy(test_x, test_y, model):
    
    result = model.predict(test_x)
    predicted_class = numpy.argmax(result, axis=1)
    true_class = numpy.argmax(test_y, axis=1)
    num_correct = numpy.sum(predicted_class == true_class) 
    accuracy = float(num_correct)/result.shape[0]
    
    return (accuracy * 100)


# plot model history
plot_model_history(model_info)

# compute test accuracy
print("Accuracy on test data is: %0.2f" % accuracy(test_features, test_labels, model))

The network defined above achieves an accuracy of about 45-46 percent. A common way to improve the performance of such networks is to augment the data by artificially creating "new" training examples via rotating, shifting, mirroring, ... the given training instances. Keras provides the ImageDataGenerator class, which simplifies this process a lot! Have a look at the [documentation](https://keras.io/preprocessing/image/) and modifify the code below to improve the test accuracy!

Note: You should be able to rerun only this last cell. Further, 100 epochs should be enough to get a slightly better performance.

In [None]:
from keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(
        rotation_range=10,
        width_shift_range=0.15,
        height_shift_range=0.15,
        zoom_range=0.15,
        horizontal_flip=True,
        vertical_flip=False,
        fill_mode='nearest',
        )

# option 1: continue training; less epochs are enough
model.set_weights(trained_weights_save)
nb_epoch = 100

# option 2: retrain from scratch; here, more epochs are needed!
# model.set_weights(original_weights_save)
# nb_epoch = 400

model_info = model.fit_generator(datagen.flow(train_features, train_labels, batch_size = 128),
                                 samples_per_epoch = train_features.shape[0], 
                                 nb_epoch = nb_epoch, 
                                 validation_data = (test_features, test_labels), verbose=1)


In [None]:
# plot model history
plot_model_history(model_info)

# compute test accuracy
print("Accuracy on test data is: %0.2f" % accuracy(test_features, test_labels, model))