# Traffic sign classification with Convolutional Neural Networks

In this notebook, you will face the real-world problem of automatically classify traffic signs. You will learn advanced techniques, such as image augmentation and learning rate scheduling, and it will be a good benchmark to assess your understanding of Convolutional Neural Networks.

# 1. Setting up the environment
For simplicity, we will import almost all packages you need in this first codeblock. Notice that we define a `data_format` string which holds the data format (or image ordering) used by Keras. We will use it to load the appropriate pre-defined dataset.

In [None]:
import numpy as np

from sklearn.model_selection import train_test_split

import h5py

from keras.models import Sequential, model_from_json
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D

from keras.optimizers import SGD
from keras.utils import np_utils
from keras.callbacks import LearningRateScheduler, ModelCheckpoint
from keras import backend as K
K.set_image_data_format('channels_first')

from matplotlib import pyplot as plt
%matplotlib inline

NUM_CLASSES = 43
IMG_SIZE = 48

if K.image_data_format() == 'channels_first':
    data_format = 'NCWH'
else:
    data_format = 'NWHC'

# 2. The data: GTSRB dataset

From [GTSRB's website](benchmark.ini.rub.de/?section=gtsrb):
>_The German Traffic Sign Benchmark is a multi-class, single-image classification challenge held at the International Joint Conference on Neural Networks (IJCNN) 2011. We cordially invite researchers from relevant fields to participate: The competition is designed to allow for participation without special domain knowledge. Our benchmark has the following properties:_
- _Single-image, multi-class classification problem_
- _More than 40 classes_
- _More than 50,000 images in total_
- _Large, lifelike database_

We will use a pre-processed version of the dataset, which we stored on scratch for NCHW format. The data is stored as an HDF5 file, thus we will use `h5py` to load it. We then also split the data in train and validation sets, using 20% of the original dataset for validation.

In [None]:
train_filename = 'scratch/data/X_'+data_format+'.h5'

with  h5py.File(train_filename) as hf: 
    X, Y = hf['imgs'][:], hf['labels'][:]
print("Loaded images from {}".format(train_filename))

X_train, X_val, Y_train, Y_val = train_test_split(X, Y, test_size=0.2, random_state=42)

We also have a separate file for the test dataset, which we load.

In [None]:
test_filename = 'scratch/data/X_'+data_format+'_test.h5'

with  h5py.File(test_filename) as hf: 
    X_test, Y_test = hf['imgs'][:], hf['labels'][:]
print("Loaded images from {}".format(test_filename))

Explore the dataset, extract some stats, like number of images and check that `X_train`, `X_val`, `X_test` and their labels are as you expect them!

In [None]:
# TODO assign the right values to these variables
n_train_samples = 
input_shape     = 
output_dim      = 

print("{} train samples".format(n_train_samples))
print("Input dimension: {} ".format(input_dim))
print("Output dimension: {}".format(output_dim))

# 3. Preprocessing the data

Nothing to do here! The data has already been prepared for us. If you are interested, here is the function which was used to prepare the data:

In [None]:
def preprocess_img(img):
    # Histogram normalization in y
    hsv = color.rgb2hsv(img)
    hsv[:,:,2] = exposure.equalize_hist(hsv[:,:,2])
    img = color.hsv2rgb(hsv)

    # central crop
    min_side = min(img.shape[:-1])
    centre = img.shape[0]//2, img.shape[1]//2
    img = img[centre[0]-min_side//2:centre[0]+min_side//2,
              centre[1]-min_side//2:centre[1]+min_side//2,
              :]

    # rescale to standard size
    img = transform.resize(img, (IMG_SIZE, IMG_SIZE))

    # roll color axis to axis 0
    if data_format == 'NCWH':
        img = np.rollaxis(img,-1)

    return img

# 4. Visualizing the data

Go ahead and explore the dataset a little further! Use `plt.imshow` to show some samples, but be careful, because the function expects images in `NWHC` format, thus you will probably have to use `np.transpose` ([here](https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.transpose.html)...)

In [None]:
plt.figure(figsize=(10,10))

#TODO use plt.imshow to display one or more samples from the datasets!

plt.axis('off');

# 5. Applying Neural Networks to the problem

We need to classify images, thus we most likely want to start from Convolutional Neural Networks! This time, a starting CNN is given for you, just be careful, it expects the variable `input_shape` to hold the shape of the input images (you should have defined this above), and you have to assign the right value to `output_dim`, which will be the dimension of the NN output. Now take time to analyze the CNN and write down the output dimensions and the number of parameters of each layer (take a piece of paper, or open a separate text file in this lab). Once you are done, add the classic call to `model.summary()` and see if it matches your expectations! 

As you see, we created a function which returns the CNN. This will be helpful later, when we will change training strategy and add data augmentation, without touching the NN topology!

In [None]:
def cnn_model():
    model = Sequential()
    
    output_dim = NUM_CLASSES
        
    model.add(Conv2D(32, (3, 3), padding='same',
                     input_shape=input_shape,
                     activation='relu'))
    model.add(Conv2D(32, (3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.2))

    model.add(Conv2D(64, (3, 3), padding='same',
                     activation='relu'))
    model.add(Conv2D(64, (3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.2))

    model.add(Conv2D(128, (3, 3), padding='same',
                     activation='relu'))
    model.add(Conv2D(128, (3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.2))

    model.add(Flatten())
    model.add(Dense(512, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(output_dim, activation='softmax'))
    return model

model = cnn_model()

#TODO: call summary() only after you've written down parameter count and output shape for each layer!

Time to compile the network! We start with a simple SGD optimizer, which we initialized with non-default parameters. 

In [None]:
# let's train the model using SGD + Nesterov momentum
lr = 0.01
sgd = SGD(lr=lr, decay=1e-6, momentum=0.9, nesterov=True)
# TODO compile the model, using the optimizer we just defined, with optimizer=sgd

Still one thing to do before training the network! We define a `lr_schedule` callback, which will return a learning rate `lr` based on the epoch, and will be called by Keras, at training time, at the beginning of each epoch. As it is defined now, it would just return a constant learning rate, but it would not lead to good accuracy. Modify the function so that it multiplies the initial learning rate by a factor of $0.1^{\lfloor epochs/10 \rfloor}$, where $\lfloor \cdot \rfloor$ is the floor operator, i.e. rounding towards the smallest integer.

In [None]:
batch_size = 32
epochs = 20

def lr_schedule(epoch):
    #TODO return a reduce the learning rate by a factor of 10 each 10 epochs
    return lr

All callbacks are passed to the `fit` function through an array, which must contain callback objects. In our case, we create an object of type `LearningRateScheduler` which takes as argument the function we just defined. That's all!

Let's start the training!

In [None]:
history_callback=model.fit(X_train, Y_train,
          batch_size=batch_size,
          epochs=epochs,
          validation_data=(X_val, Y_val),
          callbacks=[LearningRateScheduler(lr_schedule)])

Ok, very high numbers, aren't they? What about the test accuracy? Go ahead and check it!

In [None]:
#TODO use model.evaluate with the test dataset to obtain test accuracy
score = 

print("Test accuracy = {}".format(score[1]))

This accuracy should be lower than the train and validation ones. OK, can we do better? Well, the data set is small, even if wer keep training, we will not have enough samples to acheve better generalization, thus, we look at something different: data augmentation!

# 6. Data augmentation in Keras

In Keras, we cak use `ImageDataGenerator`, which takes an array of images and applies random transformations to them. We can define some parameters for such transformations. [Look at the documentation](https://keras.io/preprocessing/image/), and understand what type of modifications we are introducing in the code block below, and why we need to call `datagen.fit`!

In [None]:
from keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(featurewise_center=False, 
                            featurewise_std_normalization=False, 
                            width_shift_range=0.1,
                            height_shift_range=0.1,
                            zoom_range=0.2,
                            shear_range=0.1,
                            rotation_range=10.,)

datagen.fit(X_train)

`ImageDataGenerator` does not only modify the images. It also acts as a generator, which yields batches of samples which can be consumed by the training functions. In the code block below, we use the function `flow` to get the first batch (properly transposed, so that we can easily display it below).

In [None]:
if data_format == 'NCWH':
    X_aug = np.transpose(datagen.flow(X_train, batch_size=32, shuffle=False)[0][0:5], (0, 2, 3, 1))
else:
    X_aug = datagen.flow(X_train, batch_size=32, shuffle=False)[0][0:5]

Currently, `X_aug` holds the augmented versions of the first `5` train images. Plot the images and their augmented counterparts side to side and see what types of transformations have been applied.

In [None]:
plt.figure(figsize=(10,25))
for i in range(5):
    plt.subplot(5,2,2*i+1)
    # TODO use imshow to display the i-th original image (in NWHC format!)
    
    plt.axis('off')
    plt.subplot(5,2,2*i+2)
    # TODO use imshow to display the augmented image
    
    plt.axis('off')

Now go up by two code blocks, run the augmentation again and plot the resulting images. What happened? 

Well, the augmented images should be different from the previous ones, that's because the generator has applied a new set of random transformations to them. At training time, at every epoch, images will be slightly different, and this will help the network to focus on relevant features (and generalize much better). Let's do the training again and see what happens! First we re-initialize the model weights, by re-instantiating and compiling the model.

In [None]:
# Re-initialize models 

model = cnn_model()

# TODO compile the model again, using the previously defined sgd optimizer again


Next, we start the training. Instead of the function `fit`, which would feed the network samples taken from the original image tensors, we use `fit_generator`, which takes a generator as argument. Obviously, we will use our image augmenting generator. Let's see what happens.

In [None]:
aug_history_callback = model.fit_generator(datagen.flow(X_train, Y_train, batch_size=batch_size),
                            epochs=epochs,
                            validation_data=(X_val, Y_val),
                            callbacks=[LearningRateScheduler(lr_schedule)])

Again, get the test accuracy!

In [None]:
#TODO use model.evaluate with the test dataset to obtain test accuracy
score = 

print("Test accuracy = {}".format(score[1]))

You should now have reached a higher test accuracy, congratulations! Plot convergence data from `history_callback` and `aug_history_callback` and see how the two training curves differ... can you understand why?

In [None]:
# TODO plot training curves!

Now you can play around with the network, or adapt the code from previous notebook and visualize the activation for the 2D layers!