# Traffic sign classification with CNNs

In this notebook, we'll train a convolutional neural network (CNN, ConvNet) to classify images of traffic signs from [The German Traffic Sign Recognition Benchmark](http://benchmark.ini.rub.de/?section=gtsrb&subsection=news) using TensorFlow 2.0 / Keras. This notebook is largely based on the blog post [Building powerful image classification models using very little data](https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html) by François Chollet.

**Note that using a GPU with this notebook is highly recommended.**

First, the needed imports.

In [None]:
%matplotlib inline

import os, datetime
import random
import pathlib

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras import applications, optimizers

from tensorflow.keras.callbacks import TensorBoard
from tensorflow.keras.utils import plot_model

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
from PIL import Image

print('Using Tensorflow version:', tf.__version__,
      'Keras version:', tf.keras.__version__,
      'backend:', tf.keras.backend.backend())

## Data

The training dataset consists of 5535 images of traffic signs of varying size. There are 43 different types of traffic signs:

![title](imgs/gtsrb-montage.png)

The validation and test sets consist of 999 and 12630 images, respectively.

### Downloading the data

In [None]:
datapath = "/media/data/gtsrb/train-5535/"
nimages = dict()
(nimages['train'], nimages['validation'], nimages['test']) = (5535, 999, 12630)

### Image paths and labels

In [None]:
def get_paths(dataset):
    data_root = pathlib.Path(datapath+dataset)
    image_paths = list(data_root.glob('*/*'))
    image_paths = [str(path) for path in image_paths]
    image_count = len(image_paths)
    assert image_count == nimages[dataset], "Found {} images, expected {}".format(image_count, nimages[dataset])
    return image_paths

image_paths = dict()
image_paths['train'] = get_paths('train')
image_paths['validation'] = get_paths('validation')
image_paths['test'] = get_paths('test')

In [None]:
label_names = sorted(item.name for item in pathlib.Path(datapath+'train').glob('*/') if item.is_dir())
label_to_index = dict((name, index) for index,name in enumerate(label_names))

def get_labels(dataset):
    return [label_to_index[pathlib.Path(path).parent.name]
            for path in image_paths[dataset]]
    
image_labels = dict()
image_labels['train'] = get_labels('train')
image_labels['validation'] = get_labels('validation')
image_labels['test'] = get_labels('test')

### Data loading

We now define a function to load the images. The images are in PPM format, so we use the PIL library. Also we need to resize the images to a fixed size (`INPUT_IMAGE_SIZE`).

In [None]:
INPUT_IMAGE_SIZE = [80, 80]

def _load_image(path, label):
    image = Image.open(path.numpy())
    return np.array(image), label

def load_image(path, label):
    image, label = tf.py_function(_load_image, (path, label), (tf.float32, tf.int32))
    image.set_shape([None, None, None])
    label.set_shape([])
    return tf.image.resize(image, INPUT_IMAGE_SIZE), label

### TF Datasets

Let's now define our [TF `Dataset`s](https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/data/Dataset#class_dataset) for training, validation, and test data. 

In [None]:
BATCH_SIZE = 50

train_dataset = tf.data.Dataset.from_tensor_slices((image_paths['train'],
                                                    image_labels['train']))
train_dataset = train_dataset.map(load_image, num_parallel_calls=10)
train_dataset = train_dataset.shuffle(2000).batch(BATCH_SIZE, drop_remainder=True)
train_dataset = train_dataset.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)

validation_dataset = tf.data.Dataset.from_tensor_slices((image_paths['validation'],
                                                         image_labels['validation']))
validation_dataset = validation_dataset.map(load_image, num_parallel_calls=10)
validation_dataset = validation_dataset.batch(BATCH_SIZE, drop_remainder=True)
validation_dataset = validation_dataset.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)

test_dataset = tf.data.Dataset.from_tensor_slices((image_paths['test'],
                                                   image_labels['test']))
test_dataset = test_dataset.map(load_image, num_parallel_calls=10)
test_dataset = test_dataset.batch(BATCH_SIZE, drop_remainder=False)
test_dataset = test_dataset.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)

Let's see a couple of our training images:

In [None]:
plt.figure(figsize=(10,10))
for batch, labels in train_dataset.take(1):
    for i in range(9):    
        plt.subplot(3,3,i+1)
        plt.imshow(tf.cast(batch[i,:,:,:], tf.int32))
        plt.title(label_names[labels[i]])
        plt.grid(False)
        plt.xticks([])
        plt.yticks([])
    plt.suptitle('some training images', fontsize=16, y=0.93)

## Option 1: Train a small CNN from scratch

Similarly as with MNIST digits, we can start from scratch and train a CNN for the classification task.

However, due to the small number of training images, a large network will easily overfit.
Therefore, to make the most of our limited number of training examples, we'll apply random augmentation transformations (small random crop and contrast adjustment) to them each time we are looping over them. This way, we "augment" our training dataset to contain more data.

The augmentation transformations are implemented as preprocessing layers in Keras.
There are various such layers readily available, see [https://keras.io/guides/preprocessing_layers/](https://keras.io/guides/preprocessing_layers/) for more information.

### Initialization

In [None]:
inputs = keras.Input(shape=INPUT_IMAGE_SIZE+[3])
x = layers.Rescaling(scale=1./255)(inputs)

x = layers.RandomCrop(75, 75)(x)
x = layers.RandomContrast(0.1)(x)

x = layers.Conv2D(32, (3, 3), activation='relu')(x)
x = layers.MaxPooling2D(pool_size=(2, 2))(x)

x = layers.Conv2D(32, (3, 3), activation='relu')(x)
x = layers.MaxPooling2D(pool_size=(2, 2))(x)

x = layers.Conv2D(64, (3, 3), activation='relu')(x)
x = layers.MaxPooling2D(pool_size=(2, 2))(x)

x = layers.Flatten()(x)
x = layers.Dense(128, activation='relu')(x)
x = layers.Dropout(0.5)(x)

outputs = layers.Dense(43, activation='softmax')(x)

model = keras.Model(inputs=inputs, outputs=outputs,
                    name="gtsrb-small-cnn")

model.compile(loss='sparse_categorical_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

print(model.summary())

In [None]:
plot_model(model, 'tf2-gtsrb-small-cnn.png', show_shapes=True)

### Learning

We'll use TensorBoard to visualize our progress during training.

In [None]:
logdir = os.path.join(os.getcwd(), "logs",
                      "gtsrb-small-cnn-"+datetime.datetime.now().strftime('%Y-%m-%d_%H-%M-%S'))
print('TensorBoard log directory:', logdir)
os.makedirs(logdir)
callbacks = [TensorBoard(log_dir=logdir)]

In [None]:
%%time

epochs = 20 

history = model.fit(train_dataset, epochs=epochs,
                    validation_data=validation_dataset,
                    callbacks=callbacks, verbose=2)

model.save("gtsrb-small-cnn.h5")

In [None]:
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10,3))

ax1.plot(history.epoch,history.history['loss'], label='training')
ax1.plot(history.epoch,history.history['val_loss'], label='validation')
ax1.set_title('loss')
ax1.set_xlabel('epoch')
ax1.legend(loc='best')

ax2.plot(history.epoch,history.history['accuracy'], label='training')
ax2.plot(history.epoch,history.history['val_accuracy'], label='validation')
ax2.set_title('accuracy')
ax2.set_xlabel('epoch')
ax2.legend(loc='best');

### Inference

In [None]:
%%time
scores = model.evaluate(test_dataset, verbose=2)
print("Test set %s: %.2f%%" % (model.metrics_names[1], scores[1]*100))

## Option 2: Reuse a pre-trained CNN

Another option is to reuse a pretrained network.  Here we'll use the [VGG16](https://keras.io/applications/#vgg16) network architecture with weights learned using Imagenet. 

### Initialization

We load the pretrained VGG16 network, remove the top layers, and freeze the pre-trained weights. 

In [None]:
inputs = keras.Input(shape=INPUT_IMAGE_SIZE+[3])
x = layers.Rescaling(scale=1./255)(inputs)

x = layers.RandomCrop(75, 75)(x)
x = layers.RandomContrast(0.1)(x)

pt_model = applications.VGG16(weights='imagenet', include_top=False,      
                              input_tensor=x)
for layer in pt_model.layers:
    layer.trainable = False

We then stack our own, randomly initialized layers on top of the VGG16 network.

In [None]:
x = layers.Flatten()(pt_model.output)
x = layers.Dense(64, activation='relu')(x)

outputs = layers.Dense(43, activation='softmax')(x)

model = keras.Model(inputs=inputs, outputs=outputs,
                    name="gtsrb-vgg16-reuse")

model.compile(loss='sparse_categorical_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

print(model.summary())

In [None]:
plot_model(model, 'tf2-dvc-vgg16-reuse.png', show_shapes=True)

### Learning 1: New layers

In [None]:
logdir = os.path.join(os.getcwd(), "logs",
                      "gtsrb-vgg16-"+datetime.datetime.now().strftime('%Y-%m-%d_%H-%M-%S'))
print('TensorBoard log directory:', logdir)
os.makedirs(logdir)
callbacks_pretrained = [TensorBoard(log_dir=logdir)]

In [None]:
%%time

epochs = 20 

history = model.fit(train_dataset, epochs=epochs,
                    validation_data=validation_dataset,
                    verbose=2, callbacks=callbacks_pretrained)

model.save("gtsrb-vgg16-reuse.h5")

In [None]:
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10,3))

ax1.plot(history.epoch,history.history['loss'], label='training')
ax1.plot(history.epoch,history.history['val_loss'], label='validation')
ax1.set_title('loss')
ax1.set_xlabel('epoch')
ax1.legend(loc='best')

ax2.plot(history.epoch,history.history['accuracy'], label='training')
ax2.plot(history.epoch,history.history['val_accuracy'], label='validation')
ax2.set_title('accuracy')
ax2.set_xlabel('epoch')
ax2.legend(loc='best');

#### Inference

In [None]:
%%time

scores = model.evaluate(test_dataset, verbose=2)
print("Test set %s: %.2f%%" % (model.metrics_names[1], scores[1]*100))

### Learning 2: Fine-tuning

Once the top layers have learned some reasonable weights, we can continue training by unfreezing the last convolution block of VGG16 (`block5`) so that it may adapt to our data. The learning rate should be smaller than usual. 

In [None]:
train_layer = False
for layer in model.layers:
    if layer.name == "block5_conv1":
        train_layer = True
    layer.trainable = train_layer
    
for i, layer in enumerate(model.layers):
    print(i, layer.name, "trainable:", layer.trainable)

In [None]:
model.compile(loss='sparse_categorical_crossentropy',
    optimizer=optimizers.RMSprop(learning_rate=1e-5),
    metrics=['accuracy'])

In [None]:
logdir = os.path.join(os.getcwd(), "logs",
                      "gtsrb-vgg16-finetune-"+datetime.datetime.now().strftime('%Y-%m-%d_%H-%M-%S'))
print('TensorBoard log directory:', logdir)
os.makedirs(logdir)
callbacks_pretrained = [TensorBoard(log_dir=logdir)]

In [None]:
%%time

epochs = 20 

history = model.fit(train_dataset, epochs=epochs,
                    validation_data=validation_dataset,
                    verbose=2, callbacks=callbacks_pretrained)

model.save("gtsrb-vgg16-finetune.h5")

In [None]:
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10,3))

ax1.plot(history.epoch,history.history['loss'], label='training')
ax1.plot(history.epoch,history.history['val_loss'], label='validation')
ax1.set_title('loss')
ax1.set_xlabel('epoch')
ax1.legend(loc='best')

ax2.plot(history.epoch,history.history['accuracy'], label='training')
ax2.plot(history.epoch,history.history['val_accuracy'], label='validation')
ax2.set_title('accuracy')
ax2.set_xlabel('epoch')
ax2.legend(loc='best');

#### Inference

In [None]:
%%time
scores = model.evaluate(test_dataset, verbose=2)
print("Test set %s: %.2f%%" % (model.metrics_names[1], scores[1]*100))