# The German Traffic Sign Recognition Benchmark

For this assignment we'll work with the German Traffic Sign Recognition Benchmark, which is a benchmark for classifying images of different types of traffic signs, for example a stop sign. Clearly this type of classification can be useful for autonomous driving vehicles, so they know to come to full stop at certain intersections. Download the training data for yourself [here](https://sid.erda.dk/public/archives/daaeac0d7ce1152aea9b61d9f1e19370/GTSRB-Training_fixed.zip).

Unzip the file and take a look at the images in the different folders there. Also take a look at the *Readme.txt* and make sure you understand the structure of the classification task before starting. Below is a cell to download the training and testing data for this benchmark onto this machine.

*Note:* Remember to run this notebook on [Kaggle](https://www.kaggle.com/) or [Google Colab](https://colab.research.google.com) with GPU hardware acceleration enabled! This will make training your network *much* faster.

In [None]:
![ ! -d 'GTSRB' ] && curl -O https://sid.erda.dk/public/archives/daaeac0d7ce1152aea9b61d9f1e19370/GTSRB-Training_fixed.zip
![ ! -d 'GTSRB' ] && unzip -q GTSRB-Training_fixed.zip

![ ! -d 'GTSRB' ] && curl -O https://sid.erda.dk/public/archives/daaeac0d7ce1152aea9b61d9f1e19370/GTSRB_Online-Test-Images-Sorted.zip
![ ! -d 'GTSRB' ] && unzip -q GTSRB_Online-Test-Images-Sorted.zip

![ ! -d 'GTSRB' ] && rm -rf GTSRB/Online-Test-sort/Images

Below is some code we've already provided to load the data. It is not required you understand what the functions in this cell do exactly. The functions mostly deal with the slightly more complicated loading of the images, as they're all in different folders and have a `.ppm` format. Additionally, it resizes all images to a standard *32 x 32* size, so they'll all fit into a neural network with the same input layer size. 

In [None]:
import os
import cv2
import numpy as np

def build_image_path_list(data_dir):
    image_path_list = []
    for root, dirs, files in list(os.walk(data_dir))[1:]:
        image_path_list.extend([(os.path.join(root, f), int(root.rsplit(os.sep, 1)[1]))
                                for f in files if f.endswith('.ppm')])
    return image_path_list
    
def load_data(data_dir, size=32):
    image_list, target_list = [], []

    for image_path, target in build_image_path_list(data_dir):
        image_list.append(cv2.resize(cv2.imread(image_path), (size, size)))
        target_list.append(target)

    return (np.array(image_list), np.array(target_list))

train_images, train_labels = load_data(os.path.join('GTSRB', 'Training'))
test_images, test_labels = load_data(os.path.join('GTSRB', 'Online-Test-sort'))

print(f'Training images loaded: {train_images.shape}')
print(f'Training labels loaded: {train_labels.shape}')
print(f'Testing images loaded: {test_images.shape}')
print(f'Testing labels loaded: {test_labels.shape}')

For this assignment you'll have to build and train a deep convolutional neural network yourself for this GTSRB data set. The goal is simply to get as high as possible accuracy on the provided test set, using only the training set to learn. You may of course reuse any code you might find useful from the CIFAR notebook for this, and you are also free to look up any other functions or classes you might want to try from the [TensorFlow Keras API](https://www.tensorflow.org/api_docs/python/tf/keras/) (which is what we've used for the CIFAR notebook too).

As you try different versions of your network, you should briefly document what you've tried for each version in the markdown cell below. You should describe what you've tried, a very brief motivation for why you've tried it and what testing accuracy this version produced.

*Hint:* A common practice when designing larger neural networks is to start by looking at the architectures of models with of state of the art performance from benchmarking competitions. You can read a description of one such famous model, VGG16, in the article here: [Understanding VGG16: Concepts, Architecture, and Performance](https://datagen.tech/guides/computer-vision/vgg16/). Even though the input for this problem is much smaller than the VGG16 input, you can still experiment and see if there any ideas or layer configurations that you can reuse to improve the results of your own network.


#### Version 1

*Your description goes here.*


#### Version 2

*Your description goes here.*

...

...

In [None]:
# YOUR CODE HERE