# Self-Driving Car Engineer Nanodegree

## Deep Learning

## Project: Build a Traffic Sign Recognition Classifier

In this notebook, a template is provided for you to implement your functionality in stages which is required to successfully complete this project. If additional code is required that cannot be included in the notebook, be sure that the Python code is successfully imported and included in your submission, if necessary. Sections that begin with **'Implementation'** in the header indicate where you should begin your implementation for your project. Note that some sections of implementation are optional, and will be marked with **'Optional'** in the header.

In addition to implementing code, there will be questions that you must answer which relate to the project and your implementation. Each section where you will answer a question is preceded by a **'Question'** header. Carefully read each question and provide thorough answers in the following text boxes that begin with **'Answer:'**. Your project submission will be evaluated based on your answers to each of the questions and the implementation you provide.

>**Note:** Code and Markdown cells can be executed using the **Shift + Enter** keyboard shortcut. In addition, Markdown cells can be edited by typically double-clicking the cell to enter edit mode.

---

## Step 1: Dataset Exploration

Visualize the German Traffic Signs Dataset. This is open ended, some suggestions include: plotting traffic signs images, plotting the count of each sign, etc. Be creative!


The pickled data is a dictionary with 4 key/value pairs:

- features -> the images pixel values, (width, height, channels)
- labels -> the label of the traffic sign
- sizes -> the original width and height of the image, (width, height)
- coords -> coordinates of a bounding box around the sign in the image, (x1, y1, x2, y2). Based the original image (not the resized version).

In [55]:
%matplotlib inline
import pickle
import matplotlib.pyplot as plt
import hashlib
import os
import pickle
from urllib.request import urlretrieve
import numpy as np
from PIL import Image
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelBinarizer
from sklearn.utils import resample
from tqdm import tqdm
from zipfile import ZipFile
import cv2
from matplotlib import gridspec
import math
import tensorflow as tf
import random
from random import sample
import math
from tensorflow.contrib.layers import flatten
import time
import datetime
from datetime import timedelta
from sklearn.metrics import confusion_matrix
#from utils5 import *

In [56]:
# TODO: fill this in based on where you saved the training and testing data
training_file = "train.p"
testing_file = "test.p"

with open(training_file, mode='rb') as f:
    train = pickle.load(f)
with open(testing_file, mode='rb') as f:
    test = pickle.load(f)
    
X_train, y_train = train['features'], train['labels']
X_test, y_test = test['features'], test['labels']

## Data Dimensions

The data-set has now been loaded and consists of 39,209 images and associated labels (i.e. classifications of the images). The data-set is split into 3 mutually exclusive sub-sets.

In [57]:
### To start off let's do a basic data summary.

# TODO: number of training examples
n_train = len(X_train)

# TODO: number of testing examples
n_test = len(X_test)

# TODO: number of training examples
n_train_label = len(y_train)

# TODO: number of testing examples
n_test_label = len(y_test)

# TODO: what's the shape of an image?
train_image_shape = X_train.shape
test_image_shape = X_test.shape
train_label_shape = y_train.shape
test_label_shape = y_test.shape

# TODO: how many classes are in the dataset
n_classes = max(y_train) + 1

print("Number of training examples =", n_train)
print("Number of testing examples =", n_test)
print("Number of training labels =", n_train_label)
print("Number of testing labels =", n_test_label)
print("Training image data shape =", train_image_shape)
print("Testing image data shape =", test_image_shape)
print("Training label data shape =", train_label_shape)
print("Testing label data shape =", test_label_shape)
print("Number of classes =", n_classes)

Number of training examples = 39209
Number of testing examples = 12630
Number of training labels = 39209
Number of testing labels = 12630
Training image data shape = (39209, 32, 32, 3)
Testing image data shape = (12630, 32, 32, 3)
Training label data shape = (39209,)
Testing label data shape = (12630,)
Number of classes = 43


In [58]:
### Data exploration visualization goes here.
### Feel free to use as many code cells as needed.

### Plot the input images ###
def plot_input_images():
    sample_size = 5 #five images per class label
    count = 1 #book keeping for plots
    fig = plt.figure(figsize=(30, 30))

    for i in range(n_classes):
        ind = y_train == i
        '''print("i = ", i)
        print("ind = ", ind)
        print("y_train = ", y_train)'''
        subset_x = X_train[ind,] #get all images that belong to class i
        #print("subset_x.shape = ", subset_x.shape)

        for x in range(sample_size):
            img = random.choice(subset_x) #randomly pick on image from class i
            #print("img.shape = ", img.shape)
            #plt.subplot(n_classes, sample_size, count)
            #fig.add_subplot(n_classes,sample_size,count)

            #plt.axis('off')
            #plt.imshow(img)
            #count +=1
            #print("count = ", count)
            
    
### Display the images in the form of a grid
def display_grid(image_data, sample_size, n_labels):
    count = 0 #book keeping for plots
    
    fig = plt.figure(figsize=(sample_size, n_labels))

    grid = gridspec.GridSpec(n_labels, sample_size, wspace=0.0, hspace=0.0)
    labelset_pbar = tqdm(range(n_labels), desc='Sample test images', unit='labels')

    for k in labelset_pbar:
        ind = y_train == k
        subset = image_data[ind,] #get all images that belong to class k
        
        for x in range(sample_size):
            img = random.choice(subset) #randomly pick on image from class k
            ax = plt.Subplot(fig, grid[count])
            ax.set_xticks([])
            ax.set_yticks([])
            ax.imshow(img) #, cmap='gray')
            fig.add_subplot(ax)
            count +=1

    # hide the borders
    if k == (n_labels-1):
        all_axes = fig.get_axes()

    for ax in all_axes:
        for sp in ax.spines.values():
            sp.set_visible(False)
            plt.show()

----

## Step 2: Design and Test a Model Architecture

Design and implement a deep learning model that learns to recognize traffic signs. Train and test your model on the [German Traffic Sign Dataset](http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset).

There are various aspects to consider when thinking about this problem:

- Your model can be derived from a deep feedforward net or a deep convolutional network.
- Play around preprocessing techniques (normalization, rgb to grayscale, etc)
- Number of examples per label (some have more than others).
- Generate fake data.

Here is an example of a [published baseline model on this problem](http://yann.lecun.com/exdb/publis/pdf/sermanet-ijcnn-11.pdf). It's not required to be familiar with the approach used in the paper but, it's good practice to try to read papers like these.

### Implementation

Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.

In [59]:
### Normalize
### Implement Min-Max scaling for grayscale image data
def normalize(image_data):
    normed_image = cv2.normalize(image_data, None, 0.1, 0.9, cv2.NORM_MINMAX, cv2.CV_32F)
    return(normed_image)
    
def One_Hot_Encode():
    # Turn labels into numbers and apply One-Hot Encoding
    encoder = LabelBinarizer()
    encoder.fit(y_train) # fitting the encoder on training labels
    train_labels = encoder.transform(y_train)
    test_labels = encoder.transform(y_test)

    # Change to float32, so it can be multiplied against the features in 
    # TensorFlow, which are float32
    hot_train_labels = train_labels.astype(np.float32)
    hot_test_labels = test_labels.astype(np.float32)
    
    is_labels_encod = True
    return hot_train_labels, hot_test_labels
    
### Use this function create batches
def batches(X, y, batch_size):
    # Number of batches includes a final "rest" batch
    # if division is not exact       
    total_batch = int(n_train/batch_size) + 1
    train_feature_batch = np.array_split(X, int(n_train/batch_size)+1)
    train_label_batch = np.array_split(y, int(n_train/batch_size)+1)
    # Loop over all batches
    for i in range(total_batch):
        batch_x = train_feature_batch[i]
        batch_y = train_label_batch[i]
        return batch_x, batch_y

def flatten_layer(layer):
    # Get the shape of the input layer.
    layer_shape = layer.get_shape()

    # The number of features is: img_height * img_width * num_channels
    num_features = layer_shape[1:4].num_elements()
    
    # Reshape the layer to [num_images, num_features].
    layer_flat = tf.reshape(layer, [-1, num_features])

    # Return both the flattened layer and the number of features.
    return layer_flat, num_features


In [60]:
### Preprocess the data here.
### Feel free to use as many code cells as needed.

### Normzlize the grayscaled images
X_train_norm = np.array([normalize(image) for image in X_train], dtype=np.float32)
X_test_norm = np.array([normalize(image) for image in X_test], dtype=np.float32)

### Generate OHE
y_train_hot, y_test_hot = One_Hot_Encode()

### Question 1 

_Describe the techniques used to preprocess the data._

**Answer:**

In [61]:
### Generate data additional (if you want to!)
### and split the data into training/validation/testing sets here.
### Feel free to use as many code cells as needed.
train_features, valid_features, train_labels, valid_labels = train_test_split(
    X_train_norm,
    y_train_hot,
    test_size=0.05,
    random_state=832)
    #random_state=832289)

#test_features = X_test_norm
test_features = X_test
test_labels = y_test_hot
#test_labels = y_test

### Question 2

_Describe how you set up the training, validation and testing data for your model. If you generated additional data, why?_

**Answer:**

In [62]:
### Define your architecture here.
### Feel free to use as many code cells as needed.
# Layer-1 - Convolutional Layer 1.
filter_size1 = 7          # Convolution filters are 7 x 7 pixels.
num_filters1 = 100        # There are 100 of these filters.

# Layer-2 - Max Pooling Layer 1.
max_pool_size = 2

# Layer-3 - Convolutional Layer 2.
filter_size2 = 4          # Convolution filters are 4 x 4 pixels.
num_filters2 = 150        # There are 150 of these filters.

# Layer-4 - Max Pooling Layer 2
max_pool_size = 2

# Layer-5 - Convolutional Layer 3.
filter_size3 = 4          # Convolution filters are 4 x 4 pixels.
num_filters3 = 250        # There are 250 of these filters.

# Layer-6 - Max Pooling Layer 3
max_pool_size = 2

# Layer-7 - Fully-connected layer 1.
fc_size1 = 300             # Number of neurons in fully-connected layer.

# Layer-8 - Fully-connected layer 2.
fc_size2 = 43              # Number of neurons in fully-connected layer.


In [63]:
# We know that images are 32 pixels in each dimension.
img_size = 32

# Number of colour channels for the images.
num_channels = 3

# Images are stored in one-dimensional arrays of this length.
img_size_flat = img_size*img_size*num_channels

# Tuple with height and width of images used to reshape arrays.
img_shape = (img_size, img_size)

# Number of classes, one class for each of 43 classes
num_classes = 43

x = tf.placeholder(np.float32, shape=[None, img_size_flat], name='x')
x_image = tf.reshape(x, [-1, img_size, img_size, num_channels])
y_true = tf.placeholder(tf.float32, shape=[None, 43], name='y_true')
y_true_cls = tf.argmax(y_true, dimension=1)

drop_prob = tf.placeholder(tf.float32)

In [64]:
def new_weights(shape):
    return tf.Variable(tf.truncated_normal(shape, stddev=0.05))
    
def new_biases(length):
    return tf.Variable(tf.zeros(shape=[length]))

def new_conv_layer(input,              # The previous layer.
                   num_input_channels, # Num. channels in prev. layer.
                   filter_size,        # Width and height of each filter.
                   num_filters,        # Number of filters.
                   use_pooling=True):  # Use 2x2 max-pooling.

    # Shape of the filter-weights for the convolution.
    shape = [filter_size, filter_size, num_input_channels, num_filters]

    # Create new weights aka. filters with the given shape.
    weights = new_weights(shape=shape)

    # Create new biases, one for each filter.
    biases = new_biases(length=num_filters)

    # Create the TensorFlow operation for convolution.
    layer = tf.nn.conv2d(input=input,
                         filter=weights,
                         strides=[1, 1, 1, 1],
                         padding='VALID')

    # Add the biases to the results of the convolution.
    # A bias-value is added to each filter-channel.
    layer += biases
    
    print("layer = ", layer)

    # Use pooling to down-sample the image resolution?
    if use_pooling:
        # This is 2x2 max-pooling
        layer = tf.nn.max_pool(value=layer,
                               ksize=[1, 2, 2, 1],
                               strides=[1, 2, 2, 1],
                               padding='VALID')

    print("layer = ", layer)
    
    # Rectified Linear Unit (ReLU).
    layer = tf.nn.relu(layer)

    print("layer = ", layer)
    
    # We return both the resulting layer and the filter-weights
    # because we will plot the weights later.
    return layer, weights

    
def new_fc_layer(input,          # The previous layer.
                 num_inputs,     # Num. inputs from prev. layer.
                 num_outputs,    # Num. outputs.
                 use_relu=True,
                 l2 = True,
                 dropout = None,
                 keep_prob = 1): # Use Rectified Linear Unit (ReLU)?

    # Create new weights and biases.
    weights = new_weights(shape=[num_inputs, num_outputs])
    biases = new_biases(length=num_outputs)

    # Calculate the layer as the matrix multiplication of
    # the input and weights, and then add the bias-values.
    layer = tf.matmul(input, weights) + biases

    if use_relu:
        layer = tf.nn.relu(layer)
        
    if l2:
        reg = tf.nn.l2_loss(weights) + tf.nn.l2_loss(biases)
    else:
        reg = 0
        
    if dropout:
        layer = tf.nn.dropout(layer, keep_prob=keep_prob)
        
    return layer, reg

In [65]:
layer_conv1, weights_conv1 = \
    new_conv_layer(input=x_image,
                   num_input_channels=num_channels,
                   filter_size=filter_size1,
                   num_filters=num_filters1,
                   use_pooling=True)

layer_conv2, weights_conv2 = \
    new_conv_layer(input=layer_conv1,
                   num_input_channels=num_filters1,
                   filter_size=filter_size2,
                   num_filters=num_filters2,
                   use_pooling=True)
    
layer_conv3, weights_conv3 = \
    new_conv_layer(input=layer_conv2,
                   num_input_channels=num_filters2,
                   filter_size=filter_size3,
                   num_filters=num_filters3,
                   use_pooling=True)

layer_flat, num_features = flatten_layer(layer_conv3)

layer_fc1, reg1 = new_fc_layer(input=layer_flat,
                         num_inputs=num_features,
                         num_outputs=fc_size1,
                         use_relu=True,
                         l2 = True,
                         dropout = True,
                         keep_prob = drop_prob
                        )

layer_fc2, reg2 = new_fc_layer(input=layer_fc1,
                         num_inputs=fc_size1,
                         num_outputs=num_classes,
                         use_relu=True,
                         l2 = True,
                         dropout = False,
                         keep_prob = drop_prob
                        )

reg = reg1 + reg2

y_pred = tf.nn.softmax(layer_fc2)

y_pred_cls = tf.argmax(y_pred, dimension=1)

cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=layer_fc2,
                                                        labels=y_pred)

cost = tf.reduce_mean(cross_entropy)

cost += 1e-4 * reg

optimizer = tf.train.AdamOptimizer(learning_rate=1e-4).minimize(cost)

correct_prediction = tf.equal(y_pred_cls, y_true_cls)

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))


layer =  Tensor("add_29:0", shape=(?, 26, 26, 100), dtype=float32)
layer =  Tensor("MaxPool_11:0", shape=(?, 13, 13, 100), dtype=float32)
layer =  Tensor("Relu_17:0", shape=(?, 13, 13, 100), dtype=float32)
layer =  Tensor("add_30:0", shape=(?, 10, 10, 150), dtype=float32)
layer =  Tensor("MaxPool_12:0", shape=(?, 5, 5, 150), dtype=float32)
layer =  Tensor("Relu_18:0", shape=(?, 5, 5, 150), dtype=float32)
layer =  Tensor("add_31:0", shape=(?, 2, 2, 250), dtype=float32)
layer =  Tensor("MaxPool_13:0", shape=(?, 1, 1, 250), dtype=float32)
layer =  Tensor("Relu_19:0", shape=(?, 1, 1, 250), dtype=float32)


### Question 3

_What does your final architecture look like? (Type of model, layers, sizes, connectivity, etc.)  For reference on how to build a deep neural network using TensorFlow, see [Deep Neural Network in TensorFlow
](https://classroom.udacity.com/nanodegrees/nd013/parts/fbf77062-5703-404e-b60c-95b78b2f3f9e/modules/6df7ae49-c61c-4bb2-a23e-6527e69209ec/lessons/b516a270-8600-4f93-a0a3-20dfeabe5da6/concepts/83a3a2a2-a9bd-4b7b-95b0-eb924ab14432) from the classroom._


**Answer:**

In [66]:
### Train your model here.
### Feel free to use as many code cells as needed.
session = tf.Session()

session.run(tf.initialize_all_variables())

train_batch_size = 100

In [67]:
# Counter for total number of iterations performed so far.
total_iterations = 0

def optimize(num_iterations):
    # Ensure we update the global variable rather than a local copy.
    global total_iterations

    # Start-time used for printing time-usage below.
    start_time = time.time()

    for i in range(total_iterations,
                   total_iterations + num_iterations):

        # x_batch now holds a batch of images and
        # y_true_batch are the true labels for those images.
        x_batch, y_true_batch = batches(train_features, train_labels, 
                                            train_batch_size)
        
        
        x_batch_flat = np.reshape(x_batch, [-1, 1024*3])
        x_batch_flat = x_batch_flat.astype(np.float32)
        
        # Put the batch into a dict with the proper names
        # for placeholder variables in the TensorFlow graph.
        feed_dict_train = {x: x_batch_flat,
                           y_true: y_true_batch,
                           drop_prob: 0.5
                          }

        # Run the optimizer using this batch of training data.
        session.run(optimizer, feed_dict=feed_dict_train)

        # Print status every 100 iterations.
        if i % 100 == 0:
            # Calculate the accuracy on the training-set.
            acc = session.run(accuracy, feed_dict=feed_dict_train)

            # Message for printing.
            msg = "Optimization Iteration: {0:>6}, Training Accuracy: {1:>6.1%}"

            # Print it.
            print(msg.format(i + 1, acc))

    # Update the total number of iterations performed.
    total_iterations += num_iterations
    
    # Ending time.
    end_time = time.time()

    # Difference between start and end-times.
    time_dif = end_time - start_time

    # Print the time-usage.
    print("Time usage: " + str(timedelta(seconds=int(round(time_dif)))))

In [68]:
# Split the test-set into smaller batches of this size.
test_batch_size = 256

def print_test_accuracy(show_example_errors=False,
                        show_confusion_matrix=False):

    # Number of images in the test-set.
    #num_test = len(data.test.images)
    num_test = n_test

    # Allocate an array for the predicted classes which
    # will be calculated in batches and filled into this array.
    cls_pred = np.zeros(shape=num_test, dtype=np.int)

    # Now calculate the predicted classes for the batches.
    i = 0

    while i < num_test:
        # The ending index for the next batch is denoted j.
        j = min(i + test_batch_size, num_test)

        # Get the images from the test-set between index i and j.
        X_test_flat = np.reshape(X_test, [-1, 1024*3])
        
        images = X_test_flat[i:j, :]
        images = images.astype(np.float32)
        
        # Get the associated labels.
        #labels = y_test[i:j]
        labels = test_labels[i:j]
        
        # Create a feed-dict with these images and labels.
        feed_dict = {x: images,
                     y_true: labels,
                     drop_prob: 1.0
                    }

        # Calculate the predicted class using TensorFlow.
        cls_pred[i:j] = session.run(y_pred_cls, feed_dict=feed_dict)

        # Set the start-index for the next batch to the
        # end-index of the current batch.
        i = j

    # Convenience variable for the true class-numbers of the test-set.
    cls_true = test_labels
    cls_true = y_test
    
    # Create a boolean array whether each image is correctly classified.
    correct = (cls_true == cls_pred)
    print("correct = ", correct)

    # Calculate the number of correctly classified images.
    correct_sum = np.sum(correct)

    # Classification accuracy is the number of correctly classified
    # images divided by the total number of images in the test-set.
    acc = float(correct_sum) / num_test
    
    # Print the accuracy.
    msg = "Accuracy on Test-Set: {0:.1%} ({1} / {2})"
    print(msg.format(acc, correct_sum, num_test))

In [69]:
print_test_accuracy()

correct =  [False False False ..., False False False]
Accuracy on Test-Set: 6.5% (824 / 12630)


In [70]:
optimize(num_iterations=1)

Optimization Iteration:      1, Training Accuracy:   1.1%
Time usage: 0:00:02


In [71]:
print_test_accuracy()

correct =  [False False False ..., False False False]
Accuracy on Test-Set: 6.5% (824 / 12630)


In [72]:
optimize(num_iterations=99)

Time usage: 0:02:00


In [73]:
print_test_accuracy()

correct =  [False False False ..., False False False]
Accuracy on Test-Set: 6.9% (875 / 12630)


In [None]:
optimize(num_iterations=900) 

In [None]:
print_test_accuracy()

### Question 4

_How did you train your model? (Type of optimizer, batch size, epochs, hyperparameters, etc.)_


**Answer:**

### Question 5


_What approach did you take in coming up with a solution to this problem?_

**Answer:**

---

## Step 3: Test a Model on New Images

Take several pictures of traffic signs that you find on the web or around you (at least five), and run them through your classifier on your computer to produce example results. The classifier might not recognize some local signs but it could prove interesting nonetheless.

You may find `signnames.csv` useful as it contains mappings from the class id (integer) to the actual sign name.

### Implementation

Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.

In [3]:
### Load the images and plot them here.
### Feel free to use as many code cells as needed.

### Question 6

_Choose five candidate images of traffic signs and provide them in the report. Are there any particular qualities of the image(s) that might make classification difficult? It would be helpful to plot the images in the notebook._



**Answer:**

In [4]:
### Run the predictions here.
### Feel free to use as many code cells as needed.

### Question 7

_Is your model able to perform equally well on captured pictures or a live camera stream when compared to testing on the dataset?_


**Answer:**

In [None]:
### Visualize the softmax probabilities here.
### Feel free to use as many code cells as needed.

### Question 8

*Use the model's softmax probabilities to visualize the **certainty** of its predictions, [`tf.nn.top_k`](https://www.tensorflow.org/versions/r0.11/api_docs/python/nn.html#top_k) could prove helpful here. Which predictions is the model certain of? Uncertain? If the model was incorrect in its initial prediction, does the correct prediction appear in the top k? (k should be 5 at most)*


**Answer:**

### Question 9
_If necessary, provide documentation for how an interface was built for your model to load and classify newly-acquired images._


**Answer:**

> **Note**: Once you have completed all of the code implementations and successfully answered each question above, you may finalize your work by exporting the iPython Notebook as an HTML document. You can do this by using the menu above and navigating to  \n",
    "**File -> Download as -> HTML (.html)**. Include the finished document along with this notebook as your submission.