# Self-Driving Car Engineer Nanodegree

## Deep Learning

## Project: Build a Traffic Sign Recognition Classifier

In this notebook, a template is provided for you to implement your functionality in stages which is required to successfully complete this project. If additional code is required that cannot be included in the notebook, be sure that the Python code is successfully imported and included in your submission, if necessary. Sections that begin with **'Implementation'** in the header indicate where you should begin your implementation for your project. Note that some sections of implementation are optional, and will be marked with **'Optional'** in the header.

In addition to implementing code, there will be questions that you must answer which relate to the project and your implementation. Each section where you will answer a question is preceded by a **'Question'** header. Carefully read each question and provide thorough answers in the following text boxes that begin with **'Answer:'**. Your project submission will be evaluated based on your answers to each of the questions and the implementation you provide.

>**Note:** Code and Markdown cells can be executed using the **Shift + Enter** keyboard shortcut. In addition, Markdown cells can be edited by typically double-clicking the cell to enter edit mode.

---

## Step 1: Dataset Exploration

Visualize the German Traffic Signs Dataset. This is open ended, some suggestions include: plotting traffic signs images, plotting the count of each sign, etc. Be creative!


The pickled data is a dictionary with 4 key/value pairs:

- features -> the images pixel values, (width, height, channels)
- labels -> the label of the traffic sign
- sizes -> the original width and height of the image, (width, height)
- coords -> coordinates of a bounding box around the sign in the image, (x1, y1, x2, y2). Based the original image (not the resized version).

In [None]:
import pandas as pd

In [None]:
import numpy as np
import tensorflow as tf
import time
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import os
import math
import cv2
%matplotlib inline

In [None]:
from multiprocessing import Queue

In [None]:
# Load pickled data
import pickle

training_file = "traffic-sign-data/train.p"
testing_file = "traffic-sign-data/test.p"
valid_file = "traffic-sign-data/valid.p"
with open(training_file, mode='rb') as f:
    X_extended_train=None
    X_train_orig = None
    train = pickle.load(f)
with open(testing_file, mode='rb') as f:
    test = pickle.load(f)
with open(valid_file, mode='rb') as f:
    valid = pickle.load(f)
    
with open("signnames.csv", mode='rb') as f:
    #sign_nums, sign_names = np.loadtxt(f, skiprows=1, delimiter=',', dtype=[np.uint8, np.str_], unpack=True)
    classnames = np.genfromtxt(f, delimiter=",", \
                              dtype=[('myint','i'), ('mystring','S50')], \
                              skip_header=1, usecols=(0,1))
    
X_train, y_train = train['features'], train['labels']
X_test, y_test = test['features'], test['labels']
X_valid, y_valid = valid['features'], valid['labels']



In [None]:
# Basic data summary.

n_train = len(X_train)
n_test = len(X_test)
n_valid = len(X_valid)
_, height, width, channel = X_train.shape
image_shape = (height, width, channel)
n_classes = len(classnames)

print("Number of training examples =", n_train)
print("Number of testing examples =", n_test)
print("Number of validation examples =", n_valid)
print("Image data shape =", image_shape)
print("Number of classes =", n_classes)

In [None]:
import copy
# augment single image by applying flips, rotations to select image classes
X_extended_train = None
def select_flipping(images, labels):

    X=images
    y=labels
    # Classes of signs that, when flipped horizontally, should still be classified as the same class
    self_flippable_horizontally = np.array([11, 12, 13, 15, 17, 18, 22, 26, 30, 35])
    # Classes of signs that, when flipped vertically, should still be classified as the same class
    self_flippable_vertically = np.array([1, 5, 12, 15, 17])
    # Classes of signs that, when flipped horizontally and then vertically, should still be classified as the same class
    self_flippable_both = np.array([32, 40])
    # Classes of signs that, when flipped horizontally, would still be meaningful, but should be classified as some other class
    cross_flippable = np.array([
        [19, 20],
        [33, 34],
        [36, 37],
        [38, 39],
        [20, 19],
        [34, 33],
        [37, 36],
        [39, 38],
    ])
    n_classes = 43

    X_extended = np.empty([0, X.shape[1], X.shape[2], X.shape[3]], dtype=np.float32)
    y_extended = np.empty([0], dtype=np.int32)
    y_inds = np.empty([0], dtype=np.int32)
    
    for c in range(n_classes):
        # First copy existing data for this class
        X_extended = np.append(X_extended, X[y == c], axis=0)
        y_extended = np.append(y_extended, y[y==c], axis=0)
        y_tb_app, = np.where(y==c)
        y_inds = np.append(y_inds, np.ones([np.sum(y==c)])*-1, axis=0)
        
        # If we can flip images of this class horizontally and they would still belong to said class...
        if c in self_flippable_horizontally:
            # ...Copy their flipped versions into extended array.
            X_extended = np.append(X_extended, X[y == c][:, :, ::-1, :], axis=0)
            y_inds = np.append(y_inds, y_tb_app, axis=0)
            y_extended = np.append(y_extended, np.full((X_extended.shape[0] - y_extended.shape[0]), c, dtype=np.int32))
            if(y_inds.shape != y_extended.shape):
                print("sfh")
                print(c)
                break
        
        # If we can flip images of this class horizontally and they would belong to other class...
        if c in cross_flippable[:, 0]:
            # ...Copy flipped images of that other class to the extended array.
            flip_class = cross_flippable[cross_flippable[:, 0] == c][0][1]
            X_extended = np.append(X_extended, X[y == flip_class][:, :, ::-1, :], axis=0)
            flipped_orig_label_inds, = np.where(y == flip_class)
            y_inds = np.append(y_inds, flipped_orig_label_inds, axis=0)
            # Fill labels for added images set to current class.
            y_extended = np.append(y_extended, np.full((X_extended.shape[0] - y_extended.shape[0]), c, dtype=np.int32))
            if(y_inds.shape != y_extended.shape):
                print("cf")
                print(c)
                break
        
        # If we can flip images of this class vertically and they would still belong to said class...
        if c in self_flippable_vertically:
            # ...Copy their flipped versions into extended array.
            X_extended = np.append(X_extended, X[y == c][:, ::-1, :, :], axis=0)
            y_inds = np.append(y_inds, y_tb_app, axis=0)
            y_extended = np.append(y_extended, np.full((X_extended.shape[0] - y_extended.shape[0]), c, dtype=np.int32))
            if(y_inds.shape != y_extended.shape):
                print("sfv")
                print(c)
                break
    
        # If we can flip images of this class horizontally AND vertically and they would still belong to said class...
        if c in self_flippable_both:
            # ...Copy their flipped versions into extended array.
            X_extended = np.append(X_extended, X_extended[y_extended == c][:, ::-1, ::-1, :], axis=0)
            y_inds = np.append(y_inds, y_tb_app, axis=0)
            y_extended = np.append(y_extended, np.full((X_extended.shape[0] - y_extended.shape[0]), c, dtype=np.int32))
            if(y_inds.shape != y_extended.shape):
                print("sfb")
                print(c)
                break
    
        
    extend_datas  = X_extended
    extend_labels = y_extended
    return (extend_datas, extend_labels, y_inds)


# use opencv to do data agumentation via perturbation (translation, rotation, brightness, saturation, contrast)
# see also: https://github.com/dmlc/mxnet/blob/master/python/mxnet/image.py

def perturb_set(images, keep=0):
    return [perturb(im, keep) for im in images]
    
#def perturb(image, angle_limit=15, scale_limit=0.1, translate_limit=3, distort_limit=3, illumin_limit=0.2):
def perturb(image, keep=0, angle_limit=15, scale_limit=0.1, translate_limit=3, distort_limit=3, illumin_limit=0.2):

    if(np.random.uniform() < keep):
        return image
    (W, H, C) = image.shape
    center = np.array([W / 2., H / 2.])
    da = np.random.uniform(low=-1, high=1) * angle_limit/180. * math.pi
    scale = np.random.uniform(low=-1, high=1) * scale_limit + 1

    # Use small angle approximation instead of sin/cos functions
    cc = scale*(1 - (da*da)/2.)
    ss = scale*da
    rotation    = np.array([[cc, ss],[-ss,cc]])
    translation = np.random.uniform(low=-1, high=1, size=(1,2)) * translate_limit
    distort     = np.random.standard_normal(size=(4,2)) * distort_limit

    pts1 = np.array([[0., 0.], [0., H], [W, H], [W, 0.]])
    pts2 = np.matmul(pts1-center, rotation) + center  + translation

    #add perspective noise
    pts2 = pts2 + distort


    #http://milindapro.blogspot.jp/2015/05/opencv-filters-copymakeborder.html
    matrix  = cv2.getPerspectiveTransform(pts1.astype(np.float32), pts2.astype(np.float32)) 
    perturb = cv2.warpPerspective(image, matrix, (W, H), flags=cv2.INTER_LINEAR,
                                  borderMode=cv2.BORDER_REFLECT_101)  # BORDER_WRAP  #BORDER_REFLECT_101  #cv2.BORDER_CONSTANT  BORDER_REPLICATE

    #brightness, contrast, saturation-------------
    #from mxnet code
    if 1:  #brightness
        alpha = 1.0 + illumin_limit*np.random.uniform(-1, 1)
        #alpha = 1.0 + illumin_limit*-1
        perturb = perturb * alpha
        perturb = np.clip(perturb,0.,255.)
        pass

    coef = np.array([[[0.299, 0.587, 0.114]]]) #rgb to gray (YCbCr) :  Y = 0.299R + 0.587G + 0.114B

    if 1:  #contrast
        alpha = illumin_limit*np.random.uniform(-1, 1)
        #alpha = illumin_limit*-1
        gray = perturb * coef
        gray = (3.0 * (alpha) / gray.size) * np.sum(gray)
        perturb = perturb * (1.0 + alpha)
        perturb += gray
        perturb = np.clip(perturb,0.,255.)
        pass

    if 1:  #saturation
        alpha = illumin_limit*np.random.uniform(-1, 1)
        #alpha = illumin_limit*-1
        gray = perturb * coef
        gray = np.sum(gray, axis=2, keepdims=True)
        gray *= alpha
        perturb = perturb * (1.0 + alpha)
        perturb += gray
        perturb = np.clip(perturb,0.,255.)
        pass

    return perturb

def insert_subimage(image, sub_image, y, x): 
    h, w, c = sub_image.shape
    image[y:y+h, x:x+w, :]=sub_image 
    return image 

In [None]:

#count
#h = np.histogram(train_labels, bins=np.arange(num_class))
#results image
if X_extended_train is None:
    X_extended_train, y_extended_train, y_extended_orig_inds = select_flipping(X_train, y_train)

num_sample=10
results_image = 255.*np.ones(shape=(n_classes*height,(num_sample+2+22)*width, channel),dtype=np.float32)
for c in range(n_classes):
    
    #make mean
    idx = list(np.where(y_train == c)[0])
    mean_image = np.average(X_train[idx], axis=0)
    insert_subimage(results_image, mean_image, c*height, width)

    ### make random sample from original
    #sample_im_inds = np.random.choice(idx, num_sample)
    #perturbed_sample_ims = perturb_set(X_train[sample_im_inds])
    ### make random sample from extended
    idx_ext = list(np.where((y_extended_train == c) & (y_extended_orig_inds != 0))[0])
    if(len(idx_ext) == 0):
        idx_ext = list(np.where(y_train == c)[0])
    ###
    
    sample_im_inds = np.random.choice(idx_ext, num_sample)
    perturbed_sample_ims = perturb_set(X_extended_train[sample_im_inds])
    
    i = 0
    for im in perturbed_sample_ims:
        insert_subimage(results_image, im, c*height, (2+i)*width)
        i = i+1

    #print summary
    count=len(idx_ext)
    percentage = float(count)/float(len(X_extended_train))
    cv2.putText(results_image, '%02d:%-6s'%(c, classnames[c]), ((2+num_sample)*width, int((c+0.7)*height)),cv2.FONT_HERSHEY_SIMPLEX,0.5,(0,0,0),1)
    cv2.putText(results_image, '[%4d]'%(count), ((2+num_sample+14)*width+40, int((c+0.7)*height)),cv2.FONT_HERSHEY_SIMPLEX,0.5,(0,0,255),1)
    

print("////")
#imshow('results_image',results_image)
#cv2.waitKey(0)


print('** training data summary **')
print('\t1st column: label(image)')
print('\t2nd column: mean image')
print('\tother column: example images')
print('\tblack text: label')
print('\tblue text: sanple count for each class and histogram plot')
plt.rcParams["figure.figsize"] = (25,25)
plt.imshow(results_image.astype(np.uint8))
plt.axis('off') 
plt.show()

In [None]:
# Plot a histogram of the count of the number of examples of each sign
# in the test set
data = y_train

fig, ax0 = plt.subplots(nrows=1, ncols=1)

d = np.diff(np.unique(data)).min()
left_of_first_bin = data.min() - float(d)/2
right_of_last_bin = data.max() + float(d)/2
colors = ['red', 'lime']
labels=['Original', 'Extended']
ax0.legend(prop={'size': 10})
ax0.set_title('bars with legend')
ax0.hist([y_train, y_extended_train], np.arange(left_of_first_bin, right_of_last_bin + d, d), normed=True, color=colors, label=labels)

plt.show()

print("STD of original: ", np.std(y_train))
print("STD of extended: ", np.std(y_extended_train))

In [None]:
# Shuffle training examples
from sklearn.utils import shuffle

if X_train_orig is None:
    print(len(y_train), len(y_extended_train))
    X_train_orig, y_train_orig = X_train, y_train
    X_train, y_train = X_extended_train, y_extended_train

print(len(y_train_orig), len(y_train))    
X_train, y_train, y_extended_orig_inds = shuffle(X_train, y_train, y_extended_orig_inds, random_state=42)

----

## Step 2: Design and Test a Model Architecture

Design and implement a deep learning model that learns to recognize traffic signs. Train and test your model on the [German Traffic Sign Dataset](http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset).

There are various aspects to consider when thinking about this problem:

- Your model can be derived from a deep feedforward net or a deep convolutional network.
- Play around preprocessing techniques (normalization, rgb to grayscale, etc)
- Number of examples per label (some have more than others).
- Generate fake data.

Here is an example of a [published baseline model on this problem](http://yann.lecun.com/exdb/publis/pdf/sermanet-ijcnn-11.pdf). It's not required to be familiar with the approach used in the paper but, it's good practice to try to read papers like these.

### Implementation

Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.

### 2.1 Preprocess Data (includes shuffling)

### Question 1 

_Describe the techniques used to preprocess the data._

### Question 2

_Describe how you set up the training, validation and testing data for your model. If you generated additional data, why?_

### 2.3 Model Architecture

In [None]:
import functools as ft

# tf Graph input
keep_prob_conv = tf.placeholder(tf.float32)
keep_prob_fc   = tf.placeholder(tf.float32)
x = tf.placeholder("float", [None, 32, 32, 3])

y_rawlabels = tf.placeholder("int32", [None])
y = tf.one_hot(y_rawlabels, depth=43, on_value=1., off_value=0., axis=-1)

# Transformations

def flatten(x):
    x_shape = x.get_shape().as_list()
    new_shape = ft.reduce(lambda i,j: i*j, x_shape[1:])
    return tf.reshape(x, [-1, new_shape])
    
# Activation Functions

# ReLU = tf.nn.relu
# tanh = tf.tanh
# pReLU = parametric_relu
def parametric_relu(x, weights):
    pos = tf.nn.relu(x)
    neg = tf.multiply(weights, tf.multiply(tf.subtract(x, tf.abs(x)), 0.5))

    return tf.add(pos, neg)

def conv2d(x, W, b, strides=3, name='conv'):
    """Conv2D wrapper, with bias and relu activation"""
    # strides = [batch, in_height, in_width, channels]
    x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME', name=name)
    x = tf.nn.bias_add(x, b)
    return x


def skip_conv2d(x, weights, biases, epsilon=1e-3, name='skipper', strides=1):
    tbr=x
    for i in range(0,len(weights)):
        print("  ", i)
        tb_con = conv2d(x=tbr, W=weights[i], b=biases[i], strides=strides, name=name+'_%d'%i)
        print("    Should be equal (input to conv shape): ", tbr.get_shape().as_list(), tb_con.get_shape().as_list())
        print("    Weights and biases: ", weights[i].get_shape().as_list(), biases[i].get_shape().as_list())
        tb_con = bn(tb_con, epsilon)
        tb_con = tf.nn.relu(tb_con)
        tbr = concat(tbr, tb_con)
    return tbr

def residual_unit(x, weights, biases, epsilon=1e-3, name='resid', strides=1):
    print("  ", x.get_shape().as_list())
    tb_con1 = conv2d(x=x, W=weights[0], b=biases[0], strides=strides, name=name+'_1')
    tb_con1 = bn(tb_con1, epsilon)
    tb_con1 = tf.nn.relu(tb_con1)
    print("  ", tb_con1.get_shape().as_list())
    tb_con2 = conv2d(x=tb_con1, W=weights[1], b=biases[1], strides=strides, name=name+'_2')
    tb_con2 = bn(tb_con2, epsilon)
    tb_con2 = tf.nn.relu(tb_con2)
    print("  ", tb_con2.get_shape().as_list())
    return tf.add(x, tb_con2)
        

def maxpool2d(x, k=2, strides=1, padding_setting='SAME'):
    """MaxPool2D wrapper."""
    return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, strides, strides, 1],
                          padding=padding_setting)

def bn(x, epsilon=1e-3, offset=None, scale=None):
    mean, var = tf.nn.moments(x,[0])
    return tf.nn.batch_normalization(x, mean, var, offset, scale, epsilon)

### concatenate tensors along 3rd dimension 
def concat(x, y, name='cat'):
    #? cat = tf.concat(axis=3, values=input, name=name)

    cat = tf.concat(axis=3,values=[x,y])
    return cat

def weight_variable(shape, weight_mean, weight_stddev):
    initial = tf.truncated_normal(shape, stddev=weight_stddev, mean=weight_mean)
    # alt: tf.random_normal(shape)
    return tf.Variable(initial)


def bias_variable(shape, bias_mean):
    initial = tf.constant(bias_mean, shape=shape)
    return tf.Variable(initial)

def generate_filter_weights(num_layers=1, nb_filters=1, skip_filter_ratio=1):
    tbr = []
    in_num = nb_filters
    for i in range(num_layers):
        out_num = int(nb_filters*(skip_filter_ratio)**(i+1))
        tbr.append([in_num, out_num])
        in_num = in_num+out_num
    return tbr


In [None]:
# def LeNet_test(x):    
#     # Arguments used for tf.truncated_normal, randomly defines variables for the weights and biases for each layer
#     _, H, W, C = x.get_shape().as_list()
#     print(H, W, C)
    
#     mu = 0
#     sigma = 0.1
#     epsilon = 1e-3
#     # Layer 1: Convolutional. Input = 32x32x3. Output = 28x28x6.
#     conv1_W = tf.Variable(tf.truncated_normal(shape=(5, 5, 3, 6), mean = mu, stddev = sigma))
#     conv1_b = tf.Variable(tf.zeros(6))
#     conv1   = tf.nn.conv2d(x, conv1_W, strides=[1, 1, 1, 1], padding='VALID') + conv1_b
    
#     # Regularization
#     #conv1   = tf.nn.dropout(conv1, keep_prob=keep_prob_conv)
#     l2 = tf.nn.l2_loss(conv1_W)

#     # Activation.
#     #ReLU
#     #conv1 = tf.nn.relu(conv1)
#     #parametric-ReLU
#     conv1_prelu_W = tf.Variable(tf.random_uniform(shape=(28, 28, 6), minval=0.1, maxval=0.3))
#     conv1 = parametric_relu(conv1, conv1_prelu_W)
#     #tanh
#     #conv1 = tf.tanh(conv1)
    
#     # Pooling. Input = 28x28x6. Output = 14x14x6.
#     #conv1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')
#     conv1 = tf.nn.avg_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')
    
#     # Layer 2: Convolutional. Output = 10x10x16.
#     conv2_W = tf.Variable(tf.truncated_normal(shape=(5, 5, 6, 16), mean = mu, stddev = sigma))
#     conv2_b = tf.Variable(tf.zeros(16))
#     conv2   = tf.nn.conv2d(conv1, conv2_W, strides=[1, 1, 1, 1], padding='VALID') + conv2_b
    
#     # Regularization
#     #conv2   = tf.nn.dropout(conv2, keep_prob=keep_prob_conv)
#     l2 = l2 + tf.nn.l2_loss(conv2_W)
    
    
#     # Activation.
#     #ReLU
#     #conv2 = tf.nn.relu(conv2)
#     #parametric-ReLU
#     conv2_prelu_W = tf.Variable(tf.random_uniform(shape=(10, 10, 16), minval=0.1, maxval=0.3))
#     conv2 = parametric_relu(conv2, conv2_prelu_W)
#     #tanh
#     #conv2 = tf.tanh(conv2)
    
#     # Pooling. Input = 10x10x16. Output = 5x5x16.
#     #conv2 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')
#     conv2 = tf.nn.avg_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')
    
#     # Flatten. Input = 5x5x16. Output = 400.
#     fc0   = flatten(conv2)
    
#     # Layer 3: Fully Connected. Input = 400. Output = 120.
#     fc1_W = tf.Variable(tf.truncated_normal(shape=(400, 120), mean = mu, stddev = sigma))
#     fc1_b = tf.Variable(tf.zeros(120))
#     fc1   = tf.matmul(fc0, fc1_W) + fc1_b
    
#     # Regularization
#     #fc1   = tf.nn.dropout(fc1, keep_prob=keep_prob_fc)
#     l2 = l2 + tf.nn.l2_loss(fc1_W)
    
     
#     # Activation.
#     #ReLU
#     #fc1    = tf.nn.relu(fc1)
#     #parametric-ReLU
#     fc1_prelu_W = tf.Variable(tf.random_uniform(shape=[120], minval=0.1, maxval=0.3))
#     fc1 = parametric_relu(fc1, fc1_prelu_W)
#     #tanh
#     #fc1 = tf.tanh(fc1)
    
#     # Layer 4: Fully Connected. Input = 120. Output = 84.
#     fc2_W  = tf.Variable(tf.truncated_normal(shape=(120, 84), mean = mu, stddev = sigma))
#     fc2_b  = tf.Variable(tf.zeros(84))
#     fc2    = tf.matmul(fc1, fc2_W) + fc2_b
    
#     # Regularization
#     #fc2   = tf.nn.dropout(fc2, keep_prob=keep_prob_fc)
#     l2 = l2 + tf.nn.l2_loss(fc2_W)
    
#     # Activation.
#     #ReLU
#     #fc2    = tf.nn.relu(fc2)
#     #parametric-ReLU
#     fc2_prelu_W = tf.Variable(tf.random_uniform(shape=[84], minval=0.1, maxval=0.3))
#     fc2 = parametric_relu(fc2, fc2_prelu_W)
#     #tanh
#     #fc2 = tf.tanh(fc2)
    
#     # Layer 5: Fully Connected. Input = 84. Output = 10.
#     fc3_W  = tf.Variable(tf.truncated_normal(shape=(84, 43), mean = mu, stddev = sigma))
#     fc3_b  = tf.Variable(tf.zeros(43))
#     logits = tf.matmul(fc2, fc3_W) + fc3_b
    
#     # Regularization
#     #logits   = tf.nn.dropout(logits, keep_prob=keep_prob_fc)
#     l2 = l2 + tf.nn.l2_loss(fc3_W)
    
#     return logits , l2

# #conv_net_le_test = LeNet_test(x)
# conv_net_le_test, l2 = LeNet_test(x)

In [None]:
# def LeNet_test_deep(x):    
#     # Arguments used for tf.truncated_normal, randomly defines variables for the weights and biases for each layer
#     _, H, W, C = x.get_shape().as_list()
#     print(H, W, C)
    
#     mu = 0
#     sigma = 0.1
#     epsilon = 1e-3
#     nb_filters_1 = 12
#     nb_filters_2 = 24
#     nb_filters_3 = 48
    
#     # Layer 1.1: 
#     # Convolutional. Input = 32x32x3. Output = 32x32x6.
#     conv1_W_1 = tf.Variable(tf.truncated_normal(shape=(5, 5, 3, nb_filters_1), mean = mu, stddev = sigma))
#     conv1_b_1 = tf.Variable(tf.zeros(nb_filters_1))
#     conv1_1   = tf.nn.conv2d(x, conv1_W_1, strides=[1, 1, 1, 1], padding='VALID') + conv1_b_1
#     # Batch Norm.
#     conv1_1 = bn(conv1_1, epsilon)
#     conv1_1   = tf.nn.relu(conv1_1)
#     # Dropout
#     conv1_1   = tf.nn.dropout(conv1_1, keep_prob=keep_prob_conv)
#     # Layer 1.2: 
#     # Convolutional. Input = 32x32x6. Output = 32x32x6.
#     conv1_W_2 = tf.Variable(tf.truncated_normal(shape=(5, 5, nb_filters_1, nb_filters_1), mean = mu, stddev = sigma))
#     conv1_b_2 = tf.Variable(tf.zeros(nb_filters_1))
#     conv1_2   = tf.nn.conv2d(conv1_1, conv1_W_2, strides=[1, 1, 1, 1], padding='SAME') + conv1_b_2
#     # Batch Norm.
#     #conv1_2 = bn(conv1_2, epsilon)
#     conv1_2   = tf.nn.relu(conv1_2)
#     # Dropout
#     conv1_2   = tf.nn.dropout(conv1_2, keep_prob=keep_prob_conv)
#     # Layer 1.3: 
#     # Convolutional. Input = 32x32x6. Output = 32x32x6.
#     conv1_W_3 = tf.Variable(tf.truncated_normal(shape=(5, 5, nb_filters_1, nb_filters_1), mean = mu, stddev = sigma))
#     conv1_b_3 = tf.Variable(tf.zeros(nb_filters_1))
#     conv1_3   = tf.nn.conv2d(conv1_2, conv1_W_3, strides=[1, 1, 1, 1], padding='SAME') + conv1_b_3
#     # Batch Norm.
#     #conv1_2 = bn(conv1_2, epsilon)
#     conv1_3   = tf.nn.relu(conv1_3)
#     # Dropout
#     conv1_3   = tf.nn.dropout(conv1_3, keep_prob=keep_prob_conv)
    
#     # Residuals
#     conv1 = tf.add(conv1_3, conv1_1)
#     # Activation.
#     conv1_prelu_W = tf.Variable(tf.random_uniform(shape=(28, 28, nb_filters_1), minval=0.1, maxval=0.3))
#     conv1 = parametric_relu(conv1, conv1_prelu_W)
#     # Dense. Output = 28x28x(2*nb_filters_1)
#     conv1 = tf.concat([conv1, conv1_2], axis=3)
#     # Pooling. Input = 28x28x24. Output = 14x14x24.
#     conv1 = tf.nn.avg_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')
    
    
#     # Layer 2.1: 
#     # Convolutional. Output = 10x10x24.
#     conv2_W_1 = tf.Variable(tf.truncated_normal(shape=(5, 5, nb_filters_2, nb_filters_2), mean = mu, stddev = sigma))
#     conv2_b_1 = tf.Variable(tf.zeros(nb_filters_2))
#     conv2_1   = tf.nn.conv2d(conv1, conv2_W_1, strides=[1, 1, 1, 1], padding='VALID') + conv2_b_1
#     conv2_1   = tf.nn.relu(conv2_1)
#     # Dropout
#     conv2_1   = tf.nn.dropout(conv2_1, keep_prob=keep_prob_conv)
#     # Layer 2.2: 
#     # Convolutional. Output = 10x10x24.
#     conv2_W_2 = tf.Variable(tf.truncated_normal(shape=(5, 5, nb_filters_2, nb_filters_2), mean = mu, stddev = sigma))
#     conv2_b_2 = tf.Variable(tf.zeros(nb_filters_2))
#     conv2_2   = tf.nn.conv2d(conv2_1, conv2_W_2, strides=[1, 1, 1, 1], padding='SAME') + conv2_b_2
#     conv2_2   = tf.nn.relu(conv2_2)
#     # Dropout
#     conv2_2   = tf.nn.dropout(conv2_2, keep_prob=keep_prob_conv)
#     # Layer 2.3: 
#     # Convolutional. Output = 10x10x24.
#     conv2_W_3 = tf.Variable(tf.truncated_normal(shape=(5, 5, nb_filters_2, nb_filters_2), mean = mu, stddev = sigma))
#     conv2_b_3 = tf.Variable(tf.zeros(nb_filters_2))
#     conv2_3   = tf.nn.conv2d(conv2_2, conv2_W_3, strides=[1, 1, 1, 1], padding='SAME') + conv2_b_3
#     conv2_3   = tf.nn.relu(conv2_3)
#     # Dropout
#     conv2_3   = tf.nn.dropout(conv2_3, keep_prob=keep_prob_conv)
    
#     # Residuals
#     conv2 = tf.add(conv2_3, conv2_1)
#     # Activation.
#     conv2_prelu_W = tf.Variable(tf.random_uniform(shape=(10, 10, nb_filters_2), minval=0.1, maxval=0.3))
#     conv2 = parametric_relu(conv2, conv2_prelu_W)
#     # Dense. Output = 28x28x(2*nb_filters_2)
#     conv2 = tf.concat([conv2, conv2_2], axis=3)
    
#     # Layer 3.1: 
#     # Convolutional. Output = 10x10x(2*nb_filters_3)
#     conv3_W_1 = tf.Variable(tf.truncated_normal(shape=(5, 5, nb_filters_3, nb_filters_3), mean = mu, stddev = sigma))
#     conv3_b_1 = tf.Variable(tf.zeros(nb_filters_3))
#     conv3_1   = tf.nn.conv2d(conv2, conv3_W_1, strides=[1, 1, 1, 1], padding='SAME') + conv3_b_1
#     conv3_1   = tf.nn.relu(conv3_1)
#     # Dropout
#     conv3_1   = tf.nn.dropout(conv3_1, keep_prob=keep_prob_conv)
#     # Layer 2.2: 
#     # Convolutional. Output = 10x10x16.
#     conv3_W_2 = tf.Variable(tf.truncated_normal(shape=(5, 5, nb_filters_3, nb_filters_3), mean = mu, stddev = sigma))
#     conv3_b_2 = tf.Variable(tf.zeros(nb_filters_3))
#     conv3_2   = tf.nn.conv2d(conv3_1, conv3_W_2, strides=[1, 1, 1, 1], padding='SAME') + conv3_b_2
#     conv3_2   = tf.nn.relu(conv3_2)
#     # Dropout
#     conv3_2   = tf.nn.dropout(conv3_2, keep_prob=keep_prob_conv)
#     # Layer 2.3: 
#     # Convolutional. Output = 10x10x16.
#     conv3_W_3 = tf.Variable(tf.truncated_normal(shape=(5, 5, nb_filters_3, nb_filters_3), mean = mu, stddev = sigma))
#     conv3_b_3 = tf.Variable(tf.zeros(nb_filters_3))
#     conv3_3   = tf.nn.conv2d(conv3_2, conv3_W_3, strides=[1, 1, 1, 1], padding='SAME') + conv3_b_3
#     conv3_3   = tf.nn.relu(conv3_3)
#     # Dropout
#     conv3_3   = tf.nn.dropout(conv3_3, keep_prob=keep_prob_conv)
    
#     # Residuals
#     conv3 = tf.add(conv3_3, conv3_1)
#     # Activation.
#     conv3_prelu_W = tf.Variable(tf.random_uniform(shape=(10, 10, nb_filters_3), minval=0.1, maxval=0.3))
#     conv3 = parametric_relu(conv3, conv3_prelu_W)
#     # Pooling. Input = 10x10x16. Output = 5x5x16.
#     conv3 = tf.nn.avg_pool(conv3, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
    
#     # Flatten. Input = 5x5xnb_filters_n. Output = 400.
#     fc0   = flatten(conv3)
#     nb_fc1_in = 5 * 5 * nb_filters_3
    
#     # Layer 3: Fully Connected. Input = 400. Output = 240.
#     fc1_W = tf.Variable(tf.truncated_normal(shape=(nb_fc1_in, 240), mean = mu, stddev = sigma))
#     fc1_b = tf.Variable(tf.zeros(240))
#     fc1   = tf.matmul(fc0, fc1_W) + fc1_b
#     # Dropout
#     fc1   = tf.nn.dropout(fc1, keep_prob=keep_prob_fc) 
#     # Activation.
#     fc1_prelu_W = tf.Variable(tf.random_uniform(shape=[240], minval=0.1, maxval=0.3))
#     fc1 = parametric_relu(fc1, fc1_prelu_W)
    
#     # Layer 4: Fully Connected. Input = 240. Output = 120.
#     fc2_W  = tf.Variable(tf.truncated_normal(shape=(240, 120), mean = mu, stddev = sigma))
#     fc2_b  = tf.Variable(tf.zeros(120))
#     fc2    = tf.matmul(fc1, fc2_W) + fc2_b
#     # Dropout
#     fc2   = tf.nn.dropout(fc2, keep_prob=keep_prob_fc)
#     # Activation.
#     fc2_prelu_W = tf.Variable(tf.random_uniform(shape=[120], minval=0.1, maxval=0.3))
#     fc2 = parametric_relu(fc2, fc2_prelu_W)
    
#     # Layer 5: Fully Connected. Input = 84. Output = 10.
#     fc3_W  = tf.Variable(tf.truncated_normal(shape=(120, 43), mean = mu, stddev = sigma))
#     fc3_b  = tf.Variable(tf.zeros(43))
#     logits = tf.matmul(fc2, fc3_W) + fc3_b
    
#     # L2-Regularization
#     l2 = tf.nn.l2_loss(conv1_W_1) + tf.nn.l2_loss(conv1_W_2) + tf.nn.l2_loss(conv1_W_3)
#     l2 = l2 + tf.nn.l2_loss(conv2_W_1) + tf.nn.l2_loss(conv2_W_2) + tf.nn.l2_loss(conv2_W_3)
#     l2 = l2 + tf.nn.l2_loss(conv3_W_1) + tf.nn.l2_loss(conv3_W_2) + tf.nn.l2_loss(conv3_W_3)
#     l2 = l2 + tf.nn.l2_loss(fc1_W)
#     l2 = l2 + tf.nn.l2_loss(fc2_W)
#     l2 = l2 + tf.nn.l2_loss(fc3_W)
    
#     return logits , l2

# #conv_net_le_test = LeNet_test(x)
# conv_net_le_deep, l2 = LeNet_test_deep(x)

In [None]:
def LeNet_test_final(x):    
    # Arguments used for tf.truncated_normal, randomly defines variables for the weights and biases for each layer
    _, H, W, C = x.get_shape().as_list()
    print(H, W, C)
    
    mu = 0
    sigma = 0.1
    epsilon = 1e-3
    nb_filters_1 = 12
    nb_filters_2 = 24
    
    # Layer 1.1: 
    # Convolutional. Input = 32x32x3. Output = 32x32x6.
    conv1_W_1 = tf.Variable(tf.truncated_normal(shape=(5, 5, 3, nb_filters_1), mean = mu, stddev = sigma))
    conv1_b_1 = tf.Variable(tf.zeros(nb_filters_1))
    conv1_1   = tf.nn.conv2d(x, conv1_W_1, strides=[1, 1, 1, 1], padding='VALID') + conv1_b_1
    conv1_1   = tf.tanh(conv1_1)
    # Dropout
    conv1_1   = tf.nn.dropout(conv1_1, keep_prob=keep_prob_conv)
    
    # Pooling. Input = 28x28x24. Output = 14x14x24.
    conv1 = tf.nn.avg_pool(conv1_1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')
    
    
    # Layer 2.1: 
    # Convolutional. Output = 10x10x24.
    conv2_W_1 = tf.Variable(tf.truncated_normal(shape=(5, 5, nb_filters_1, nb_filters_2), mean = mu, stddev = sigma))
    conv2_b_1 = tf.Variable(tf.zeros(nb_filters_2))
    conv2_1   = tf.nn.conv2d(conv1, conv2_W_1, strides=[1, 1, 1, 1], padding='VALID') + conv2_b_1
    conv2_1   = tf.tanh(conv2_1)
    # Dropout
    conv2_1   = tf.nn.dropout(conv2_1, keep_prob=keep_prob_conv)
    
    # Pooling. Input = 28x28x24. Output = 14x14x24.
    conv2 = tf.nn.avg_pool(conv2_1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')
    
    # Flatten. Input = 5x5xnb_filters_n. Output = 400.
    fc0   = flatten(conv2)
    nb_fc1_in = 5 * 5 * nb_filters_2
    
    # Layer 3: Fully Connected. Input = 400. Output = 240.
    fc1_W = tf.Variable(tf.truncated_normal(shape=(nb_fc1_in, 240), mean = mu, stddev = sigma))
    fc1_b = tf.Variable(tf.zeros(240))
    fc1   = tf.matmul(fc0, fc1_W) + fc1_b
    # Dropout
    fc1   = tf.nn.dropout(fc1, keep_prob=keep_prob_fc) 
    # Activation.
    fc1 = tf.tanh(fc1)
    
    # Layer 4: Fully Connected. Input = 240. Output = 120.
    fc2_W  = tf.Variable(tf.truncated_normal(shape=(240, 120), mean = mu, stddev = sigma))
    fc2_b  = tf.Variable(tf.zeros(120))
    fc2    = tf.matmul(fc1, fc2_W) + fc2_b
    # Dropout
    fc2   = tf.nn.dropout(fc2, keep_prob=keep_prob_fc)
    # Activation.
    fc2 = tf.tanh(fc2)
    
    # Layer 5: Fully Connected. Input = 84. Output = 10.
    fc3_W  = tf.Variable(tf.truncated_normal(shape=(120, 43), mean = mu, stddev = sigma))
    fc3_b  = tf.Variable(tf.zeros(43))
    logits = tf.matmul(fc2, fc3_W) + fc3_b
    
    # L2-Regularization
    l2 = tf.nn.l2_loss(conv1_W_1)
    l2 = l2 + tf.nn.l2_loss(conv2_W_1)
    l2 = l2 + tf.nn.l2_loss(fc1_W)
    l2 = l2 + tf.nn.l2_loss(fc2_W)
    l2 = l2 + tf.nn.l2_loss(fc3_W)
    
    return logits , l2

#conv_net_le_test = LeNet_test(x)
conv_net_le_final, l2 = LeNet_test_final(x)

References: 
https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/3_NeuralNetworks/


### Question 3

_What does your final architecture look like? (Type of model, layers, sizes, connectivity, etc.)  For reference on how to build a deep neural network using TensorFlow, see [Deep Neural Network in TensorFlow
](https://classroom.udacity.com/nanodegrees/nd013/parts/fbf77062-5703-404e-b60c-95b78b2f3f9e/modules/6df7ae49-c61c-4bb2-a23e-6527e69209ec/lessons/b516a270-8600-4f93-a0a3-20dfeabe5da6/concepts/83a3a2a2-a9bd-4b7b-95b0-eb924ab14432) from the classroom._


### 2.4 Training the model

In [None]:
def generate_uniform(images, labels, num_class, num_per_class):
    X_tbr = np.empty([0, images.shape[1], images.shape[2], images.shape[3]], dtype=np.float32)
    y_tbr = np.empty([0], dtype=np.int32)
    for c in range(num_class):
        idx_all, = np.where(labels == c)
        idx = np.random.choice(idx_all, num_per_class)
        X_tbr = np.append(X_tbr, images[idx], axis=0)
        y_tbr = np.append(y_tbr, labels[idx], axis=0)
    
    return shuffle(X_tbr, y_tbr)

In [None]:
#pred = conv_net_le_test
pred = conv_net_le_final

pred_probs = tf.nn.softmax(pred)
anneal_steps = [6, 15, 30, training_epochs]
pos = 0
# Training parameters
initial_learning_rate = 0.001
lr = initial_learning_rate
anneal_rate = 0.5
batch_size = 128
training_epochs = 50
n_train = len(X_train) * 2
num_per_class = int(n_train / n_classes)
print("Training Parameters: ")
print("  Initial Learning Rate: ", initial_learning_rate)
print("  Batch Size: ", batch_size)
print("  Epochs: ", training_epochs)
print("  Training set size: ", n_train)
print("  Number of classes per training set: ", num_per_class)
display_step = 1

# Define cost function and optimizer
cost = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=pred, labels=y_rawlabels))

# Regularization
beta = 0.0005
cost = cost + beta * l2

learning_rate = tf.placeholder(tf.float32, shape=[])
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
# Function to initialise the variables
init = tf.global_variables_initializer()
# Launch the graph
sess = tf.Session()
# Initialise variables
sess.run(init)

# Initialise time logs
init_time = time.time()
epoch_accuracies = []
learning_rates = []
current_acc = 0.0
anneal_count = 0
max_epoch_count = 3
finish_count = 0
max_no_change_count = 3
req_change = 0.05
print("Training...")
lr = initial_learning_rate
# Training cycle
for epoch in range(training_epochs):
    avg_cost = 0.
    print("Epoch:", '%04d' % (epoch + 1)) 
    arg_im, arg_lab = generate_uniform(X_train, y_train, num_class=n_classes, num_per_class=num_per_class) 
    arg_im = perturb_set(arg_im, keep=0.8)
    #arg_lab = y_train
    n_train = len(arg_im)
    total_batch = int(n_train / batch_size)
    # Loop over all batches
    batch_avg = 0
    for i in range(total_batch):
        batch_x, batch_y = np.array(arg_im[i * batch_size:(i + 1) * batch_size]), \
                           np.array(arg_lab[i * batch_size:(i + 1) * batch_size])
        # Run optimization op (backprop) and cost op (to get loss value)
        _, c = sess.run([optimizer, cost], feed_dict={x: batch_x, y_rawlabels: batch_y, learning_rate: lr, keep_prob_conv: 0.8, keep_prob_fc: 0.9})
        # Compute average loss
        avg_cost += c / total_batch
        batch_avg += c
        if((i+1) % 100 == 0):
            #print("  Loss: ", batch_avg/100)
            batch_avg=0
        
    # Display logs
    if epoch % display_step == 0:
        current_time = time.time()
        print("  Time: ", current_time - init_time)
        correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
        with sess.as_default():
            epoch_accuracy = accuracy.eval({x: X_valid, y_rawlabels: y_valid, keep_prob_conv: 1, keep_prob_fc: 1})
            print("  Accuracy (validation):", epoch_accuracy)
            epoch_accuracies.append(epoch_accuracy)
            learning_rates.append(lr)
            current_acc = np.mean(epoch_accuracies[-5:])
            print("    5 Epoch Moving Average:", current_acc)
            
    if epoch >= anneal_steps[pos]:
        lr*=anneal_rate
        print("New learning rate: ", lr)
        pos+=1
        
print("Optimization Finished!")

curr_time = time.time()
training_time = curr_time - init_time

# Test model
y_rawpreds = tf.cast(tf.argmax(pred, 1), dtype=tf.int32)

correct_prediction = tf.equal(y_rawpreds, y_rawlabels)
# Calculate accuracy
# accuracy_train = tf.reduce_mean(tf.cast(correct_prediction, "float"))
# print("Accuracy (train):", accuracy_train.eval({x_unflattened: X_train, y_rawlabels: y_train}))
train_predict_time = time.time()
# print("Time to calculate accuracy on training set: ", train_predict_time - epoch_time)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))



In [None]:
from sklearn import metrics
import itertools

def plot_confusion_matrix(conf_arr, normed=True):
    if normed:
        norm_conf = conf_arr.astype('float') / conf_arr.sum(axis=1)[:, np.newaxis]
    else:
        norm_conf = conf_arr
    fig = plt.figure()
    plt.clf()
    ax = fig.add_subplot(111)
    ax.set_aspect(1)
    res = ax.imshow(np.array(norm_conf), cmap=plt.cm.jet, 
                interpolation='nearest')

    width, height = conf_arr.shape

    for x in range(width):
        for y in range(height):
            ax.annotate(str(conf_arr[x][y]), xy=(y, x), 
                    horizontalalignment='center',
                    verticalalignment='center')

    cb = fig.colorbar(res)
    alphabet = np.arange(width)
    plt.xticks(range(width), alphabet[:width])
    plt.yticks(range(height), alphabet[:height])
    plt.ylabel('True label')
    plt.xlabel('Predicted label')
    plt.savefig('ConfusionMatrices/RES_LONG', format='png')

def plot_accuracy_learningrate(epoch_accuracies, learning_rates):
    plt.figure()
    f, axarr = plt.subplots(2, sharex=True)
    axarr[0].plot(range(len(epoch_accuracies)), epoch_accuracies, 'b-')
    axarr[0].set_title('accuracy')
    axarr[1].plot(range(len(learning_rates)), learning_rates, 'g-')
    axarr[1].set_title('learning rate')
    plt.show()
    
# Line below needed only when not using `with tf.Session() as sess`
with sess.as_default():
    train_predict_time = time.time()
    [final_accuracy, y_pred, correct] = sess.run([accuracy, y_rawpreds, correct_prediction], 
                                                 feed_dict = {x: X_test, y_rawlabels: y_test, keep_prob_conv: 1, keep_prob_fc: 1})
    current_time = time.time()
    #print("Accuracy (test):", accuracy.eval({x: X_test, y_rawlabels: y_test}))
    print("Final Accuracy:", final_accuracy)
    print("Time to evaluate test set: ", current_time - train_predict_time)
    print("Training Time: ", training_time)
    print("Precision:", metrics.precision_score(y_test, y_pred, average='macro'))
    print("Recall:", metrics.recall_score(y_test, y_pred, average='macro'))
    print("F1-score:", metrics.f1_score(y_test, y_pred, average='macro'))
    conf_m = metrics.confusion_matrix(y_test, y_pred)
    print("Confusion_matrix:")
    plot_confusion_matrix(conf_m, normed=True)
    plot_accuracy_learningrate(epoch_accuracies, learning_rates)
    #fpr, tpr, tresholds = metrics.roc_curve(y_test, y_pred)



### Question 4

_How did you train your model? (Type of optimizer, batch size, epochs, hyperparameters, etc.)_


### Question 5


_What approach did you take in coming up with a solution to this problem?_

---
## Step 3: Test a Model on New Images

Take several pictures of traffic signs that you find on the web or around you (at least five), and run them through your classifier on your computer to produce example results. The classifier might not recognize some local signs but it could prove interesting nonetheless.

You may find `signnames.csv` useful as it contains mappings from the class id (integer) to the actual sign name.

### Implementation

Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.

### 3.1 New Images

### Question 6

_Choose five candidate images of traffic signs and provide them in the report. Are there any particular qualities of the image(s) that might make classification difficult? It would be helpful to plot the images in the notebook._

**Answer:**

(Special characteristics of images are noted in the comments)

In [None]:
# Helper function to read image copied from lane lines project
def read_image_and_print_dims(image_path):
    """Reads and returns image.
    Helper function to examine how an image is represented.
    """
    #reading in an image
    image = mpimg.imread(image_path)
    #printing out some stats and plotting
    print('This image is:', type(image), 'with dimensions:', image.shape)
    plt.imshow(image)  #call as plt.imshow(gray, cmap='gray') to show a grayscaled image
    return image

In [None]:
# This sign is not in English. It is a stop sign.
# There are multiple signs in the picture. 
# What wil the model attempt to recognise?
japanese_sign = read_image_and_print_dims('traffic-sign-data/japanese-sign.jpg')

In [None]:
# There is other intervening text in the image.
# This sign is shown at an angle.
german_sign = read_image_and_print_dims('traffic-sign-data/german-sign.jpg')

In [None]:
# This sign is quite clear.
two_way_sign = read_image_and_print_dims('traffic-sign-data/two-way-sign.jpg')

In [None]:
# Can the model recognise the 20 km/h sign as a speed limit sign
# even though it has different background colour, different shape
# and additional 'km/h' text?
speed_limit_stop = read_image_and_print_dims('traffic-sign-data/speed-limit-stop.JPG')

In [None]:
# What will the model think this is?
shark_sign = read_image_and_print_dims('traffic-sign-data/shark-sign.jpg')

### 3.2 The Model's Predictions on the New Images

In [None]:
### Run the predictions here.
### Feel free to use as many code cells as needed.
def predict(img):
    """Print model's prediction of which traffic sign this image is."""
    classification = sess.run(tf.argmax(pred, 1), feed_dict={x_unflattened: [img], dropout_conv: 1, dropout_fc: 1})
    print(classification)
    print('NN predicted', classification[0])

In [None]:
def show_and_pred_X_train(index):
    """Show image from training set and print model's prediction 
    (of which traffic sign this image is).
    """
    plt.imshow(X_train[index])
    predict(X_train[index])

In [None]:
def show_and_pred_image(image):
    """Show image and print model's prediction (of which traffic 
    sign this image is).
    """
    plt.imshow(image)
    predict(image)

In [None]:
def read_show_and_pred_image(image_path):
    """Read image, show image and print model's prediction (of 
    which traffic sign this image is).
    """
    # Read in image from file
    image = mpimg.imread(image_path)
    # Show image
    # Call as plt.imshow(gray, cmap='gray') to show a grayscaled image
    plt.imshow(image) 
    predict(image)
    return image

In [None]:
show_and_pred_X_train(40)

In [None]:
def read_show_and_pred_image_tsdata(image_name):
    """Read image from dir `traffic-sign-data`, show image and print model's prediction (of 
    which traffic sign this image is).
    """
    return read_show_and_pred_image('traffic-sign-data/' + image_name) 

In [None]:
japanese_sign = read_show_and_pred_image_tsdata("japanese_sign_resized.png")

This is a Japanese stop sign, though it looks like a Yield sign.
The network predicts this is a Roundabout Mandatory sign, which is completely different.

In [None]:
german_sign = read_show_and_pred_image_tsdata("german_sign_resized.png")

This is a no parking zone sign. The network predicts this is a 'Right-of-way at the next intersection' sign. They are not similar.

In [None]:
two_way_sign = read_show_and_pred_image_tsdata("two_way_sign_resized.png")

The network predicts this is a Go straight or left sign. They are similar in that there precisely two curved arrows in both signs.

In [None]:
speed_limit_stop = read_show_and_pred_image_tsdata("speed_limit_stop_resized.png")

The network predicts this is a roundabout mandatory sign (40). This is wrong- it should be 20km/h speed limit (0). The network may have been confused by the many curves that make up the sign.

In [None]:
shark_sign = read_show_and_pred_image_tsdata("shark_sign_resized.png")

The network predicts this is a roundabout mandatory sign. This is wrong, but then there is no correct class within the 43 for this sign. 
* It is unclear why this sign should be the roundabout mandatory sign of all signs. 
    * There are not many curved arrows - the black portion of the sign is small and is close to a short horizontal line segment in the middle of the sign.
    * The diamond-shaped sign could have indicated Priority Road (12).

### Question 7

_Is your model able to perform equally well on captured pictures or a live camera stream when compared to testing on the dataset?_


**Answer:**

(Answer applied to captured pictures)

No, it does not perform equally well on captured images. It has a performance of 0% accuracy on captured images as opposed to 79% on the test set.

* The images not included in the dataset are not exactly the same road signs so there is additional difficulty because the model needs to generalise well to classify these new signs correctly. The
* Some road signs such as the shark sign may not even be included in the 43 categories.
* The images are also processed (e.g. cropped) differently.

It seems that the model is classifying 'unknown signs' as Roundabout Mandatory signs.

Reference for images of correct German signage: http://www.gettingaroundgermany.info/zeichen.shtml

### 3.3 Visualising the certainty of the model's predictions

In [None]:
### Visualize the softmax probabilities here.
### Feel free to use as many code cells as needed.

def certainty_of_predictions(img):
    """Return model's top five choices for what traffic sign 
    this image is and its confidence in its predictions.
    """
    top_five = sess.run(tf.nn.top_k(tf.nn.softmax(pred), k=5), feed_dict={x_unflattened: [img], dropout_conv: 1, dropout_fc: 1})
    print("Top five: ", top_five)
    return top_five

In [None]:
def show_and_pred_certainty_image(image):
    plt.imshow(image)
    return certainty_of_predictions(image)

In [None]:
def show_and_pred_certainty_X_train(index):
    """Show image from training set and print model's certainty of its 
    prediction (of which traffic sign this image is).
    """
    plt.imshow(X_train[index])
    return certainty_of_predictions(X_train[index])

In [None]:
sign_names = pd.read_csv("signnames.csv")
sign_names.head()

In [None]:
def plot_certainty_arrays(probabilities, labels):
    """Plot model's probabilities (y) and traffic sign labels (x) 
    in a bar chart.
    """
    y_pos = np.arange(len(labels))
    performance = [10,8,6,4,2,1]

    plt.bar(y_pos, probabilities, align='center', alpha=0.5)
    plt.xticks(y_pos, labels)
    plt.ylabel('Probability')
    plt.xlabel('Traffic sign')
    plt.title('Model\'s certainty of its predictions')

    plt.show()
    print("Traffic Sign Key")
    for label in labels:
        print(label, ": ", sign_names.loc[label]['SignName'])
    

In [None]:
show_and_pred_certainty_X_train(40)

In [None]:
plot_certainty_arrays([ 1.,  0.,  0.,  0.,  0.], [0, 1, 2, 3, 4])

In [None]:
japanese_sign_certainties = show_and_pred_certainty_image(japanese_sign)

In [None]:
japanese_sign_certainties[1][0]


In [None]:
plot_certainty_arrays(japanese_sign_certainties[0][0],
                      japanese_sign_certainties[1][0])

In [None]:
german_sign_certainties = show_and_pred_certainty_image(german_sign)

In [None]:
plot_certainty_arrays(german_sign_certainties[0][0], 
                      german_sign_certainties[1][0])

In [None]:
two_way_sign_certainties = show_and_pred_certainty_image(two_way_sign)

In [None]:
plot_certainty_arrays(two_way_sign_certainties[0][0], two_way_sign_certainties[1][0])

In [None]:
speed_limit_stop_certainties = show_and_pred_certainty_image(speed_limit_stop)

In [None]:
plot_certainty_arrays(speed_limit_stop_certainties[0][0],
                      speed_limit_stop_certainties[1][0])

In [None]:
shark_sign_certainties = show_and_pred_certainty_image(shark_sign)

In [None]:
plot_certainty_arrays(shark_sign_certainties[0][0], shark_sign_certainties[1][0])

### Question 8

*Use the model's softmax probabilities to visualize the **certainty** of its predictions, [`tf.nn.top_k`](https://www.tensorflow.org/versions/r0.11/api_docs/python/nn.html#top_k) could prove helpful here. Which predictions is the model certain of? Uncertain? If the model was incorrect in its initial prediction, does the correct prediction appear in the top k? (k should be 5 at most)*


**Answer:**

(see code above)

* The model is certain of all of its predictions even though some are wrong. 
* The model also predicts different outcomes confidently for the two times I ran the predictions on each sign. 

These are both strange outcomes.


### Question 9
_If necessary, provide documentation for how an interface was built for your model to load and classify newly-acquired images._


**Answer:**
Not applicable at the moment.

> **Note**: Once you have completed all of the code implementations and successfully answered each question above, you may finalize your work by exporting the iPython Notebook as an HTML document. You can do this by using the menu above and navigating to  \n",
    "**File -> Download as -> HTML (.html)**. Include the finished document along with this notebook as your submission.

In [None]:
# Close the current session.
sess.close()