# Self-Driving Car Engineer Nanodegree

## Deep Learning

## Project: Build a Traffic Sign Recognition Classifier

In this notebook, a template is provided for you to implement your functionality in stages which is required to successfully complete this project. If additional code is required that cannot be included in the notebook, be sure that the Python code is successfully imported and included in your submission, if necessary. Sections that begin with **'Implementation'** in the header indicate where you should begin your implementation for your project. Note that some sections of implementation are optional, and will be marked with **'Optional'** in the header.

In addition to implementing code, there will be questions that you must answer which relate to the project and your implementation. Each section where you will answer a question is preceded by a **'Question'** header. Carefully read each question and provide thorough answers in the following text boxes that begin with **'Answer:'**. Your project submission will be evaluated based on your answers to each of the questions and the implementation you provide.

>**Note:** Code and Markdown cells can be executed using the **Shift + Enter** keyboard shortcut. In addition, Markdown cells can be edited by typically double-clicking the cell to enter edit mode.

---
## Step 0: Load The Data

In [None]:
# Load pickled data
import pickle

training_file = ".\data\\train.p"
testing_file = ".\data\\test.p"

with open(training_file, mode='rb') as f:
    train = pickle.load(f)
with open(testing_file, mode='rb') as f:
    test = pickle.load(f)
    
X_train, y_train = train['features'], train['labels']
X_test, y_test = test['features'], test['labels']

---

## Step 1: Dataset Summary & Exploration

The pickled data is a dictionary with 4 key/value pairs:

- `'features'` is a 4D array containing raw pixel data of the traffic sign images, (num examples, width, height, channels).
- `'labels'` is a 2D array containing the label/class id of the traffic sign. The file `signnames.csv` contains id -> name mappings for each id.
- `'sizes'` is a list containing tuples, (width, height) representing the the original width and height the image.
- `'coords'` is a list containing tuples, (x1, y1, x2, y2) representing coordinates of a bounding box around the sign in the image. **THESE COORDINATES ASSUME THE ORIGINAL IMAGE. THE PICKLED DATA CONTAINS RESIZED VERSIONS (32 by 32) OF THESE IMAGES**

Complete the basic data summary below.

In [None]:
### Replace each question mark with the appropriate value.

import numpy as np

n_train = X_train.shape[0]

n_test = X_test.shape[0]

image_shape = (X_train.shape[1], X_train.shape[2])

n_classes = np.unique(y_test).size

print("Number of training examples =", n_train)
print("Number of testing examples =", n_test)
print("Image data shape =", image_shape)
print("Number of classes =", n_classes)
print("Label shape =", y_train.shape)

Visualize the German Traffic Signs Dataset using the pickled file(s). This is open ended, suggestions include: plotting traffic sign images, plotting the count of each sign, etc.

The [Matplotlib](http://matplotlib.org/) [examples](http://matplotlib.org/examples/index.html) and [gallery](http://matplotlib.org/gallery.html) pages are a great resource for doing visualizations in Python.

**NOTE:** It's recommended you start with something simple first. If you wish to do more, come back to it after you've completed the rest of the sections.

In [None]:
### Source for genfromtxt: https://carnd-forums.udacity.com/questions/18451906/answers/18451962, Author: Bharat Ramanathan.

import numpy as np
signname_bytes = np.genfromtxt('signnames.csv', delimiter=',', skip_header=1, usecols=(1,), unpack=True, dtype=None)

signname_map = {}
for i in range(len(signname_bytes)):
    s = signname_bytes[i].tostring().decode("utf-8")
    signname_map[i] = s

In [None]:
### Data exploration visualization goes here.
### Feel free to use as many code cells as needed.
import matplotlib.pyplot as plt
# Visualizations will be shown in the notebook.
%matplotlib inline

# Histogram of training labels.
def plot_histogram(y, bins, titleprefix):
    plt.hist(y, bins)
    plt.xlabel(titleprefix + ' Label')
    plt.ylabel('Count')
    plt.title('Histogram of ' + titleprefix + ' Labels')
    
plot_histogram(y_train, n_classes, 'Training')

[hist, bin_edges] = np.histogram(y_train, n_classes)

for i in range(len(hist)):
    print('{}, {}, {}'.format(i, signname_map[i], hist[i]))

In [None]:
# Draw some training examples.
def draw_image(image, title, fignum):
    plt.figure(fignum)
    plt.imshow(image)
    plt.title(title)

def draw_sign(label_to_draw, fignum, desired_count):
    count = 0    
    for i in range(len(X_train)):
        if y_train[i] == label_to_draw:
            count += 1
            draw_image(X_train[i], 'Label={},Name={},Index={}'.format(str(label_to_draw), signname_map[label_to_draw], str(i)), fignum + count)
            if count >= desired_count:
                return

# draw_sign(2, 1, 3)
        
for label in range(7):
    draw_sign(label, fignum=label, desired_count=1)

----

## Step 2: Design and Test a Model Architecture

Design and implement a deep learning model that learns to recognize traffic signs. Train and test your model on the [German Traffic Sign Dataset](http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset).

There are various aspects to consider when thinking about this problem:

- Neural network architecture
- Play around preprocessing techniques (normalization, rgb to grayscale, etc)
- Number of examples per label (some have more than others).
- Generate fake data.

Here is an example of a [published baseline model on this problem](http://yann.lecun.com/exdb/publis/pdf/sermanet-ijcnn-11.pdf). It's not required to be familiar with the approach used in the paper but, it's good practice to try to read papers like these.

**NOTE:** The LeNet-5 implementation shown in the [classroom](https://classroom.udacity.com/nanodegrees/nd013/parts/fbf77062-5703-404e-b60c-95b78b2f3f9e/modules/6df7ae49-c61c-4bb2-a23e-6527e69209ec/lessons/601ae704-1035-4287-8b11-e2c2716217ad/concepts/d4aca031-508f-4e0b-b493-e7b706120f81) at the end of the CNN lesson is a solid starting point. You'll have to change the number of classes and possibly the preprocessing, but aside from that it's plug and play!

### Implementation

Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.

In [None]:
### Preprocess the data here.
### Feel free to use as many code cells as needed.

### Question 1 

_Describe how you preprocessed the data. Why did you choose that technique?_

### Answer:

I normalized the image data with Min-Max scaling to a range of [0.1, 0.9].

I wanted to let the neural network learn whatever processing was appropriate for classifying the images. I considered performing YUV-conversion and Y-channel global and local contrast normalization as described in [1], but I decided to simply add a color-space transformation layer as described in [2]. This let the neural network figure out the best color-space transformation for this classification problem.

[1] Pierre Sermanet and Yann LeCun, Traffic Sign Recognition with Multi-Scale Convolutional Networks, URL: http://yann.lecun.com/exdb/publis/pdf/sermanet-ijcnn-11.pdf. Retrieved 1/19/2017.

[2] Alexandros Karargyris, Color Space Transformation Network, URL: https://arxiv.org/ftp/arxiv/papers/1511/1511.01064.pdf. Retrieved 1/19/2017.

In [None]:
### Author: Vivek Yadav
### https://github.com/vxy10/ImageAugmentation

import cv2
import numpy as np

def augment_brightness_camera_images(image):
    image1 = cv2.cvtColor(image,cv2.COLOR_RGB2HSV)
    random_bright = .25+np.random.uniform()
    #print(random_bright)
    image1[:,:,2] = image1[:,:,2]*random_bright
    image1 = cv2.cvtColor(image1,cv2.COLOR_HSV2RGB)
    return image1

def transform_image(img,ang_range,shear_range,trans_range,brightness=0):
    '''
    This function transforms images to generate new images.
    The function takes in following arguments,
    1- Image
    2- ang_range: Range of angles for rotation
    3- shear_range: Range of values to apply affine transform to
    4- trans_range: Range of values to apply translations over.

    A Random uniform distribution is used to generate different parameters for transformation

    '''
    # Rotation
    ang_rot = np.random.uniform(ang_range)-ang_range/2
    rows,cols,ch = img.shape    
    Rot_M = cv2.getRotationMatrix2D((cols/2,rows/2),ang_rot,1)

    # Translation
    tr_x = trans_range*np.random.uniform()-trans_range/2
    tr_y = trans_range*np.random.uniform()-trans_range/2
    Trans_M = np.float32([[1,0,tr_x],[0,1,tr_y]])

    # Shear
    pts1 = np.float32([[5,5],[20,5],[5,20]])

    pt1 = 5+shear_range*np.random.uniform()-shear_range/2
    pt2 = 20+shear_range*np.random.uniform()-shear_range/2

    pts2 = np.float32([[pt1,5],[pt2,pt1],[5,pt2]])

    shear_M = cv2.getAffineTransform(pts1,pts2)

    img = cv2.warpAffine(img,Rot_M,(cols,rows))
    img = cv2.warpAffine(img,Trans_M,(cols,rows))
    img = cv2.warpAffine(img,shear_M,(cols,rows))

    # Brightness
    if brightness == 1:
        img = augment_brightness_camera_images(img)

    return img

In [None]:
def mutate_image(image):
    return transform_image(image, 10, 4, 2, brightness=1)

In [None]:
[hist, bin_edges] = np.histogram(y_train, n_classes)
print(hist)

In [None]:
from sklearn.model_selection import train_test_split

# Split training data into training and validation sets.
X_train, X_validation, y_train, y_validation = train_test_split(X_train, y_train, test_size=0.1, random_state=42)

print(X_train.shape)
print(y_train.shape)
print(X_validation.shape)
print(y_validation.shape)

In [None]:
### Generate additional data.

import random

def generate_extra_data(X, y, extra):
    total_train_size = X.shape[0] * (1 + extra)

    JX = np.zeros([total_train_size, X.shape[1], X.shape[2], X.shape[3]], np.uint8)
    Jy = np.zeros([total_train_size], np.uint8)

    J_index = 0
    total_generated = 0
    
    for i in range(X.shape[0]):
        curx = X[i]
        cury = y[i]
        
        base = (extra + 1) * i
        JX[base] = curx
        Jy[base] = cury
        
        for x in range(extra):
            offset = x + 1
            JX[base + offset] = mutate_image(curx)
            Jy[base + offset] = cury
            total_generated += 1

    print('Done generating. total_generated=', total_generated)
    
    return [JX, Jy]

def generate_extra_data_min(X, y, min_train_size):
    total_train_size = 0

    [hist, bin_edges] = np.histogram(y, n_classes)

    for count in hist:
        if (count > min_train_size):
            total_train_size += count
        else:
            total_train_size += min_train_size

    JX = np.zeros([total_train_size, X.shape[1], X.shape[2], X.shape[3]], np.uint8)
    Jy = np.zeros([total_train_size], np.uint8)

    J_index = 0
    total_generated = 0
    for label in range(n_classes):
        label_examples = np.where(y == label)[0]

        # Copy over existing training examples.
        for i in range(label_examples.size):
            JX[J_index] = X[label_examples[i]]
            Jy[J_index] = y[label_examples[i]]
            J_index += 1

        # Generate mutated examples if needed.
        if label_examples.size < min_train_size:
            generate_count = min_train_size - label_examples.size

            for i in range(generate_count):
                chosen = label_examples[random.randrange(label_examples.size)]
                JX[J_index] = mutate_image(X[chosen])
                Jy[J_index] = label
                J_index += 1

            total_generated += generate_count

    print('Done generating. total_generated=', total_generated)
    
    return [JX, Jy]
    
[JX_train, Jy_train] = generate_extra_data(X_train, y_train, 10)
#[JX_train, Jy_train] = generate_extra_data_min(X_train, y_train, 10000)

In [None]:
print(np.sum(JX_train[0]-JX_train[4]))

# Examine generated data.
#for i in range(10):
#    draw_image(JX_train[i], 'blah', i)

### Question 2

_Describe how you set up the training, validation and testing data for your model. **Optional**: If you generated additional data, how did you generate the data? Why did you generate the data? What are the differences in the new dataset (with generated data) from the original dataset?_

### Answer:

#### Splitting Training, Validation, and Testing Data
I set aside the test data (X_test, y_test) and did not touch it during my model development and neural-network tweaking.

I split the training data (X_train, y_train) into 2 sets: training' and validation data (X_train, y_train) and (X_validation, y_validation). 10% of the original training data was randomly selected and separated as the validation data.

In [None]:
plot_histogram(y_train, n_classes, 'Training')

### Answer (continued):
#### Generating Additional Data
A histogram of the training labels shows a non-uniform distribution (see the above cell). Some of the classes have as few as 200 examples. To give my model plenty of data to learn from and to make it more robust against pictures of traffic signs at angles and with poor lighting, I generated additional data.

I generated additional data using a helper function "transform_image" written by Vivek Yadav found at [3]. It generates a new image using OpenCV to rotate, shear, translate, and augment the brightness of an existing image. I found that too much jitter reduced my validation accuracy. This was probably because excessive jitter results in training data that are so different from the original data that they mislead the model. I experimented with different ranges of rotation, shear, and translation until validation accuracy was actually better than without the additional data.

I tried two approaches: 
1. Scale up the count of training examples by some factor, thereby keeping the above non-uniform distribution.
2. Increase the count of training examples to cause a uniform distribution of classes.

The first approach resulted in a better validation accuracy.

[3] Vivek Yadav, Image Augmentation, URL: https://github.com/vxy10/ImageAugmentation, Retrieved: 1/20/2017.

In [None]:
# Use jittered training data as training data.
[X_train, y_train] = [JX_train, Jy_train]

In [None]:
import tensorflow as tf

MAX_EPOCHS = 200
BATCH_SIZE = 128

In [None]:
### Define your architecture here.
### Feel free to use as many code cells as needed.

from tensorflow.contrib.layers import flatten

def conv_activation(x, filter_height, filter_width, in_depth, out_depth, mean, stddev):
    h = (tf.nn.conv2d(x, 
                      tf.Variable(tf.truncated_normal([filter_height, filter_width, in_depth, out_depth], mean = mean, stddev = stddev)), 
                      [1, 1, 1, 1], 
                      'VALID') 
              + tf.Variable(tf.zeros(out_depth)))
    
    # Activation.
    return tf.nn.relu(h)

def conv5x5_activation(x, in_depth, out_depth, mean, stddev):
    return conv_activation(x, 5, 5, in_depth, out_depth, mean, stddev)

def conv3x3_activation(x, in_depth, out_depth, mean, stddev):
    return conv_activation(x, 3, 3, in_depth, out_depth, mean, stddev)
    
def maxpool(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')

def dropout(x, keep_prob):
    return tf.nn.dropout(x, keep_prob)

def fully_connected(x, in_count, out_count, mean, stddev):
    return (tf.matmul(x, 
                      tf.Variable(tf.truncated_normal((in_count, out_count), mean = mean, stddev = stddev)))
            + tf.Variable(tf.zeros(out_count)))

def make_branch(h, depth, mean, stddev, keep_prob):
    # Layer 1: Convolutional. Input = 32x32x3. Output = 28x28xdepth.
    h = conv5x5_activation(h, 3, depth, mean, stddev)
    
    # Convolutional. Output = 24x24x(depth*1.5).
    h = conv5x5_activation(h, depth, int(depth * 3 / 2), mean, stddev)
    
    # Convolutional. Output = 20x20x(depth*2).
    h = conv5x5_activation(h, int(depth * 3 / 2), depth * 2, mean, stddev)

    # Output = 10x10x(depth*2)
    h = maxpool(h)
    
    # Convolutional. Output = 8x8x(depth*2.5).
    h = conv3x3_activation(h, depth * 2, int(depth * 5 / 2), mean, stddev)

    # Pooling.
    h = dropout(h, keep_prob)
    
    return h

def define_nn_architecture(x, keep_prob):    
    # Hyperparameters
    mu = 0
    sigma = 0.1
    
    # Let this layer modify color. Input = 32x32x3. Ouput = 32x32x3. 
    # This layer is inspired by "Color Space Transformation Network" at https://arxiv.org/ftp/arxiv/papers/1511/1511.01064.pdf.
    h = conv_activation(x, 1, 1, 3, 3, mu, sigma)
    
    # This branch-then-merge architecture is inspired by Vivek Yadav's post.
    # https://chatbotslife.com/german-sign-classification-using-deep-learning-neural-networks-98-8-solution-d05656bf51ad
    b1 = make_branch(h, 32, mu, sigma, keep_prob)
    b2 = make_branch(h, 64, mu, sigma, keep_prob)
    b3 = make_branch(h, 128, mu, sigma, keep_prob)
    
    # Merge branches. Output = 10x10x112.
    h = tf.concat_v2([b1, b2, b3], axis=3)
    
    # Flatten. Output = concat_dim.
    h = flatten(h)
    
    concat_dim = h.get_shape().dims[1].value
    
    print('concat_dim= ', concat_dim)
    
    # Output = 400.
    h = fully_connected(h, concat_dim, 800, mu, sigma)
    
    # Activation.
    h = dropout(h, keep_prob)

    # Output = 120.
    h = fully_connected(h, 800, 400, mu, sigma)
    
    # Activation.
    h = dropout(h, keep_prob)

    # Output = n_classes.
    h = fully_connected(h, 400, n_classes, mu, sigma)
    
    h = dropout(h, keep_prob)
    
    return h

### Question 3

_What does your final architecture look like? (Type of model, layers, sizes, connectivity, etc.)  For reference on how to build a deep neural network using TensorFlow, see [Deep Neural Network in TensorFlow
](https://classroom.udacity.com/nanodegrees/nd013/parts/fbf77062-5703-404e-b60c-95b78b2f3f9e/modules/6df7ae49-c61c-4bb2-a23e-6527e69209ec/lessons/b516a270-8600-4f93-a0a3-20dfeabe5da6/concepts/83a3a2a2-a9bd-4b7b-95b0-eb924ab14432) from the classroom._


**Answer:**

My final architecture is a convolutional neural network. I augmented the LeNet architecture with more layers, deeper convolutional layers, and wider fully connected layers. Inspired by Vivek Yadav's post [4], I added a branch-then-merge section at the start. Each branch is a series of convolutional layers, a maxpool, and a dropout layer. Each branch has a different convolutional filter depth.

[4] Vivek Yadav, (98.8% solution) German sign classification using deep learning neural networks, URL:  https://chatbotslife.com/german-sign-classification-using-deep-learning-neural-networks-98-8-solution-d05656bf51ad. Retrieved 1/27/2017.

In [None]:
### Train your model here.
### Feel free to use as many code cells as needed.

### One-hot encode labels.

def safe_one_hot(y, num_labels):
    # From https://github.com/tensorflow/tensorflow/issues/6509
    sparse_labels = tf.reshape(y, [-1, 1])
    derived_size = tf.shape(sparse_labels)[0]
    indices = tf.reshape(tf.range(0, derived_size, 1), [-1, 1])
    concated = tf.concat(1, [indices, sparse_labels])
    outshape = tf.concat(0, [tf.reshape(derived_size, [1]), tf.reshape(num_labels, [1])])
    one_hot_y = tf.sparse_to_dense(concated, outshape, 1.0, 0.0)
    return one_hot_y

x = tf.placeholder(tf.float32, (None, 32, 32, 3))
y = tf.placeholder(tf.int32, (None))
# one_hot_y = tf.one_hot(y, n_classes)
one_hot_y = safe_one_hot(y, n_classes)

### Calculate logits, cross-entropy, loss, and setup optimizer.

rate = 0.0001

kp = tf.placeholder(tf.float32)

logits = define_nn_architecture(x, kp)
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, one_hot_y)
loss_operation = tf.reduce_mean(cross_entropy)
optimizer = tf.train.AdamOptimizer(learning_rate = rate)
training_operation = optimizer.minimize(loss_operation)

In [None]:
### Evaluate accuracy.

correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(one_hot_y, 1))
accuracy_operation = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
saver = tf.train.Saver()

def evaluate(X_data, y_data):
    num_examples = len(X_data)
    total_accuracy = 0
    sess = tf.get_default_session()
    for offset in range(0, num_examples, BATCH_SIZE):
        batch_x, batch_y = X_data[offset:offset+BATCH_SIZE], y_data[offset:offset+BATCH_SIZE]
        accuracy = sess.run(accuracy_operation, feed_dict={x: batch_x, y: batch_y, kp: 1.0})
        total_accuracy += (accuracy * len(batch_x))
    return total_accuracy / num_examples

In [None]:
import datetime
def format_seconds(secs):
    delta = datetime.timedelta(seconds=int(secs))
    return str(delta)

def format_now():
    return datetime.datetime.now().strftime('%Y-%m-%d_%Hh%Mm%S')

import time
start = time.monotonic()
end = time.monotonic()
print(format_seconds(end - start))
print(format_now())

In [None]:
import time

class Stopwatch:
    def __init__(self, autostart = True):
        self._duration = 0
        self._started = False
        self._start = 0
        if (autostart):
            self.start()
        
    def start(self):
        if not self._started:
            self._started = True
            self._start = time.monotonic()
        return self
    
    def stop(self):
        if not self._started:
            raise RuntimeError('Cannot stop stopwatch that has not been started.')
        end = time.monotonic()
        self._duration += (end - self._start)
        self._started = False
        self._start = 0
        
    def reset(self):
        self._duration = 0
        self._started = False
        self._start = 0
        
    def format_duration(self):
        return format_seconds(self._duration)
    
sw = Stopwatch().start()
sw.stop()
print(sw.format_duration())

In [None]:
def normalize(image_data):
    """
    Normalize the image data with Min-Max scaling to a range of [0.1, 0.9]
    :param image_data: The image data to be normalized
    :return: Normalized image data
    """
    a = 0.1
    b = 0.9
    rgb_min = 0
    rgb_max = 255
    return a + ( ( (image_data - rgb_min)*(b - a) )/( rgb_max - rgb_min ) )

X_train = normalize(X_train)
X_validation = normalize(X_validation)
X_test = normalize(X_test)

In [None]:
def predict(sess, X_pred):
    softmax = tf.nn.softmax(logits)
    output = sess.run(softmax, feed_dict={x: X_pred, kp: 1.0})
    predictions = np.argmax(output, axis = 1)
    return predictions

test_files = ['bend_right.png',
              'do_not_enter.png',
              'german_do_not_enter.png',
              'german_priority.png',
              'keep_right.png',
              'no_passing.png',
              'speed_limit.png',
              'stop_sign.png',
              'uk_speed_limit_50.png',
              'yield.png']

import matplotlib.image as mpimg

def read_test_images():
    X = []
    for file in test_files:
        img = mpimg.imread('.\\test_images\\' + file)
        X.append(img)
    X = np.array(X)
    return normalize(X)
    
def predict_test_images(sess):
    X = read_test_images()
    predictions = predict(sess, X)
    
    for i in range(len(test_files)):
        id_pred = predictions[i]
        print('{}: Prediction: {}, Name: {}'.format(test_files[i], id_pred, signname_map[id_pred]))
        print()

In [None]:
# Measure accuracy of model with test set.

def evaluate_test_accuracy(sess):
    test_stopwatch = Stopwatch()
    
    test_accuracy = evaluate(X_test, y_test)
    
    test_stopwatch.stop()

    print("Test Set Size: ", len(y_test))
    print("Test Accuracy: {:.4f}".format(test_accuracy))
    print("Duration: {}".format(test_stopwatch.format_duration()))

In [None]:
### Train model.

from sklearn.utils import shuffle
import time

train_model = True
epochs_since_improvement_limit = 2
min_accuracy = 0.99
keep_prob_during_training = 0.4

if (train_model):
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        num_examples = len(X_train)

        best_validation_accuracy = 0.0
        best_train_accuracy = 0.0

        print("Training... num_examples: ", num_examples)
        print()
        print("EPOCH, ValAcc, TrainSubsetAcc, Duration")
        print()

        train_stopwatch = Stopwatch()

        for i in range(MAX_EPOCHS):
            epoch_stopwatch = Stopwatch()

            X_train, y_train = shuffle(X_train, y_train)
            for offset in range(0, num_examples, BATCH_SIZE):
                end = offset + BATCH_SIZE
                batch_x, batch_y = X_train[offset:end], y_train[offset:end]
                sess.run(training_operation, feed_dict={x: batch_x, y: batch_y, kp: keep_prob_during_training})

            validation_accuracy = evaluate(X_validation, y_validation)

            val_size = len(y_validation)
            train_accuracy = evaluate(X_train[:val_size], y_train[:val_size])
            
            if validation_accuracy > best_validation_accuracy:
                best_validation_accuracy = validation_accuracy
                epochs_since_improvement = 0
            #elif train_accuracy > best_train_accuracy:
                #best_train_accuracy = train_accuracy
                #epochs_since_improvement = 0
            else:
                epochs_since_improvement += 1            

            epoch_stopwatch.stop()

            print("{}, {:.3f}, {:.3f}, {}".format(i+1, validation_accuracy, train_accuracy, epoch_stopwatch.format_duration()))
            print()

            if (epochs_since_improvement >= epochs_since_improvement_limit 
                and best_validation_accuracy >= min_accuracy):
                #and best_train_accuracy >= min_accuracy):
                print("Stopping early. Validation accuracy has not improved in {} epochs.".format(epochs_since_improvement_limit))
                print()
                break

        train_stopwatch.stop()

        print("Training Duration: ", train_stopwatch.format_duration())
        print()    
        print("Best Validation Accuracy: ", best_validation_accuracy)
        print()

        model_filename = '.\\cnn_branch_pyramid_{:.1f}_{}'.format(best_validation_accuracy * 100, format_now())
        saver.save(sess, model_filename)
        print("Model saved: ", model_filename)
        
        evaluate_test_accuracy(sess)
        predict_test_images(sess)

### 4 conv layers in 3 branches, dropouts between fully connected layers, learning rate=0.0001, 10 examples per example, keep_prob=0.4

Training... num_examples:  388168

EPOCH, ValAcc, TrainSubsetAcc, Duration

1, 0.455, 0.397, 0:06:46

2, 0.559, 0.512, 0:06:43

3, 0.737, 0.654, 0:06:39

4, 0.885, 0.812, 0:06:39

5, 0.946, 0.886, 0:06:39

6, 0.966, 0.936, 0:06:40

7, 0.983, 0.952, 0:06:46

8, 0.990, 0.956, 0:06:44

9, 0.993, 0.973, 0:06:44

10, 0.995, 0.979, 0:06:45

11, 0.996, 0.984, 0:06:39

12, 0.996, 0.979, 0:06:38

13, 0.997, 0.989, 0:06:43

14, 0.996, 0.987, 0:06:43

15, 0.998, 0.988, 0:06:42

16, 0.998, 0.992, 0:06:44

17, 0.997, 0.993, 0:06:41

Stopping early. Validation accuracy has not improved in 2 epochs.

Training Duration:  1:54:03

Best Validation Accuracy:  0.997704667177

Model saved:  .\cnn_branch_pyramid_99.8_2017-01-26_16h58m44
Test Set Size:  12630
Test Accuracy: 0.9827
Duration: 0:00:04
bend_right.png: Prediction: 5, Name: Speed limit (80km/h)

do_not_enter.png: Prediction: 5, Name: Speed limit (80km/h)

german_do_not_enter.png: Prediction: 5, Name: Speed limit (80km/h)

german_priority.png: Prediction: 10, Name: No passing for vehicles over 3.5 metric tons

keep_right.png: Prediction: 5, Name: Speed limit (80km/h)

no_passing.png: Prediction: 5, Name: Speed limit (80km/h)

speed_limit.png: Prediction: 5, Name: Speed limit (80km/h)

stop_sign.png: Prediction: 8, Name: Speed limit (120km/h)

uk_speed_limit_50.png: Prediction: 5, Name: Speed limit (80km/h)

yield.png: Prediction: 5, Name: Speed limit (80km/h)

### 4 conv layers in 3 branches, dropouts between fully connected layers, learning rate=0.0001, 10000 min examples per class, keep_prob=0.4

Training... num_examples:  430000

EPOCH, ValAcc, TrainSubsetAcc, Duration

1, 0.309, 0.214, 0:07:22

2, 0.541, 0.431, 0:07:19

3, 0.776, 0.700, 0:07:18

4, 0.882, 0.815, 0:07:18

5, 0.945, 0.893, 0:07:18

6, 0.961, 0.938, 0:07:18

7, 0.973, 0.957, 0:07:25

8, 0.981, 0.967, 0:07:25

9, 0.986, 0.972, 0:07:27

10, 0.990, 0.977, 0:07:29

11, 0.991, 0.980, 0:07:30

12, 0.993, 0.983, 0:07:24

13, 0.994, 0.982, 0:07:29

14, 0.995, 0.986, 0:07:31

15, 0.996, 0.990, 0:07:30

16, 0.995, 0.989, 0:07:28

17, 0.995, 0.990, 0:07:26

Stopping early. Validation accuracy has not improved in 2 epochs.

Training Duration:  2:06:04

Best Validation Accuracy:  0.995919408314

Model saved:  .\cnn_branch_pyramid_99.6_2017-01-26_14h53m51
Test Set Size:  12630
Test Accuracy: 0.9805
Duration: 0:00:04
bend_right.png: Prediction: 18, Name: General caution

do_not_enter.png: Prediction: 18, Name: General caution

german_do_not_enter.png: Prediction: 18, Name: General caution

german_priority.png: Prediction: 18, Name: General caution

keep_right.png: Prediction: 18, Name: General caution

no_passing.png: Prediction: 18, Name: General caution

speed_limit.png: Prediction: 18, Name: General caution

stop_sign.png: Prediction: 18, Name: General caution

uk_speed_limit_50.png: Prediction: 18, Name: General caution

yield.png: Prediction: 18, Name: General caution

### 4 conv layers in 3 branches, dropouts between fully connected layers, learning rate=0.0001, 5000 min examples per class, keep_prob=0.5

EPOCH, ValAcc, TrainSubsetAcc, Duration

1, 0.640, 0.493, 0:03:47

2, 0.824, 0.710, 0:03:48

3, 0.920, 0.834, 0:03:48

4, 0.956, 0.893, 0:03:48

5, 0.966, 0.923, 0:03:48

6, 0.983, 0.955, 0:03:48

7, 0.986, 0.965, 0:03:48

8, 0.989, 0.973, 0:03:48

9, 0.992, 0.982, 0:03:48

10, 0.992, 0.980, 0:03:48

11, 0.994, 0.984, 0:03:48

12, 0.994, 0.988, 0:03:48

13, 0.994, 0.992, 0:03:48

14, 0.996, 0.993, 0:03:48

15, 0.996, 0.992, 0:03:48

16, 0.996, 0.991, 0:03:48

17, 0.996, 0.993, 0:03:48

18, 0.997, 0.995, 0:03:48

19, 0.997, 0.995, 0:03:47

20, 0.997, 0.995, 0:03:48

21, 0.997, 0.993, 0:03:47

Stopping early. Validation accuracy has not improved in 2 epochs.

Training Duration:  1:19:53

Best Validation Accuracy:  0.997194593216

Model saved:  .\cnn_branch_pyramid_99.7_2017-01-26_02h35m23
bend_right.png: Prediction: 12, Name: Priority road

do_not_enter.png: Prediction: 12, Name: Priority road

german_do_not_enter.png: Prediction: 12, Name: Priority road

german_priority.png: Prediction: 12, Name: Priority road

keep_right.png: Prediction: 12, Name: Priority road

no_passing.png: Prediction: 12, Name: Priority road

speed_limit.png: Prediction: 12, Name: Priority road

stop_sign.png: Prediction: 12, Name: Priority road

uk_speed_limit_50.png: Prediction: 12, Name: Priority road

yield.png: Prediction: 12, Name: Priority road

### 4 conv layers in 3 branches, dropouts between fully connected layers

Training... num_examples:  215000

EPOCH, ValAcc, TrainSubsetAcc, Duration

1, 0.946, 0.878, 0:03:52

2, 0.988, 0.957, 0:03:49

3, 0.990, 0.975, 0:03:48

4, 0.988, 0.983, 0:03:49

5, 0.994, 0.984, 0:03:48

6, 0.995, 0.981, 0:03:48

7, 0.992, 0.986, 0:03:49

8, 0.993, 0.987, 0:03:48

Stopping early. Validation accuracy has not improved in 2 epochs.

Training Duration:  0:30:34

Best Validation Accuracy:  0.994899260393

Model saved:  .\cnn_branch_pyramid_99.5_2017-01-26_00h57m57
bend_right.png: Prediction: 13, Name: Yield

do_not_enter.png: Prediction: 13, Name: Yield

german_do_not_enter.png: Prediction: 15, Name: No vehicles

german_priority.png: Prediction: 13, Name: Yield

keep_right.png: Prediction: 13, Name: Yield

no_passing.png: Prediction: 13, Name: Yield

speed_limit.png: Prediction: 13, Name: Yield

stop_sign.png: Prediction: 13, Name: Yield

uk_speed_limit_50.png: Prediction: 29, Name: Bicycles crossing

yield.png: Prediction: 13, Name: Yield

### Proper dropout, 3 conv layers in 3 branches, normalization

Training... num_examples:  215000

EPOCH, ValAcc, TrainSubsetAcc, Duration

1, 0.993, 0.971, 0:01:52

2, 0.993, 0.988, 0:01:50

3, 0.993, 0.992, 0:01:50

4, 0.994, 0.993, 0:01:50

5, 0.995, 0.995, 0:01:50

6, 0.996, 0.996, 0:01:50

7, 0.997, 0.998, 0:01:50

8, 0.996, 0.995, 0:01:49

9, 0.996, 0.998, 0:01:49

Stopping early. Validation accuracy has not improved in 2 epochs.

Training Duration:  0:16:35

Best Validation Accuracy:  0.996684519255

Model saved:  .\cnn_branch_pyramid_99.7_2017-01-25_23h20m01
bend_right.png: Prediction: 3, Name: Speed limit (60km/h)

do_not_enter.png: Prediction: 3, Name: Speed limit (60km/h)

german_do_not_enter.png: Prediction: 3, Name: Speed limit (60km/h)

german_priority.png: Prediction: 3, Name: Speed limit (60km/h)

keep_right.png: Prediction: 3, Name: Speed limit (60km/h)

no_passing.png: Prediction: 3, Name: Speed limit (60km/h)

speed_limit.png: Prediction: 3, Name: Speed limit (60km/h)

stop_sign.png: Prediction: 3, Name: Speed limit (60km/h)

uk_speed_limit_50.png: Prediction: 3, Name: Speed limit (60km/h)

yield.png: Prediction: 3, Name: Speed limit (60km/h)

### Proper dropout, dropout at end of each branch, 2 conv layers in 3 branches, 1 deeply-layered branch, min_accuracy=0.995

Training... num_examples:  211728

EPOCH, ValAcc, TrainSubsetAcc, Duration

1, 0.907, 0.850, 0:02:10

2, 0.962, 0.936, 0:02:09

3, 0.975, 0.952, 0:02:08

4, 0.986, 0.964, 0:02:08

5, 0.992, 0.975, 0:02:08

6, 0.994, 0.981, 0:02:08

7, 0.995, 0.982, 0:02:08

8, 0.775, 0.710, 0:02:08

9, 0.932, 0.888, 0:02:08

10, 0.975, 0.955, 0:02:07

11, 0.993, 0.980, 0:02:07

12, 0.995, 0.986, 0:02:08

13, 0.995, 0.994, 0:02:07

14, 0.995, 0.989, 0:02:07

15, 0.995, 0.993, 0:02:08

16, 0.996, 0.996, 0:02:08

17, 0.998, 0.996, 0:02:08

18, 0.997, 0.996, 0:02:08

19, 0.997, 0.996, 0:02:09

20, 0.997, 0.998, 0:02:08

21, 0.997, 1.000, 0:02:09

22, 0.996, 0.996, 0:02:09

23, 0.998, 0.997, 0:02:09

Stopping early. Validation accuracy has not improved in 2 epochs.

Training Duration:  0:49:19

Best Validation Accuracy:  0.998214741137

Model saved:  .\cnn_branch_pyramid_99.8_2017-01-25_18h31m54
bend_right.png: Prediction: 34, Name: Turn left ahead

do_not_enter.png: Prediction: 38, Name: Keep right

german_do_not_enter.png: Prediction: 2, Name: Speed limit (50km/h)

german_priority.png: Prediction: 38, Name: Keep right

keep_right.png: Prediction: 34, Name: Turn left ahead

no_passing.png: Prediction: 21, Name: Double curve

speed_limit.png: Prediction: 38, Name: Keep right

stop_sign.png: Prediction: 2, Name: Speed limit (50km/h)

uk_speed_limit_50.png: Prediction: 34, Name: Turn left ahead

yield.png: Prediction: 38, Name: Keep right

### Dropout at end of each branch, 3 conv layers in each branch, wider fully-connected layers, min_accuracy=0.99

38, 0.997, 0.991, 0:01:21

39, 0.992, 0.985, 0:01:21

40, 0.992, 0.983, 0:01:21

Stopping early. Validation accuracy has not improved in 2 epochs.

Training Duration:  0:55:00

Best Validation Accuracy:  0.996939556236

Model saved:  .\cnn_branch_pyramid_99.7_2017-01-25_12h55m34
bend_right.png: Prediction: 30, Name: Beware of ice/snow

do_not_enter.png: Prediction: 38, Name: Keep right

german_do_not_enter.png: Prediction: 38, Name: Keep right

german_priority.png: Prediction: 29, Name: Bicycles crossing

keep_right.png: Prediction: 11, Name: Right-of-way at the next intersection

no_passing.png: Prediction: 30, Name: Beware of ice/snow

speed_limit.png: Prediction: 30, Name: Beware of ice/snow

stop_sign.png: Prediction: 12, Name: Priority road

uk_speed_limit_50.png: Prediction: 11, Name: Right-of-way at the next intersection

yield.png: Prediction: 40, Name: Roundabout mandatory

### Dropout at end of each branch, wider fully-connected layers, min_accuracy=0.97

Training... num_examples:  211728

EPOCH, ValAcc, TrainSubsetAcc, Duration

1, 0.755, 0.675, 0:01:25

2, 0.885, 0.821, 0:01:25

3, 0.946, 0.895, 0:01:25

4, 0.970, 0.932, 0:01:25

5, 0.972, 0.941, 0:01:25

6, 0.983, 0.955, 0:01:25

7, 0.981, 0.958, 0:01:25

8, 0.984, 0.958, 0:01:25

9, 0.985, 0.968, 0:01:25

10, 0.986, 0.976, 0:01:25

11, 0.988, 0.964, 0:01:25

12, 0.991, 0.979, 0:01:25

13, 0.989, 0.980, 0:01:25

14, 0.991, 0.981, 0:01:25

15, 0.990, 0.981, 0:01:25

16, 0.991, 0.979, 0:01:25

17, 0.990, 0.986, 0:01:25

18, 0.995, 0.988, 0:01:25

19, 0.994, 0.988, 0:01:25

20, 0.991, 0.985, 0:01:25

21, 0.992, 0.981, 0:01:25

Stopping early. Validation accuracy has not improved in 2 epochs.

Training Duration:  0:29:52

Best Validation Accuracy:  0.994899260545

Model saved:  .\cnn_branch_pyramid_99.5_2017-01-25_02h17m53
bend_right.png: Prediction: 9, Name: No passing

do_not_enter.png: Prediction: 3, Name: Speed limit (60km/h)

german_do_not_enter.png: Prediction: 17, Name: No entry

german_priority.png: Prediction: 12, Name: Priority road

keep_right.png: Prediction: 25, Name: Road work

no_passing.png: Prediction: 18, Name: General caution

speed_limit.png: Prediction: 25, Name: Road work

stop_sign.png: Prediction: 17, Name: No entry

uk_speed_limit_50.png: Prediction: 4, Name: Speed limit (70km/h)

yield.png: Prediction: 9, Name: No passing

### Uniform distribution of training examples (4000 per class)

199, 0.948, 0.998, 0:00:47

200, 0.947, 0.997, 0:00:47

Training Duration:  2:37:51

Best Validation Accuracy:  0.957153787755

Model saved:  .\cnn_branch_pyramid_95.7_2017-01-24_21h30m46
bend_right.png: Prediction: 41, Name: End of no passing

do_not_enter.png: Prediction: 34, Name: Turn left ahead

german_do_not_enter.png: Prediction: 41, Name: End of no passing

german_priority.png: Prediction: 39, Name: Keep left

keep_right.png: Prediction: 41, Name: End of no passing

no_passing.png: Prediction: 41, Name: End of no passing

speed_limit.png: Prediction: 19, Name: Dangerous curve to the left

stop_sign.png: Prediction: 41, Name: End of no passing

uk_speed_limit_50.png: Prediction: 41, Name: End of no passing

yield.png: Prediction: 39, Name: Keep left

### 32, 64, 128 depth conv filter branches

Training... num_examples:  211728

EPOCH, ValAcc, TrainSubsetAcc, Duration

1, 0.951, 0.885, 0:01:19

2, 0.970, 0.949, 0:01:18

3, 0.985, 0.969, 0:01:18

4, 0.994, 0.980, 0:01:17

5, 0.994, 0.983, 0:01:17

6, 0.996, 0.989, 0:01:18

7, 0.994, 0.990, 0:01:17

8, 0.995, 0.991, 0:01:17

9, 0.988, 0.982, 0:01:17

10, 0.996, 0.995, 0:01:24

11, 0.996, 0.994, 0:01:17

12, 0.993, 0.993, 0:01:17

13, 0.994, 0.993, 0:01:17

14, 0.991, 0.985, 0:01:17

15, 0.990, 0.992, 0:01:17

16, 0.996, 0.995, 0:01:17

17, 0.996, 0.995, 0:01:17

18, 0.996, 0.996, 0:01:17

19, 0.996, 0.996, 0:01:17

20, 0.995, 0.998, 0:01:17

21, 0.997, 0.993, 0:01:17

22, 0.996, 0.996, 0:01:17

23, 0.996, 0.997, 0:01:17

Stopping early. Validation accuracy has not improved in 2 epochs.

Training Duration:  0:29:56

Best Validation Accuracy:  0.997194593216

Model saved:  .\cnn_branch_pyramid_99.7_2017-01-24_15h30m50
bend_right.png: Prediction: 5, Name: Speed limit (80km/h)

do_not_enter.png: Prediction: 5, Name: Speed limit (80km/h)

german_do_not_enter.png: Prediction: 5, Name: Speed limit (80km/h)

german_priority.png: Prediction: 4, Name: Speed limit (70km/h)

keep_right.png: Prediction: 5, Name: Speed limit (80km/h)

no_passing.png: Prediction: 5, Name: Speed limit (80km/h)

speed_limit.png: Prediction: 5, Name: Speed limit (80km/h)

stop_sign.png: Prediction: 5, Name: Speed limit (80km/h)

uk_speed_limit_50.png: Prediction: 5, Name: Speed limit (80km/h)

yield.png: Prediction: 5, Name: Speed limit (80km/h)

### AdamOptimizer and Predictions

Training... num_examples:  211728

EPOCH, ValAcc, TrainSubsetAcc, Duration

1, 0.955, 0.908, 0:00:58

2, 0.985, 0.966, 0:00:57

3, 0.990, 0.974, 0:00:57

4, 0.988, 0.981, 0:00:57

5, 0.994, 0.988, 0:00:57

6, 0.992, 0.986, 0:00:57

7, 0.995, 0.994, 0:00:57

8, 0.996, 0.995, 0:00:57

9, 0.996, 0.996, 0:00:57

10, 0.992, 0.993, 0:00:57

11, 0.996, 0.996, 0:00:57

12, 0.994, 0.995, 0:00:57

13, 0.998, 0.997, 0:00:57

14, 0.996, 0.995, 0:00:57

15, 0.997, 0.997, 0:00:57

Stopping early. Validation accuracy has not improved in 2 epochs.

Training Duration:  0:14:24

Best Validation Accuracy:  0.997959704157

Model saved:  .\cnn_branch_pyramid_99.8_2017-01-23_13h38m23

bend_right.png: Prediction: 17, Name: b'Vehicles over 3.5 metric tons prohibited'

do_not_enter.png: Prediction: 17, Name: b'Vehicles over 3.5 metric tons prohibited'

speed_limit.png: Prediction: 3, Name: b'Speed limit (50km/h)'

stop_sign.png: Prediction: 17, Name: b'Vehicles over 3.5 metric tons prohibited'

yield.png: Prediction: 17, Name: b'Vehicles over 3.5 metric tons prohibited'

### AdamOptimizer and Final Model

Training... num_examples:  211728

EPOCH, ValAcc, TrainSubsetAcc, Duration

1, 0.956, 0.908, 0:00:56

2, 0.987, 0.961, 0:00:57

3, 0.985, 0.965, 0:01:00

4, 0.992, 0.978, 0:01:00

5, 0.993, 0.982, 0:00:59

6, 0.994, 0.991, 0:00:59

7, 0.993, 0.990, 0:00:59

8, 0.993, 0.991, 0:01:00

9, 0.995, 0.994, 0:01:00

10, 0.995, 0.989, 0:00:59

11, 0.997, 0.996, 0:00:59

12, 0.995, 0.993, 0:00:59

13, 0.995, 0.992, 0:00:59

Stopping early. Validation accuracy has not improved in 2 epochs.

Training Duration:  0:12:53

Best Validation Accuracy:  0.997194593216

Model saved:  .\cnn_branch_pyramid_99.7

### AdagradOptimizer

191, 0.970, 0.923, 0:00:57

192, 0.968, 0.925, 0:00:57

193, 0.968, 0.923, 0:00:57

194, 0.968, 0.928, 0:00:57

195, 0.969, 0.921, 0:00:57

196, 0.969, 0.928, 0:00:57

197, 0.968, 0.935, 0:00:57

198, 0.970, 0.920, 0:00:57

199, 0.969, 0.920, 0:00:57

200, 0.963, 0.924, 0:00:57

Training Duration:  3:12:22

Best Validation Accuracy:  0.970160673906

Model saved:  .\cnn_branch_pyramid_97.0

### GradientDescentOptimizer

Training... num_examples:  211728

EPOCH, ValAcc, TrainSubsetAcc

1, 0.052, 0.057

2, 0.052, 0.063

3, 0.052, 0.059

4, 0.052, 0.066

5, 0.052, 0.060

6, 0.052, 0.061

7, 0.052, 0.059

8, 0.052, 0.058

Stopping early. Validation accuracy has not improved in 7 epochs.

### (All runs below use AdamOptimizer).

### 20 epochs and early-stopping

Training... num_examples:  211728

EPOCH, ValAcc, TrainSubsetAcc

1, 0.963, 0.902

2, 0.989, 0.963

3, 0.990, 0.972

4, 0.989, 0.979

5, 0.995, 0.983

6, 0.992, 0.983

7, 0.995, 0.991

8, 0.995, 0.994

9, 0.995, 0.990

10, 0.994, 0.992

11, 0.997, 0.998

12, 0.996, 0.995

13, 0.993, 0.994

14, 0.997, 0.996

15, 0.996, 0.994

16, 0.995, 0.995

17, 0.997, 0.996

18, 0.996, 0.997

Stopping early. Validation accuracy has not improved in 7 epochs.

### 20 Epochs

num_examples:  211728

Training...

EPOCH, ValAcc, ValLoss, TrainSAcc, TrainSLoss

1, 0.887, 0.358, 0.830, 0.537

2, 0.969, 0.118, 0.947, 0.176

3, 0.983, 0.058, 0.975, 0.080

4, 0.986, 0.070, 0.978, 0.066

5, 0.994, 0.036, 0.986, 0.049

6, 0.986, 0.079, 0.978, 0.086

7, 0.993, 0.045, 0.992, 0.032

8, 0.995, 0.023, 0.992, 0.025

9, 0.992, 0.038, 0.993, 0.025

10, 0.994, 0.035, 0.991, 0.033

11, 0.995, 0.029, 0.996, 0.026

12, 0.992, 0.030, 0.988, 0.044

13, 0.993, 0.032, 0.995, 0.019

14, 0.997, 0.016, 0.995, 0.024

15, 0.997, 0.017, 0.994, 0.028

16, 0.998, 0.016, 0.996, 0.012

17, 0.996, 0.039, 0.994, 0.020

18, 0.996, 0.027, 0.998, 0.005

19, 0.998, 0.014, 0.997, 0.010

20, 0.998, 0.016, 0.997, 0.013

### Split train and validation after generating extra data:
num_examples:  211728
Training...

EPOCH 1 ...
Validation Accuracy = 0.878

EPOCH 2 ...
Validation Accuracy = 0.942

EPOCH 3 ...
Validation Accuracy = 0.954

EPOCH 4 ...
Validation Accuracy = 0.956

EPOCH 5 ...
Validation Accuracy = 0.975

EPOCH 6 ...
Validation Accuracy = 0.978

EPOCH 7 ...
Validation Accuracy = 0.970

EPOCH 8 ...
Validation Accuracy = 0.975

EPOCH 9 ...
Validation Accuracy = 0.981

EPOCH 10 ...
Validation Accuracy = 0.978

### Split train and validation before generating extra data
num_examples:  211728
Training...

EPOCH 1 ...
Validation Accuracy = 0.978

EPOCH 2 ...
Validation Accuracy = 0.986

EPOCH 3 ...
Validation Accuracy = 0.995

EPOCH 4 ...
Validation Accuracy = 0.993

EPOCH 5 ...
Validation Accuracy = 0.995

EPOCH 6 ...
Validation Accuracy = 0.994

EPOCH 7 ...
Validation Accuracy = 0.998

EPOCH 8 ...
Validation Accuracy = 0.993

EPOCH 9 ...
Validation Accuracy = 0.995

EPOCH 10 ...
Validation Accuracy = 0.993

In [None]:
with tf.Session() as sess:
    if train_model:
        model = model_filename
    else:
        #model = '.\\cnn_branch_pyramid_99.8_2017-01-23_13h38m23'
        #model = '.\\cnn_branch_pyramid_99.7_2017-01-24_15h30m50'
        model = '.\\cnn_branch_pyramid_99.7_2017-01-25_23h20m01'
    
    # http://stackoverflow.com/questions/33759623/tensorflow-how-to-restore-a-previously-saved-model-python
    saver = tf.train.import_meta_graph(model + '.meta')
    saver.restore(sess, model)
    
    print('Restored model: ', model)
    
    evaluate_test_accuracy(sess)

Restored model:  .\cnn_branch_pyramid_99.8_2017-01-26_16h58m44
Test Set Size:  12630
Test Accuracy: 0.9827
Duration: 0:00:03

Restored model:  .\cnn_branch_pyramid_99.6_2017-01-26_14h53m51
Test Set Size:  12630
Test Accuracy: 0.9805
Duration: 0:00:03

Restored model:  .\cnn_branch_pyramid_99.7_2017-01-26_02h35m23
Test Set Size:  12630
Test Accuracy: 0.9787
Duration: 0:00:04

Restored model:  .\cnn_branch_pyramid_99.5_2017-01-26_00h57m57
Test Set Size:  12630
Test Accuracy: 0.9761
Duration: 0:00:04

Restored model:  .\cnn_branch_pyramid_99.7_2017-01-25_23h20m01
Test Set Size:  12630
Test Accuracy: 0.9669
Duration: 0:00:01

### Question 4

_How did you train your model? (Type of optimizer, batch size, epochs, hyperparameters, etc.)_


**Answer:**

I trained my model with AdamOptimizer and these hyperparameters:
- Batch size: 128
- Epochs (with early stopping): 200
- Dropout: 0.4
- Learning rate: 0.0001
- Number of jittered images per training example: 10
- (mu, sigma) for weight initialization: (0, 0.1)
- Epochs since improvement for early stop: 2

I tried other optimizers but compared to AdamOptimizer, some were slower to reach the highest accuracy and others reached a lower validation accuracy.

Early-stopping usually kicked in during training. In my final training run, it stopped after 17 epochs.

I tried lower dropouts like 0.2, but when I did that, the training seemed to be stuck at a low validation accuracy (like 5%).

I tried learning rates of 0.001 and 0.0001 and achieved a slightly improved validation accuracy with the smaller learning rate.

I tried different approaches for generating jittered images. 
1. Scale up the count of training examples by some factor, thereby keeping the above non-uniform distribution.
2. Increase the count of training examples to cause a uniform distribution of classes.

The first approach resulted in better validation accuracy, likely because the distribution of labels was closer between the training set and validation set.

I did not try different values of (mu, sigma) for weight initialization.

I tried larger values for "epochs since improvement for early stop" but I realized that could risk overfitting, so I lowered to 2.

### Question 5


_What approach did you take in coming up with a solution to this problem? It may have been a process of trial and error, in which case, outline the steps you took to get to the final solution and why you chose those steps. Perhaps your solution involved an already well known implementation or architecture. In this case, discuss why you think this is suitable for the current problem._

**Answer:**

---

## Step 3: Test a Model on New Images

Take several pictures of traffic signs that you find on the web or around you (at least five), and run them through your classifier on your computer to produce example results. The classifier might not recognize some local signs but it could prove interesting nonetheless.

You may find `signnames.csv` useful as it contains mappings from the class id (integer) to the actual sign name.

### Implementation

Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.

In [None]:
### Load the images and plot them here.
### Feel free to use as many code cells as needed.

import matplotlib.image as mpimg
import matplotlib.pyplot as plt

fignum = 0

for file in test_files:
    img = mpimg.imread('.\\test_images\\' + file)
    fignum += 1
    draw_image(img, file, fignum)

### Question 6

_Choose five candidate images of traffic signs and provide them in the report. Are there any particular qualities of the image(s) that might make classification difficult? It could be helpful to plot the images in the notebook._



**Answer:**

In [None]:
### Run the predictions here.
### Feel free to use as many code cells as needed.

with tf.Session() as sess:
    if train_model:
        model = model_filename
    else:
        model = '.\\cnn_branch_pyramid_99.8_2017-01-23_13h38m23'
    
    # http://stackoverflow.com/questions/33759623/tensorflow-how-to-restore-a-previously-saved-model-python
    saver = tf.train.import_meta_graph(model + '.meta')
    saver.restore(sess, model)
    predict_test_images(sess)

### Predictions

bend_right.png: Prediction: 17, Name: No entry

do_not_enter.png: Prediction: 17, Name: No entry

german_do_not_enter.png: Prediction: 17, Name: No entry

german_priority.png: Prediction: 17, Name: No entry

keep_right.png: Prediction: 3, Name: Speed limit (60km/h)

no_passing.png: Prediction: 10, Name: No passing for vehicles over 3.5 metric tons

speed_limit.png: Prediction: 3, Name: Speed limit (60km/h)

stop_sign.png: Prediction: 17, Name: No entry

uk_speed_limit_50.png: Prediction: 5, Name: Speed limit (80km/h)

yield.png: Prediction: 17, Name: No entry

### Question 7

_Is your model able to perform equally well on captured pictures when compared to testing on the dataset? The simplest way to do this check the accuracy of the predictions. For example, if the model predicted 1 out of 5 signs correctly, it's 20% accurate._

_**NOTE:** You could check the accuracy manually by using `signnames.csv` (same directory). This file has a mapping from the class id (0-42) to the corresponding sign name. So, you could take the class id the model outputs, lookup the name in `signnames.csv` and see if it matches the sign from the image._


**Answer:**

In [None]:
### Visualize the softmax probabilities here.
### Feel free to use as many code cells as needed.

def certainty(sess, X_pred):
    softmax = tf.nn.softmax(logits)
    softmax_output = sess.run(softmax, feed_dict={x: X_pred, kp: 1.0})
    top_k = tf.nn.top_k(tf.constant(softmax_output), k=5)
    top_k_output = sess.run(top_k)
    return top_k_output

import matplotlib.image as mpimg

def eval_certainty(sess):
    X = read_test_images()
    
    top_k = certainty(sess, X)
    
    for i in range(len(test_files)):
        print('### ' + test_files[i])
        print()
        for j in range(len(top_k.values[i])):
            id_pred = top_k.indices[i][j]
            print('{:.3f}, {}, {}'.format(top_k.values[i][j], top_k.indices[i][j], signname_map[id_pred]))
            print()
        
with tf.Session() as sess:
    if train_model:
        model = model_filename
    else:
        model = '.\\cnn_branch_pyramid_99.8_2017-01-23_13h38m23'
    
    # http://stackoverflow.com/questions/33759623/tensorflow-how-to-restore-a-previously-saved-model-python
    saver = tf.train.import_meta_graph(model + '.meta')
    saver.restore(sess, model)
    eval_certainty(sess)

## Test Accuracy 98.27%

model = .\cnn_branch_pyramid_99.8_2017-01-26_16h58m44

### bend_right.png

0.075, 5, Speed limit (80km/h)

0.073, 8, Speed limit (120km/h)

0.056, 31, Wild animals crossing

0.051, 7, Speed limit (100km/h)

0.050, 10, No passing for vehicles over 3.5 metric tons

### do_not_enter.png

0.080, 5, Speed limit (80km/h)

0.075, 8, Speed limit (120km/h)

0.071, 10, No passing for vehicles over 3.5 metric tons

0.058, 31, Wild animals crossing

0.057, 7, Speed limit (100km/h)

### german_do_not_enter.png

0.084, 5, Speed limit (80km/h)

0.074, 8, Speed limit (120km/h)

0.069, 10, No passing for vehicles over 3.5 metric tons

0.059, 7, Speed limit (100km/h)

0.053, 31, Wild animals crossing

### german_priority.png

0.077, 10, No passing for vehicles over 3.5 metric tons

0.067, 5, Speed limit (80km/h)

0.062, 8, Speed limit (120km/h)

0.057, 31, Wild animals crossing

0.053, 7, Speed limit (100km/h)

### keep_right.png

0.065, 5, Speed limit (80km/h)

0.047, 10, No passing for vehicles over 3.5 metric tons

0.046, 8, Speed limit (120km/h)

0.044, 38, Keep right

0.044, 4, Speed limit (70km/h)

### no_passing.png

0.082, 5, Speed limit (80km/h)

0.073, 8, Speed limit (120km/h)

0.069, 10, No passing for vehicles over 3.5 metric tons

0.053, 7, Speed limit (100km/h)

0.046, 9, No passing

### speed_limit.png

0.082, 5, Speed limit (80km/h)

0.067, 8, Speed limit (120km/h)

0.057, 10, No passing for vehicles over 3.5 metric tons

0.053, 7, Speed limit (100km/h)

0.046, 4, Speed limit (70km/h)

### stop_sign.png

0.080, 8, Speed limit (120km/h)

0.076, 5, Speed limit (80km/h)

0.069, 10, No passing for vehicles over 3.5 metric tons

0.058, 7, Speed limit (100km/h)

0.057, 31, Wild animals crossing

### uk_speed_limit_50.png

0.083, 5, Speed limit (80km/h)

0.055, 8, Speed limit (120km/h)

0.052, 10, No passing for vehicles over 3.5 metric tons

0.046, 7, Speed limit (100km/h)

0.045, 4, Speed limit (70km/h)

### yield.png

0.084, 5, Speed limit (80km/h)

0.067, 8, Speed limit (120km/h)

0.059, 7, Speed limit (100km/h)

0.057, 10, No passing for vehicles over 3.5 metric tons

0.046, 4, Speed limit (70km/h)

### bend_right.png

0.583, 17, No entry

0.149, 3, Speed limit (60km/h)

0.094, 5, Speed limit (80km/h)

0.031, 9, No passing

0.027, 30, Beware of ice/snow

### do_not_enter.png

0.572, 17, No entry

0.112, 3, Speed limit (60km/h)

0.077, 7, Speed limit (100km/h)

0.048, 5, Speed limit (80km/h)

0.041, 30, Beware of ice/snow

### german_do_not_enter.png

0.617, 17, No entry

0.090, 30, Beware of ice/snow

0.074, 3, Speed limit (60km/h)

0.035, 7, Speed limit (100km/h)

0.028, 9, No passing

### german_priority.png

0.342, 17, No entry

0.152, 30, Beware of ice/snow

0.113, 5, Speed limit (80km/h)

0.100, 3, Speed limit (60km/h)

0.085, 10, No passing for vehicles over 3.5 metric tons

### keep_right.png

0.273, 3, Speed limit (60km/h)

0.201, 5, Speed limit (80km/h)

0.172, 30, Beware of ice/snow

0.081, 38, Keep right

0.051, 17, No entry

### no_passing.png

0.238, 10, No passing for vehicles over 3.5 metric tons

0.192, 5, Speed limit (80km/h)

0.107, 3, Speed limit (60km/h)

0.086, 9, No passing

0.085, 17, No entry

### speed_limit.png

0.275, 3, Speed limit (60km/h)

0.220, 5, Speed limit (80km/h)

0.132, 17, No entry

0.130, 30, Beware of ice/snow

0.050, 7, Speed limit (100km/h)

### stop_sign.png

0.431, 17, No entry

0.106, 3, Speed limit (60km/h)

0.078, 30, Beware of ice/snow

0.059, 5, Speed limit (80km/h)

0.036, 10, No passing for vehicles over 3.5 metric tons

### uk_speed_limit_50.png

0.194, 5, Speed limit (80km/h)

0.190, 3, Speed limit (60km/h)

0.172, 17, No entry

0.085, 30, Beware of ice/snow

0.062, 10, No passing for vehicles over 3.5 metric tons

### yield.png

0.630, 17, No entry

0.087, 5, Speed limit (80km/h)

0.069, 3, Speed limit (60km/h)

0.049, 10, No passing for vehicles over 3.5 metric tons

0.030, 30, Beware of ice/snow

### Question 8

*Use the model's softmax probabilities to visualize the **certainty** of its predictions, [`tf.nn.top_k`](https://www.tensorflow.org/versions/r0.12/api_docs/python/nn.html#top_k) could prove helpful here. Which predictions is the model certain of? Uncertain? If the model was incorrect in its initial prediction, does the correct prediction appear in the top k? (k should be 5 at most)*

`tf.nn.top_k` will return the values and indices (class ids) of the top k predictions. So if k=3, for each sign, it'll return the 3 largest probabilities (out of a possible 43) and the correspoding class ids.

Take this numpy array as an example:

```
# (5, 6) array
a = np.array([[ 0.24879643,  0.07032244,  0.12641572,  0.34763842,  0.07893497,
         0.12789202],
       [ 0.28086119,  0.27569815,  0.08594638,  0.0178669 ,  0.18063401,
         0.15899337],
       [ 0.26076848,  0.23664738,  0.08020603,  0.07001922,  0.1134371 ,
         0.23892179],
       [ 0.11943333,  0.29198961,  0.02605103,  0.26234032,  0.1351348 ,
         0.16505091],
       [ 0.09561176,  0.34396535,  0.0643941 ,  0.16240774,  0.24206137,
         0.09155967]])
```

Running it through `sess.run(tf.nn.top_k(tf.constant(a), k=3))` produces:

```
TopKV2(values=array([[ 0.34763842,  0.24879643,  0.12789202],
       [ 0.28086119,  0.27569815,  0.18063401],
       [ 0.26076848,  0.23892179,  0.23664738],
       [ 0.29198961,  0.26234032,  0.16505091],
       [ 0.34396535,  0.24206137,  0.16240774]]), indices=array([[3, 0, 5],
       [0, 1, 4],
       [0, 5, 1],
       [1, 3, 5],
       [1, 4, 3]], dtype=int32))
```

Looking just at the first row we get `[ 0.34763842,  0.24879643,  0.12789202]`, you can confirm these are the 3 largest probabilities in `a`. You'll also notice `[3, 0, 5]` are the corresponding indices.

**Answer:**

> **Note**: Once you have completed all of the code implementations and successfully answered each question above, you may finalize your work by exporting the iPython Notebook as an HTML document. You can do this by using the menu above and navigating to  \n",
    "**File -> Download as -> HTML (.html)**. Include the finished document along with this notebook as your submission.