# Self-Driving Car Engineer Nanodegree

## Deep Learning

## Project: Build a Traffic Sign Recognition Classifier

In this notebook, a template is provided for you to implement your functionality in stages which is required to successfully complete this project. If additional code is required that cannot be included in the notebook, be sure that the Python code is successfully imported and included in your submission, if necessary. Sections that begin with **'Implementation'** in the header indicate where you should begin your implementation for your project. Note that some sections of implementation are optional, and will be marked with **'Optional'** in the header.

In addition to implementing code, there will be questions that you must answer which relate to the project and your implementation. Each section where you will answer a question is preceded by a **'Question'** header. Carefully read each question and provide thorough answers in the following text boxes that begin with **'Answer:'**. Your project submission will be evaluated based on your answers to each of the questions and the implementation you provide.

>**Note:** Code and Markdown cells can be executed using the **Shift + Enter** keyboard shortcut. In addition, Markdown cells can be edited by typically double-clicking the cell to enter edit mode.

In [1]:
### Import the libraries to be used for the project
import tensorflow as tf
from tensorflow.python.ops.variables import Variable
import cv2
import numpy as np
import scipy.ndimage
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelBinarizer
from sklearn.metrics import confusion_matrix
from sklearn.utils import shuffle
import pickle
import os
import math
import random
from tqdm import tqdm
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from mpl_toolkits.mplot3d import Axes3D
from IPython.display import Image
%matplotlib inline

print('Libraries are imported')

Libraries are imported


In [2]:
### The following are wrapper or helper functions
def get_data():
    '''Load data from a pickle file'''
    
    training_file = "traffic-signs-data/train.p" # The training and testing file paths
    testing_file = "traffic-signs-data/test.p"

    with open(training_file, mode='rb') as f:
        train = pickle.load(f)
    with open(testing_file, mode='rb') as f:
        test = pickle.load(f)

    X_train, y_train = train['features'], train['labels']
    X_test, y_test = test['features'], test['labels']

    return X_train, y_train, X_test, y_test


def make_gray(features):
    '''Create an array to hold a grayscale image vector for each image sample in the dataset'''
    
    # Create an array to hold a grayscale image vector for each image sample in the training dataset
    X_train_gray = np.empty([*features.shape[0:3], 1], np.float32)

    for i in range(len(features)):
        X_train_gray[i, :, :, 0] = cv2.cvtColor(features[i], cv2.COLOR_BGR2GRAY)

    return X_train_gray


def get_summary(X_train, y_train, X_test, y_test):
    '''Get some basic information about the data'''
    
    # TODO: number of training examples
    n_train = X_train.shape[0]

    # TODO: number of testing examples
    n_test = X_test.shape[0]

    # TODO: what's the shape of an image?
    image_shape = X_train.shape[1:4]

    # TODO: how many classes are in the dataset
    n_classes = max(y_train) + 1
    
    return n_train, n_test, image_shape, n_classes


def print_images(data, indices=[]):
    '''Show each image provided in data'''
    
    # For a single image
    if len(data.shape) <= 3:
        fig = plt.figure()
        ax = fig.add_subplot(111)
        # For grayscale images
        if (data.shape[-1] == 1):
            data = data[:, :, 0]
            ax.imshow(data, cmap='gray')
        # For color images
        else:
            ax.imshow(data)
        if len(indices) > 0:
            print(indices[0])
        plt.show()
    # For multiple images    
    elif len(data.shape) >= 4:
        for d in range(len(data)):
            fig = plt.figure()
            ax = fig.add_subplot(111)
            # For grayscale images
            if (data[d].shape[-1] == 1):
                ax.imshow(data[d, :, :, 0], cmap='gray')
            # For color images
            else:
                ax.imshow(data[d])
            if len(indices) > 0:
                print(indices[d])
            plt.show()

            
def print_class(features, labels, class_no, image_idx):
    '''Print images by the class provided'''
    
    # For one image label
    if len(labels) == 1:
        indices_by_class = np.where(labels == class_no)
    else:
        indices_by_class = np.where(labels == class_no)[0]
    features_by_class = features[indices_by_class]
    images = features_by_class[image_idx]
    print_images(images, image_idx)
    

def count_classes(data, n_classes):
    '''Return a count of the number of images per class, and the starting index of each class'''
    
    class_no = 0
    count = 0
    class_count = np.zeros((n_classes), np.int32)
    class_idx = np.zeros((n_classes), np.int32)

    for d in data:
        class_count[d] += 1
    
    for c in range(1, len(class_count)):
        class_idx[c] = class_idx[c-1] + class_count[c-1]
        
    return class_count, class_idx


def make_subset(features, labels, n_classes=43, track_no=30, threshold=600, reduction=4):
    '''
        This function reduces the dataset size by randomly selecting whole tracks to remove from classes with
        a larger number of image samples. The track structure of the images of the subset is kept intact.
        
        Arguments:
        track_no - the number of samples in a single track of traffic sign images
        treshold - classes whose number samples exceed this number will be reduced
        reduction - 1 over this number is the proportion of the dataset we are including in the subset
    '''
    # Get the original count of the samples per class in the dataset
    class_count, class_idx = count_classes(labels, n_classes)
    # Get the number of tracks per class
    num_tracks = class_count/track_no
    num_tracks = num_tracks.astype(np.int32)
    
    x_subset = np.copy(features)
    y_subset = np.copy(labels)
    delete_range = np.array([]) # The range of indicies of the samples we are removing from the dataset copy
    del_perct = 1.0-1.0/reduction # The proportion of the dataset we are removing
    
    # For each class, remove del_perct proportion of tracks from the dataset copy if the number of samples
    # in the class exceeds the threshold
    for n in range(len(num_tracks)):
        if num_tracks[n]*track_no > threshold:
            # Randomly select the tracks to delete from the copy of the dataset
            a = np.arange(num_tracks[n])
            a = np.random.permutation(a)
            end = int(del_perct*len(a)) 
            a = a[0:end] # We only want to remove del_perct proportion of the number of tracks
            a = np.sort(a).astype(np.int32)
            for i in a:
                # This is the indices range in the dataset of the track we are removing.
                delete_range = np.append(delete_range, 
                                         range(i*track_no + class_idx[n], (i+1)*track_no + class_idx[n]))
                
    x_subset = np.delete(x_subset, delete_range, axis=0)
    y_subset = np.delete(y_subset, delete_range, axis=0)     
    
    return x_subset, y_subset


def normalize_data(data):
    '''Normalize the data'''
    
    data_norm = data/255.0
    
    return data_norm
            

def add_synthetic_data(features, labels, tfms=['fliplr', 'rotate', 'blur'], max_angle=20, max_sigma=2):
    '''
        Add Synthetic Data to the dataset. This function groups the data together by class.
        
        Arguments:
        tfms - a list of the operations desired for adding synthetic data. Options include:
                'fliplr' - flip left/right
                'flipud' - flip up/down
                'rotate' - rotate the image cw or ccw
                'blur' - add Gaussian blur
        max_angle - the maximum angle in degrees by which an image will be rotated cw or ccw
        max_sigma - the maximum sigma value for Gaussian blurring
    '''
    
    # Create the arrays to store the new dataset with synthetic data added
    class_count, class_idx = count_classes(labels, max(labels)+1)
    new_len = (len(tfms) + 1)*len(features)
    X_mod = np.empty((new_len, *features.shape[1:]), features.dtype)
    y_mod = np.empty((new_len, *labels.shape[1:]), labels.dtype)
    index = 0
    
    for c in range(len(class_count)):
        # Copy the class' original unmodified dataset
        X_mod[index:index + class_count[c]] = np.copy(features[class_idx[c]:class_idx[c] + class_count[c]])
        y_mod[index:index + class_count[c]] = np.copy(labels[class_idx[c]:class_idx[c] + class_count[c]])
        index += class_count[c]
        
        # Add the transformations to each image sample in the class, keeping the class samples together
        for op in tfms:
            for i in range(class_idx[c], class_idx[c] + class_count[c]):
                if op == 'fliplr':
                    X_mod[index] = np.fliplr(features[i])
                    y_mod[index] = labels[i]
                if op == 'flipud':
                    X_mod[index] = np.flipud(features[i])
                    y_mod[index] = labels[i]
                if op == 'rotate':
                    angle = random.uniform(-max_angle, max_angle) # Randomly selected angle between max_angle
                    X_mod[index] = scipy.ndimage.interpolation.rotate(features[i], angle, reshape=False) 
                    y_mod[index] = labels[i]
                if op == 'blur':
                    sigma = random.uniform(0., max_sigma) # Randomly selected sigma less than max_sigma
                    X_mod[index] = scipy.ndimage.filters.gaussian_filter(features[i], sigma)
                    y_mod[index] = labels[i]
                index += 1   

    return X_mod, y_mod


def make_encoder(labels):
    '''Creates a binarizer encoder based on the labels'''
    
    encoder = LabelBinarizer()
    encoder.fit(labels)
    
    return encoder


def one_hot_transform(labels, encoder):
    '''Applys the one hot transform to the input labels'''
    
    train_labels = encoder.transform(labels)
    # Change to float32, so it can be multiplied against the features in TensorFlow, which are float32
    train_labels = train_labels.astype(np.float32)

    return train_labels

def select_data(features, labels, subset_len=1290):
    '''Randomly select a track from each class'''
    
    _, train_features, _, train_labels = train_test_split(
        features,
        labels,
        test_size=subset_len,
        random_state=832289,
        stratify = labels)
    
    return train_features, train_labels


def make_train_test(features, labels, op='random_split', n_classes=43, split_size=0.20, track_size=30):
    '''
        Make a training and validation dataset
        
        Arguments:
        op - How to split the data. Options include:
                random_split - Randomly split into training and validation sets using train_test_split
                track_split - Randomly select one track per class for validation
    '''
    
    # Get randomized datasets for training and validation using sklearn's train_test_split. 
    # This does not keep the track structure intact.
    if op == 'random_split':
        # Get randomized datasets for training and validation
        train_features, valid_features, train_labels, valid_labels = train_test_split(
            features,
            labels,
            test_size=split_size,
            random_state=832289,
            stratify = labels)
        
    elif op == 'track_split':
        class_count, class_idx = count_classes(labels, n_classes)
        
        # Copy the features and labels because we're going to delete the validation set from them
        train_features = np.copy(features)
        train_labels = np.copy(labels)
        valid_features = np.empty((len(class_count)*track_size, *train_features[0].shape), train_features.dtype)
        if len(labels.shape) == 1:
            valid_labels = np.empty((len(class_count)*track_size), train_labels.dtype)
        else:
            valid_labels = np.empty((len(class_count)*track_size, labels.shape[-1]), train_labels.dtype)
        delete_range = np.empty((len(class_count), track_size), np.int32)
        num_tracks = class_count/track_size
        num_tracks = num_tracks.astype(np.int32)
        
        # Randomly select a track from each class and add to the validation set
        for t in range(len(num_tracks)):        
            track_num = np.random.randint(num_tracks[t])
            track_range = range(track_num*track_size + class_idx[t], (track_num+1)*track_size + class_idx[t])
            valid_features[t*track_size:(t+1)*track_size] = features[track_range]
            valid_labels[t*track_size:(t+1)*track_size] = labels[track_range]
            delete_range[t] = track_range
        # Remove the samples that were added to the validation set
        train_features = np.delete(train_features, delete_range, axis=0) 
        train_labels = np.delete(train_labels, delete_range, axis=0) 
        
        # Shuffle the validation and training sets
        valid_features, valid_labels = shuffle(valid_features, valid_labels)
        train_features, train_labels = shuffle(train_features, train_labels)
        
    else:
        print('The split operation provided is not valid')
        return
    
    return train_features, valid_features, train_labels, valid_labels


def save_pickle(data_dict, filename):
    '''Save the data for easy access'''
    
    pickle_file = filename + '.pickle'
    # if not os.path.isfile(pickle_file):
    print('Saving data to ' + filename)
    try:
        with open(pickle_file, 'wb') as pfile:
            pickle.dump(data_dict, pfile, pickle.HIGHEST_PROTOCOL)
    except Exception as e:
        print('Unable to save data to', pickle_file, ':', e)
        raise
    print('Data saved to cache')
    
print('Helper Functions are Created')

Helper Functions are Created


---

## Step 1: Dataset Exploration

Visualize the German Traffic Signs Dataset. This is open ended, some suggestions include: plotting traffic signs images, plotting the count of each sign, etc. Be creative!


The pickled data is a dictionary with 4 key/value pairs:

- features -> the images pixel values, (width, height, channels)
- labels -> the label of the traffic sign
- sizes -> the original width and height of the image, (width, height)
- coords -> coordinates of a bounding box around the sign in the image, (x1, y1, x2, y2). Based the original image (not the resized version).

In [13]:
### Load the data and get a summary
X_train, y_train, X_test, y_test = get_data()
n_train, n_test, image_shape, n_classes = get_summary(X_train, y_train, X_test, y_test)
class_count, class_idx = count_classes(y_train, n_classes)

print("Number of training examples =", n_train)
print("Number of testing examples =", n_test)
print("Image data shape =", image_shape)
print("Number of classes =", n_classes)

# Make copies of the original data to preserve the originals incase we need them later
train_features = np.copy(X_train)
train_labels = np.copy(y_train)
test_features = np.copy(X_test)
test_labels = np.copy(y_test)


Number of training examples = 39209
Number of testing examples = 12630
Image data shape = (32, 32, 3)
Number of classes = 43


In [None]:
### Provide some data visualizations
fig = plt.figure()
ax = fig.add_subplot(111)
ax.hist(y_train, n_classes)
plt.title('Count of Sign Classes')
plt.show()

x = np.array([])
y = np.array([])
for i in range(image_shape[0]):
    for j in range(image_shape[1]):
        x = np.append(x, [int(j)])
        y = np.append(y, [int(i)])

class_no = -1
for i in range(n_train):
    if y_train[i] != class_no:
        class_no = y_train[i]
        img = X_train[i]
        img_grayscale = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        z = img_grayscale.flatten()
        fig = plt.figure()
        ax = fig.add_subplot(111, projection='3d')
        ax.scatter(x, y, z, zdir='z')
        plt.title('Scatter Plot of Grayscale Vers. of Class ' + str(class_no) + ', image no. ' + str(i))
        plt.show()
        
y_by_class = np.where(y_train == 14)
y_by_class = y_by_class[0]
X_by_class = X_train[y_by_class]
counter = 0
batch = X_by_class.shape[0]/10
for img in X_by_class:
    if counter%batch == 0:
        fig = plt.figure()
        ax = fig.add_subplot(111)
        ax.imshow(img)
        plt.show()
    counter += 1

----

## Step 2: Design and Test a Model Architecture

Design and implement a deep learning model that learns to recognize traffic signs. Train and test your model on the [German Traffic Sign Dataset](http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset).

There are various aspects to consider when thinking about this problem:

- Your model can be derived from a deep feedforward net or a deep convolutional network.
- Play around preprocessing techniques (normalization, rgb to grayscale, etc)
- Number of examples per label (some have more than others).
- Generate fake data.

Here is an example of a [published baseline model on this problem](http://yann.lecun.com/exdb/publis/pdf/sermanet-ijcnn-11.pdf). It's not required to be familiar with the approach used in the paper but, it's good practice to try to read papers like these.

### Implementation

Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.

### Question 1 

_Describe the techniques used to preprocess the data._

**Answer:** For the dataset, I kept the images as RGB. I created a subset of the full dataset primarily to shorten the processing time of the model. In order to ensure that the model would have enough training samples per class, I only reduced the number of image samples from those classes that contained a number of samples above a certain threshold. In this case, that number was 600. If any class had more than 600 images, I would remove 3/4 of the tracks from that class. I made sure that each reduced class still had full tracks. 

I added synthetic data in order to help the model avoid overfitting on the existing training set. The two transforms I included are a rotation cw or ccw, and a Gaussian blur. The amount to rotate is randomly selected per image. Likewise, the Gaussian blur has a sigma value that is randomly selected per image. I applied the transforms to each class and made sure that the newly added synthetic images were grouped together with the images that they originated from.

I normalized the image values by dividing each pixel value by 255. I normalize the input data in order to standardize the range of distributions of the input values. This allows the optimization algorithm to move towards a maxima more stably and quickly.

For cross-validation, I followed a suggestion from a paper by Pierre Sermanet and Yann LeCun (a link to their paper is here: yann.lecun.com/exdb/publis/pdf/sermanet-ijcnn-11.pdf) in which they selected a track from each class to include in their validation set. I removed the validation tracks from the training set and did not shuffle the training set images.

In [4]:
### Generate data additional (if you want to!)
### and split the data into training/validation/testing sets here.
### Feel free to use as many code cells as needed.

# Some booleans for activating data modification functions
grayscale = False
yuv = False
subset = False
synth = True
tfms = ['rotate', 'blur']
norm = True
ops = 'track_split'    # Extract a track per class to include in the validation set.

## Modify the data set 
if grayscale:
    # Make grayscale test and training images
    train_features = make_gray(train_features)
    test_features  = make_gray(test_features)

if subset:
    # Create a subset of the data
    train_features, train_labels = make_subset(train_features, train_labels)

if synth:
    # Add synthetic data
    train_features, train_labels = add_synthetic_data(train_features, train_labels, tfms=tfms)

if norm:
    # Normalize the test and training images
    train_features = normalize_data(train_features)
    test_features = normalize_data(test_features)

# Split the training data into a training and validation set
train_features, valid_features, train_labels, valid_labels = make_train_test(train_features, train_labels, 
                                                                             op=ops)

# Select some data from the training set to run accuracy computation against
train_dict_features, train_dict_labels = select_data(train_features, train_labels, 
                                                     subset_len=len(valid_features))
    
# Apply one-hot-encoding transform to the labels
encoder = make_encoder(train_labels)
train_labels = one_hot_transform(train_labels, encoder)
valid_labels = one_hot_transform(valid_labels, encoder)                                                                      
test_labels = one_hot_transform(test_labels, encoder)
train_dict_labels = one_hot_transform(train_dict_labels, encoder)

print(train_features.shape, valid_features.shape, test_features.shape,
      train_dict_features.shape, train_labels.shape, valid_labels.shape, test_labels.shape, train_dict_labels.shape)

(116337, 32, 32, 3) (1290, 32, 32, 3) (12630, 32, 32, 3) (1290, 32, 32, 3) (1290, 43) (116337, 43) (1290, 43) (12630, 43)


### Question 2

_Describe how you set up the training, validation and testing data for your model. If you generated additional data, why?_

**Answer:** As I mentioned above, I added synthetic data in order to help the model avoid overfitting on the existing training set. The two transforms I included are a rotation cw or ccw, and a Gaussian blur. The amount to rotate is randomly selected per image. Likewise, the Gaussian blur has a sigma value that is randomly selected per image. I applied the transforms to each class and made sure that the newly added synthetic images were grouped together with the images that they originated from.

For cross-validation, I followed a suggestion from a paper by Pierre Sermanet and Yann LeCun in which they selected a track from each class to include in their validation set. The paper stated that they were able to achieve better accuracy when they used this method to build their validation set.

### Question 3

_What does your final architecture look like? (Type of model, layers, sizes, connectivity, etc.)  For reference on how to build a deep neural network using TensorFlow, see [Deep Neural Network in TensorFlow
](https://classroom.udacity.com/nanodegrees/nd013/parts/fbf77062-5703-404e-b60c-95b78b2f3f9e/modules/6df7ae49-c61c-4bb2-a23e-6527e69209ec/lessons/b516a270-8600-4f93-a0a3-20dfeabe5da6/concepts/83a3a2a2-a9bd-4b7b-95b0-eb924ab14432) from the classroom._


**Answer:** I used a convolutional neural network (ConvNet) that includes 3 convolutional layers, followed by two fully-connected layers and a final classification layer. Compared to the LeNet architecture, mine has one more convolutional layers because I felt the extra layer could allow the network to detect more complicated strutures within each image. I used pooling only in the last layer because the lectures mentioned that pooling may remove too much information and that using dropout for regularization also prevents overfitting on the training data. I exclusively used valid padding because I felt it reduced the size of each image and that this would shorten computation time while still not removing as much information as pooling does. The detailed architecture is as follows:

1st Convolutional Layer: 
Input is the data features, which is multiple 32x32x3 images.
The filter is a 5x5 window, whose initial weights are a truncated normal with mean 0 and standard dev 0.05.
The bias is originally all 0.
I used a VALID padding.
I used the ReLU activation.
The output size is 28x28x6.

2nd Convolutional Layer: 
The output of the 1st conv layer is the input here. Each input sample is 28x28x6.
The filter is a 5x5 window, whose initial weights are a truncated normal with mean 0 and standard dev 0.05.
The bias is originally all 0.
I used a VALID padding.
I used the ReLU activation.
The output size is 24x24x16.

3rd Convolutional Layer: 
The output of the 2nd conv layer is the input here. Each input sample is 24x24x16.
The filter is a 5x5 window, whose initial weights are a truncated normal with mean 0 and standard dev 0.05.
The bias is originally all 0.
I used a VALID padding.
I used the ReLU activation with 0.5 probability dropout.
I use a 2x2 max pooling operation with a stride of 2.
The output size is 10x10x32.

Flattened Layer:
The output of the 3rd conv layer is the input here. Each input sample is 10x10x32.
This is converted into a flat 1D layer of size 3200.

Fully Connected Layer 1:
The flattened input is of size 3200.
The weight matrix is a truncated normal with mean 0, standard dev 0.05, and shape 3200x120.
The bias is originally all 0.
I used the ReLu activation.
The output size is 120.

Fully Connected Layer 2:
The output of the 1st FC layer is the input here.
The weight matrix is a truncated normal with mean 0, standard dev 0.05, and shape 120x84.
The bias is originally all 0.
I used the ReLu activation.
The output size is 84.

Classification Layer:
The output of the 2nd FC layer is the input here.
The weight matrix is a truncated normal with mean 0, standard dev 0.05, and shape 84x43, where 43 is the number of classes in the dataset.
The bias is originally all 0.
I used the ReLu activation.
The output size is 43.

The output of the classification layer is run through a softmax operation to produce a vector of the probabilities of the input belonging to each class.

In [5]:
### The following are helper functions that are used to build the convolution neural network

def conv_layer(input_layer, filter_size, num_input_channels, num_filters, stride=1, padding='SAME',
               relu=False, pooling=False, pool_size=2, drop=False, keep_prob=0.5):
    '''
        Create a convolution layer with filter size, output channels, stride, and other parameters set
        by the user.
        
        Arguments:
        num_filters - number of output channels
        relu - True if using relu, false otherwise
        pooling - True if using pooling, false otherwise
        pool_size - the H and W of the pooling filter
        drop - True if using dropout, false otherwise
        keep_prob - The amount of data to drop
    '''
    
    shape = [filter_size, filter_size, num_input_channels, num_filters]

    # Create filter weights with the given shape.
    weights = tf.Variable(tf.truncated_normal(shape, stddev=0.05))
    # print(weights.get_shape())

    # Create new biases, one for each filter.
    biases = tf.Variable(tf.zeros(num_filters))

    # Create the convolution operation
    conv_layer = tf.nn.conv2d(input=input_layer,
                             filter=weights,
                             strides=[1, stride, stride, 1],
                             padding=padding)

    # Add the biases to the convolution output
    conv_layer = tf.nn.bias_add(conv_layer, biases)
    
    if relu:
        # Create the ReLu activation function
        conv_layer = tf.nn.relu(conv_layer)
        
    if drop:
        # Use Dropout to prevent overfitting
        conv_layer = tf.nn.dropout(conv_layer, keep_prob)
        
    if pooling == 'max':
        # Create max-pooling
        conv_layer = tf.nn.max_pool(value=conv_layer,
                               ksize=[1, pool_size, pool_size, 1],
                               strides=[1, pool_size, pool_size, 1],
                               padding=padding)
    if pooling == 'average':
        # Create average-pooling
        conv_layer = tf.nn.avg_pool(value=conv_layer,
                               ksize=[1, pool_size, pool_size, 1],
                               strides=[1, pool_size, pool_size, 1],
                               padding=padding)

    print(conv_layer.get_shape())
    
    return conv_layer

    
def fc_layer(input_layer, num_features, fc_size, relu=False, drop=False, keep_prob=0.5):
    '''
        Create a fully connected layer
            
        Arguments:
        fc_size - Number of output channels
        relu - True if using relu, false otherwise
        drop - True if using dropout, false otherwise
        keep_prob - The amount of data to drop
    '''

    shape=[num_features, fc_size]

    weights = tf.Variable(tf.truncated_normal(shape, stddev=0.05))
    biases = tf.Variable(tf.zeros(fc_size))

    # Calculate the layer as the matrix multiplication of
    # the input and weights, and then add the bias-values.
    fc_layer = tf.nn.bias_add(tf.matmul(input_layer, weights), biases)
    
    if relu:
        fc_layer = tf.nn.relu(fc_layer)
    
    if drop:
        fc_layer = tf.nn.dropout(fc_layer, keep_prob)
        
    print(fc_layer.get_shape())
    
    return fc_layer

print('Network Helper Functions Created')

Network Helper Functions Created


In [6]:
### Create a ConvNet
from tensorflow.contrib.layers import flatten

# Create the input data placeholders
features = tf.placeholder(tf.float32, shape=[None, *train_features.shape[1:]])
labels = tf.placeholder(tf.float32, shape=[None, train_labels.shape[-1]])

filter_size = 5
output_size = 6

# Conv Layer 1
conv_layer_1 = conv_layer(features, filter_size, train_features.shape[-1], output_size, 
                          padding='VALID', relu=True)

# Conv Layer 2
conv_layer_2 = conv_layer(conv_layer_1, filter_size, output_size, 16, 
                          padding='VALID', relu=True)

# Conv Layer 3
conv_layer_3 = conv_layer(conv_layer_2, filter_size, 16, 32, 
                          padding='VALID', pooling='max', relu=True, drop=True)

# Flatten layer
flat_layer = flatten(conv_layer_3)
print(flat_layer.get_shape())

# Fully Connected Layer 1
num_features = int(flat_layer.get_shape()[-1])
fc1_size = 120
fc_layer_1 = fc_layer(flat_layer, num_features, fc1_size, relu=True)

# Fully Connected Layer 2
fc2_size = 84
fc_layer_2 = fc_layer(fc_layer_1, fc1_size, fc2_size, relu=True)

# Output Layer
output_layer = fc_layer(fc_layer_2, fc2_size, train_labels.shape[-1])

print('Architecture Created')

(?, 28, 28, 6)
(?, 24, 24, 16)
(?, 10, 10, 32)
(?, 3200)
(?, 120)
(?, 84)
(?, 43)
Architecture Created


In [7]:
## Creating the Softmax predictions, loss, and optimizer functions
prediction = tf.nn.softmax(output_layer)
y_pred_cls = tf.argmax(prediction, 1)
y_true_cls = tf.argmax(labels, 1)
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(output_layer, labels))

learning_rate = 0.001

# Construct the optimizer
# optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(loss)

# Calculate accuracy
correct_prediction = tf.equal(y_pred_cls, y_true_cls)
accuracy_operation = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

def evaluate(X_data, y_data, batch_size):
    '''Calculate the accuracy of the model by classifying a dataset'''
    
    num_examples = len(X_data)
    total_accuracy = 0
    sess = tf.get_default_session()
    for offset in range(0, num_examples, batch_size):
        batch_x, batch_y = X_data[offset:offset+batch_size], y_data[offset:offset+batch_size]
        accuracy = sess.run(accuracy_operation, feed_dict={features: batch_x, labels: batch_y})
        total_accuracy += (accuracy * len(batch_x))
    return total_accuracy / num_examples

# Get the top k predicted classes
top_5 = tf.nn.top_k(input=prediction, k=5)

def plot_confusion_matrix(cm, n_classes=None):
    '''Plot the confusion matrix  of the true and predicted labels as an image.'''

    plt.matshow(cm)  
    if n_classes is None:
        n_classes = cm.shape[0]
        
    # Make various adjustments to the plot.
    plt.colorbar()
    tick_marks = np.arange(n_classes)
    plt.xticks(tick_marks, range(n_classes))
    plt.yticks(tick_marks, range(n_classes))
    plt.xlabel('Predicted')
    plt.ylabel('True')

    plt.show()
    
print('Optimization and Evaluation Functions Created')

Optimization and Evaluation Functions Created


In [None]:
## Read in new images taken from online or from a camera

# Sort the list of the filenames of the new images
sorted_files = sorted(os.listdir('cropped_images/'))
print(sorted_files)

# Load the new images into an array
new_images = np.array([mpimg.imread('cropped_images/' + name) for name in sorted_files])

# Make any modifications to the new images that were done to the training set
if grayscale:
    new_images = make_gray(new_images)
    
if norm:
    new_images = normalize_data(new_images)
    
print(new_images.shape)

# Associate labels with the new images
new_images_labels = np.tile(np.array([0]), 10)    
new_images_labels = np.append(new_images_labels, np.tile(np.array([1]), 8))
new_images_labels = np.append(new_images_labels, np.tile(np.array([2]), 9))
new_images_labels = np.append(new_images_labels, np.tile(np.array([3]), 9))
new_images_labels = np.append(new_images_labels, np.tile(np.array([4]), 9))
new_images_labels = np.append(new_images_labels, np.tile(np.array([5]), 8))

print(new_images_labels)
num_classes = max(new_images_labels) + 1

new_images_labels = one_hot_transform(new_images_labels, encoder)
print(new_images_labels.shape)

In [None]:
# Import shuffle from sklearn to shuffle training data after every epoch
from sklearn.utils import shuffle

# Class used to save and/or restore Tensor Variables
saver = tf.train.Saver()
init = tf.initialize_all_variables()

# The file path to save the data
save_file = 'model_rgb.ckpt'

valid_feed_dict = {features: valid_features, labels: valid_labels}
test_feed_dict = {features: test_features, labels: test_labels}
train_feed_dict = {features: train_dict_features, labels: train_dict_labels}

def run_model(train_features, train_labels, run_count, epochs, batch_size, save_file):
    # Launch the graph
    with tf.Session() as sess:
        if run_count == 0:
            print('Initiating the Model')
            sess.run(init)
        else:
            print('Restoring a Saved Model')
            saver.restore(sess, save_file)
        
        batch_count = int(math.ceil(len(train_features)/batch_size))
        
        # Training cycle
        for epoch_i in range(epochs):
            # Shuffle the training data
            train_features, train_labels = shuffle(train_features, train_labels)
            # Progress bar
            batches_pbar = tqdm(range(batch_count), desc='Epoch {:>2}/{}'.format(epoch_i+1, epochs), unit='batches')

            # The training cycle
            for batch_i in batches_pbar:
                # Get a batch of training features and labels
                batch_start = batch_i*batch_size
                if batch_start >= len(train_features):
                    break
                batch_end = min(batch_start + batch_size, len(train_features))
                batch_features = train_features[batch_start:batch_end]
                batch_labels = train_labels[batch_start:batch_end]

                # Run optimizer and get loss
                _, l = sess.run(
                    [optimizer, loss],
                    feed_dict={features: batch_features, labels: batch_labels})

            # Check accuracy against Validation data
            validation_accuracy = evaluate(valid_features, valid_labels, batch_size)
            training_accuracy = evaluate(train_dict_features, train_dict_labels, batch_size)
                  
            # Display logs per epoch step
            print("Epoch:", '%04d' % (epoch_i+1), "loss=", "{:.9f}".format(l)) 
            print("Validation Accuracy=", "{:.9f}".format(validation_accuracy), 
                  "Training Accuracy=", "{:.9f}".format(training_accuracy))

        print("Optimization Finished!") 

        test_accuracy = evaluate(test_features, test_labels, batch_size)
        print('Test accuracy at {}'.format(test_accuracy))

        # Check the accuracy of new images
        new_data_accuracy, cls_true, cls_pred = sess.run([accuracy_operation, y_true_cls, y_pred_cls],
                                                          feed_dict={features: new_images, labels: new_images_labels})
        
        print('New data accuracy at {}'.format(new_data_accuracy))
        
        # Check the confusion matrix of the predicted versus actual labels
        cm = confusion_matrix(cls_true, cls_pred, labels=range(43))
        plot_confusion_matrix(cm, 43)
    
        # Get the probabilies and labels of the top 5 predicted classifications
        top_5_prob = sess.run(top_5, feed_dict={features: new_images, labels: new_images_labels})
        
        run_count += epochs
        sess.close()
        
        return run_count, confusion_matrix(cls_true, cls_pred), cls_true, cls_pred, top_5_prob

print('Model Created')

In [9]:
run_count = 0
run_count = train_model(train_features, train_labels, run_count, 10, 20, save_file)

Initiating the Model


Epoch  1/10: 100%|██████████| 5817/5817 [04:54<00:00, 19.73batches/s]


Epoch: 0001 loss= 0.037677098
Validation Accuracy= 0.901550378 Training Accuracy= 0.956589140


Epoch  2/10: 100%|██████████| 5817/5817 [04:30<00:00, 23.12batches/s]


Epoch: 0002 loss= 0.061136287
Validation Accuracy= 0.951937979 Training Accuracy= 0.979844956


Epoch  3/10: 100%|██████████| 5817/5817 [04:22<00:00, 22.15batches/s]


Epoch: 0003 loss= 0.016174113
Validation Accuracy= 0.953488366 Training Accuracy= 0.979069762


Epoch  4/10: 100%|██████████| 5817/5817 [04:25<00:00, 21.87batches/s]


Epoch: 0004 loss= 0.339343786
Validation Accuracy= 0.952713171 Training Accuracy= 0.983720926


Epoch  5/10: 100%|██████████| 5817/5817 [04:36<00:00, 21.06batches/s]


Epoch: 0005 loss= 0.250322938
Validation Accuracy= 0.944186038 Training Accuracy= 0.987596896


Epoch  6/10: 100%|██████████| 5817/5817 [04:43<00:00, 19.50batches/s]


Epoch: 0006 loss= 0.004106718
Validation Accuracy= 0.943410846 Training Accuracy= 0.985271315


Epoch  7/10: 100%|██████████| 5817/5817 [04:36<00:00, 24.03batches/s]


Epoch: 0007 loss= 0.003670453
Validation Accuracy= 0.967441854 Training Accuracy= 0.986821703


Epoch  8/10: 100%|██████████| 5817/5817 [04:30<00:00, 23.04batches/s]


Epoch: 0008 loss= 0.019639447
Validation Accuracy= 0.962015496 Training Accuracy= 0.985271314


Epoch  9/10: 100%|██████████| 5817/5817 [04:24<00:00, 21.98batches/s]


Epoch: 0009 loss= 0.035604242
Validation Accuracy= 0.952713171 Training Accuracy= 0.993798448


Epoch 10/10: 100%|██████████| 5817/5817 [04:24<00:00, 21.98batches/s]


Epoch: 0010 loss= 0.004415096
Validation Accuracy= 0.955038752 Training Accuracy= 0.989922478
Optimization Finished!
Model Saved to model_rgb.ckpt
Validation accuracy at 0.9542635679244995
Test accuracy at 0.9463974833488464
[16  1 38 33 11 38 18 12 25 35 12  7 23  7  4  9 21 20 27 38  4 33  9  3  1
 11 13 10  9 11  5 17 34 23  2 17  3 12 16  8  7 30 18 12 24 25  3 10 18  8
 25 13 15  9 13 35  5 26  9 16 38 10  4  9 15  9 26  2  5 28 11 25 30 34  5
 12  1 10 25 25 21 33 25  7 10 35  3  7 22 13  3  1  2 14 12 32  3 38  9 33
  1 10  5 11 33  4 35 25 33  4  1 14 16 10 30  3 27 29  1 17 13  7  1  8  2
 10 10 30  1  6 36  3 14 13 11 10 18 40  2 38 41  4  6 18 17 25  2 41 11 21
  7 24 11 25 17  3  6  9  7  4 13 16  4 25 18  9 13 14 29 17 13 38 26 25 33
  1  3 40 13  2  8  4 36 25 20 25 18  1 10  8 10 29 12 38 31  2  8 38 18 28
 17  9  4  1 17  9  2 31 13 15 15 38 25  5 25 13 10  5  4 10  2  4  5  1 14
 12 12  5  8 36 25 13 33 18 33 19 12 30  4 18 12 13 20  0 10 40  5  8 12 38
 20 14  0 36 34

In [None]:
run_count = train_model(train_features, train_labels, run_count, 10, 20, save_file)

Restoring a Saved Model


### Question 4

_How did you train your model? (Type of optimizer, batch size, epochs, hyperparameters, etc.)_


**Answer:**

### Question 5


_What approach did you take in coming up with a solution to this problem?_

**Answer:**

---

## Step 3: Test a Model on New Images

Take several pictures of traffic signs that you find on the web or around you (at least five), and run them through your classifier on your computer to produce example results. The classifier might not recognize some local signs but it could prove interesting nonetheless.

You may find `signnames.csv` useful as it contains mappings from the class id (integer) to the actual sign name.

### Implementation

Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.

In [3]:
### Load the images and plot them here.
### Feel free to use as many code cells as needed.

### Question 6

_Choose five candidate images of traffic signs and provide them in the report. Are there any particular qualities of the image(s) that might make classification difficult? It would be helpful to plot the images in the notebook._



**Answer:**

In [4]:
### Run the predictions here.
### Feel free to use as many code cells as needed.

### Question 7

_Is your model able to perform equally well on captured pictures or a live camera stream when compared to testing on the dataset?_


**Answer:**

In [None]:
### Visualize the softmax probabilities here.
### Feel free to use as many code cells as needed.

### Question 8

*Use the model's softmax probabilities to visualize the **certainty** of its predictions, [`tf.nn.top_k`](https://www.tensorflow.org/versions/r0.11/api_docs/python/nn.html#top_k) could prove helpful here. Which predictions is the model certain of? Uncertain? If the model was incorrect in its initial prediction, does the correct prediction appear in the top k? (k should be 5 at most)*


**Answer:**

### Question 9
_If necessary, provide documentation for how an interface was built for your model to load and classify newly-acquired images._


**Answer:**

> **Note**: Once you have completed all of the code implementations and successfully answered each question above, you may finalize your work by exporting the iPython Notebook as an HTML document. You can do this by using the menu above and navigating to  \n",
    "**File -> Download as -> HTML (.html)**. Include the finished document along with this notebook as your submission.