# Self-Driving Car Engineer Nanodegree


## Project: **Traffic Sign Classifier** 
***
This is a ML based traffic sign classification implementation using a variety of CNN network architectures with keras framework. 

The project implementation is structured as follows. 

---
>**Traffic Sign Classification Project**

>1. Data Process
>    1. Data is analyzed to highlight low-frequency categories and suitably augmented. 
>    2. The augmented data is then normalized before pickling.
>2. ML network classifiers
>    1. Three differnt architectures are used to compare classifier performance. The 3 netowrk architectures chosen are LeNet, AlexNext and GoogLeNet. 
***


## 1. Data Processor

**Test data source & references**

1. Data-set from GTSRB web-site as per project requirements -- [German Traffic Sign Repo](benchmark.ini.rub.de/?section=gtsrb&subsection=dataset#Imageformat)

2. Network model examples were derived from github repos, especially the variant of AlexNet to fit the data-set constraints -- [liferlisiqi GitHub](https://github.com/liferlisiqi/Traffic-Sign-Classifier/blob/master/README.md) 

This program reads from data-source and performs bsic data analysis and summarization. It then normlaizes the data and chaches the normalized data sets into a pickle file for further network processing. 

>**Note**: Data normalization is applied to the given data-set as well as the downloaded German Traffic Sign data-base. See sections towards the end of this notebook for the GTSRB data-processing. 

Sample training data set visualization. 
<img src="./TrafficSigns.png" alt="Sample Training Data Set" width=450/>

- - - - 

### 1.1 Data Augmentation
Using data-augmentation to enlarge training set for a more generalized & robust network. 

Simple augmentation techniques are employed here which include
 
 1. Small random image dithers: samples randomly perturbed in 
     - Position ([-2,2] pixels)
     - Scale ([.9,1.1] ratio) and 
     - Rotation ([-15,+15] degrees)
     
Random selections of images with equal sampling from each traffic sign class ID are done with 1/3rd of the samples being applied with the dithers above. 

A simpler technique is to shuffle the selected training set and select each 1/3rd partition to apply the dithering. This is used to simplify the data augmentation process.

>**Note**: For augmentation, only a 1/3 of the entire training set is chosen, so essentially each of the 3 augmentations are applied to only 1/9th of the original image training set. This is done primarily as a method to save on the disk-space as the combined data set now only consumes x1.33 of the original space.  

>**Note**: The training image size & sign RoI locations are not modified after augmentation and replicated in the augmented training set as-is. The argument for this is that simply the dithered/scaled/rotated images are warped/scaled back to original size and hence the RoI ofr the labelled traffic sign should not vary by much, if at all. 

Following is a plot of the training data set after & before augmentation showing a "more  uniform" distribution of categories. 
<img src="./TrafficSigns_DataClassDistributions_BeforeAfter.png" alt="Augmented Training Data Distribution" width=450/>



## 3. CNN Classifiers

After daat-paugmentation & processing, 3 different architectures of CNN classifiers are tested. 

### 3.1 LeNet 
This program reads pre-processed data from pickled data-source and applies the LeNet model for traffic sign classification.
The TensorFlow model is then pickled in the `./models` sub-directory for the final evaluation phase.  

>**Strategy**: 
> 1. Training hyper-parameters were selected afetr a fair amount  of parameter sweeps but essentially are empirircal in the choice. The set values in this notebook perform with goo accuracy for the particular data-sets chosen for validation & testing. 
> 2. A drop-out scheme was chosen as a simpler mechanism to reduce weight variance rahter than L2-regularization or other more elaborate methods. 
> 3. An early training termination crietria was chosen to be when per-epoch training accuracy (measured on validation set) was found to decrease by $\le \epsilon=10^{-3}$. This was done primarily to shorten training wall-clock times for a model with *sufficient* accuracy for the particular classifier task.  
> 4. Note that the early termination kicks in only after _a minimum number of epochs_ (chosen here to be $E = 15$) have been run. 

#### 3.1.1 LeNet Model Architecture

Design and implement a deep learning model that learns to recognize traffic signs. Train and test your model on the [German Traffic Sign Dataset](http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset).

The LeNet-5 implementation shown in the [classroom](https://classroom.udacity.com/nanodegrees/nd013/parts/fbf77062-5703-404e-b60c-95b78b2f3f9e/modules/6df7ae49-c61c-4bb2-a23e-6527e69209ec/lessons/601ae704-1035-4287-8b11-e2c2716217ad/concepts/d4aca031-508f-4e0b-b493-e7b706120f81) at the end of the CNN lesson is a solid starting point. You'll have to change the number of classes and possibly the preprocessing, but aside from that it's plug and play! 

With the LeNet-5 solution from the lecture, you should expect a validation set accuracy of about 0.89. To meet specifications, the validation set accuracy will need to be at least 0.93. It is possible to get an even higher accuracy, but 0.93 is the minimum for a successful project submission. 

There are various aspects to consider when thinking about this problem:

- Neural network architecture (is the network over or underfitting?)
- Play around preprocessing techniques (normalization, rgb to grayscale, etc)
- Number of examples per label (some have more than others).
- Generate fake data.

Here is an example of a [published baseline model on this problem](http://yann.lecun.com/exdb/publis/pdf/sermanet-ijcnn-11.pdf). It's not required to be familiar with the approach used in the paper but, it's good practice to try to read papers like these.

>**Note**: No padding is required for this implementation as the normalized data-set has be re-sized to 32 x 32.

>**Note**: RGB images are used here, set `USE_GRAYSCALE = True` to use gray-scaled versions. 

#### 3.1.2 TensorFlow Environment Hyper-Parameters

Re-shuffle data-set to remove hidden biases in labelled sequences

**Hyperparameter Record** 

> *LeNet*: `USE_GRAYSCALE = False`, `EPOCHS = 50`, `BATCH_SIZE = 128`, `KEEP_PROB = .85`, `LEARN_RATE = 1e-3`, `USE_SGD = 'Adam'` 

#### 3.1.3 LeNet5 Network

![LeNEt5 Architecture](./lenet_archdiagram.png)

**Model Layer Architecture**

Input data is a normalized RGB frame of size `32x32` pixels. The LeNet network base-layers can be used almost without modification. The only change of course is to modify the final fully-connected layer to detect `n_classes` number of categories in stead of teh standard 10 categories from LeNet's original implementation.  

As implemented, the network below has the following layer organization: (derived from a Keras summary implementation of the network) 

#### 3.1.4 Model Performance on Test Images

Apply the validted model on the test set and predict traffic-sizn class ID. Analyze performance to meet minimum accuracy of `> 95%`. 

There are two test modes, using pickled test data & using traffic sign images from the German traffic sign data-base. 
Obtain the top-5 detected class IDs along with the detection prbabilities (`softmax` values).

LeNet Test Accuracy: 
<img src="./TrafficSigns_LeNet_AccuracyCompare.png" alt="Convolutional Lane-marker Search" width=450/>

LeNet Top-5 Probability Performance: 
<img src="./TrafficSigns_LeNet_TestImages_Top5Probs.png" alt="Convolutional Lane-marker Search" width=450/>

GTSRB Test Evaluation:
<img src="./TrafficSigns_LeNet_GTSRBTestImages_Top5Probs.png" alt="Convolutional Lane-marker Search" width=450/>


### 3.2 AlexNet

This program reads pre-processed data from pickled data-source and applies the GoogLeNet Inceptionv3 model for traffic sign classification.
The TensorFlow model is then pickled in the `./models` sub-directory for the final evaluation phase.  

>**Strategy**: 
> 1. Training hyper-parameters were selected after a fair amount  of parameter sweeps but essentially are empirircal in the choice. The set values in this notebook perform with goo accuracy for the particular data-sets chosen for validation & testing. 
> 2. A L2 regularization scheme is chosen as in the standard AlexNet arhcitecture. 
> 3. An early training termination crietria was chosen to be when per-epoch training accuracy (measured on validation set) was found to decrease by $\le \epsilon=10^{-3}$. This was done primarily to shorten training wall-clock times for a model with *sufficient* accuracy for the particular classifier task.  
> 4. Note that the early termination kicks in only after _a minimum number of epochs_ (chosen here to be $E = 15$) have been run. 

#### 3.2.1 AlexNet Model Architecture

Design and implement a deep learning model that learns to recognize traffic signs. Train and test your model on the [German Traffic Sign Dataset](http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset).

There are various aspects to consider when thinking about this problem:

- Neural network architecture (is the network over or underfitting?)
- Play around preprocessing techniques (normalization, rgb to grayscale, etc)
- Number of examples per label (some have more than others).
- Generate fake data.

>**Note**: No padding is required for this implementation as the normalized data-set has be re-sized to 32 x 32.

>**Note**: RGB images are used here, set `USE_GRAYSCALE = True` to use gray-scaled versions. 

#### 3.2.2 TensorFlow Environment Hyper-Parameters

Re-shuffle data-set to remove hidden biases in labelled sequences

**Hyperparameter Record** 

> *AlexNet*: `USE_GRAYSCALE = False`, `EPOCHS = 30`, `BATCH_SIZE = 128`, `KEEP_PROB = .5`, `LEARN_RATE = 5e-4`, `USE_SGD = 'Adam'`, `BETA=1e-5` 

#### 3.2.3 AlexNet Network

![AlexNet Architecture](./alexnet2012_archdiagram.png)

---
**Model Layer Architecture**

Input data is a normalized RGB frame of size `32x32` pixels. The AlexNet network base-layers can be used almost without modifications. The changes required are for the  final fully-connected layer to detect `n_classes` number of categories as well as some modifications to suit the in stead of teh standard 10 categories from LeNet's original implementation.  

As implemented, the network below has the following layer organization: (derived from a Keras summary implementation of the network) 

#### 3.2.3 Train & Validate the Model

Run the training data through the training pipeline to train the model.

Before each epoch, shuffle the training set.

After each epoch, measure the loss and accuracy of the validation set.

A validation set can be used to assess how well the model is performing. A low accuracy on the training and validation
sets imply underfitting. A high accuracy on the training set but low accuracy on the validation set implies overfitting.

Save the model after training.

AlexNet Accuracy Curve: 
<img src="./TrafficSigns_AlexNet_AccuracyCurve.png" alt="Convolutional Lane-marker Search" width=450/>

#### 3.2.4 Model Performance on Test Images

Apply the validted model on the test set and predict traffic-sizn class ID. Analyze performance to meet minimum accuracy of `> 95%`. 

There are two test modes, using pickled test data & using traffic sign images from the German traffic sign data-base. 
Obtain the top-5 detected class IDs along with the detection prbabilities (`softmax` values).

AlexNet Test Accuracy Comparison: 
<img src="./TrafficSigns_AlexNet_AccuracyCompare.png" alt="Convolutional Lane-marker Search" width=450/>

AlexNet Sample Test Output: 
<img src="./TrafficSigns_AlexNet_TestImages.png" alt="Convolutional Lane-marker Search" width=450/>

AlexNet Top-5 Detection Probabilities:
<img src="./TrafficSigns_AlexNet_TestImages_Top5Probs.png" alt="Convolutional Lane-marker Search" width=450/>

GTSRB Classification Results: 
<img src="./TrafficSigns_AlexNet_GTSRBTestImages_Top5Probs.png" alt="Convolutional Lane-marker Search" width=450/>


### 3.3 GoogLeNet
This program reads pre-processed data from pickled data-source and applies the GoogLeNet Inceptionv3 model for traffic sign classification.
The TensorFlow model is then pickled in the `./models` sub-directory for the final evaluation phase.  

>**Strategy**: 
> 1. Training hyper-parameters were selected afetr a fair amount  of parameter sweeps but essentially are empirircal in the choice. The set values in this notebook perform with goo accuracy for the particular data-sets chosen for validation & testing. 
> 2. A drop-out regularization scheme is chosen as in the stndard Inceptionv3-based Googlenet arhcitecture. 
> 3. An early training termination crietria was chosen to be when per-epoch training accuracy (measured on validation set) was found to decrease by $\le \epsilon=10^{-3}$. This was done primarily to shorten training wall-clock times for a model with *sufficient* accuracy for the particular classifier task.  
> 4. Note that the early termination kicks in only after _a minimum number of epochs_ (chosen here to be $E = 15$) have been run. 

#### 3.3.1 GoogLeNet Model Architecture

Design and implement a deep learning model that learns to recognize traffic signs. Train and test your model on the [German Traffic Sign Dataset](http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset).

There are various aspects to consider when thinking about this problem:

- Neural network architecture (is the network over or underfitting?)
- Play around preprocessing techniques (normalization, rgb to grayscale, etc)
- Number of examples per label (some have more than others).
- Generate fake data.

>**Note**: No padding is required for this implementation as the normalized data-set has be re-sized to 32 x 32.

>**Note**: RGB images are used here, set `USE_GRAYSCALE = True` to use gray-scaled versions. 

#### 3.3.2 TensorFlow Environment Hyper-Parameters

Re-shuffle data-set to remove hidden biases in labelled sequences

**Hyperparameter Record** 

> *GoogLeNet*: `USE_GRAYSCALE = False`, `EPOCHS = 35`, `BATCH_SIZE = 128`, `KEEP_PROB = .5`, `LEARN_RATE = 4e-4`, `USE_SGD = 'Adam'` 

#### 3.3.3 GoogLeNet Network (& Inception v3 Model)

![GoogLeNet Architecture](./googlenet_archdiagram.png)


----

**Model Layer Architecture**

Input data is a normalized RGB frame of size `32x32` pixels. The Inception v3 based GoogLeNet network base-layers can be used almost without modifications. The changes required are for the  final fully-connected layer to detect `n_classes` number of categories as well as some modifications to suit the in stead of the standard 10 categories from original implementation original implementation.  

As implemented, the network below has the following layer organization: (derived from a Keras summary implementation of the network)  
>**Note**:
The following code has been used from the Keras documentation directly
[Keras Application Inception v3](https://keras.io/applications/#inceptionv3)

GoogLeNet Learning Curve: 
<img src="./TrafficSigns_GoogLeNet_LearningCurve.png" alt="Convolutional Lane-marker Search" width=450/>

GoogLeNet Accuracy Curve: 
<img src="./TrafficSigns_GoogLeNet_AccuracyCurve.png" alt="Convolutional Lane-marker Search" width=450/>


#### 3.3.4 Model Performance on Test Images

Apply the validated model on the test set and predict traffic-sizn class ID. Analyze performance to meet minimum accuracy of `> 95%`. 

There are two test modes, using pickled test data & using traffic sign images from the German traffic sign data-base. 
Obtain the top-5 detected class IDs along with the detection prbabilities (`softmax` values).

GoogLeNet Test Accuracy: 
<img src="./TrafficSigns_GoogLeNet_AccuracyCompare.png" alt="Convolutional Lane-marker Search" width=450/>

GoogLeNet Top-5 Probability Performance: 
<img src="./TrafficSigns_GoogLeNet_TestImages.png" alt="Convolutional Lane-marker Search" width=450/>

GTSRB Test Evaluation:
<img src="./TrafficSigns_GoogLeNet_GTSRBTestImages_Top5Probs.png" alt="Convolutional Lane-marker Search" width=450/>

---

## 4. Discussion

A few enhancements that can make the implementation more robust:

    1. Data-augmentation to account for poor lighting conditions may improve outlier performance for some hazy traffic sign images. 
    