# **Writeup | Self-Driving Car project - Deep Learning**
**x** min read

**Abstract — This notebook is the writeup of the Traffic Sign Recognition project.** We apply Deep Learning and Convolutional Networks (ConvNets) to the task of traffic sign classification as part of the SELF-DRIVING CAR nanodegree program. The project is broken down into three steps, which are:   
>- Step 1: Data Set Summary and Exploration
- Step 2: Design and Test a Model Architecture
- Step 3: Test a Model on New Images

The model yielded the accuracy of **9x.xx%** with a loss of **xx.xx**, above the human performance of 98.81%, using 32x32 pre-proceeded input images.

Beyond the initial requirements, I also implemented the Tensorboard features: embedding visualizer, summary{images, loss, accuracy, weights, biais}, a comparaison of different architectures, and compared the model result with Google Images search.

Here is the link to the [PROJECT SPECIFICATION](https://review.udacity.com/#!/rubrics/481/view) and here to my [PROJECT CODE](https://github.com/chatmoon/Traffic-Sign-Classifier-Project/blob/master/_2_WIP/_JNBK_/_TSC-step2.1_170309-1557_WIP.ipynb).

---
### Step 1: Data Set Summary & Exploration

The goals here are the following:
* Load the [data set](http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset) and Summarize
* Explore and visualize the data set

#### 1.1. Load and summarize the data set  

*The code for the loading of the data set is contained in the code cell [2] of the IPython notebook.   
And the one for the summary is contained in the code cell [4].*   

I used the numpy library to calculate summary statistics of the traffic signs data set:   
* The size of training set is 34799
* The size of validing set is 4410
* The size of test set is 12630
* The shape of a traffic sign image is (32, 32, 3)
* The number of unique classes/labels in the data set is 43

#### 1.2. An exploratory visualization of the dataset
##### 1.2.1. Explore the data set

*The code for this step is contained in the code cells from [6] to [8] of the IPython notebook.*

In this part, we have three representations of the data set:   
- fig.1: a list showing the number of occurence per traffic sign name
- fig.2: a bar chart showing the number of occurence per class id
- fig.3: a bar chart showing the distribution of these traffic sign images into the data set

>![fig.1](https://raw.githubusercontent.com/chatmoon/Traffic_Sign_Classifier/master/images/_dataExplo_1_showList.PNG)
>![fig.2](https://raw.githubusercontent.com/chatmoon/Traffic_Sign_Classifier/master/images/_dataExplo_2_showChart.png)
>![fig.3](https://raw.githubusercontent.com/chatmoon/Traffic_Sign_Classifier/master/images/_dataExplo_3_showDist.png)

**Observations:**    
- The fig.1 and 2 show there is a large disparity between traffic sign occurences.   
Additional data should be created in order to rebalance the under represented classes.
- The fig.3 shows that each class has been piled on top of the other.   
The data set should be shuffled before training the model.

##### 1.2.2. Visualize the data set
*The code for this step is contained in the code cell [9] of the IPython notebook.*

In this part, we have also three types of visualization of the data set:   
- fig.4: 43 random images, one per class and with their black-box
- fig.5: 5 x 10, 5 random traffic signs, 10 images each 
- fig.6: a sprite image showing all the traffic sign in a single image

> *fig.4: 43 random images, one per class and with their black-box*
![fig.4](https://raw.githubusercontent.com/chatmoon/Traffic_Sign_Classifier/master/images/_dataVisu_4_show43TS.png)
> *fig.5: 5 x 10, 5 random traffic signs, 10 images each*
![fig.5](https://raw.githubusercontent.com/chatmoon/Traffic_Sign_Classifier/master/images/_dataVisu_5_show5TS.png)
> *fig.6: a sprite image showing all the traffic sign in a single image*
![fig.6](https://raw.githubusercontent.com/chatmoon/Traffic_Sign_Classifier/master/images/_sp_xx_5984x5984.png)   
> *fig.7: a sampling of challenging images to classify*
![fig.7](https://raw.githubusercontent.com/chatmoon/Traffic_Sign_Classifier/master/images/_dataExplo_4_challenges.PNG)

**Observations:**   
The fig.4 to 7 show how the images are of different qualities. Many images are either shaky, too dark or not having the same scale. There are variabilities such as viewpoint variations, lighting conditions (saturations, low-contrast), motion-blur, occlusions, sun glare, physical damage, colors fading. The classification of the traffic sign can be challenging and complex in this context. We should pre-proceed the images to decrease the impact of the mixed qualities of images.

**Note:** the sprite image will be useful for the embedding visualization in TensorBoard.

---
### Step 2: Design and Test a Model Architecture

#### 2.1. Preprocess the data set
*The code for this step is contained in the code cell **[XX]** of the IPython notebook.* 

> **Definition:** Pre-processing refers to techniques such as converting to grayscale, normalization, etc.

In this part, I describe how I preprocessed the image data, what techniques were chosen and why I chose these techniques. You will also find below a overview of the preprocessing workflow in figure *fig.8* with images showing the output of each preprocessing technique.

In summary, the preprocessing workflow generate five different types of image: 0RGB, 1GRAY, 2SHP, 3HST, 4CLAHE.

> *fig.8: the preprocessing workflow*
![fig.8](https://raw.githubusercontent.com/chatmoon/Traffic_Sign_Classifier/master/images/_dataPPro%20flow_170610-1221.png)


First of all, **all preprocessed images are centered and normalized** because during the training of the network we multiply weights to the initial input and add biases to cause activations and then backpropagate with the gradients to train (update) the model. In this process, we do not want the gradients go out of control. Then all preprocessed images are centered around zero by subtracting the mean, and normalized by dividing by the standard deviation. This technique doesn't change the content of the image. It avoids the values of weights and biases to get too big or to small. It tackles the numerical stability issue that occurs when several small values are added to big values (introducing a lot of errors) during the optimization of the Loss function.

The two starting points of this approch are:
- the following comments: "the ConvNet was trained with full supervision on the colorimages of the GTSRB dataset and reached 98.97% accuracyon the phase 1 test set. After the end of phase 1, additional experiments with grayscale images established a new record accuracy of 99.17%", Traffic Sign Recognition with Multi-Scale Convolutional Networks, Pierre Sermanet and Yann LeCun
- the observation of the training set samples shows there are variabilities such as colors fading, lighting conditions (saturations, low-contrast), sun glare, motion-blur. To overcome a part of these variabilities and make the classification easier, I converted RGB images to grascale, then grayscale images have been sharpened, the histograms of sharpened images have been equilized, and the histograms of equilized images have been equilized adaptively with limited contrast

The following functions from OpenCV have been used:
- 1GRAY: [cvtColor](http://docs.opencv.org/2.4/modules/imgproc/doc/miscellaneous_transformations.html#cvtcolor)   
- 2SHP: [filter2D](http://docs.opencv.org/2.4/modules/imgproc/doc/filtering.html#filter2d)   
- 3HST: [equalizeHist](http://docs.opencv.org/2.4/modules/imgproc/doc/histograms.html#equalizehist) to improve the contrast   
- 4CLAHE: [createCLAHE](http://docs.opencv.org/3.1.0/d5/daf/tutorial_py_histogram_equalization.html)   

**Note:**
- In addition, I also tried to blur the images but I got a better accuracy removing this last technic   
- other realistic perturbations would probably also increase robustness of the model such as other affine transformations, brightness, blur or adding some artificial occlusions. They would be implemented in the future sprint

#### 2.2. Generate additional jittered data
*The code for this step is contained in the code cell **[XX]** of the IPython notebook.* 

In this part, I describe how and why I generated additional data, what techniques were chosen and why I chose these techniques. You will also find below a workflow of the generation of additional data in figure fig.9 with a visualization of the jittered images and the new histograms of the number of occurence per class.

I decided to generate additional data for two reasons:
- first, the fig.2 shows the data set is imbalanced, i.e. the classes are not represented equally. In this case, the accuracy measures might be excellent accuracy on paper but it is only reflecting the underlying class distribution. For example, if the accuracy is 90% of the instances in Class-3 (Speed limit 50km/h) is because the models look at the data and cleverly decide that the best thing to do is to always predict “Class-3” and achieve high accuracy
- secondly, the amount of data might be not sufficient for the model to generalise well in production with new data   


To add more data to the the data set, I combined the two following techniques:
- images are randomly picked and perturbed in position ([-2,2] pixels)
- then they are pertubed in rotation ([-15,+15] degrees)   


**Notes:**
- in addition, I also tried to use the bounding box to crop the images and then perturbing them in scale ([.9,1.1] ratio). I got an accuracy around 93%, far below the human performance of 98.81%. I removed this last part and I get a better result at the end      
- the observation of the training set in fig.6 shows the data set is a stack of several series of 30 similar images **with usually increasing scale**. It is for that reason I did not implement the simple perturbation in scale
- there are 21 traffic signs that have a horizontal or a vertical axis of symmetry. Consequently, they are invariant to horizontal or vertical flipping. This technic would be implemented to add more data to the data set in the future sprint

I did not know how much the data would have to be raised to improve the accuracy or to overcome the overfitting problem. Then I created several set of augmented data: 500, 1000, 1500, 2000, 2500 and 3000 for each preprocessed type of image (see fig.9).

Here is an example of an original image and an augmented image:



# ANNEX

[writeup_template.md](https://github.com/udacity/CarND-Traffic-Sign-Classifier-Project/blob/master/writeup_template.md)