# **Traffic Sign Recognition** 

###  Udacity Student: Ren Silva

## Writeup

### You can use this file as a template for your writeup if you want to submit it as a markdown file, but feel free to use some other method and submit a pdf if you prefer.

---

**Build a Traffic Sign Recognition Project**

The goals / steps of this project are the following:
* Load the data set (see below for links to the project data set)
* Explore, summarize and visualize the data set
* Design, train and test a model architecture
* Use the model to make predictions on new images
* Analyze the softmax probabilities of the new images
* Summarize the results with a written report


[//]: # (Image References)

[image1]: ./examples/visualization.jpg "Visualization"
[image2]: ./examples/grayscale.jpg "Grayscaling"
[image3]: ./examples/random_noise.jpg "Random Noise"
[image4]: ./examples/placeholder.png "Traffic Sign 1"
[image5]: ./examples/placeholder.png "Traffic Sign 2"
[image6]: ./examples/placeholder.png "Traffic Sign 3"
[image7]: ./examples/placeholder.png "Traffic Sign 4"
[image8]: ./examples/placeholder.png "Traffic Sign 5"

## Rubric Points
### Here I will consider the [rubric points](https://review.udacity.com/#!/rubrics/481/view) individually and describe how I addressed each point in my implementation.  



---
### Writeup / README

#### 1. Provide a Writeup / README that includes all the rubric points and how you addressed each one. You can submit your writeup as markdown or pdf. You can use this template as a guide for writing the report. The submission includes the project code.

You're reading it! and here is a link to my [project code](https://github.com/udacity/CarND-Traffic-Sign-Classifier-Project/blob/master/Traffic_Sign_Classifier.ipynb)

### Data Set Summary & Exploration

#### 1. Provide a basic summary of the data set. In the code, the analysis should be done using python, numpy and/or pandas methods rather than hardcoding results manually.

I used the pandas library to calculate summary statistics of the traffic
signs data set:

* The size of training set is **34799**
* The size of the validation set is **4410**
* The size of test set is **12630**
* The shape of a traffic sign image is **(32, 32, 3)**
* The number of unique classes/labels in the data set is **43**


#### 2. Include an exploratory visualization of the dataset.

Here is an exploratory visualization of the data set. It is a bar chart showing how the data ...

[//]: # (Image References)

[image2]: ./Traffic_Sign_Classifier/output_20_0.png "Grayscaling"
[image3]: ./Traffic_Sign_Classifier/output_21_0.png "Random Noise"

![png](./Traffic_Sign_Classifier/output_15_1.png)


### Design and Test a Model Architecture

#### 1. Describe how you preprocessed the image data. What techniques were chosen and why did you choose these techniques? Consider including images showing the output of each preprocessing technique. Pre-processing refers to techniques such as converting to grayscale, normalization, etc. (OPTIONAL: As described in the "Stand Out Suggestions" part of the rubric, if you generated additional data for training, describe why you decided to generate additional data, how you generated the data, and provide example images of the additional data. Then describe the characteristics of the augmented training set like number of images in the set, number of images for each class, etc.)

As a first step, I decided to convert the images to grayscale because colours do not matter in this type of classification, and would cause more harm than good. A Stop sign will be the same whether it is red or grayscale.

Here is an example of a traffic sign image before and after grayscaling.

![alt text][image2]

As a last step, I normalised the image data (by subracting the mean and dividing by the standard deviation). This makes the model faster to converge. This would also help in case we had more than one featue, with different scales (although this was not the case). In practice, I have found it that models fed with normalised data produce much better results, faster.

I made a few attemps to generate additional data because I found that there was a big imbalance on the number of samples per image, as shown below:

![alt text][image1]

To add more data to the the data set, I used the following techniques:

* random zoom/translation - I created my custom function to do that, that gave me more control over the amount of translation and zoom

* random noise (from scikit learn)

* random rotation (from scikit learn)

Whilst translation was always applied, noise and rotation would only be applied if a random coin-toss function (also custom) returned positive.

Here is an example of an original image and three augmentation techniques:

![alt text][image3]

The difference between the original data set and the augmented data set is the following:

* The augmented data set was better balanced, as shown below (I limited the amount of augmented data to a maximum of twice the number of original pictures):

![png](./Traffic_Sign_Classifier/output_23_1.png)

**NB ==>** The data augmentation had an unintended effect: validation accuracy was much higher than training accuracy. This occorred many times and, the validation loss was much lower than the target 93% - so I ended up abandoning the idea, and ran the model for submission without data augmentation 




#### 2. Describe what your final model architecture looks like including model type, layers, layer sizes, connectivity, etc.) Consider including a diagram and/or table describing the final model.

My final model consisted of the following layers:

| Layer         		|     Description	        					| 
|:---------------------:|:---------------------------------------------:| 
| Input         		| 32x32x1 Grayscale image       				| 
| Convolution 5x5     	| 1x1 stride, valid padding, outputs 28x28x16 	|
| RELU					| Activation									|
| Max pooling	      	| 2x2 stride, outputs 14x14x16 				    |
| Convolution 5x5	    | 1x1 stride, valid padding, outputs 10x10x32 	|
| RELU					| Activation									|
| Max pooling	      	| 2x2 stride, outputs 5x5x32 				    |
| Flatten.              | input 5x5x32, outputs 5x5x32                  |
| Fully connected		| input 5x5x32, outputs 1024                    |
| Fully connected		| input 1024, outputs 256                       |
| Fully connected		| input 256, outputs 128                        |
| Fully connected		| input 128, outputs "no of classes" (43)       |
| Softmax				| Activation									|




#### 3. Describe how you trained your model. The discussion can include the type of optimizer, the batch size, number of epochs and any hyperparameters such as learning rate.

To train the model, I used an AdamOptimizer with a learning rate of 0.003 (much higher than the norm, I know), batch size 128 (I experimented with higher ones but settled for that), and started with 10 epochs and gradually increased them up to 40 (I also put an early stop if the model reached the target validation accuracy if .931 or higher), and saved the model every time the validation accuracy increased.

#### 4. Describe the approach taken for finding a solution and getting the validation set accuracy to be at least 0.93. Include in the discussion the results on the training, validation and test sets and where in the code these were calculated. Your approach may have been an iterative process, in which case, outline the steps you took to get to the final solution and why you chose those steps. Perhaps your solution involved an already well known implementation or architecture. In this case, discuss why you think the architecture is suitable for the current problem.

My final model results were:
* training set accuracy of **0.722**
* validation set accuracy of **0.937**
* test set accuracy of **0.906**

If an iterative approach was chosen:
* What was the first architecture that was tried and why was it chosen
The Lenet architecture from the previous lab was the starting point. It represents a typical convolutional network architecture: a few convolutional/maxpooling layer pairs, followed by a flatten (a transition to fully connected), then a series of fully-connected (also called dense) and eventualy a softmax activation on a dense layer that contains the target number of classes.

* What were some problems with the initial architecture?
It seems to work ok - but both training and validation accuracy were low - it turns out I was using random_normal initialisation, instead of the the optimum truncated_normal initialisation for weights.

As I adjusted the intialisation and the normalisation, the training accuracy improved, but not the validation accuracy - so it was overfitting.

* How was the architecture adjusted and why was it adjusted? Typical adjustments could include choosing a different model architecture, adding or taking away layers (pooling, dropout, convolution, etc), using an activation function or changing the activation function. One common justification for adjusting an architecture would be due to overfitting or underfitting. A high accuracy on the training set but low accuracy on the validation set indicates over fitting; a low accuracy on both sets indicates under fitting.

At that point I was ready to tackle overfitting. So I added dropouts after the dense layers. NB -> I already had max pooling after each convolutional layer - this brought validation layer closer to training layer - but not close enough to the 93% target.

At that point I looked at data augmentation - and got an interesting result: training accuracy was much lower than validation accuracy - but still validation accuracy was not closer to 93%.

I stopped worrying data augmentation and went back to my network. I was now working with more convolutional layers and more dense layers. I also added dropouts after the dense layers as a means to reduce overfitting.

* Which parameters were tuned? How were they adjusted and why?

The number of types of layers were adjusted several times. In addion, I changed:

- batch size: I tried numbers such as: 128, 256, etc.
- number of epochs: I started with 10, and gradually went up to 200. I noticed that I would get 93% validation briefly and then go back to a lower validation accuiracy (so I could not conclude the assignment). Eventually I changed the training function so that it would save the model with the highest validation accuracy, and stop once it achieved a target validation accuracy (which I set at 93.1%)

* What are some of the important design choices and why were they chosen? For example, why might a convolution layer work well with this problem? How might a dropout layer help with creating a successful model?

Convolutional layers are critical for image problems - that is how we identify the patterns, regardless of their position. A pooling layer aids in reducing the dimensionality, and acts as a regulariser between layers.


 

### Test a Model on New Images

#### 1. Choose five German traffic signs found on the web and provide them in the report. For each image, discuss what quality or qualities might be difficult to classify.


Here are 5 German traffic signs that I found on the web:



![png](./Traffic_Sign_Classifier/output_45_0.png)




#### 2. Discuss the model's predictions on these new traffic signs and compare the results to predicting on the test set. At a minimum, discuss what the predictions were, the accuracy on these new predictions, and compare the accuracy to the accuracy on the test set (OPTIONAL: Discuss the results in more detail as described in the "Stand Out Suggestions" part of the rubric).

Here are the results of the prediction:


<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>Image</th>
      <th>Prediction</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>Speed limit (30km/h)</td>
      <td>Speed limit (30km/h)</td>
    </tr>
    <tr>
      <th>1</th>
      <td>Speed limit (70km/h)</td>
      <td>Speed limit (70km/h)</td>
    </tr>
    <tr>
      <th>2</th>
      <td>Road work</td>
      <td>Road work</td>
    </tr>
    <tr>
      <th>3</th>
      <td>Beware of ice/snow</td>
      <td>Beware of ice/snow</td>
    </tr>
    <tr>
      <th>4</th>
      <td>Wild animals crossing</td>
      <td>Wild animals crossing</td>
    </tr>
  </tbody>
</table>
</div>



The model was able to correctly guess 3 of the 5 traffic signs, which gives an accuracy of 100%. This compares favorably to the accuracy on the test set of 91%



#### 3. Describe how certain the model is when predicting on each of the five new images by looking at the softmax probabilities for each prediction. Provide the top 5 softmax probabilities for each image along with the sign type of each probability. (OPTIONAL: as described in the "Stand Out Suggestions" part of the rubric, visualizations can also be provided such as bar charts)

The code for making predictions on my final model (extracted from 3 cells in the iPython notebook) is shown below:


---

```python

def predict(X,k=1):
    with tf.Session() as sess:

        # restore weights
        saver.restore(sess, tf.train.latest_checkpoint('.'))   

        # run prediction
        prediction = sess.run(tf.nn.softmax(sess.run(logits, feed_dict={x: X})))
        result = sess.run(tf.nn.top_k(tf.constant(prediction), k=k))
    
    return result

result = predict(X_test_images,5)

from IPython.display import display

for p,i,y,img in zip(result[0],result[1],y_test_labels,X_untouched_images):
    
    print('\n\n\n',y,sign_text(y))
    
    plt.figure(figsize=(3,3))
    plt.imshow(img)
    plt.show()
    
    show_list = [[prob,sign_text(pred)] for prob,pred in zip(p,i)]
    df_show = pd.DataFrame(show_list,columns=["Probability","Prediction"])
    display(df_show.head())
```

---

For the first image, the model is very sure that this is a **Speed Limit of 30 km/h** sign (probability of 1.0), and the image does contain that. The top five softmax probabilities were:

    
    
    



![png](./Traffic_Sign_Classifier/output_56_1.png)



<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>Probability</th>
      <th>Prediction</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>1.000000e+00</td>
      <td>Speed limit (30km/h)</td>
    </tr>
    <tr>
      <th>1</th>
      <td>3.615916e-11</td>
      <td>Speed limit (20km/h)</td>
    </tr>
    <tr>
      <th>2</th>
      <td>8.952134e-20</td>
      <td>Speed limit (50km/h)</td>
    </tr>
    <tr>
      <th>3</th>
      <td>2.404318e-22</td>
      <td>Speed limit (80km/h)</td>
    </tr>
    <tr>
      <th>4</th>
      <td>1.039073e-24</td>
      <td>Speed limit (60km/h)</td>
    </tr>
  </tbody>
</table>
</div>


    
    
For the seconde image, the model is very sure that this is a **Speed Limit of 70 km/h** sign (probability of 1.0), and the image does contain that. The top five softmax probabilities were:   



![png](./Traffic_Sign_Classifier/output_56_4.png)



<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>Probability</th>
      <th>Prediction</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>1.000000e+00</td>
      <td>Speed limit (70km/h)</td>
    </tr>
    <tr>
      <th>1</th>
      <td>2.896572e-34</td>
      <td>Speed limit (30km/h)</td>
    </tr>
    <tr>
      <th>2</th>
      <td>0.000000e+00</td>
      <td>Speed limit (20km/h)</td>
    </tr>
    <tr>
      <th>3</th>
      <td>0.000000e+00</td>
      <td>Speed limit (50km/h)</td>
    </tr>
    <tr>
      <th>4</th>
      <td>0.000000e+00</td>
      <td>Speed limit (60km/h)</td>
    </tr>
  </tbody>
</table>
</div>


    
    
For the third image, the model is very sure that this is a **Road work** sign (probability of 1.0), and the image does contain that. The top five softmax probabilities were:



![png](./Traffic_Sign_Classifier/output_56_7.png)



<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>Probability</th>
      <th>Prediction</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>1.0</td>
      <td>Road work</td>
    </tr>
    <tr>
      <th>1</th>
      <td>0.0</td>
      <td>Speed limit (20km/h)</td>
    </tr>
    <tr>
      <th>2</th>
      <td>0.0</td>
      <td>Speed limit (30km/h)</td>
    </tr>
    <tr>
      <th>3</th>
      <td>0.0</td>
      <td>Speed limit (50km/h)</td>
    </tr>
    <tr>
      <th>4</th>
      <td>0.0</td>
      <td>Speed limit (60km/h)</td>
    </tr>
  </tbody>
</table>
</div>


    
    
     
For the forth image, the model is very sure that this is a **Beware of ice/snow** sign (probability of 1.0), and the image does contain that. The top five softmax probabilities were:   



![png](./Traffic_Sign_Classifier/output_56_10.png)



<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>Probability</th>
      <th>Prediction</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>1.000000e+00</td>
      <td>Beware of ice/snow</td>
    </tr>
    <tr>
      <th>1</th>
      <td>3.464607e-08</td>
      <td>Right-of-way at the next intersection</td>
    </tr>
    <tr>
      <th>2</th>
      <td>2.728659e-09</td>
      <td>Slippery road</td>
    </tr>
    <tr>
      <th>3</th>
      <td>2.239767e-11</td>
      <td>Dangerous curve to the right</td>
    </tr>
    <tr>
      <th>4</th>
      <td>9.215399e-13</td>
      <td>Road narrows on the right</td>
    </tr>
  </tbody>
</table>
</div>


    
    
    
For the fifth image, the model is very sure that this is a **Wild animals crossing** sign (probability of 0.806), and the image does contain that. The top five softmax probabilities were:  



![png](./Traffic_Sign_Classifier/output_56_13.png)



<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>Probability</th>
      <th>Prediction</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>0.806312</td>
      <td>Wild animals crossing</td>
    </tr>
    <tr>
      <th>1</th>
      <td>0.190004</td>
      <td>Dangerous curve to the right</td>
    </tr>
    <tr>
      <th>2</th>
      <td>0.001888</td>
      <td>Dangerous curve to the left</td>
    </tr>
    <tr>
      <th>3</th>
      <td>0.000832</td>
      <td>Bicycles crossing</td>
    </tr>
    <tr>
      <th>4</th>
      <td>0.000478</td>
      <td>Slippery road</td>
    </tr>
  </tbody>
</table>
</div>



