# Traffic Sign Recognition Report

---

### Build a Traffic Sign Recognition Project

###### The goals / steps of this project are:

- Load the data set (see below for links to the project data set).

- Explore, summarize and visualize the data set.

- Design, train and test a model architecture.

- Use the model to make predictions on new images.

- Analyze the softmax probabilities of the new images.

- Summarize the results with a written report.


### 1) Dataset Summary:
Using numpy I calculated the following:
- Number of training examples.
- Number of testing examples.
- Number of validation examples.
- Image data shape.
- Number of classes.


## 2) Exploratory Visualization:
To visualize the data:
- A random image is plotted
- The label is printed.
- The type of the image is printed.
- The maximum and minimum values of pixels the image are printed.
- A chart of the number of images each class has is shown below.


 <img src="Report_pics/capture.JPG" width="361" alt="Combined Image" /> 
 <p style="text-align: center;"> A chart showing the number of pictures per each class (above)</p> 

## 3) Preprocessing: 

I chose to normalize the images, i.e. each pixel values were (0, 255) and was normalized to (0, 1). Augmenting the data set to roughly even the number of images per class would help a lot.

**Note:** I tried gray scaling but the returned images gave me weird colors so I removed it. Also, I spent days searching for other ways to preprocess my data, but after increasing the batch size and the number of epochs my values reached 0.93 so I didn't add them.

#### 4) Model Architecture: 

I used a 5 layer LeNet architecture. I tried tuning the parameters, eventually I kept the values (mu = 0 and sigma = 0.1). The steps taken for LeNet architecture are:
- **Layer 1:** Convolution with valid padding. Input = 32x32x3. Output = 28x28x6.
- Relu activation.
- MaxPooling. Input = 28x28x6. Output = 14x14x6.
- **Layer 2:** Convolution with valid padding. Output = 10x10x16.
- Relu activation.
- MaxPooling. Input = 10x10x16. Output = 5x5x16.
- Flatten. Input = 5x5x16. Output = 400.
- **Layer 3:** Fully Connected. Input = 400. Output = 120.
- Relu activation.
- **Layer 4:** Fully Connected. Input = 120. Output = 84.
- Relu activation.
- **Layer 5:** Fully Connected. Input = 84. Output = 43.


## 5) Model Training: 

- The labels are one-hot encoded.
- I used Adam optimizer and optimizer.minimize.
- Number of epochs = 35.
- Batch size = 256.
- Learning rate = 0.001.
- A plot of the loss and accuracy is below.

 <img src="Report_pics/capture2.png" width="361" alt="Combined Image" />
 <p style="text-align: center;"> Loss and accuracy (above)</p>

## 6) Solution Approach:

My final model results were:
- Training set accuracy of 0.99908.
- Validation set accuracy of 0.93878 the highest being 0.94014.
- Test set accuracy of 0.92629

I chose to use LeNet architecture. "The LeNet architecture is straightforward and small, (in terms of memory footprint), making it perfect for teaching the basics of CNNs — it can even run on the CPU (if your system does not have a suitable GPU), making it a great 'first CNN'."  Adrian Rosebrock on August 1, 2016. LeNet gave a good accuracy after a certain number of epochs, this helped me to use less preprocessing techniques. Data augmentation would have helped the accuracy jump to above 0.95, I almost added it but ran out of time as researching and implementing many easy tasks took me too long, I decided to do the bare minimum and get back to enhancing and attempting challenges later.

I adjusted my hyper parameters by trial and error. At the time I was using only 15 epochs to reduce the processing time as I'm using my CPU. After tuning I used 35 epochs and I got the required accuracy. Processing dose take long though, this wouldn't be the case if more preprocessing was applied. Training, validation and test accuracy were satisfying.

## 7) Acquiring New Images:

- The 20 images I found on the web were in PNG format, with dimensions (800, 704) so i had to resize them. 
- Each pixel had 4 color spaces, so I had to slice the last channel.
- Each pixel was already ranging from (0.0, 1.0), luckily.
- They shouldn't be difficult for detect as they are raw (i.e. not pictures taken from the street).
- When reading the images using cv2.imread the colors were changed, that didn't happen when using mpimg.imread.
- I had to manually color some gaps in white as they were read in black instead. 

 <img src="Report_pics/all.png" width="361" alt="Combined Image" /> 
 <p style="text-align: center;"> The 20 images I found on the web (above)</p>

## 8) Performance on New Images:

The results for the images I found on the web were between 0.7 and 0.85. The results are better on the test dataset, probably because reducing the size of the images wasted some information.

## 9) Model Certainty - Softmax Probabilities:

The model was very certain about some pictures giving probability 1.0 to the right classification. For a few pictures though it wasn't. Mostly the unrecognized pictures were the ones that lost some of their features when compressing, such as the slippery road sign shown below.

 <img src="Report_pics/capture3.png" width="90" alt="Combined Image" />

 <p style="text-align: center;"> The resized version of the slippery road sign (above)</p> 

 <img src="Report_pics/capture4.png" width="300" alt="Combined Image" />
