Build a Traffic Sign Recognition Project - Luca Fiaschi

The goals / steps of this project are the following:

Load the data set (see below for links to the project data set)
Explore, summarize and visualize the data set
Design, train and test a model architecture
Use the model to make predictions on new images
Analyze the softmax probabilities of the new images
Summarize the results with a written report

Rubric Points

Here I will consider the rubric points individually and describe how I addressed each point in my implementation. The implementation and the project writeup can be found here to my project code

Data Set Summary & Exploration

I used the pandas library to calculate summary statistics of the traffic signs data set:

The size of training set is 34799
The size of the validation set is 4410
The size of test set is 12630
The shape of a traffic sign image is (32, 32, 3)
The number of unique classes/labels in the data set is 43

Here is an exploratory visualization of the data set. It is a bar chart showing how the data is ditributed across the different labels.

The average number of training examples per class is 809, the minimum is 180 and the maximum 2010, hence some labels are one order of magnitude more abundant than others.

Most common signs:

Speed limit (50km/h) train samples: 2010
Speed limit (30km/h) train samples: 1980
Yield train samples: 1920
Priority road train samples: 1890
Keep right train samples: 1860

Most rare signs:

Speed limit (20km/h) train samples: 180
Dangerous curve to the left train samples: 180
Go straight or left train samples: 180
Pedestrians train samples: 210
End of all speed and passing limits train samples: 210

Here is an visualization of some 10 randomly picked training examples for each class. As we can see, within each class there is a high variability in appearance due to different weather conditions, time of the day and image angle.

Design and Test a Model Architecture

Question 1:

Describe how you preprocessed the data. Why did you choose that technique?

Answer

Following a published baseline model on this problem I applied similar normalization and image enhancements . Images were transformed in the YUV space and adjusted by histogram sketching and by increasing sharpness. Finally only the Y channel was selected as in some preliminary experiments full color images seem to confuse the classifier (as also reported in the published baseline), the latter effect however may depend on the network architecture, as in the long term we would intuitively expect to have networks trained with full color images to perform better.

Here is an example of an original image and the transformed image.

Hence the difference between the original data set and the augmented data set is the following is the reduced level of noise and number of channels.

Question 2:

Describe how you set up the training, validation and testing data for your model. Optional: If you generated additional data, how did you generate the data? Why did you generate the data? What are the differences in the new dataset (with generated data) from the original dataset?

Answer:

All images were processed by transform_img function as discribed in the question 1. Training test and validation set were provided in the exercise. Training set was also augmented by generating 5 additional images from every given image. Images were augmented by augment_img function. The process consists of random rotation around image center (random value between -15 and 15 deg) and random vertical stretching (as the simplest way to simulate different viewing angle) by random value up to 40 %.

An example of an image aftern augmentation is shown below:

Question 3:

Describe what your final model architecture looks like.

Answer

My final model consisted of the following layers:

Layer	Description
Input	32x32x1 Y channel image
Convolution 5x5	1x1 stride, valid padding, outputs 28x28x6
Relu
Max pooling	2x2 stride, outputs 16x16x64
Convolution 5x5	1x1 stride, valid padding, outputs 14x14x6
Relu
Fully connected	Input 14x14x6 = 400 output 120
Relu
droupout
Fully connected	Input 120 output 84
Relu
dropuout
Fully connected	Input 84 output 43
Softmax

Question 4:

How did you train your model? (Type of optimizer, batch size, epochs, hyperparameters, etc.)

Answer:

I trained the model using an Adam optimizer , learning rate of 1e-4 , dropout rate of 0.3 and batch size of 128.

Question 5

What approach did you take in coming up with a solution to this problem? It may have been a process of trial and error, in which case, outline the steps you took to get to the final solution and why you chose those steps. Perhaps your solution involved an already well known implementation or architecture. In this case, discuss why you think this is suitable for the current problem.

Answer:

To train the model, I started from a a well known architecture (LeNet) because of simplicity of implementation and because it performs well on recognition task with tens of classes (such as carachter recognition). After a few runs with this architecture I noted that the model tended to overfit to the original training set, in fact the learning curve showed that the training error converged to 99% while the validation error wasn`t giving a satisfactory performance. For this reasons, I tested two regularization techniques to improve the results:

Data augmentation
Dropout

I started trying with an high dropout rate 50% and this seemed to slow down overfitting: the model was slower to train but also achieved a slightly higher accuracy in the end. However, only When added the augmented dataset I started seeing strong increased performance as the model was now able to learn within a few epochs but at the same time to generalize well on the validation set.

A dropout rate of 30% and a learning rate of 1e-4 was selected after a few trial and errors. Training the model overall takes around 6 hours.

Training curves can be seen below, at the end of the curves both training and validation error converge around a hundred epochs.

My final model results were:

training set accuracy of 97%
validation set accuracy of 95%
test set accuracy of 93%

Test a Model on New Images

Question 1:

Choose five German traffic signs found on the web and provide them in the report. For each image, discuss what quality or qualities might be difficult to classify.

Here are five German traffic signs that I found on the web:

All these images maybe challenging to classify because:

they include much more background then the training images
the background is very different from the one in the training images
contains image artifacts such as jpeg compression problems and copyright trademarks

Since these images are not in the right shape accepted by the classifier they were downsampled ans smoothed before applying the trasnform_img function

Here are the results of the prediction:

Top 5 Labels for image Double curve:

Speed limit (30km/h) with prob = 0.76
End of speed limit (80km/h) with prob = 0.11
End of no passing with prob = 0.02
Speed limit (20km/h) with prob = 0.02
Children crossing with prob = 0.02

Top 5 Labels for image Children crossing:

Children crossing with prob = 0.71
Right-of-way at the next intersection with prob = 0.17
Go straight or right with prob = 0.04
Dangerous curve to the right with prob = 0.04
Slippery road with prob = 0.02

Top 5 Labels for image Speed limit (50km/h):

Speed limit (80km/h) with prob = 0.68
Speed limit (50km/h) with prob = 0.31
Speed limit (100km/h) with prob = 0.01
Speed limit (60km/h) with prob = 0.00
Speed limit (30km/h) with prob = 0.00

Top 5 Labels for image Stop:

Dangerous curve to the right with prob = 0.95
Keep right with prob = 0.04
Turn left ahead with prob = 0.01
Go straight or right with prob = 0.00
Speed limit (80km/h) with prob = 0.00

Top 5 Labels for image Go straight or left:

Turn left ahead with prob = 0.98
Priority road with prob = 0.01
Ahead only with prob = 0.01
Keep right with prob = 0.00
Roundabout mandatory with prob = 0.00

Top 5 Labels for image Speed limit (80km/h):

Speed limit (30km/h) with prob = 0.74
Speed limit (50km/h) with prob = 0.14
Speed limit (120km/h) with prob = 0.02
Speed limit (70km/h) with prob = 0.02
Speed limit (60km/h) with prob = 0.02

The model was able to correctly guess 1 of the 6 traffic signs, which gives an accuracy of ~17%. This is very different from the accuracy on the test set but is also comprehensible given the different conditions in which these images were take.

For the first and forth image, the model is relatively sure of the predicted label (peaked probability distribution) without however getting close to the right answer. It is to consider that these two images are those most affected by the image compression and trademarks artifacts.

Prediction of image 2 is correct with a very high confidence.

Wile prediction for image 5 and 6 are wrong but the model was able to recognise the type of sign (a speed limit sign)

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
examples		examples
images_signs		images_signs
plots		plots
.gitignore		.gitignore
README.md		README.md
Traffic_Sign_Classifier.ipynb		Traffic_Sign_Classifier.ipynb
training.log		training.log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples

examples

images_signs

images_signs

plots

plots

.gitignore

.gitignore

README.md

README.md

Traffic_Sign_Classifier.ipynb

Traffic_Sign_Classifier.ipynb

training.log

training.log

Repository files navigation

Build a Traffic Sign Recognition Project - Luca Fiaschi

Rubric Points

Data Set Summary & Exploration

Design and Test a Model Architecture

Question 1:

Answer

Question 2:

Answer:

Question 3:

Answer

Question 4:

Answer:

Question 5

Answer:

Test a Model on New Images

Question 1:

About

Releases

Packages

Languages

lfiaschi/udacity-traffic-sign-classifier

Folders and files

Latest commit

History

Repository files navigation

Build a Traffic Sign Recognition Project - Luca Fiaschi

Rubric Points

Data Set Summary & Exploration

Design and Test a Model Architecture

Question 1:

Answer

Question 2:

Answer:

Question 3:

Answer

Question 4:

Answer:

Question 5

Answer:

Test a Model on New Images

Question 1:

About

Resources

Stars

Watchers

Forks

Languages