# Behavioral Cloning 


---

**Behavioral Cloning Project**

The goals / steps of this project are the following:
* Use the simulator to collect data of good driving behavior
* Build, a convolution neural network in Keras that predicts steering angles from images
* Train and validate the model with a training and validation set
* Test that the model successfully drives around track one without leaving the road
* Summarize the results with a written report


[//]: # (Image References)

[image1]: ./examples/placeholder.png "Model Visualization"
[image2]: ./examples/placeholder.png "Grayscaling"
[image3]: ./examples/placeholder_small.png "Recovery Image"
[image4]: ./examples/placeholder_small.png "Recovery Image"
[image5]: ./examples/placeholder_small.png "Recovery Image"
[image6]: ./examples/placeholder_small.png "Normal Image"
[image7]: ./examples/placeholder_small.png "Flipped Image"

## Rubric Points
###Here I will consider the [rubric points](https://review.udacity.com/#!/rubrics/432/view) individually and describe how I addressed each point in my implementation.  

---

### Model Architecture and Training Strategy

#### 1. An appropriate model architecture has been employed

To create this model, I used Keras, which in this case relies on the Tensorflow library as its backend. Coding in Keras is more concise and intuitive than in Tensorflow.

I wanted to see if a simple convolution neural network would be efficient enough to drive the car in the simulator. For this I relied on the LeNet5 model, which was successful at categorizing 43 traffic signs categories in my previous project. 

To recognize track features my model uses two convolution layers:
* the first one uses 5x5 filter with a stride of one and a depth of 6
* the second also uses 5x5 filter with a stride of one but a depth of 16

I also use two fully connected layers, one with 120 neurons and the other with 84 neurons.

At the beginning, the model also manipulates each picture by normalizing them (using a Keras lambda layer) and cropping the bottom and top parts (using a Keras Cropping2d layer).

To include nonlinearity, multiple RELU activation layers are used throughout the model.

Numerous models were tested by running them through the simulator and ensuring that the vehicle could stay on the track. Every time a model failed, additional changes were made to the model and the data set.

#### 2. Attempts to reduce overfitting in the model

The model contains 2 pooling layers (one after each convolution layer) to help reduce overfitting. 

I also limited the number of epoch to train the model to 3.

Using a generator, I was able to use a large data set.

#### 3. Model parameter tuning

The model uses an adam optimizer, so the learning rate was not tuned manually. The Adam optimizer tends to converge faster than Stochastic Gradient Descent because it uses momentum, yet at the later training stages it will reduce the learning rate to help find a local minimum.

#### 4. Appropriate training data

Proper training data was critical to train the model to accurately predict the steering angle of the car. I initially relied on the Udacity set, because it features proper driving. I later added a set with my own driving data around the first track. That set includes clockwise and counter-clockwise driving (the first track contains a majority of left turns, so recording counter-clockwise driving helps the model to avoid being biased towards turning left). I also added examples of recovery driving at different stages of the track (e.g. from a yellow line, from a dirt side, from side of bridge, etc). 


### Model Architecture and Training Strategy

#### 1. Solution Design Approach

When looking at the different features of the first track in the simulator, I realized that not too many would need to be successfully recognized by the model in order to drive successfully. Of great importance are the features that indicate the edges of the track:
* yellow lines
* red-and-white stripes
* black walls of the bridge
* dirt areas on the side of the road

Of course, the model would need to not only recognize a feature, but also its angle compared to the track and the car position. However, since LeNet5 had proven successful at categorizing 43 different traffic signs in my previous project, I thought a model architecture based on it might succeed at driving the car in the simulator. 

If the track had shown a greater number of features (e.g. more driving surfaces, more edge types) then a small model like LeNet might not have been able to properly categorize each feature, but in this case it proved successful, granted enough proper training data was provided.

I also had to think about the pictures fed into the model. Unlike my previous project, which used 32x32 pixels, each picture had an initial size of 160x320 pixels. The data set from Udacity is almost 400 MB and my own data set is 250 MB. Since there might have been some memory issues when trying to train the model on a larger data set, I used a generator that would read 32 images at a time.

Not all the pixels in a picture are also relevant: the top pixels only show parts of the sky and horizon, whereas the bottom pixels show mostly the hood of the car. These were cropped before the rest of the picture was analyzed by the model.

I also had to think about the tool used to read images in the first place. Initially I used the cv2 libary of OpenCV. However I later realized that OpenCV by default reads images as BGR, whereas in drive.py - the file used to drive the car in the simulator - the images were being read as RGB. This created issues because a model trained on BGR data to recognize "Yellow" markings for example would not recognize those Yellow marking when being fed RGB data. So I changed the code and used matplotlib.image (which uses RGB) instead to read the pictures when training the model.

I also thought about resizing/downsizing pictures. This would have further reduced the size of the output at the end of each convolution layer, thereby speeding the training process. A popular network used to train a self-driving car is one by NVIDIA, which uses 66x200 images, so resizing could also have been used to fit these measurements. In the end though, the model I chose proved sufficient to drive the car on the first track.

Confident that the generator would help prevent memory issues, I used a combined data set (Udacity and my own) that comprised about 40,000 pictures. 

In order to gauge how well the model was working, I split my image and steering angle data into a training and validation set. Because my data set was failry large, I decided to allocate a significant portion (30%) to the validation set. When running the model I could see the utility from Keras showing me the progress of each epoch, and that the loss was decreasing with each batch processed. After only the first epoch, loss for the validation set was around 0.15 and diminished slightly over the next two epochs. 

To combat the overfitting, I limited the number of epochs to train my model. I also used a large data set so that it would be harder for the model to "learn" the right predicion for just a specific set of images. Finally, I used max pooling layers to generalize the results of each convolution layer.

At the end of the process, the vehicle is now able to drive autonomously around the track without leaving the road.

#### 2. Final Model Architecture

The final model architecture (model.py lines 18-24) consists of a convolution neural network being processed images.

During image processing steps, each image is first normalized using a Keras Lambda layer, then the top 70 pixels and bottom 20 pixels are cropped out using Cropping2D layer. 

The resulting layer is then fed to a convolutional layer with a 5x5 filter, a stride of 1, and a depth of 6.

Following this layer is a RELU activation layer, to bring nonlinearity.

The layer is then reduced in size using a 2x2 maxpooling layer.

Another convolutional layer then comes, with a 5x5 filter, a stride of 1, and a depth of 16.

Like the first convolution layer, this one is followed by an RELU activation layer and a 2x2 maxpooling layer.

The result is flattened, then fed into a 120-neuron fully connected layer, then a 84-neuron fully connected layer.

The output layer contains only one neuron, which ouputs a number that will be used to determine the steering angle for the car.


#### 3. Creation of the Training Set & Training Process

Capturing good driving behavior was difficult at first, as using the keyboard keys to drive the car resulted in jittery moves and made it difficult to keep the car in the center of the road, which is essential if we want the model to learn proper driving. I thus decided to use the Udacity data set because it contains examples of good driving.

I later learn to use the mouse to steer the car. This resulted in much smoother driving, as the mouse enabled me to keep steering angles constant over multiple frames. I then decided to do multiple examples of recovery driving, where I started the recording from a side of the road, steering the car back to the middle. 

Finally, I decided to use the left and right "cameras" pictures. Indeed, when recording a run, three pictures where recorded at a time: 
* one from a center camera
* one from a left camera
* one from a right camera

While it was easy to write the code to use all three sets of pictures to train the model, a more difficult part was to decide what angle to use for the left and right pictures. Indeed, the only steering angle provided was for the center picture, so a correction had to be given to that initial angle to give the model the correct angle for the right and left pictures. 

Analyzing some images, I determined the left and right camera to be at a roughly 20 degrees angle from the center view. Since the maximum steering angle of the car is +/- 25 degrees, then 20 degrees would result in a steering coefficient output from the model of +/- 0.8. However, I assumed assigning such a correction coefficient would result in jittery driving, so I tried instead correction coefficients between 0.1 and 0.3, and settled for 0.2.

Once I had enough training data that included various driving (clockwise and counter clockwise) and recovery situations, I shuffled the data before feeding it to my model.

Because of the size of my data set, I limited the number of epochs to 3, both to prevent overfitting but also because training the model took time (about 40 minutes on a g2.2xlarge instance on Amazon Web Services). I also decided to proceed cautiously when making change to my model, choosing to only update one parameter at a time (e.g. changing left/right correction coefficient from 0.1 to 0.2), because I wanted to better understand the impact of each parameter on the model. In the end, I found a relatively simple model that was still able to drive around the first track.
