## Behavioral Cloning Project

The goals / steps of this project are the following:

+ Use the simulator to collect data of good driving behavior
+ Build, a convolution neural network in Keras that predicts steering angles from images
+ Train and validate the model with a training and validation set
+ Test that the model successfully drives around track one without leaving the road
+ Summarize the results with a written report

### Video recording of a test drive

[Link to video](https://youtu.be/2bbZJGtuk_I)

In [1]:
from IPython.display import HTML
HTML("""
<video width="320" height="240" controls>
  <source src="./final_model.mp4" type="video/mp4">
</video>
""")

## Model Architecture and Training Strategy
### 1) An appropriate model architecture has been employed

My model embraces transfer learning and as a result is motivated by [a paper from Nvidia](https://images.nvidia.com/content/tegra/automotive/images/2016/solutions/pdf/end-to-end-dl-using-px.pdf). The model includes RELU layers to introduce nonlinearity, and the data is normalized in the model using a Keras lambda layer.

**Layers:**

+ **Lambda** - normalization layer
+ **Convolution2D** - convolution with 5x5 & 3x3 kernels, padding valid and RELU activation.
+ **MaxPooling2D** - useful to reduce dimensions
+ **Dropout** - prevents overfiting
+ **Cropping2D** - remove irrelevant parts of image
+ **Flatten** - converting output of convolutional part of the CNN into a 1 dimensional feature vector
+ **Dense** - regression output (steering angle)

```
____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to
====================================================================================================
lambda_1 (Lambda)                (None, 160, 320, 3)   0           lambda_input_1[0][0]
____________________________________________________________________________________________________
convolution2d_1 (Convolution2D)  (None, 80, 160, 24)   1824        lambda_1[0][0]
____________________________________________________________________________________________________
activation_1 (Activation)        (None, 80, 160, 24)   0           convolution2d_1[0][0]
____________________________________________________________________________________________________
maxpooling2d_1 (MaxPooling2D)    (None, 79, 159, 24)   0           activation_1[0][0]
____________________________________________________________________________________________________
convolution2d_2 (Convolution2D)  (None, 40, 80, 36)    21636       maxpooling2d_1[0][0]
____________________________________________________________________________________________________
activation_2 (Activation)        (None, 40, 80, 36)    0           convolution2d_2[0][0]
____________________________________________________________________________________________________
maxpooling2d_2 (MaxPooling2D)    (None, 39, 79, 36)    0           activation_2[0][0]
____________________________________________________________________________________________________
convolution2d_3 (Convolution2D)  (None, 20, 40, 48)    43248       maxpooling2d_2[0][0]
____________________________________________________________________________________________________
activation_3 (Activation)        (None, 20, 40, 48)    0           convolution2d_3[0][0]
____________________________________________________________________________________________________
maxpooling2d_3 (MaxPooling2D)    (None, 19, 39, 48)    0           activation_3[0][0]
____________________________________________________________________________________________________
convolution2d_4 (Convolution2D)  (None, 19, 39, 64)    27712       maxpooling2d_3[0][0]
____________________________________________________________________________________________________
activation_4 (Activation)        (None, 19, 39, 64)    0           convolution2d_4[0][0]
____________________________________________________________________________________________________
maxpooling2d_4 (MaxPooling2D)    (None, 18, 38, 64)    0           activation_4[0][0]
____________________________________________________________________________________________________
convolution2d_5 (Convolution2D)  (None, 18, 38, 64)    36928       maxpooling2d_4[0][0]
____________________________________________________________________________________________________
activation_5 (Activation)        (None, 18, 38, 64)    0           convolution2d_5[0][0]
____________________________________________________________________________________________________
maxpooling2d_5 (MaxPooling2D)    (None, 17, 37, 64)    0           activation_5[0][0]
____________________________________________________________________________________________________
flatten_1 (Flatten)              (None, 40256)         0           maxpooling2d_5[0][0]
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 1164)          46859148    flatten_1[0][0]
____________________________________________________________________________________________________
activation_6 (Activation)        (None, 1164)          0           dense_1[0][0]
____________________________________________________________________________________________________
dense_2 (Dense)                  (None, 100)           116500      activation_6[0][0]
____________________________________________________________________________________________________
activation_7 (Activation)        (None, 100)           0           dense_2[0][0]
____________________________________________________________________________________________________
dense_3 (Dense)                  (None, 50)            5050        activation_7[0][0]
____________________________________________________________________________________________________
activation_8 (Activation)        (None, 50)            0           dense_3[0][0]
____________________________________________________________________________________________________
dense_4 (Dense)                  (None, 10)            510         activation_8[0][0]
____________________________________________________________________________________________________
activation_9 (Activation)        (None, 10)            0           dense_4[0][0]
____________________________________________________________________________________________________
dense_5 (Dense)                  (None, 1)             11          activation_9[0][0]
====================================================================================================
Total params: 47,112,567
Trainable params: 47,112,567
Non-trainable params: 0
```

### 2) Attempts to reduce overfitting in the model
The model contains dropout layers in order to reduce overfitting.

The model was trained and validated on different data sets to ensure that the model was not overfitting. The model was tested by running it through the simulator and ensuring that the vehicle could stay on the track.

### 3) Model parameter tuning
The model used an adam optimizer and I experimented with various values for learning rate ([.1, .01, .001, .0001]) and settled on .0001 since it resulted in the least validation loss. 

The following hyperparameters minimized validation loss:
+ learning rate: .0001
+ number of epochs: 12
+ batch size: 32

```
Epoch 1/12
6336/6428 [============================>.] - ETA: 3s - loss: 
6432/6428 [==============================] - 285s - loss: 0.0423 - val_loss: 0.0121
Epoch 2/12
6432/6428 [==============================] - 283s - loss: 0.0130 - val_loss: 0.0100
Epoch 3/12
6516/6428 [==============================] - 286s - loss: 0.0104 - val_loss: 0.0099
Epoch 4/12
6432/6428 [==============================] - 282s - loss: 0.0106 - val_loss: 0.0087
Epoch 5/12
6432/6428 [==============================] - 282s - loss: 0.0099 - val_loss: 0.0102
Epoch 6/12
6516/6428 [==============================] - 286s - loss: 0.0090 - val_loss: 0.0082
Epoch 7/12
6432/6428 [==============================] - 282s - loss: 0.0095 - val_loss: 0.0104
Epoch 8/12
6432/6428 [==============================] - 283s - loss: 0.0089 - val_loss: 0.0086
Epoch 9/12
6516/6428 [==============================] - 287s - loss: 0.0083 - val_loss: 0.0111
Epoch 10/12
6432/6428 [==============================] - 291s - loss: 0.0090 - val_loss: 0.0081
Epoch 11/12
6432/6428 [==============================] - 295s - loss: 0.0080 - val_loss: 0.0092
Epoch 12/12
6516/6428 [==============================] - 300s - loss: 0.0078 - val_loss: 0.0081
```

### 4) Appropriate training data
The training data is [provided](https://d17h27t6h515a5.cloudfront.net/topher/2016/December/584f6edd_data/data.zip) by Udacity. I made several attempts to collect and augment my own training data but it didn't yield better results.

## Model Architecture and Training Strategy

### 1) Solution Design Approach

The overall strategy for deriving a model architecture was iterative. My first step after downloading the simulator, was to drive the vehicle manually for a few laps to get a feel for the controls and get better at track 1. My ability to manually drive the vehicle would be porportional to the quality of the training samples produced. I recorded one lap for an initial end to end test scenario prior to implementing the full CNN architecture.

The second step was to use a very simple convolution neural network sequential model with one flat and dense layers. I thought this model might be appropriate since I was primarily interested in and end-to-end test scenario to define a workflow for training and get familiar with the tools provided.

After training this initial model, the vehicle performed as expected, poorly. The next step was to gather more training data and perform image processing. In addition, I downloaded the training dataset available from Udacity for augmentation as well. Here are some processing steps taken:

+ cropping images to remove top part of the image which isn't necessary
+ using multiple cameras in addition to center image
+ flip images horizontally
+ added an angle offset of +/- 0.4 to the steering angle (recovery driving)

In order to gauge how well the model was working, I split my image and steering angle data into a training and validation set. I found that my first model had a low mean squared error on the training set but a high mean squared error on the validation set. This implied that the model was overfitting.

To combat the overfitting, I modified the model to include dropout.

Then I looked at the Nvidia paper (mentioned above) for a reference architecture and integrated it into this project.

I ran into a memory error after a sufficiently large number of training samples, so I refactored my model training to use generators and train on batches of images instead of having to load the entire data into memory.

The final step was to run the simulator to see how well the car was driving around track one. There were a few spots where the vehicle fell off the track. Inorder to improve the driving behavior in these cases, I recorded some recovery driving images and retrained the model. 

At the end of the process, the vehicle is able to drive autonomously around the track without leaving the road.

### 2) Final Model Architecture

![CNN Architecture](./assets/cnn_architecture.png)

### 3) Creation of the Training Set & Training Process
To capture good driving behavior, I first recorded few laps on track one using center lane driving. Here is an example image of center lane driving:

![CNN Architecture](./assets/center_2017_03_15_07_36_35_267.jpg)

I then recorded the vehicle recovering from the left side and right sides of the road back to center so that the vehicle would learn to stay in the lane and be more generalizable. These images show what a recovery looks like starting from ... :

Steering angle:0
![CNN Architecture](./assets/center_2016_12_01_13_30_48_287.jpg)

Steering angle:0
![CNN Architecture](./assets/center_2016_12_01_13_31_13_381.jpg)

Steering angle: 0.5784606
![CNN Architecture](./assets/left_2016_12_01_13_32_43_963.jpg)

Steering angle: 0.0904655
![CNN Architecture](./assets/right_2016_12_01_13_32_45_477.jpg)


I used this training data for training the model. The validation set helped determine if the model was over or under fitting. Here are the model hyperparameter values:

+ learning rate: .0001
+ number of epochs: 12
+ batch size: 32

After the collection process, I had 12K number of data points.  I finally randomly shuffled the data set and put Y% of the data into a validation set.
```
Number of negative steering angles: 1775 
Number of positive steering angles:1900 
Number of zero steering angles:4361
```

Steering angles distribution plot:
![CNN Architecture](./assets/angles_distribution_plot.png)

A test set wasn't necessary since the purpose of a test set is to measure the model's ability to generalize but the purpose is satisfied by running the final model on the test track in the game simulation.

Since it isn't possible to store all images in memory, I used a python generator to generate batches of data. Only a list of filenames of the entire training and validation set were stored in memory, and the images themselves were read from disk only when new batch was requested.

An adam optimizer (https://arxiv.org/abs/1412.6980v8) was used to minimize the mean squared error (MSE). The evaluation metric (loss function) used is MSE since the project requires predicting steering angles which is a regression problem.