In this project , we will train a deep neural network to simulate human driving behavior.
Behavioral cloning is a method by which human sub-cognitive skills can be captured and reproduced in a computer program. As the human subject performs the skill, his or her actions are recorded along with the situation that gave rise to the action. A log of these records is used as input to a learning program. The learning program outputs a set of rules that reproduce the skilled behavior. This method can be used to construct automatic control systems for complex tasks for which classical control theory is inadequate.
The goal of this project is to teach a Convolutional Neural Network (CNN) to drive a car in a Udacity simulator.
- opencv -
pip install opencv-python
- pandas -
pip install pandas
- Tensorflow - GPU -
conda install tensorflow-gpu
- matplotlib -
pip install matplotlib
- moviepy -
pip install moviepy==1.0.0
- imgaug -
conda install -c conda-forge imgaug
- scikit learn -
conda install -c anaconda scikit-learn
- keras -
conda install -c conda-forge keras
- Flask -
conda install -c conda-forge flask-socketio
- Pillow -
conda install -c anaconda pillow
- socketio -
conda install -c conda-forge python-socketio
- h5py -
conda install -c anaconda h5py
The car is equipped with three cameras that provide video streams and records the values of the steering angle, speed, throttle and brake. The steering angle is the only thing that needs to be predicted, but more advanced models might also want to predict throttle and brake. This turns out to be a regression task. We will use CNN for feature extraction and turn it into a regression model.
github repo : https://github.com/udacity/self-driving-car-sim
The approach for model design is from DAVE-2 system from the paper 'End to End Learning for Self-Driving Cars' from NVIDIA , published on 25 Apr 2016.
Here is a visualization of the architecture (note: visualizing the architecture is optional according to the project rubric)
This model is appropriate because it was designed based on the idea that with minimum training data the system can learn to drive in traffic on local roads.
The main takeaway from this paper and recommended strategy is :
- Use YUV color space
- Use images from both left and right camera along with center camera
- Use random shift and random rotation as Image Augmentation
The model consists of 9 layers. 1 Lambda Layer ( Normalization Layer ), 5 Convolution Layer, 3 Fully Connected Layer. My model consists of a convolution neural network with 5x5 filter sizes and depths between 24 and 64 (code cell 43 of Behavior Cloning Project v2.0.ipynb). The model includes RELU activation function to introduce nonlinearity (code cell 43 of Behavior Cloning Project v2.0.ipynb), and the data is normalized in the model using a Keras lambda layer (code cell 31 of Behavior Cloning Project v2.0.ipynb).
To capture good driving behavior, I first recorded 3 laps on track-1 using center lane driving. Here is an example image of center lane driving:
I then recorded 3 laps driving in opposite direction . This would not only increase the data to train on, but also makes the model unbiased as the track-1 has majority of left turns and few right turns.
I have used images from left and right camera , with tweaked steering angle .
This has 2 advantages : a. Our model will learn how to recover from the side track and come back to center lane. b. 3 times the data
Here are the images from left , center and right cameras
Then I repeated this process on track -2 in order to get more data points.
I preprocessed the data by applying following techniques:
- I cropped the less useful data, i.e. upper portion with trees and forest, and lower portion with car's hood.
- Converting the image from RGB color space to YUV color space.
- Resize the image to 200x66x3
Note : YUV color space and Image size with 200x66x3 is recommended in the paper . Which gave amazing results and help generalize the model.
Then I Augmented the data. after trying different image augmentations , I finally stick to 3 data augmentations :
a. Random Image flip ( steering angle will be changed and will be negative of what was before)
b. Random Image shift (Images will either be left/ right shift or up/ down shift i.e. Translation)
c. Zoomed Image (factor 1.x to 1.3x)
d. Random Image rotation ( -25 deg to +25 deg)
These Augmentations were recommended in the paper 'End to End Learning for Self-Driving Cars' from NVIDIA. (para 5.2)
I used this training data for training the model. The validation set helped determine if the model was over or under fitting. The ideal number of epochs was 30. I used an adam optimizer with learning_rate = 0.0001.
python drive_updated.py model_v2.0.h5
and run the simulator in autonomous mode , choose any track.
Track 1:
output of track 1 is here
Track 2
output of track 2 is here