Skip to content

Human behavior cloning using Convolutional Neural Networks in order to learn how to steer the car in a simulated track

Notifications You must be signed in to change notification settings

ibiscp/Behavioral-Cloning

Repository files navigation

#Behavioral Cloning The idea of this project is to clone the human behavior to learn how to steer the car in a simulated track. The only information given to the model is a front view of the vehicle and the output expected is the steering angle of the wheel.

In order to meet this goal, it is necessary to train a Convolutional Neural Network, so it learns what to do in each scenario given a dataset with a bunch of images and a steering angles related to each one.

The figure bellow presents an image of both tracks of the simulator:

##Resources There are a few files needed to run the Behavioral Cloning project.

The simulator contains two tracks. Sample driving data for the first track is included bellow, which can optionally be used to help train the network. It is also possible to collect the data using the record button on the simulator.

###Simulator Download

###Beta Simulators

Dataset

The dataset, provided by Udacity, found in this link, contains the following data:

  • Folder with 8.036 simulation images, showing the center, left and right camera view of the road, tantalizing 24.108 images
  • File driving_log.csv containing a list describing all the images with the following information
    • Center image path
    • Left image path
    • Right image path
    • Steering angle
    • Throttle
    • Brake
    • Speed

Bellow is an example of the images used to train the CNN, it is also shown how the steering angle is adjusted based on the image.

Histogram

The image bellow presents the histogram of the given dataset, here is possible to notice that the number of images with steering angle equal to zero is much more representative.

In order to have a more balanced dataset, it is necessary to eliminate good part of the zero angle steering examples. It was decided to consider only 15% of the total number, and the result is presented bellow.

Image Augmentation

In order to improve the learning task and make it more robust, it is necessary to augment the dataset, so more data is artificially generated based only on the given ones.

The following augmentation is used in this project:

  • Flip
  • Change image brightness
  • Rotate
  • Translate
  • Shadow
  • Shear
  • Crop

Examples of each transformation will be presented bellow.

###Flip In order to have a balanced dataset, it is useful to flip each image randomly, also inverting the sign of the steering angle.

###Change image brightness It is useful to change the image brightness in order to make the model learn how to generalize from a day to a rainy day or at night, for example. This can be achieved changing the V value of the converted image to HSV.

###Rotate It is also possible to generate sloping angles, so the model learns how to generalize to these cases.

Translate

Translating the image randomly makes it possible to generate even more data in different positions of the road, adding a proportional factor of this translation to the steering angle.

Shadow

Shading randomly an image makes it more robust to shadows on the track, such as a tree, wires or poles.

Shear

Shearing the image is also usefull, once it is possible to generate more data with the ones we already have, change the borders that the vehicle does not need to learn.

Crop

In order to minimize the number of parameters of our CNN, it is possible to crop some unnecessary parts of the image, including the bottom, top and some few pixels on the sides.

Composed result

The image bellow shows an example of a composed treatment of an image.

Neural Network Architecture

This project was tested using two different architectures, CommaAI and NVIDIA. Botch were trained using the same configuration (learning rate, optimizer, number of epochs, samples per epoch and augmentation) the only thing that was really changed was the model.

Configuration

  • Learning rate: 1e-3
  • Optimizer: Adam
  • Number of epochs: 20
  • Samples per epoch: 20000
  • Batch size: 50
  • Validation split: 30%

NVIDIA Architecture

Number total of trainable parameters: 2,116,983

CommaAI Architecture

Number total of trainable parameters: 592,497

Results

Bellow is presented a video result running on the same track where the CNN was trained (Track 1). It was also tested on a track never seen before (Track 2) in order to prove that the model learns how to generalize to different tracks and conditions.

Behavioral Cloning

Conclusion and next steps

The task of adjusting the parameters, in order to get a satisfactory result is really difficult. Besides defining the architecture parameters, various other factors influence on the result, such as augmentation and dataset balance.

For this task it is important to have a good computer in order to train the model faster. On my computer, with a NVIDIA GeForce GT 730M it takes about 20 minutes to train, what, is a little bit frustrating.

About

Human behavior cloning using Convolutional Neural Networks in order to learn how to steer the car in a simulated track

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published