# Behavioral Cloning

Hereafter We present the information for Cloning driving behavior using Deep Learning. The code will be presented separately in the repository in the file `model.py` and different packages will be used to process training data gathered from Udacity's simulator. Images gathered will be extended to provide better traininig. Images were saved to an S3 bucket to provide access from multiple computers to craft the model separately from the training instance.

The goals / steps of this project are the following:
* Use the simulator to collect data of good driving behavior
* Build, a convolution neural network in Keras that predicts steering angles from images
* Train and validate the model with a training and validation set
* Test that the model successfully drives around track one without leaving the road
* Summarize the results with a written report

# Required Files

## Are all required files submitted?

> The submission includes a model.py file, drive.py, model.h5 and a writeup report.

The following project includes:

  - [**model.py**](model.py)
      Contains the script to create and train the model
  - [**drive.py**](drive.py) For driving the car in autonomous mode
  - [**model.h5**](model.h5) Contains a trained convolution neural network 
  - [**writeup_report.md**](writeup_report.md) Summary of the results
  - **data_processing/** Module where all data processing code was placed
    - **data_preprocessing.py** Utilities to merge all csv files into one 
    - **data_processing.py** Utilities to navigate csv using one index structure so that we can perform the following in a simple manner:
        - Shuffle the data 
        - Apply multiple transforms 
        - Extend the data
    


# Quality of code

## Is the code functional?

> The model provided can be used to successfully operate the simulation.

Using the Udacity provided simulator and drive.py file, the car can be driven autonomously around the track by executing 
```sh
python drive.py model.h5
```

To extend the data I created a standardized process.

  1. Gather new data using Udacity's Simulator and save it to a specific folder, e.g. *new_data*. Inside that folder you'll also have a **IMG** directory and a **driving_log.csv** file.
  2. Place the directory inside the specific track folder inside the data directory. 
  3. If you have a **data/driving_log_compiled.csv** file already, erase it and re run *model.py* to regenerate, the process should gather your new images. All the transformations already in place will be applied to the new images.


## Is the code usable and readable?

> The code in `model.py` uses a Python generator, if needed, to generate data for training rather than storing the training data in memory. The `model.py` code is clearly organized and comments are included where needed.

The code is separated in three files.

### `model.py`

Contains all the code related to the CNN and the generators involved to train the model, as well as the pipeline for training, validating, and evaluating the model.

### `data/data_processing.py`

Several transformations for our images are done to extend the dataset, more information on this is found below. All the code pertaining into how these transformations are done is contained in this file.

### `data/data_preprocessing.py`

Images are separated across multiple folders to allow scalability when adding new data from the simulator. These folders and files are traversed and merged into one.

# Model Architecture and Training Strategy

## Has an appropriate model architecture been employed for the task?

> The neural network uses convolution layers with appropriate filter sizes. Layers exist to introduce nonlinearity into the model. The data is normalized in the model.



## Has an attempt been made to reduce overfitting of the model?

> Train/validation/test splits have been used, and the model uses dropout layers or other methods to reduce overfitting.



## Have the model parameters been tuned appropriately?

> Learning rate parameters are chosen with explanation, or an Adam optimizer is used.



## Is the training data chosen appropriately?

> Training data has been chosen to induce the desired behavior in the simulation (i.e. keeping the car on the track).



# Architecture and Training Documentation

## Is the solution design documented?

> The README thoroughly discusses the approach taken for deriving and designing a model architecture fit for solving the given problem.

## Is the model architecture documented?

> The README provides sufficient details of the characteristics and qualities of the architecture, such as the type of model used, the number of layers, the size of each layer. Visualizations emphasizing particular qualities of the architecture are encouraged.

## Is the creation of the training dataset and training process documented?

> The README describes how the model was trained and what the characteristics of the dataset are. Information such as how the dataset was generated and examples of images from the dataset should be included.

To assemble the data we use multiple use cases, we need to be able to extend the data easily so we create some scripts that rebuilds the labels.

In [2]:
import os

print(os.listdir('./data')) 

['track1', 'track2']


# Simulation

## Is the car able to navigate correctly on test data?

> No tire may leave the drivable portion of the track surface. The car may not pop up onto ledges or roll over any surfaces that would otherwise be considered unsafe (if humans were in the vehicle).