## Behavioral Cloning Project

This project will imitate driving with the simulator provided by Udacity. The goal of the project is to drive one lap around the track autonomously, without exiting the track at any point.

The user first has to generate test data by driving around the track him/herself, and that data will be used to imitiate the driving patterns using a neural network. The neural network should take the images taken by the simulators cameras and steering data as input/output.

The model that is used in this project is based on the NVIDIA model from [here](http://images.nvidia.com/content/tegra/automotive/images/2016/solutions/pdf/end-to-end-dl-using-px.pdf), with some motifications. 

The model is structured as following, explanations on model structure and preprocessing choices will follow in appropriate sections.

| Layer | Description | Output Size         
| :-: |:-------------: | :-:
|Input | The input data  | input size = (none, 64, 64, 3)
| Normalization (lambda) | Normalize data to -1 to 1 | (none, 64, 64, 3)
| Conv2D (5,5) | Conv #1  with "elu" activation | (none, 60, 60, 24)
| Average pooling (2,2) | Pooling layer | (none, 30, 30, 24)
| Dropout | 50% | (none, 30, 30, 24)
| Conv2D (5,5) | Conv #2 with "elu" activation | (none, 26, 26, 36)
| Average pooling (2,2) | Pooling layer | (none, 13, 13, 36)
| Dropout | 50% | (none, 13, 13, 36)
| Conv2D (3,3) | Conv #3 with "elu" activation | (none, 11, 11, 48)
| Average pooling (2,2) | Pooling layer | (none, 5, 5, 48)
| Dropout | 50% | (none, 5, 5, 48)
| Conv2D (3,3) | Conv #4 with "elu" activation | (none, 3, 3, 64)
| Average pooling (2,2) | Pooling layer | (none, 1, 1, 64)
| Dropout | 50% | (none, 1, 1, 64)
| Flatten | Make fully connected | (none, 64)
| Dense | Fully connected | (none, 50)
| Dropout | 50% | (none, 50)
| Dense | Fully connected | (none, 10)
| Dropout | 50% | (none, 10)
| Dense | Output (steering) | (none, 1)


Instead of the (2,2) strides in the original NVIDIA model, I have used (2,2) average pooling layers, and have also added a dropout layer afterwards for each layer, to prevent overfitting. The original model size and layers were changed due to using 64 by 64 pixel images instead of 66 by 200 from the paper.

This can also be seen from the plot(model) output from Keras ![here](model.png "CNN model used for behavioral cloning")

In [1]:
import csv
import cv2
import ntpath
import numpy as np

## First step is to import and  process the data from the driver in the simulator

For this project, I have used the data provided by Udacity and supplemented it with additional data taken on curves and on the bridge, and some data on how to recover from bad initial position (close to the edge of the road).

First we need to pre-process the data. Ideally this should/could be done within the model structure, but due to some difficulties and bugs the training data was affected but not during run time in the drive.py file. So instead the pre-processing was done outside the main model, and added to the drive.py funciton as well.

In the pre-processing phase, threes steps are done. This step proved to be quite crucial and changing the model structure had less impact than the preprocessing did.

* Change the image from rgb to yuv

    This comes from the NVIDIA model and helps to de-correlate the images three colors to make the most of the data we have.
    

* Crop image

    This is done to remove the unneeded data regarding the sky and trees in the horizon, so we don't spend modeling effort on them.
    

* Resize image

    This is done to reduce the memory burdeon of the model.


In [2]:
def process_image(img):
	yuvImg = cv2.cvtColor(img, cv2.COLOR_BGR2YUV)
	croppedImg = yuvImg[60:150, 0:360]
	return cv2.resize(croppedImg, (64,64))

Since I have data originaing from both Linux and Windows machines, and the path format varies between these two environments, the process_files function takes the path and IsWindows as arguments. It also takes an additional argument to determine if the data should be flipped (horizontally) or not. This is done to reduce the dependency on the current track and its relative left curves.

Data from all three cameras on the car are used, and a steering correction of 0.25 is applied to the steering vector to compensate for the left and right camera.

In [3]:
def process_files(IsWindows=False, originalpath='data/', flip=False):
	lines = []
	with open(originalpath + 'driving_log.csv') as csvfile:
		reader = csv.reader(csvfile)
		next(reader)
		for line in reader:
			lines.append(line)
        
	car_camera_images = []
	steering_angles = []

	steering_correction = 0.25

	for line in lines:
		source_path = line[0]
		if IsWindows:
			drive, path_and_file = ntpath.splitdrive(source_path)
			path, filename = ntpath.split(path_and_file)
		else:
			filename = source_path.split('/')[-1]
			
		left_filename = filename.replace('center', 'left')
		right_filename = filename.replace('center', 'right')
		
		center_path = originalpath + 'IMG/' + filename
		left_path = originalpath + 'IMG/' + left_filename
		right_path = originalpath + 'IMG/' + right_filename
		
		image_center = cv2.imread(center_path)
		image_left = cv2.imread(left_path)
		image_right = cv2.imread(right_path)

		if flip:
			image_center = cv2.flip(image_center, 0)
			image_left = cv2.flip(image_left,0)
			image_right = cv2.flip(image_right,0)
		
		steering_angle = float(line[3])
		steering_left = steering_angle + steering_correction
		steering_right = steering_angle - steering_correction
		
		if flip:
			steering_angle = -steering_angle
			steering_left = -steering_left
			steering_right = -steering_right

		image_center = process_image(image_center)
		image_left = process_image(image_left)
		image_right = process_image(image_right)
		
		car_camera_images.extend([image_center, image_left, image_right])
		steering_angles.extend([steering_angle, steering_left, steering_right])
		
	return car_camera_images, steering_angles

Now the data is loaded and processed. There are 4 sets of data, the original data from Udacity, MyOwnData3 and MyOwnData4 which focus on curves, the bridge, and recovering from bad position on the road, and MyOwnData2 which is just a normal run as close to center as possible and flipped.

In [4]:
car_camera_images, steering_angles = process_files(IsWindows=False, originalpath='data/', flip=False)

car_camera_images_windows, steering_angles_windows = process_files(IsWindows=True, originalpath='MyOwnData3/', flip=False)
car_camera_images = car_camera_images + car_camera_images_windows
steering_angles = steering_angles + steering_angles_windows

car_camera_images_windows, steering_angles_windows = process_files(IsWindows=True, originalpath='MyOwnData4/', flip=False)
car_camera_images = car_camera_images + car_camera_images_windows
steering_angles = steering_angles + steering_angles_windows

car_camera_images_windows, steering_angles_windows = process_files(IsWindows=True, originalpath='MyOwnData2/', flip=True)
car_camera_images = car_camera_images + car_camera_images_windows
steering_angles = steering_angles + steering_angles_windows

X_train = np.array(car_camera_images)
y_train = np.array(steering_angles)

print('The shape of the image data is', X_train.shape)

The shape of the image data is (47250, 64, 64, 3)


## Neural Net model 
The model follows the structure presented before, and uses an "elu" activation instead of "relu" to help with smaller steering values.

In [None]:
from keras.models import Sequential
from keras.layers import Flatten, Dense, Lambda, Conv2D, Dropout, Cropping2D
from keras.layers.pooling import MaxPooling2D, AveragePooling2D

## Inspired from the NVIDIA model
model = Sequential()
model.add(Lambda(lambda x: (x/127.5) - 1, input_shape = (64,64,3)))
#model.add(Cropping2D(cropping=((60,10),(0,0))))
model.add(Conv2D(24, 5, 5, activation='elu', border_mode='valid'))
model.add(AveragePooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))

model.add(Conv2D(36, 5, 5, activation='elu', border_mode='valid'))
model.add(AveragePooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))

model.add(Conv2D(48, 3, 3, activation='elu', border_mode='valid'))
model.add(AveragePooling2D(pool_size=(2,2)))
model.add(Dropout(0.5))

model.add(Conv2D(64, 3, 3, activation='elu', border_mode='valid'))
model.add(AveragePooling2D(pool_size=(2,2)))
model.add(Dropout(0.5))

model.add(Flatten())
model.add(Dense(50, activation='elu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='elu'))
model.add(Dropout(0.5))
model.add(Dense(1))


### Compile and train the model
For this model, I chose the nadam optimizer instead of the adam optimizer, as it incorporates Nesterov momentum into Adam optimizer and has faster convergent [[1]](http://cs229.stanford.edu/proj2015/054_report.pdf)

In [None]:
model.compile(loss='mse', optimizer='nadam')
model.fit(X_train, y_train, validation_split=0.2, shuffle=True, nb_epoch=5)

### Save and plot the model

In [None]:
model.save('model.h5')
plot(model, to_file='model.png', show_shapes=True, show_layer_names=True)