# Introduction

In this notebook, I want to continue working with the model form the experiment 1. The model was able to learn the steering angles for the three hand-picked images but the question is can it learn to actually steer the car in the simulator's autonomous mode. Given the discussion about recovery in the project material, it is unlikely that the provided sample training data is enough to teach the model to drive, but doing a test with that data would give at least a baseline to work from.

Here is the overall plan
1. Recreate the model from experiment 1
1. Create training data using the provided sample data
1. Train the model using the whole training data and see if it any learning takes place
1. If needed, tweak the model to get better training performance
1. Test the model with the simulator to see how it performs

Here are some utility functions.

In [17]:
import os
from PIL import Image

def get_record_and_image(index):
    record = df.iloc[index]
    path = os.path.join('data', record.center)
    return record, Image.open(path)

def layer_info(model):
    for n, layer in enumerate(model.layers, 1):
        print('Layer {:2} {:16} input shape {} output shape {}'.format(n, layer.name, layer.input_shape, layer.output_shape))

## Step 1: Recreate the model from experiment 1

This is an exact copy of the model from experiment 1 with one difference: the input image size is halved, because the images will be downscaled this time. The reason for the downscaling is explained in Step 2.

In [25]:
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D

model = Sequential()
model.add(Convolution2D(6, 5, 5, border_mode='valid', subsample=(5, 5), input_shape=(80, 160, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(16, 5, 5, border_mode='valid', subsample=(2, 2)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(120))
model.add(Activation('relu'))
model.add(Dense(84))
model.add(Activation('relu'))
model.add(Dense(1))
model.add(Activation('tanh'))

layer_info(model)

Layer  1 convolution2d_15 input shape (None, 80, 160, 3) output shape (None, 16, 32, 6)
Layer  2 activation_36    input shape (None, 16, 32, 6) output shape (None, 16, 32, 6)
Layer  3 maxpooling2d_13  input shape (None, 16, 32, 6) output shape (None, 8, 16, 6)
Layer  4 convolution2d_16 input shape (None, 8, 16, 6) output shape (None, 2, 6, 16)
Layer  5 activation_37    input shape (None, 2, 6, 16) output shape (None, 2, 6, 16)
Layer  6 maxpooling2d_14  input shape (None, 2, 6, 16) output shape (None, 1, 3, 16)
Layer  7 flatten_8        input shape (None, 1, 3, 16) output shape (None, 48)
Layer  8 dense_22         input shape (None, 48) output shape (None, 120)
Layer  9 activation_38    input shape (None, 120) output shape (None, 120)
Layer 10 dense_23         input shape (None, 120) output shape (None, 84)
Layer 11 activation_39    input shape (None, 84) output shape (None, 84)
Layer 12 dense_24         input shape (None, 84) output shape (None, 1)
Layer 13 activation_40    input shape

## Step 2: Create training set

In [21]:
import numpy as np
import pandas as pd

df = pd.read_csv('data/driving_log.csv')

Now I need to create the actual training data, X_train and y_train. I will just read all the images and store them as NumPy arrays to X_train. Similary, I read the corresponding steering angles and store them to y_train.

Note: I ended up scaling the images down to half size to conserve memory and speed up training. This was also mentioned in the project cheat sheet (https://carnd-forums.udacity.com/questions/26214464/behavioral-cloning-cheatsheet).

In [22]:
from tqdm import tqdm

X_train = []
y_train = []
for i in tqdm(range(len(df))):
    record, image = get_record_and_image(i)
    image = image.resize((image.width // 2, image.height // 2))
    X_train.append(np.array(image))
    image.close()
    y_train.append(record['steering'])
    

100%|██████████| 8036/8036 [00:40<00:00, 200.22it/s]


Some preprocessing: normalize the images and convert the y_train to a NumPy array because that is what the Keras fit() seems to want. This step takes some time and consumes also a lot of memory; downscaling the images above helps.

In [23]:
X_min = np.min(X_train)
X_max = np.max(X_train)
X_normalized = (X_train - X_min) / (X_max - X_min) - 0.5
y_train = np.array(steering_angles)

## Step 3: Train the model

Here I use all the data from the sample training data, 8036 images and their steering angles. Instead of using the training data generator as in the experiment 1, I just give the whole training set to model.fit and let it split it to training and validation sets. After training, I save the model so it can be loaded to the simulator for testing if the training seems to proceed well.

In [26]:
model.compile('adam', 'mse')
history = model.fit(X_normalized, y_train, validation_split=0.2, nb_epoch=10, verbose=2)
model.save('model.h5')

Train on 6428 samples, validate on 1608 samples
Epoch 1/10
15s - loss: 0.0129 - val_loss: 0.0123
Epoch 2/10
11s - loss: 0.0102 - val_loss: 0.0115
Epoch 3/10
11s - loss: 0.0092 - val_loss: 0.0116
Epoch 4/10
11s - loss: 0.0088 - val_loss: 0.0105
Epoch 5/10
11s - loss: 0.0082 - val_loss: 0.0106
Epoch 6/10
11s - loss: 0.0077 - val_loss: 0.0100
Epoch 7/10
11s - loss: 0.0074 - val_loss: 0.0112
Epoch 8/10
11s - loss: 0.0071 - val_loss: 0.0109
Epoch 9/10
11s - loss: 0.0069 - val_loss: 0.0111
Epoch 10/10
11s - loss: 0.0064 - val_loss: 0.0109


The validator error does not get much lower after epoch 4, whereas the training error keeps falling. This insicates that there is some overtraining going on and the generalization performance is likely poor.

Lets do a bit of random sampling of the predicted steering angles to get a feeling how they match with the actual angles.

In [28]:
from random import randrange

for i in range(10):
    index = randrange(len(df))
    X = np.expand_dims(X_normalized[index], axis=0)
    y = y_train[index]
    print('Actual steering angle {} model prediction {}'.format(y, model.predict(X)[0][0]))

Actual steering angle 0.07132844 model prediction 0.07661744207143784
Actual steering angle 0.1765823 model prediction 0.11375436931848526
Actual steering angle -0.11672329999999999 model prediction 0.004589824937283993
Actual steering angle 0.0 model prediction -0.005036316346377134
Actual steering angle 0.3583844 model prediction 0.11509314179420471
Actual steering angle 0.07132844 model prediction 0.07235338538885117
Actual steering angle 0.0 model prediction -0.013521449640393257
Actual steering angle -0.05975719 model prediction -0.08059648424386978
Actual steering angle -0.08824026 model prediction -0.04176517203450203
Actual steering angle -0.002791043 model prediction 0.015210249461233616


The sample do not look very good.

In [19]:
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Flatten, Dropout
from keras.layers import Convolution2D, MaxPooling2D

model = Sequential()
model.add(Convolution2D(6, 5, 5, border_mode='valid', subsample=(5, 5), input_shape=(80, 160, 3)))
model.add(Dropout(0.5))
model.add(Activation('relu'))
#model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(16, 5, 5, border_mode='valid'))
model.add(Dropout(0.5))
model.add(Activation('relu'))
#model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(120))
model.add(Activation('relu'))
model.add(Dense(84))
model.add(Activation('relu'))
model.add(Dense(1))
model.add(Activation('tanh'))

layer_info(model)

Layer  1 convolution2d_11 input shape (None, 80, 160, 3) output shape (None, 16, 32, 6)
Layer  2 dropout_1        input shape (None, 16, 32, 6) output shape (None, 16, 32, 6)
Layer  3 activation_26    input shape (None, 16, 32, 6) output shape (None, 16, 32, 6)
Layer  4 convolution2d_12 input shape (None, 16, 32, 6) output shape (None, 12, 28, 16)
Layer  5 dropout_2        input shape (None, 12, 28, 16) output shape (None, 12, 28, 16)
Layer  6 activation_27    input shape (None, 12, 28, 16) output shape (None, 12, 28, 16)
Layer  7 flatten_6        input shape (None, 12, 28, 16) output shape (None, 5376)
Layer  8 dense_16         input shape (None, 5376) output shape (None, 120)
Layer  9 activation_28    input shape (None, 120) output shape (None, 120)
Layer 10 dense_17         input shape (None, 120) output shape (None, 84)
Layer 11 activation_29    input shape (None, 84) output shape (None, 84)
Layer 12 dense_18         input shape (None, 84) output shape (None, 1)
Layer 13 activation