# How to  

1. Model Creation in Keras
2. Save in "model.h5" file
3. Run "python drive.py model.h5"
4. Simulation with autonomous mode
5. Recording Video "python drive.py model.h5 run1"
6. Create Video "python video.py run1"

### Validating Your Network  

In order to validate your network, you'll want to compare model performance on the training set and a validation set. The validation set should contain image and steering data that was not used for training. A rule of thumb could be to use 80% of your data for training and 20% for validation or 70% and 30%. Be sure to randomly shuffle the data before splitting into training and validation sets.  

If model predictions are poor on both the training and validation set (for example, mean squared error is high on both), then this is evidence of underfitting. Possible solutions could be to  

    - increase the number of epochs  
    - add more convolutions to the network.  

When the model predicts well on the training set but poorly on the validation set (for example, low mean squared error for training set, high mean squared error for validation set), this is evidence of overfitting. If the model is overfitting, a few ideas could be to  

    - use dropout or pooling layers  
    - use fewer convolution or fewer fully connected layers
    - collect more data or further augment the data set   

Ideally, the model will make good predictions on both the training and validation sets. The implication is that when the network sees an image, it can successfully predict what angle was being driven at that moment.  

If your model has low mean squared error on the training and validation sets but is driving off the track, this could be because of the data collection process. It's important to feed the network examples of good driving behavior so that the vehicle stays in the center and recovers when getting too close to the sides of the road.  



### Collecting Enough Data  
  
How do you know when you have collected enough data?  
Machine learning involves trying out ideas and testing them to see if they work.   
If the model is over or underfitting, then try to figure out why and adjust accordingly.  

Since this model outputs a single continuous numeric value, one appropriate error metric would be mean squared error.  
If the mean squared error is:  
- **high on both** a training and validation set: Underfitting (Increase layers/parametors)  
- **low on a training set** but **high on a validation set**: Overfitting (Collecting more data)  
- **low on both** the training and validation sets, but the car is falling off the track?  

Try to figure out the cases where the vehicle is falling off the track. Does it occur only on turns? Then maybe it's important to collect more turning data. The vehicle's driving behavior is only as good as the behavior of the driver who provided the data.  

Here are some general guidelines for data collection:  

    - two or three laps of center lane driving
    - one lap of recovery driving from the sides
    - one lap focusing on driving smoothly around curves




In [1]:
import cv2
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import csv
%matplotlib inline

# Load Training Data of Normal Driving
lines = []
with open('./DrivingData/driving_log.csv') as csvfile:
    reader = csv.reader(csvfile)
    for line in reader:
        lines.append(line)
        
images = []
measurements = []
for line in lines:
    source_path = line[0]
    filename = source_path.split('/')[-1]
    current_path = './DrivingData/IMG/' + filename
    image = plt.imread(current_path)
    images.append(image)
    
    measurement = float(line[3])
    measurements.append(measurement)

# Load Training Data of JigZag Recovering Driving
lines_aug = []
with open('./DrivingData_Aug/driving_log.csv') as csvfile:
    reader = csv.reader(csvfile)
    for line in reader:
        lines_aug.append(line)

for line in lines_aug:
    source_path = line[0]
    filename = source_path.split('/')[-1]
    current_path = './DrivingData_Aug/IMG/' + filename
    image = plt.imread(current_path)
    images.append(image)
    
    measurement = float(line[3])
    measurements.append(measurement)

"""
# Augment Curving Data
images_add_curv, measurements_add_curv = [], []
for image, measurement in zip(images, measurements):
    images_add_curv.append(image)
    measurements_add_curv.append(measurement)    
    if (measurement < -0.02 and measurement > -0.1) or (measurement > 0.02 and measurement < 0.1):
        images_add_curv.append(image)
        measurements_add_curv.append(measurement)
"""

# Augment Lateral Data by Horizontal Flip
augmented_images, augmented_measurements = [], []
#for image, measurement in zip(images_add_curv, measurements_add_curv):
for image, measurement in zip(images, measurements):
    augmented_images.append(image)
    augmented_measurements.append(measurement)
    augmented_images.append(cv2.flip(image,1))     # <=== vertical(1) flip
    augmented_measurements.append(measurement * -1.0)


augmented_images_ = np.array(augmented_images)
augmented_measurements_ = np.array(augmented_measurements)

#Train/validation/test splits
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(augmented_images_, augmented_measurements_, test_size=0.10)


print('original           images: ',len(measurements))
#print('original + curving images: ',len(measurements_add_curv))
print(X_train.shape, y_train.shape)
print(X_test.shape, y_test.shape)


original           images:  8908
(16034, 160, 320, 3) (16034,)
(1782, 160, 320, 3) (1782,)


In [2]:
from keras.models import Sequential
from keras.layers import Cropping2D, Dense, Dropout, Flatten, Activation, Conv2D, MaxPooling2D, Lambda

model = Sequential()
model.add(Lambda(lambda x: x / 255.0 - 0.5, input_shape=(160, 320, 3)))
model.add(Cropping2D(cropping=((50,20),(0,0))))   # <=== ((from top row, from bottom row),(from left column, from right column))

model.add(Conv2D(8, (3, 3), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(8, (3, 3), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

"""
model.add(Conv2D(64, (3, 3), padding='same'))
model.add(Activation('relu'))

model.add(Conv2D(8, (3, 3), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
"""
model.add(Flatten())
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))


model.compile(loss='mse', optimizer='adam')
model.fit(X_train, y_train, validation_split=0.1, shuffle=True, epochs=5, verbose=1)

score = model.evaluate(X_test, y_test, verbose=0)
print('Test loss: {:.4f}'.format(score))

model.save('model.h5')

Using TensorFlow backend.


Train on 14430 samples, validate on 1604 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Test loss: 0.0670
