# Model Training

## Detection Model

The classifier used is support vector machines. The classifier is trained on the HOG descriptors of the image and not on the image directly. This helps in identifying the distinct shapes of the traffic signs

In [19]:
from sklearn.model_selection import train_test_split,GridSearchCV,StratifiedKFold
from sklearn import svm
from sklearn.decomposition import PCA
from skimage.feature import hog
import numpy as np
import cv2
from sklearn.externals import joblib
import random
from sklearn.metrics import precision_score,recall_score,f1_score,classification_report

### Helper functions for preprocessing data

**extract_hog(image)** function extracts HOG features from image. The HOG features are extracted with the parameters of 8x8 pixel cells and 1 cell per block. There are 8 directions for the gradients as mentioned by the parameter orientations.

The images are then converted to grayscale.

In [2]:
def extract_hog(image):
    return hog(image, orientations=8, pixels_per_cell=(8, 8),
                    cells_per_block=(1, 1),visualise=False)

def convert_grayscale(image):
    return cv2.cvtColor(image,cv2.COLOR_RGB2GRAY)

#### Data Preprocessing

The images are loaded from the data files. The images are resized to a uniform 32x32. and converted to grayscale. The HOG features are extracted from all the images and stored in **X_hog**. Due to imbalance in the data, additional positive samples are created from recognition dataset.

In [3]:
X = np.load('storage/detect_data_file.npy')
y = np.load('storage/detect_labels_file.npy')
X,test_x = X[:6000],X[6000:]
y,test_y = y[:6000],y[6000:]

X_extra = np.load('storage/data_file.npy')
X_extra_format = []

for i in xrange(2000):
    index = random.randint(0,X_extra.shape[0])
    image = cv2.cvtColor(X_extra[index],cv2.COLOR_RGB2GRAY)
    image = cv2.resize(image,(32,32))
    X_extra_format.append(image)
    y = np.append(y,1)


X = [convert_grayscale(i) for i in X]
X = X+X_extra_format
X = [extract_hog(i) for i in X]
y = np.array(y)
test_x = [convert_grayscale(i) for i in test_x]
test_x = np.array([extract_hog(i) for i in test_x])
X_hog = np.array(X)
print X_hog.shape

(8000, 128)


#### Training the classifier
An SVM is trained using Gridsearch to obtain the best parameters for the classifier.
Crossvalidation is done using stratified Kfold because of the imbalance of the positive and negative examples. The SVM is trained with **probability** equal to true, as probability of prediction is needed in searching for best prediction in the image.

In [4]:

clf = svm.SVC(random_state=42,probability=True)
cv = StratifiedKFold(n_splits=5)
param_grid = {'kernel':['rbf','linear'],'C':[0.1,0.5,0.001,1]}
grid = GridSearchCV(clf,param_grid,cv=cv)
grid.fit(X_hog,y)
best_clf = grid.best_estimator_
print 'Detection Training Score: '
print grid.best_score_

Detection Training Score: 
0.990125


#### Saving the trained SVM classifier in a .sav file

In [6]:
clf_filename = 'storage/clf.sav'
joblib.dump(best_clf, clf_filename)
print 'Classifier Saved Successfully'

Classifier Saved Successfully


## Detection Testing

In [25]:
predictions = best_clf.predict(test_x)

print '\nRecall Score: ' + str(recall_score(test_y,predictions))

print '\nPrecision Score: '+ str(precision_score(test_y,predictions))

print '\nF1 Score: '+ str(f1_score(test_y,predictions))

print '\nClassification Report:\n'
print classification_report(test_y,predictions)



Recall Score: 0.985875706215

Precision Score: 0.961432506887

F1 Score: 0.97350069735

Classification Report:

             precision    recall  f1-score   support

          0       1.00      0.99      1.00      2130
          1       0.96      0.99      0.97       354

avg / total       0.99      0.99      0.99      2484



## Recognition Model

A multi-scale convolutional neural network is used for recognition of the traffic signs.

The CNN has a total of 2 convolutional stages. Each stage has two convolutional filters. The first has 32 filters while the second has 64 filters. Unlike traditional CNN's, the output of the 1st stage is branched out and fed to the classifier in addition to being inputs for the 2nd stage. The branched 1st-stage outputs are then subsampled once more so that it undergoes the same amount of subsampling (4x4) as the 2nd stage outputs.

The final classifier comprises of three layers:
1. Hidden fully connected layer with 256 neurons.
2. Dropout layer to avoid overfitting
3. Fully connected output layer with 43 neurons (Because there are 43 categories of traffic signs)

In [26]:
import pickle
import numpy

from keras.layers import Dense, Dropout, Flatten, merge
from keras.layers import Convolution2D, MaxPooling2D
from keras.utils import np_utils
from keras.layers import Input, Dense
from keras.models import Model
from keras.optimizers import SGD
from keras.models import model_from_json

import time


Using Theano backend.


In [101]:
testing_labels = 'datasets/carnd_dataset/test.p'

with open(testing_labels, mode='rb') as f:
    test = pickle.load(f)

X_train = numpy.load('storage/data_file.npy')
y_train = numpy.load('storage/labels_file.npy')
X_test = numpy.load('storage/test_data_file.npy')
y_test = test['labels']

X_train = numpy.array([convert_grayscale(i) for i in X_train])
X_test = numpy.array([convert_grayscale(i) for i in X_test])


print 'Training Data Shape: '
print X_train.shape
print 'Testing Data Shape: '
print X_test.shape
X_test = X_test.reshape(X_test.shape[0],1,32,32)
Y_test = np_utils.to_categorical(y_test,43)



Training Data Shape: 
(39209, 32, 32)
Testing Data Shape: 
(12630, 32, 32)


In [30]:
X_train = X_train.reshape(X_train.shape[0],1,32,32)
X_test = X_test.reshape(X_test.shape[0],1,32,32)
Y_train = np_utils.to_categorical(y_train,43)
Y_test = np_utils.to_categorical(y_test,43)

shuffle_data = [(i,j) for i,j in zip(X_train,Y_train)]

random.shuffle(shuffle_data)
X = []
y = []
shuffle_data = numpy.array(shuffle_data)
print shuffle_data.shape
for i in xrange(len(shuffle_data)):
    X.append(shuffle_data[i][0])
    y.append(shuffle_data[i][1])

X_train = numpy.array(X)
Y_train = numpy.array(y)

print 'Shuffling of data Done'

(39209, 2)
Shuffling of data Done


In [64]:
inputs = Input(shape=(1,32,32))

first_layer = Convolution2D(32,3,3,activation='relu')(inputs)
first_layer = Convolution2D(32,3,3,activation='relu')(first_layer)

first_p_layer = MaxPooling2D(pool_size=(2, 2))(first_layer)
drop_1 = Dropout(0.2)(first_p_layer)

second_p_layer = MaxPooling2D(pool_size=(2, 2))(drop_1)

first_input_layer = Flatten()(second_p_layer)

second_layer = Convolution2D(64,3,3,activation='relu')(drop_1)
second_layer = Convolution2D(64,3,3,activation='relu')(second_layer)

third_p_layer = MaxPooling2D(pool_size=(2, 2))(second_layer)
drop_2 = Dropout(0.2)(third_p_layer)

second_input_layer = Flatten()(drop_2)

input_layer = merge([first_input_layer,second_input_layer],mode='concat',concat_axis=1)
hidden_layer = Dense(256,activation='sigmoid')(input_layer)
drop = Dropout(0.5)(hidden_layer)
predictions = Dense(43,activation='softmax')(drop)

model = Model(input=inputs,output=predictions)

sgd = SGD(lr=0.01,decay=1e-6)

model.compile(optimizer=sgd,
                metrics=['categorical_accuracy'],
                loss='categorical_crossentropy')



model.fit(X_train,Y_train,nb_epoch=30,batch_size=32,verbose=2,validation_split=0.2)
time.sleep(0.1)

Train on 31367 samples, validate on 7842 samples
Epoch 1/30
158s - loss: 2.6437 - categorical_accuracy: 0.3334 - val_loss: 1.2158 - val_categorical_accuracy: 0.7335
Epoch 2/30
156s - loss: 0.9633 - categorical_accuracy: 0.7697 - val_loss: 0.5565 - val_categorical_accuracy: 0.8870
Epoch 3/30
158s - loss: 0.5547 - categorical_accuracy: 0.8765 - val_loss: 0.3287 - val_categorical_accuracy: 0.9480
Epoch 4/30
159s - loss: 0.3764 - categorical_accuracy: 0.9228 - val_loss: 0.2476 - val_categorical_accuracy: 0.9651
Epoch 5/30
159s - loss: 0.2864 - categorical_accuracy: 0.9447 - val_loss: 0.1732 - val_categorical_accuracy: 0.9737
Epoch 6/30
558s - loss: 0.2298 - categorical_accuracy: 0.9546 - val_loss: 0.1444 - val_categorical_accuracy: 0.9769
Epoch 7/30
160s - loss: 0.1936 - categorical_accuracy: 0.9633 - val_loss: 0.1226 - val_categorical_accuracy: 0.9784
Epoch 8/30
161s - loss: 0.1639 - categorical_accuracy: 0.9700 - val_loss: 0.0987 - val_categorical_accuracy: 0.9839
Epoch 9/30
161s - loss:

In [103]:
print "Saving..."
model_json = model.to_json()
with open("storage/model.json", "w") as json_file:
    json_file.write(model_json)
model.save_weights("storage/model.h5")
print("Saved model to disk")

Saving...
Saved model to disk


## Recognition Testing

In [107]:
json_file = open('storage/model.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)
loaded_model.load_weights("storage/model.h5")
print("Loaded model from disk\n")
print ("Testing: ")
sgd = SGD(lr=0.01,decay=1e-6)

loaded_model.compile(optimizer=sgd,
                metrics=['categorical_accuracy','precision','recall','fbeta_score'],
                loss='categorical_crossentropy')

score = loaded_model.evaluate(X_test,Y_test,verbose=1)
print '\nCategorical Accuracy Score : '+str(score[1])
print 'Precision Score : '+str(score[2])
print 'Recall Score : '+str(score[3])
print 'F1 Score : '+str(score[4])


Loaded model from disk

Testing: 

Categorical Accuracy Score : 0.971654790154
Precision Score : 0.984920771275
Recall Score : 0.962549485371
F1 Score : 0.973406097706
