# Detect

This notebook contains the necessary functions for searching for the traffic sign in an image. It makes use of a sliding window and finding the maximum probable window.

In [2]:
import numpy as np
from skimage.feature import hog
from skimage import io
from sklearn.externals import joblib
import cv2
from skimage.transform import pyramid_gaussian
import time
from keras.models import model_from_json
import matplotlib.patches as patches
import matplotlib.pyplot as plt

Using Theano backend.


### Helper Functions

#### This function implements the sliding window

The window has the shape of the parameter ** windowSize ** and moves with a stride of ** stepSize **. The lower and upper parts of the images are not searched, as it has been found that 98% of the images do not lie in those regions. This technique helps in increasing speed.

In [3]:
def sliding_window(img,windowSize,stepSize):
    y_low = 3*img.shape[0]/16
    y_high = 3*img.shape[0]/4
    for y in xrange(y_low,y_high,stepSize):
        for x in xrange(0,img.shape[1],stepSize):
            yield (x,y,img[y:y + windowSize[1],x:x + windowSize[0]])


#### This function extract HOG features

The HOG features are extracted with the parameters of 8x8 pixel cells and 1 cell per block. There are 8 directions for the gradients as mentioned by the parameter **orientations**.

In [4]:
def hog_extract(img):
    return hog(img, orientations=8, pixels_per_cell=(8, 8),
                    cells_per_block=(1, 1),visualise=False)

#### Function for predicting sign in given window

The CNN model is loaded from 'model.json' file, where it has already been saved. The model then takes in the **img** and outputs the predicted probabilities. The maximum probable sign is returned

In [5]:
def predict_sign(img):

    img = img.reshape(1,1,32,32)

    json_file = open('storage/model.json', 'r')
    loaded_model_json = json_file.read()
    json_file.close()
    loaded_model = model_from_json(loaded_model_json)
    # load weights into new model
    loaded_model.load_weights("storage/model.h5")
    print("Loaded model from disk")

    loaded_model.compile(optimizer='sgd',
                    metrics=['categorical_accuracy'],
                    loss='categorical_crossentropy')
    ans = np.argmax(loaded_model.predict(img))
    labels = np.load('storage/classes_file.npy')
    return (ans,labels[ans])

#### This function moves the sliding window and finds the best window

A gaussian image pyramid is created for each image. The image is downscaled by the value mentioned in the parameter **scale** and **max_layer** specifies the number of resized images present. A sliding window of shape **WinH,WinW** is used to find the best predicition.

In [6]:
def prediction(img,clf):

    best_window = None
    scale = 1.2
    (winW, winH) = (32, 32)
    best_pred = 0
    best_x,best_y = 0,0

    # Creating an image pyramid using pyramid_gaussian with a depth of three layers which each downscale equals 1.1
    for (i, resized) in enumerate(pyramid_gaussian(img,max_layer=2, downscale=scale)):

        # Break if scaled image is too small
        if resized.shape[0] < 30 or resized.shape[1] < 30:
            break


        for (x, y, window) in sliding_window(resized, stepSize=10, windowSize=(winW, winH)):

            if window.shape[0] != winH or window.shape[1] != winW:
                continue

            # A prediction is made for every window generated by the sliding window with confidence of prediction
            window = window.astype('float32')
            window_hog = hog_extract(cv2.cvtColor(window,cv2.COLOR_RGB2GRAY))
            window_hog = window_hog.reshape(1, -1)
            pred = clf.predict_proba(window_hog)
            if pred[0][1] > best_pred:
                clone = resized.copy()
                best_x = x
                best_y = y
                best_pred = pred[0][1]
                best_window = cv2.cvtColor(window,cv2.COLOR_RGB2GRAY)
    cv2.startWindowThread()
    cv2.rectangle(clone, (best_x, best_y), (best_x + winW, best_y + winH), (0, 255, 0), 2)
    cv2.imshow('Window',clone)
    cv2.waitKey(15)
    cv2.destroyWindow('Window')
    print 'The prediction is : '
    print predict_sign(best_window)


The image is loaded from the **file_path** and the detection and recognition is done for that image

In [7]:
file_path = 'datasets/FullIJCNN2013/00699.ppm'
img = cv2.imread(file_path)
clf = joblib.load('storage/clf.sav')
prediction(img,clf)