### Image Classification using SVM
With the SVM algorithm categorize images whether or not they are cars

![title](svm_class.jpg)

In [51]:
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
import cv2
import glob
import time
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

### Load Data
Load the images from a pre-defined dataset in which are already splitted the image with a car from those whithout

In [52]:
# Read in car and non-car images
cars = glob.glob('./vehicles/cars*/*.jpeg')
notcars = glob.glob('./non-vehicles/notcars*/*.jpeg')

print('Car Images: {}'.format(len(cars)))
print('Non-Car Images: {}'.format(len(notcars)))

Car Images: 1196
Non-Car Images: 1125


### Extract Features
Extract a Color histogram for each picture

* Convert the Image from RGB to HSV
* Extract a color Histogram for each color channel
* Combine the Histogram of the differents color channels
* Normalize the Histogram


In [53]:
# Define a function to compute color histogram features  
def color_hist(img_rgb, nbins=32, bins_range=(0, 256)):
    # Convert from RGB to HSV using cv2.cvtColor()
    img_hsv = cv2.cvtColor(img_rgb, cv2.COLOR_RGB2HSV)

    ### Take histograms in three channels
    ch_1 = np.histogram(img_hsv[:,:,0], bins=nbins, range=bins_range)
    ch_2 = np.histogram(img_hsv[:,:,1], bins=nbins, range=bins_range)
    ch_3 = np.histogram(img_hsv[:,:,2], bins=nbins, range=bins_range)
                        
    ### Concatenate the histograms into a single feature vector
    # Generating bin centers
    bin_edges = ch_1[1]
    bin_centers = (bin_edges[1:]  + bin_edges[0:len(bin_edges)-1])/2
                        
    # Normalize the result
    hist_features = np.concatenate((ch_1[0], ch_2[0], ch_3[0])).astype(np.float64)
    norm_features = hist_features / np.sum(hist_features)
    
    # Return the feature vector
    return norm_features

In [54]:
# Define a function to extract features from a list of images
# Have this function call color_hist()
def extract_features(imgs, hist_bins=32, hist_range=(0, 256)):
    # Create a list to append feature vectors to
    features = []
    # Iterate through the list of images
    for file in imgs:
        # Read in each one by one
        image = mpimg.imread(file)
        # Apply color_hist() 
        hist_features = color_hist(image, nbins=hist_bins, bins_range=hist_range)
        # Append the new feature vector to the features list
        features.append(hist_features)
    # Return list of feature vectors
    return features

### Hyperparameters of the SVM and Test Train Set
Shuffle the images in order to avoid undesiderable strange behaviours due to some randomy contiguous images and split the dataset of images into a train and test set.

The Train set will then be used to train the model and the test set will be used to test and validate the model on some unknown new data

In [58]:
# performs under different binning scenarios
histbin = 32

car_features = extract_features(cars, hist_bins=histbin, hist_range=(0, 256))
notcar_features = extract_features(notcars, hist_bins=histbin, hist_range=(0, 256))

# Create an array stack of feature vectors
X = np.vstack((car_features, notcar_features)).astype(np.float64)                        
# Fit a per-column scaler
X_scaler = StandardScaler().fit(X)
# Apply the scaler to X
scaled_X = X_scaler.transform(X)

# Define the labels vector
y = np.hstack((np.ones(len(car_features)), np.zeros(len(notcar_features))))


# Shuffle the images
rand_state = np.random.randint(0, 100)
# Split up data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(scaled_X, y, test_size=0.2, random_state=rand_state)

print('Dataset includes', len(cars), 'cars and', len(notcars), 'not-cars')
print('Using', histbin,'histogram bins')
print('Feature vector length:', len(X_train[0]))

Dataset includes 1196 cars and 1125 not-cars
Using 12 histogram bins
Feature vector length: 36


### Train the SVM Model
Using the Test set extracted in the step above, train an SVM model

In [59]:
# Use a linear SVC 
svc = SVC(kernel='rbf')

# Check the training time for the SVC
t = time.time()
svc.fit(X_train, y_train)
t2 = time.time()
print(round(t2-t, 2), 'Seconds to train SVC...')

0.03 Seconds to train SVC...


### Evaluate the Model
Comparing the prediction of the model on the Test set with the correct values, evaluates the score of the model

In [60]:
# Check the score of the SVC
print('Test Accuracy of SVC = ', round(svc.score(X_test, y_test), 4))
# Check the prediction time for a single sample
t=time.time()
n_predict = 10
print('My SVC predicts: ', svc.predict(X_test[0:n_predict]))
print('For these',n_predict, 'labels: ', y_test[0:n_predict])
t2 = time.time()
print(round(t2-t, 5), 'Seconds to predict', n_predict,'labels with SVC')

Test Accuracy of SVC =  0.9978
My SVC predicts:  [1. 0. 1. 0. 0. 0. 1. 1. 0. 1.]
For these 10 labels:  [1. 0. 1. 0. 0. 0. 1. 1. 0. 1.]
0.00299 Seconds to predict 10 labels with SVC
