## Color Classify

Now we'll try training a classifier on our dataset. First, we'll see how it does just using spatially binned color and color histograms.

To do this, we'll use the functions you defined in previous exercises, namely, `bin_spatial()`, `color_hist()`, and `extract_features()`. We'll then read in our car and non-car images, extract the color features for each, and scale the feature vectors to zero mean and unit variance.

**All that remains is to define a labels vector, shuffle and split the data into training and testing sets, and finally, define a classifier and train it!**

In [3]:
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
import cv2
import glob
import time
from sklearn.svm import LinearSVC
from sklearn.preprocessing import StandardScaler

# NOTE: the next import is only valid 
# for scikit-learn version <= 0.17
# if you are using scikit-learn >= 0.18 then use this:
# from sklearn.model_selection import train_test_split
from sklearn.model_selection import train_test_split

In [8]:
# Define a function to compute binned color features
def bin_spatial(img, size=(32,32)):
    # Use cv2.resize().ravel() to create the feature vector
    features = cv2.resize(img, size).ravel()
    return features

# Define a function to compute color histogram features
def color_hist(img, nbins=32, bins_range=(0, 256)):
    # Compute the histogram of the color channels separately
    channel1_hist = np.histogram(img[:,:,0], bins=nbins, range=bins_range)
    channel2_hist = np.histogram(img[:,:,1], bins=nbins, range=bins_range)
    channel3_hist = np.histogram(img[:,:,2], bins=nbins, range=bins_range)
    
    # Concatenate the histograms into a single feature vector
    hist_features = np.concatenate((channel1_hist[0], channel2_hist[0], channel3_hist[0]))
    
    return hist_features

# Define a function to extract features from a list of images
def extract_features(imgs, cspace='RGB', spatial_size=(32, 32),
                        hist_bins=32, hist_range=(0, 256)):
    # Create a list to append feature vectors to
    features = []
    # Iterate through the list of images
    for file in imgs:
        # Read in each one by one
        img = mpimg.imread(file)
        
        # apply color conversion if other than 'RGB'
        if cspace != 'RGB':
            if cspace == 'HSV':
                feature_image = cv2.cvtColor(img, cv2.COLOR_RGB2HSV)
            elif cspace == 'LUV':
                feature_image = cv2.cvtColor(img, cv2.COLOR_RGB2LUV)
            elif cspace == 'HLS':
                feature_image = cv2.cvtColor(img, cv2.COLOR_RGB2HLS)
            elif cspace == 'YUV':
                feature_image = cv2.cvtColor(img, cv2.COLOR_RGB2YUV)
            elif cspace == 'YCrCb':
                feature_image = cv2.cvtColor(img, cv2.COLOR_RGB2YCrCb)
        else: feature_image = np.copy(img)
            
        # Apply bin_spatial() to get spatial color features
        spatial_features = bin_spatial(feature_image, size=spatial_size)
        # print('Spatial Features: ', spatial_features)

        # Apply color_hist() to get color histogram features
        hist_features = color_hist(feature_image, nbins=hist_bins, bins_range=hist_range)
        # print('History Features:', hist_features)
        
        # Append the new feature vector to the features list
        features.append(np.concatenate((spatial_features,hist_features)))
    # Return list of feature vectors
    return features

In [9]:
cars = glob.glob('vehicles_smallset/**/*.jpeg')
notcars = glob.glob('non-vehicles_smallset/**/*.jpeg')

In [56]:
spatial = 20
histbin = 32

car_features = extract_features(cars, cspace='RGB', spatial_size=(spatial, spatial),
                        hist_bins=histbin, hist_range=(0, 256))
notcar_features = extract_features(notcars, cspace='RGB', spatial_size=(spatial, spatial),
                        hist_bins=histbin, hist_range=(0, 256))

In [57]:
# Create an Array Stack of Feature Vectors
X = np.vstack((car_features, notcar_features)).astype(np.float64)

# Fit a per-column scaler
X_scaler = StandardScaler().fit(X)

# Apply the scaler to X
scaled_X = X_scaler.transform(X)

# Define the Labels Vector
y = np.hstack((np.ones(len(car_features)), np.zeros(len(notcar_features))))

In [58]:
# Split the data into randomized training and test sets
rand_state = np.random.randint(0, 100)
X_train, X_test, y_train, y_test = train_test_split(scaled_X, y, test_size=0.2, random_state=rand_state)

In [59]:
print('Using spatial binning of:', spatial, 'and', histbin, 'histogram bins.')

Using spatial binning of: 20 and 32 histogram bins.


In [60]:
print('Feature Vector Length:', len(X_train[0]))

Feature Vector Length: 1296


### Scalar Vector Machines:

In [36]:
# User a Linear SVC
svc = LinearSVC()

# Check the training time for the SVC
t = time.time()
svc.fit(X_train, y_train)
t2 = time.time()
print(round(t2-t, 2), 'Seconds to train SVC...')

# Check the score of the SVC
print('Test Accuracy of SVC = ', round(svc.score(X_test, y_test), 4))

# Check the prediction time for a single sample
t=time.time()
n_predict = 10
print('My SVC predicts: ', svc.predict(X_test[0:n_predict]))
print('For these',n_predict, 'labels: ', y_test[0:n_predict])
t2 = time.time()
print(round(t2-t, 5), 'Seconds to predict', n_predict,'labels with SVC')

0.41 Seconds to train SVC...
Test Accuracy of SVC =  0.9871
My SVC predicts:  [ 1.  1.  1.  1.  1.  0.  0.  0.  0.  1.]
For these 10 labels:  [ 1.  1.  1.  1.  1.  0.  0.  0.  0.  1.]
0.00118 Seconds to predict 10 labels with SVC


#### Test Results:

| Spatial        | Histbin           | Feature Vectors | Test Accuracy  |
| -------------- |:-----------------:|:-----------------:| -----:|
| 32  | 32 | 3168 | 0.9742 |
| 20  | 20 | 1260 | 0.9914 |
| 10  | 10 | 330 | 0.9505 |
| 20  | 32 | 1296 | 0.9871 |

### Decision Trees:

In [61]:
from sklearn import tree

clf = tree.DecisionTreeClassifier()

# Check the training time for the Decision Tree
t = time.time()
clf.fit(X_train, y_train)
t2 = time.time()
print(round(t2-t, 2), 'Seconds to train Decision Tree...')

# Check the score of the Decision Tree
print('Test Accuracy of Decision Tree = ', round(clf.score(X_test, y_test), 4))

# Check the prediction time for a single sample
t=time.time()
n_predict = 10
print('My Decision Tree predicts: ', clf.predict(X_test[0:n_predict]))
print('For these',n_predict, 'labels: ', y_test[0:n_predict])
t2 = time.time()
print(round(t2-t, 5), 'Seconds to predict', n_predict,'labels with SVC')

1.04 Seconds to train Decision Tree...
Test Accuracy of Decision Tree =  0.972
My Decision Tree predicts:  [ 1.  0.  1.  0.  0.  0.  1.  1.  0.  1.]
For these 10 labels:  [ 1.  0.  1.  0.  0.  0.  1.  1.  0.  0.]
0.00086 Seconds to predict 10 labels with SVC


#### Test Results:

| Spatial        | Histbin           | Feature Vectors | Test Accuracy  |
| -------------- |:-----------------:|:-----------------:| -----:|
| 32  | 32 | 3168 | 0.9677 |
| 20  | 20 | 1260 | 0.9677 |
| 10  | 10 | 330 | 0.957 |
| 20  | 32 | 1296 | 0.972 |

---

##### Note: SVM seems to get better results the best results verses the Decision Tree. Particularly, when the `spatial` and `histbin` were are 20.