# Image features exercise
*Complete and hand in this completed worksheet (including its outputs and any supporting code outside of the worksheet) with your assignment submission. For more details see the [assignments page](http://vision.stanford.edu/teaching/cs231n/assignments.html) on the course website.*

We have seen that we can achieve reasonable performance on an image classification task by training a linear classifier on the pixels of the input image. In this exercise we will show that we can improve our classification performance by training linear classifiers not on raw pixels but on features that are computed from the raw pixels.

All of your work for this exercise will be done in this notebook.

In [1]:
import random
import numpy as np
from cs231n.data_utils import load_CIFAR10
import matplotlib.pyplot as plt

from __future__ import print_function

%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

# for auto-reloading extenrnal modules
# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
%load_ext autoreload
%autoreload 2

## Load MNIST data

In [2]:
import numpy as np

from sklearn.utils import shuffle
from sklearn.datasets import fetch_mldata

mnist = fetch_mldata('MNIST original', data_home='.')
mnist.data = mnist.data/255.0*2 - 1

X_train = mnist.data[:60000]
y_train = mnist.target[:60000].astype(np.uint8)

X_val  = mnist.data[60000:]
y_val  = mnist.target[60000:].astype(np.uint8)

X_test  = mnist.data[60000:]
y_test  = mnist.target[60000:].astype(np.uint8)

X_train = X_train.reshape(X_train.shape[0], 28, 28)
X_val = X_val.reshape(X_val.shape[0], 28, 28) 
X_test = X_test.reshape(X_test.shape[0], 28, 28) 


print('Train data shape: ', X_train.shape)
print('Train labels shape: ', y_train.shape)
print('Validation data shape: ', X_val.shape)
print('Validation labels shape: ', y_val.shape)
print('Test data shape: ', X_test.shape)
print('Test labels shape: ', y_test.shape)

Train data shape:  (60000, 28, 28)
Train labels shape:  (60000,)
Validation data shape:  (10000, 28, 28)
Validation labels shape:  (10000,)
Test data shape:  (10000, 28, 28)
Test labels shape:  (10000,)


## Extract Features
For each image we will compute a Histogram of Oriented
Gradients (HOG) as well as a color histogram using the hue channel in HSV
color space. We form our final feature vector for each image by concatenating
the HOG and color histogram feature vectors.

Roughly speaking, HOG should capture the texture of the image while ignoring
color information, and the color histogram represents the color of the input
image while ignoring texture. As a result, we expect that using both together
ought to work better than using either alone. Verifying this assumption would
be a good thing to try for the bonus section.

The `hog_feature` and `color_histogram_hsv` functions both operate on a single
image and return a feature vector for that image. The extract_features
function takes a set of images and a list of feature functions and evaluates
each feature function on each image, storing the results in a matrix where
each column is the concatenation of all feature vectors for a single image.

In [3]:
from cs231n.features import *

num_color_bins = 10 # Number of bins in the color histogram
feature_fns = [hog_feature] #, lambda img: color_histogram_hsv(img, nbin=num_color_bins)]
X_train_feats = extract_features(X_train, feature_fns, verbose=True)
X_val_feats = extract_features(X_val, feature_fns)
X_test_feats = extract_features(X_test, feature_fns)

# Preprocessing: Subtract the mean feature
mean_feat = np.mean(X_train_feats, axis=0, keepdims=True)
X_train_feats -= mean_feat
X_val_feats -= mean_feat
X_test_feats -= mean_feat

# Preprocessing: Divide by standard deviation. This ensures that each feature
# has roughly the same scale.
std_feat = np.std(X_train_feats, axis=0, keepdims=True)
X_train_feats /= std_feat
X_val_feats /= std_feat
X_test_feats /= std_feat

# Preprocessing: Add a bias dimension
X_train_feats = np.hstack([X_train_feats, np.ones((X_train_feats.shape[0], 1))])
X_val_feats = np.hstack([X_val_feats, np.ones((X_val_feats.shape[0], 1))])
X_test_feats = np.hstack([X_test_feats, np.ones((X_test_feats.shape[0], 1))])

Done extracting features for 1000 / 60000 images
Done extracting features for 2000 / 60000 images
Done extracting features for 3000 / 60000 images
Done extracting features for 4000 / 60000 images
Done extracting features for 5000 / 60000 images
Done extracting features for 6000 / 60000 images
Done extracting features for 7000 / 60000 images
Done extracting features for 8000 / 60000 images
Done extracting features for 9000 / 60000 images
Done extracting features for 10000 / 60000 images
Done extracting features for 11000 / 60000 images
Done extracting features for 12000 / 60000 images
Done extracting features for 13000 / 60000 images
Done extracting features for 14000 / 60000 images
Done extracting features for 15000 / 60000 images
Done extracting features for 16000 / 60000 images
Done extracting features for 17000 / 60000 images
Done extracting features for 18000 / 60000 images
Done extracting features for 19000 / 60000 images
Done extracting features for 20000 / 60000 images
Done extr

## Neural Network on image features
Earlier in this assigment we saw that training a two-layer neural network on raw pixels achieved better classification performance than linear classifiers on raw pixels. In this notebook we have seen that linear classifiers on image features outperform linear classifiers on raw pixels. 

For completeness, we should also try training a neural network on image features. This approach should outperform all previous approaches: you should easily be able to achieve over 55% classification accuracy on the test set; our best model achieves about 60% classification accuracy.

In [4]:
print(X_train_feats.shape)

(60000, 82)


In [5]:
from cs231n.classifiers.neural_net import TwoLayerNet

input_dim = X_train_feats.shape[1]
hidden_dim = 500
num_classes = 10

net = TwoLayerNet(input_dim, hidden_dim, num_classes)
best_net = None

################################################################################
# TODO: Train a two-layer neural network on image features. You may want to    #
# cross-validate various parameters as in previous sections. Store your best   #
# model in the best_net variable.                                              #
################################################################################
stats = net.train(X_train_feats, y_train, X_val_feats, y_val,
            num_iters=3000, batch_size=200,
            learning_rate=1, learning_rate_decay=0.95,
            reg=0.0, verbose=True)


val_acc = (net.predict(X_val_feats) == y_val).mean()
print('Validation accuracy: ', val_acc)
################################################################################
#                              END OF YOUR CODE                                #
################################################################################

iteration 0 / 3000: loss 2.302585
iteration 100 / 3000: loss 0.210294
iteration 200 / 3000: loss 0.136146
iteration 300 / 3000: loss 0.258663
iteration 400 / 3000: loss 0.168198
iteration 500 / 3000: loss 0.157242
iteration 600 / 3000: loss 0.089668
iteration 700 / 3000: loss 0.118887
iteration 800 / 3000: loss 0.097347
iteration 900 / 3000: loss 0.063540
iteration 1000 / 3000: loss 0.074662
iteration 1100 / 3000: loss 0.064815
iteration 1200 / 3000: loss 0.062773
iteration 1300 / 3000: loss 0.111336
iteration 1400 / 3000: loss 0.100987
iteration 1500 / 3000: loss 0.062335
iteration 1600 / 3000: loss 0.056612
iteration 1700 / 3000: loss 0.079014
iteration 1800 / 3000: loss 0.028705
iteration 1900 / 3000: loss 0.014633
iteration 2000 / 3000: loss 0.096229
iteration 2100 / 3000: loss 0.038555
iteration 2200 / 3000: loss 0.040437
iteration 2300 / 3000: loss 0.060819
iteration 2400 / 3000: loss 0.035447
iteration 2500 / 3000: loss 0.062738
iteration 2600 / 3000: loss 0.059936
iteration 270

In [6]:
# Run your neural net classifier on the test set. You should be able to
# get more than 55% accuracy.

test_acc = (net.predict(X_test_feats) == y_test).mean()
print(test_acc)

0.9688


# Result

We get better accuracy on MNIST as well with features