# Summary
This notebook represents a summary of the assignment1 and includes the knn, svm, softmax, two layer net classifiers with the best results achieved on the classification problem for CIFAR-10 dataset.

In [1]:
import random
import numpy as np
from cs231n.data_utils import load_CIFAR10
import matplotlib.pyplot as plt

from __future__ import print_function

%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

# for auto-reloading extenrnal modules
# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
%load_ext autoreload
%autoreload 2

## Load data
Similar to previous exercises, we will load CIFAR-10 data from disk.

In [2]:
from cs231n.features import color_histogram_hsv, hog_feature

def get_CIFAR10_data(num_training=49000, num_validation=1000, num_test=1000):
    # Load the raw CIFAR-10 data
    cifar10_dir = 'cs231n/datasets/cifar-10-batches-py'

    X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)
    
    return X_train, y_train, X_test, y_test

# Cleaning up variables to prevent loading data multiple times (which may cause memory issue)
try:
   del X_train, y_train
   del X_test, y_test
   print('Clear previously loaded data.')
except:
   pass

X_train, y_train, X_test, y_test = get_CIFAR10_data()

# As a sanity check, we print out the size of the training and test data.
print('Training data type: ', type(X_train))
print('Training data shape: ', X_train.shape)
print('Training labels shape: ', y_train.shape)
print('Test data type: ', type(X_test))
print('Test data shape: ', X_test.shape)
print('Test labels shape: ', y_test.shape)

Training data type:  <type 'numpy.ndarray'>
Training data shape:  (50000, 32, 32, 3)
Training labels shape:  (50000,)
Test data type:  <type 'numpy.ndarray'>
Test data shape:  (10000, 32, 32, 3)
Test labels shape:  (10000,)


## Raw pixels as image features

In [3]:
# Preprocessing: reshape the image data into rows
X_train_raw_pixels = np.reshape(X_train, (X_train.shape[0], -1))
X_test_raw_pixels = np.reshape(X_test, (X_test.shape[0], -1))

# Preprocessing: subtract the mean image
# first: compute the image mean based on the training data
mean_image = np.mean(X_train_raw_pixels, axis=0)

# second: subtract the mean image from train and test data
X_train_raw_pixels -= mean_image
X_test_raw_pixels -= mean_image

# third: append the bias dimension of ones (i.e. bias trick) so that our SVM
# only has to worry about optimizing a single weight matrix W.
X_train_raw_pixels = np.hstack([X_train_raw_pixels, np.ones((X_train_raw_pixels.shape[0], 1))])
X_test_raw_pixels = np.hstack([X_test_raw_pixels, np.ones((X_test_raw_pixels.shape[0], 1))])

# As a sanity check, print out the shapes of the data
print('X_train_raw_pixels shape: ', X_train_raw_pixels.shape)
print('X_test_raw_pixels shape: ', X_test_raw_pixels.shape)

X_train_raw_pixels shape:  (50000, 3073)
X_test_raw_pixels shape:  (10000, 3073)


## Extract image features (HOG + hue)

In [4]:
from cs231n.features import *

num_color_bins = 10 # Number of bins in the color histogram
feature_fns = [hog_feature, lambda img: color_histogram_hsv(img, nbin=num_color_bins)]
X_train_feats = extract_features(X_train, feature_fns, verbose=True)
X_test_feats = extract_features(X_test, feature_fns)

# Preprocessing: Subtract the mean feature
mean_feat = np.mean(X_train_feats, axis=0, keepdims=True)
X_train_feats -= mean_feat
X_test_feats -= mean_feat

# Preprocessing: Divide by standard deviation. This ensures that each feature
# has roughly the same scale.
std_feat = np.std(X_train_feats, axis=0, keepdims=True)
X_train_feats /= std_feat
X_test_feats /= std_feat

# Preprocessing: Add a bias dimension
X_train_feats = np.hstack([X_train_feats, np.ones((X_train_feats.shape[0], 1))])
X_test_feats = np.hstack([X_test_feats, np.ones((X_test_feats.shape[0], 1))])

Done extracting features for 1000 / 50000 images
Done extracting features for 2000 / 50000 images
Done extracting features for 3000 / 50000 images
Done extracting features for 4000 / 50000 images
Done extracting features for 5000 / 50000 images
Done extracting features for 6000 / 50000 images
Done extracting features for 7000 / 50000 images
Done extracting features for 8000 / 50000 images
Done extracting features for 9000 / 50000 images
Done extracting features for 10000 / 50000 images
Done extracting features for 11000 / 50000 images
Done extracting features for 12000 / 50000 images
Done extracting features for 13000 / 50000 images
Done extracting features for 14000 / 50000 images
Done extracting features for 15000 / 50000 images
Done extracting features for 16000 / 50000 images
Done extracting features for 17000 / 50000 images
Done extracting features for 18000 / 50000 images
Done extracting features for 19000 / 50000 images
Done extracting features for 20000 / 50000 images
Done extr

## KNN classifier

In [5]:
from cs231n.classifiers import KNearestNeighbor

In [6]:
# kNN classifier
# Image features: raw pixels
best_k = 10

knn = KNearestNeighbor()
knn.train(X_train_raw_pixels, y_train)
test_acc = np.mean(knn.predict(X_test_raw_pixels, k=best_k) == y_test)

print('kNN classifier (Image features: raw pixels)')
print('best_k: %d' % (best_k))
print('test accuracy: %f' % (test_acc))

kNN classifier (Image features: raw pixels)
best_k: 10
test accuracy: 0.338600


In [7]:
# kNN classifier
# Image features: HOG + hue
best_k = 13

knn = KNearestNeighbor()
knn.train(X_train_feats, y_train)
test_acc = np.mean(knn.predict(X_test_feats, k=best_k) == y_test)

print('kNN classifier (Image features: HOG + hue)')
print('best_k: %d' % (best_k))
print('test accuracy: %f' % (test_acc))

kNN classifier (Image features: HOG + hue)
best_k: 13
test accuracy: 0.443100


## Softmax classifier

In [8]:
from cs231n.classifiers import Softmax

In [9]:
# Softmax
# Image features: raw pixels
np.random.seed(0)

lr = 5e-07
reg = 3.214286e+04
batch_size = 200
num_iters = 1500

softmax = Softmax()
softmax.train(X_train_raw_pixels, y_train, 
              learning_rate=lr, reg=reg, 
              num_iters=1500)
        
train_acc = np.mean(softmax.predict(X_train_raw_pixels) == y_train)
test_acc = np.mean(softmax.predict(X_test_raw_pixels) == y_test)

print('Softmax classifier (Image features: raw pixels)')
print('lr: %e' % (lr))
print('reg: %e' % (reg))
print('batch_size: %d' % (batch_size))
print('num_iters: %d' % (num_iters))
print('training accuracy: %f' % (train_acc))
print('test accuracy: %f' % (test_acc))

Softmax classifier (Image features: raw pixels)
lr: 5.000000e-07
reg: 3.214286e+04
batch_size: 200
num_iters: 1500
training accuracy: 0.315240
test accuracy: 0.320800


In [10]:
# Softmax
# Image features: HOG + hue
np.random.seed(0)

lr = 5e-07
reg = 1e+04
batch_size = 200
num_iters = 1500

softmax = Softmax()
softmax.train(X_train_feats, y_train, 
              learning_rate=lr, reg=reg, 
              num_iters=1500)

train_acc = np.mean(softmax.predict(X_train_feats) == y_train)
test_acc = np.mean(softmax.predict(X_test_feats) == y_test)

print('Softmax classifier (Image features: HOG + hue)')
print('lr: %e' % (lr))
print('reg: %e' % (reg))
print('batch_size: %d' % (batch_size))
print('num_iters: %d' % (num_iters))
print('training accuracy: %f' % (train_acc))
print('test accuracy: %f' % (test_acc))

Softmax classifier (Image features: HOG + hue)
lr: 5.000000e-07
reg: 1.000000e+04
batch_size: 200
num_iters: 1500
training accuracy: 0.415340
test accuracy: 0.409300


## SVM

In [11]:
from cs231n.classifiers.linear_classifier import LinearSVM

In [12]:
# SVM
# Image features: raw pixels
np.random.seed(0)

lr = 1.258925e-07
reg = 1.467799e+04
batch_size = 2000
num_iters = 1500

svm = LinearSVM()
svm.train(X_train_raw_pixels, y_train, 
          learning_rate=lr, reg=reg, 
          batch_size=batch_size,
          num_iters=num_iters)

train_acc = np.mean(svm.predict(X_train_raw_pixels) == y_train)
test_acc = np.mean(svm.predict(X_test_raw_pixels) == y_test)

print('SVM (Image features: raw pixels)')
print('lr: %e' % (lr))
print('reg: %e' % (reg))
print('batch_size: %d' % (batch_size))
print('num_iters: %d' % (num_iters))
print('training accuracy: %f' % (train_acc))
print('test accuracy: %f' % (test_acc))

SVM (Image features: raw pixels)
lr: 1.258925e-07
reg: 1.467799e+04
batch_size: 2000
num_iters: 1500
training accuracy: 0.384380
test accuracy: 0.376200


In [13]:
# SVM
# Image features: HOG + hue
np.random.seed(0)

lr = 1e-08
reg = 1e+06
batch_size = 200
num_iters = 3000

svm = LinearSVM()
svm.train(X_train_feats, y_train, 
          learning_rate=lr, reg=reg, 
          batch_size=batch_size,
          num_iters=num_iters)

train_acc = np.mean(svm.predict(X_train_feats) == y_train)
test_acc = np.mean(svm.predict(X_test_feats) == y_test)

print('SVM (Image features: HOG + hue)')
print('lr: %e' % (lr))
print('reg: %e' % (reg))
print('batch_size: %d' % (batch_size))
print('num_iters: %d' % (num_iters))
print('training accuracy: %f' % (train_acc))
print('test accuracy: %f' % (test_acc))

SVM (Image features: HOG + hue)
lr: 1.000000e-08
reg: 1.000000e+06
batch_size: 200
num_iters: 3000
training accuracy: 0.415040
test accuracy: 0.411100


## Neural Network

In [14]:
from cs231n.classifiers.neural_net import TwoLayerNet

In [15]:
# Preprocessing: Remove the bias dimension
# Make sure to run this cell only ONCE
print(X_train_raw_pixels.shape)
X_train_raw_pixels = X_train_raw_pixels[:, :-1]
X_test_raw_pixels = X_test_raw_pixels[:, :-1]

print(X_train_raw_pixels.shape)

(50000, 3073)
(50000, 3072)


In [16]:
# Preprocessing: Remove the bias dimension
# Make sure to run this cell only ONCE
print(X_train_feats.shape)
X_train_feats = X_train_feats[:, :-1]
X_test_feats = X_test_feats[:, :-1]

print(X_train_feats.shape)

(50000, 155)
(50000, 154)


In [17]:
# Two layer net
# Image features: raw pixels
np.random.seed(0)

input_dim = X_train_raw_pixels.shape[1]
num_classes = 10
hidden_dim = 100
lr_decay = 0.95
lr = 1e-03
reg = 1
batch_size = 200
num_iters = 50 * 1000

net = TwoLayerNet(input_dim, hidden_dim, num_classes)
net.train(X_train_raw_pixels, y_train, 
          X_test_raw_pixels[range(50)], y_test[range(50)],
          num_iters=num_iters,
          batch_size=batch_size,
          learning_rate=lr, 
          learning_rate_decay=lr_decay,
          reg=reg, verbose=False)
            
train_acc = np.mean(net.predict(X_train_raw_pixels) == y_train)
test_acc = np.mean(net.predict(X_test_raw_pixels) == y_test)
            
print('Two layer net (Image features: raw pixels)')
print('hidden_dim: %d' % (hidden_dim))
print('lr_decay: %f' % (lr_decay))
print('lr: %e' % (lr))
print('reg: %e' % (reg))
print('batch_size: %d' % (batch_size))
print('num_iters: %d' % (num_iters))
print('training accuracy: %f' % (train_acc))
print('test accuracy: %f' % (test_acc))

Two layer net (Image features: raw pixels)
hidden_dim: 100
lr_decay: 0.950000
lr: 1.000000e-03
reg: 1.000000e+00
batch_size: 200
num_iters: 50000
training accuracy: 0.554100
test accuracy: 0.517100


In [18]:
# Two layer net
# Image features: HOG + hue
np.random.seed(0)

input_dim = X_train_feats.shape[1]
num_classes = 10
hidden_dim = 100
lr_decay = 0.95
lr = 0.1
reg = 1.58489319246e-11
batch_size = 200
num_iters = 15 * 1000

net = TwoLayerNet(input_dim, hidden_dim, num_classes)
net.train(X_train_feats, y_train, 
          X_test_feats[range(50)], y_test[range(50)],
          num_iters=num_iters,
          batch_size=batch_size,
          learning_rate=lr, 
          learning_rate_decay=lr_decay,
          reg=reg, verbose=False)
            
train_acc = np.mean(net.predict(X_train_feats) == y_train)
test_acc = np.mean(net.predict(X_test_feats) == y_test)
            
print('Two layer net (Image features: HOG + hue)')
print('hidden_dim: %d' % (hidden_dim))
print('lr_decay: %f' % (lr_decay))
print('lr: %e' % (lr))
print('reg: %e' % (reg))
print('batch_size: %d' % (batch_size))
print('num_iters: %d' % (num_iters))
print('training accuracy: %f' % (train_acc))
print('test accuracy: %f' % (test_acc))

Two layer net (Image features: HOG + hue)
hidden_dim: 100
lr_decay: 0.950000
lr: 1.000000e-01
reg: 1.584893e-11
batch_size: 200
num_iters: 15000
training accuracy: 0.648560
test accuracy: 0.578300


## Final results
| Classifier | Features | Learning rate | Regularization strength | Batch size | Number of iterations | Other hyperparameters | Training accuracy | Test accuracy |  
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| kNN | raw pixels | - | - | - | - | k = 10 | - | 33.86% |
| kNN | HOG + hue  | - | - | - | - | k = 13 | - | 44.31% |
| Softmax | raw pixels | 5.000000e-07 | 3.214286e+04 | 200 | 1500 | - | 31.52% | 32.08% |
| Softmax | HOG + hue  | 5.000000e-07 | 1.000000e+04 | 200 | 1500| - | 41.53% | 40.93% |
| SVM | raw pixels | 1.258925e-07 | 1.467799e+04 | 2000 | 1500 | - | 38.43% | 37.62% |
| SVM | HOG + hue  | 1.000000e-08 | 1.000000e+06 | 200 | 3000 | - | 41.50% | 41.11% |
| Two layer net | raw pixels | 1.000000e-03 | 1.000000e+00 | 200 | 50000 | lr_decay: 0.95, hidden_dim: 100| 55.41% | 51.71% |
| Two layer net | HOG + hue | 1.000000e-01 | 1.584893e-11 | 200 | 15000 | lr_decay: 0.95, hidden_dim: 100 | 64.85% | **57.83%** |