# Cambridge ML Commando Course

## Image Recognition with Support Vector Machines

_In this notebook, we show how to perform face recognition using Support Vector Machines. We will use the Olivetti faces dataset, included in Scikit-learn. More info at:_ http://scikit-learn.org/stable/datasets/olivetti_faces.html

Start by importing numpy, scikit-learn, and pyplot, the Python libraries we will be using in this chapter. Show the versions we will be using (in case you have problems running the notebooks).

In [None]:
%pylab inline
import IPython
import sklearn as sk
import numpy as np
import matplotlib.pyplot as plt

print ('IPython version:', IPython.__version__)
print ('numpy version:', np.__version__)
print ('scikit-learn version:', sk.__version__)
print ('matplotlib version:', matplotlib.__version__)

Import the olivetti faces dataset

In [None]:
from sklearn.datasets import fetch_olivetti_faces
import bz2
import pickle

# fetch the faces data ... annoyingly this seems to not work in later (>0.20) versions of sklearn
# so as a backup we have a local version of the dataset
# faces = fetch_olivetti_faces()

# with bz2.BZ2File("faces.pbz2", "w") as f: 
#     pickle.dump(faces, f)

f = bz2.BZ2File("./data/faces.pbz2", "rb")
faces = pickle.load(f)
print (faces.DESCR)


Let's look at the data, faces.images has 400 images of faces, each one is composed by a matrix of 64x64 pixels.
faces.data has the same data but in rows of 4096 attributes instead of matrices (4096 = 64x64)

In [None]:
print (faces.keys())
print (faces.images.shape)
print (faces.data.shape)
print (faces.target.shape)

np.set_printoptions(precision=2, threshold=math.inf) #rjm49 ... lets us print a face array out in full
print("What are we actually dealing with here?...")
eg = faces.data[0]
print("initial shape is", eg.shape)

import PIL
from PIL import Image
sidelen = int(numpy.sqrt(4096))
eg = eg.reshape((sidelen, sidelen))
print("have made the example square of sidelen", sidelen)
plt.imshow(eg)
plt.show()

We don't have to scale attributes, because data is already normalized

In [None]:
print (np.max(faces.data))
print (np.min(faces.data))
print (np.mean(faces.data))

In [None]:
def print_faces(images, target, min_n=0, top_n=20, faces_across=20):
    # set up the figure size in inches
    if top_n > len(images):
        top_n = len(images)
    fig = plt.figure(figsize=(12, 12))
    fig.subplots_adjust(left=0, right=1, bottom=0, top=1, hspace=0.05, wspace=0.05)
    for i in range(min_n, top_n):
        # plot the images in a matrix of 20x20
        p = fig.add_subplot(faces_across, faces_across, i + 1, xticks=[], yticks=[])
        p.imshow(images[i], cmap=plt.cm.bone)
        
        # label the image with the target value
        p.text(0, 14, str(target[i]), color='red') # this is the Person ID at the top
        p.text(0, 60, str(i), color='red') # this is the Image ID at the bottom
    plt.show()
        
def faces_by_index(images, select):
    fig = plt.figure(figsize=(12,12))
    fig.subplots_adjust(left=0, right=1, bottom=0, top=1, hspace=0.05, wspace=0.05)
    for i,s in enumerate(select):
        p = fig.add_subplot(5,5, i+1, xticks=[], yticks=[])
        p.imshow(images[s], cmap=plt.cm.bone)    

In [None]:
# print_faces(faces.images, faces.target, top_n=20, faces_across=10)

Plot all the faces in a matrix of 20x20, for each one, we'll put it target value in the top left corner and it index in the bottom left corner.
It may take a few seconds.

In [None]:
print_faces(faces.images, faces.target, top_n=400)

We will try to build a classifier whose model is a hyperplane that separates instances (points) of one class from the rest. Support Vector Machines (SVM) are supervised learning methods that try to obtain these hyperplanes in an optimal way, by selecting the ones that pass through the widest possible gaps between instances of different classes. New instances will be classified as belonging to a certain category based upon which side of the surfaces they fall. Let's import the SVC class from the sklearn.svm module. SVC stands for Support Vector Classifier: we will use SVM for classification.

Let's use a linear kernel: http://en.wikipedia.org/wiki/Kernel_%28linear_algebra%29

In [None]:
from sklearn.svm import SVC
svc_1 = SVC(kernel='linear') # default kernel="rbf"
print (svc_1)

Build training and testing sets

In [None]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
        faces.data, faces.target, test_size=0.25, random_state=0)

Perform 5-fold cross-validation

In [None]:
from sklearn.model_selection import cross_val_score, KFold
from scipy.stats import sem

def evaluate_cross_validation(clf, X, y, K):
    # create a k-fold cross validation iterator
#     cv = KFold(len(y), K, shuffle=True, random_state=0)
    # by default the score used is the one returned by score method of the estimator (accuracy)
    scores = cross_val_score(clf, X, y, cv=K)
    print (scores)
    print (("Mean score: {0:.3f} (+/-{1:.3f})").format(np.mean(scores), sem(scores)))

In [None]:
evaluate_cross_validation(svc_1, X_train, y_train, 5)

In [None]:
from sklearn import metrics

def train_and_evaluate(clf, X_train, X_test, y_train, y_test):
    
    clf.fit(X_train, y_train)
    
    print ("Accuracy on training set:")
    print (clf.score(X_train, y_train))
    print ("Accuracy on testing set:")
    print (clf.score(X_test, y_test))
    
    y_pred = clf.predict(X_test)
    
    print ("Classification Report:")
    print (metrics.classification_report(y_test, y_pred))
    print ("Confusion Matrix:")
    print (metrics.confusion_matrix(y_test, y_pred))

Let's measure precision and recall on the evaluation set, for _each class_. 

In [None]:
train_and_evaluate(svc_1, X_train, X_test, y_train, y_test)
# train_and_evaluate(linclf, X_train, X_test, y_train, y_test)
# train_and_evaluate(knn, X_train, X_test, y_train, y_test)

### Glasses or Not-Glasses?
Performace on face recognition is very good. Now, another problem: let's try to classify images of people with and without glasses. By hand, we have marked people with glasses. 

In [None]:
# the index ranges of images of people with glasses
glasses = [
    (10, 19), (30, 32), (37, 38), (50, 59), (63, 64),
    (69, 69), (120, 121), (124, 129), (130, 139), (160, 161),
    (164, 169), (180, 182), (185, 185), (189, 189), (190, 192),
    (194, 194), (196, 199), (260, 269), (270, 279), (300, 309),
    (330, 339), (358, 359), (360, 369)
]

def tag_positive_examples(segments):
    # create a new y array of target size initialized with zeros
    y = np.zeros(faces.target.shape[0])
    # put 1 in the specified segments
    for (start, end) in segments:
        y[start:end + 1] = 1
    return y

have_glasses = tag_positive_examples(glasses)

X_train, X_test, y_train, y_test = train_test_split(
        faces.data, have_glasses, test_size=0.25, random_state=0)

In [None]:
svc_2 = SVC(kernel='linear')
evaluate_cross_validation(svc_2, X_train, y_train, 5)
train_and_evaluate(svc_2, X_train, X_test, y_train, y_test)

Almost perfect! Now, let's hold out 10 images (all from the same person, sometimes with glasses and sometimes without glasses).  We'll separate the subject with indexes from 30 to 39.

The aim is to see if the SVM can pick up glasses-related features (and not be dependent on having seen a particular face with or without glasses in the training set). We'll train and evaluate in the rest of the 390 instances. After that, we'll evaluate again over the separated 10 instances.


In [None]:
X_test = faces.data[30:40]
y_test = target_glasses[30:40]

print(y_test.shape[0])

select = np.ones(target_glasses.shape[0]) # <-- here we set an array all set to 1
select[30:40] = 0 #<-- here we drop any indices already assigned to testing 
X_train = faces.data[select == 1] #<-- we use the above array as a mask to get the same elements out of each of the following..
X_images = faces.images[select == 1]
y_train = have_glasses[select == 1]

#rjm49 - train up and run linear SVM classifier
svcs = []
for k in ("linear", "poly", "rbf"):
    this_svc = SVC(kernel=k)
    print("Kernel is",k)
    train_and_evaluate(this_svc, X_train, X_test, y_train, y_test)
    y_pred = this_svc.predict(X_test)
    svcs.append((k,this_svc))

Show our test-set faces, and their predicted category. Picture number eight is incorrectly classified as no-glasses (probably because his eyes are closed!).

In [None]:
eval_faces = [np.reshape(a, (64, 64)) for a in X_test]
print("True labels")
print_faces(eval_faces, y_test, top_n=10)
print("Pred labels")
print_faces(eval_faces, y_pred, top_n=10)

### Using Kernel functions to improve performance

In [None]:
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import make_moons, make_circles
from sklearn.metrics import f1_score
from matplotlib.colors import ListedColormap
X, y = make_classification(n_features=2, n_redundant=0, n_informative=2,
                           random_state=1, n_clusters_per_class=1)

Xhard, yhard = make_classification(n_features=2, n_redundant=0, n_informative=2,
                           random_state=1, n_clusters_per_class=2, class_sep=0.5)

datasets = [(X,y),
            (Xhard,yhard),
            make_moons(noise=0.3, random_state=0),
            make_circles(noise=0.2, factor=0.5, random_state=1),
            ]
classifiers = svcs

h = .02
i = 1
plt.figure(figsize=(12,3*len(datasets)))
for ds_cnt, ds in enumerate(datasets):
    # preprocess dataset, split into training and test part
    X, y = ds
    X = StandardScaler().fit_transform(X)
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.4, random_state=42)

    x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5
    y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                         np.arange(y_min, y_max, h))

    # just plot the dataset first
    cm = plt.cm.RdBu
    cm_bright = ListedColormap(['#FF0000', '#0000FF'])
    ax = plt.subplot(len(datasets), len(classifiers) + 1, i)
#     ax = plt.gca()
    if ds_cnt == 0:
        ax.set_title("Input data")
    # Plot the training points
    ax.scatter(X_train[:, 0], X_train[:, 1], c=y_train, cmap=cm_bright,
               edgecolors='k')
    # Plot the testing points
    ax.scatter(X_test[:, 0], X_test[:, 1], c=y_test, cmap=cm_bright, alpha=0.6,
               edgecolors='k')
    ax.set_xlim(xx.min(), xx.max())
    ax.set_ylim(yy.min(), yy.max())
    ax.set_xticks(())
    ax.set_yticks(())
    i += 1
    
        # iterate over classifiers
    for name, clf in classifiers:
        ax = plt.subplot(len(datasets), len(classifiers) + 1, i)
#         ax = plt.gca()
        clf.fit(X_train, y_train)
#         score = clf.score(X_test, y_test)
        y_hats = clf.predict(X_test)
        score = f1_score(y_test, y_hats, average="weighted")

        # Plot the decision boundary. For that, we will assign a color to each
        # point in the mesh [x_min, x_max]x[y_min, y_max].
        if hasattr(clf, "decision_function"):
            Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])
        else:
            Z = clf.predict_proba(np.c_[xx.ravel(), yy.ravel()])[:, 1]

        # Put the result into a color plot
        Z = Z.reshape(xx.shape)
        ax.contourf(xx, yy, Z, cmap=cm, alpha=.8)

        # Plot the training points
        ax.scatter(X_train[:, 0], X_train[:, 1], c=y_train, cmap=cm_bright,
                   edgecolors='k')
        # Plot the testing points
        ax.scatter(X_test[:, 0], X_test[:, 1], c=y_test, cmap=cm_bright,
                   edgecolors='k', alpha=0.6)

        ax.set_xlim(xx.min(), xx.max())
        ax.set_ylim(yy.min(), yy.max())
        ax.set_xticks(())
        ax.set_yticks(())
        if ds_cnt == 0:
            ax.set_title(name +" "+ type(clf).__name__)
        ax.text(xx.max() - .3, yy.min() + .3, ('acc={:.2f}'.format(score).lstrip('0')),
                size=15, horizontalalignment='right')
        i += 1
    
plt.show()