<a href="https://colab.research.google.com/github/tonyscan6003/IntroToVision/blob/main/Example_2_2_eigenfaces.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Simple Facial Recognition with Eigenfaces

![Eigenfaces](https://github.com/tonyscan6003/CE6003/blob/master/images/eigenfaces.jpg?raw=true)

In this example we will obtain eigenfaces from a small dataset of faces.
We will demonstrate reconstraction of faces with the eigenfaces.
We will train a Support Vector Machine Classifier and demonstrate facial recognition. (Note that it is appropriate to use an SVM in this case as the dataset used ["Olivetti Faces"](https://scikit-learn.org/stable/datasets/index.html#real-world-datasets)  contains 40 people with 10 examples for each face. )

This example is based on the [scikit Learn tutorial](https://scipy-lectures.org/packages/scikit-learn/auto_examples/plot_eigenfaces.html) 

**HouseKeeping**: Import Packages, import sklearn dataset, divide in test and train splits.

In [None]:
from sklearn import datasets
from skimage import feature
from skimage import exposure
from sklearn import svm
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn import decomposition
import numpy as np
import urllib
import cv2
import matplotlib.pyplot as plt

Load & display the Olivetti Faces - This consists of 400 faces of size 64 x 64 = 4096 pixels. (This is a sklearn built in dataset and is very simple to load) As the number of pixels is very large we need to use eignfaces to produce a representation in a more compact form, that we can use for matching/classification. 

In [None]:

faces = datasets.fetch_olivetti_faces()
faces.data.shape

In [None]:
fig = plt.figure(figsize=(8, 6))
# plot several images
for i in range(15):
    ax = fig.add_subplot(3, 5, i + 1, xticks=[], yticks=[])
    ax.imshow(faces.images[i], cmap=plt.cm.bone)

We will divide the the dataset in train and test splits. (Note that the labels are also supplied in the sklearn dataset that we loaded.)

In [None]:
X_train, X_test, y_train, y_test = train_test_split(faces.data,
        faces.target, random_state=0)

print(X_train.shape, X_test.shape)

## Creating the Eigenfaces
As we saw in the Video lesson on Eigenfaces we use Principle Component Analysis to obtain eigenvectors. The sklearn python package has a built in PCA function to enable us to quickly determine the eigenfaces from the databased of faces. We can set the parameter `n_eigenfaces` which sets the number of pca components to use as Eigenfaces, using fewer eigenfaces will give a more compact representation of the faces. However as we will see in the next section this representation is lossy and faces cannot be completely reconstructed.

In [None]:
n_eigenfaces = 5  # Number of eigenfaces to return (max 299)

In [None]:

pca = decomposition.PCA(n_components=n_eigenfaces, whiten=True)
pca.fit(X_train)

PCA(copy=True, iterated_power='auto', n_components=100, random_state=None,
    svd_solver='auto', tol=0.0, whiten=True)

We wil display the 1st 30 eigenfaces obtained from the PCA. The ith eigenvector is accessed by pca.components_[i] 

In [None]:
fig = plt.figure(figsize=(16, 6))
for i in range(min(30,n_eigenfaces)):
    ax = fig.add_subplot(3, 10, i + 1, xticks=[], yticks=[])
    ax.imshow(pca.components_[i].reshape(faces.images[0].shape),
              cmap=plt.cm.bone)

## Transforming the Database & Reconstructing faces
In this step we will transform the database of faces into it's eigenface representation. This will give use a vector for each face equal to the number of eigenfaces we choose to retain (max number of Eigenfaces = Number of faces in database -1)

 In the previous step we could set the parameter n_eigenfaces which controls how many eigenfaces to retain. We will investigate the quality of the face reconstruction by varying this number from a low value 5 to the max value 299, trying some values inbetween. The idea of doing this is to see what is a good minimum number of Eigenfaces we need to properly represent the faces. (Make sure you are using enough Eigenfaces before moving onto the Recognition step below.)

In [None]:
# Apply the transformation to the training and test sets.
X_train_pca = pca.transform(X_train)
X_test_pca = pca.transform(X_test)

We then will apply the inverse PCA transform. Which will give us back the reconstructed faces. (Take a look at the stackover flow post https://stackoverflow.com/questions/55533116/pca-inverse-transform-in-sklearn which explains these 2 steps for 2D data.)

In [None]:
# Apply the inverse transform to training set
X_train_recon = pca.inverse_transform(X_train_pca)

We will then plot some pairs of original faces from the dataset and the reconstructed faces from the representation in our database using Eigenfaces. Change the number of eigenfaces (n_eigenfaces) above and run all the code cells from Creating the Eigenfaces to the cell below to see the quality of the reconstructed faces.

In [None]:
fig = plt.figure(figsize=(2, 20))
# plot several images
for i in range(16):
    ax = fig.add_subplot(14, 2, i + 1, xticks=[], yticks=[])   
    if i % 2  ==0:
       # plot oringial face
       ax.imshow(X_train[i].reshape(64,64), cmap=plt.cm.bone)
    else:
       #plot reconstructed face 
       ax.imshow(X_train_recon[i-1].reshape(64,64), cmap=plt.cm.bone)
fig.suptitle('Pairs of Original Database face & Reconstucted Faces from Databse with Eigenface representation = '+str(n_eigenfaces)) 

## Recognition
Now that we have selected the appropriate value of number eigenfaces to represent our database, we can now perform recognition between the test set and training set. The test set has alreay been converted to it's Eigenface representation. In the code cell below we will train a support vector machine classifier to perform recognition. We will use a support vector machine as we have 10 images for each person in this dataset. (In a case where we just have one image per person we would use k nearest neighbors, we could use KNN in this case also)

Initialise & Train the support Vector machine

In [None]:
clf = svm.SVC(C=5., gamma=0.001)
clf.fit(X_train_pca, y_train)

Apply the test set to the classifier and get a classification report

In [None]:
y_pred = clf.predict(X_test_pca)
print(classification_report(y_test, y_pred))

Plot some examples of the Test Set (Any errors in subject labelling will be highlighted in red font)

In [None]:

fig = plt.figure(figsize=(20, 15))
for i in range(100):
    ax = fig.add_subplot(10, 20, i + 1, xticks=[], yticks=[])
    ax.imshow(X_test[i].reshape(faces.images[0].shape),
              cmap=plt.cm.bone)
    y_pred = clf.predict(X_test_pca[i, np.newaxis])[0]
    color = ('black' if y_pred == y_test[i] else 'red')
    ax.set_title(y_test[i],
                 fontsize='small', color=color)