# Face Recognition

Face recognition algorithms depend on comparing a single sampled image to a large database to find the closest match. With a large database, it is not practical to compare each image one-on-one. One solution is to consider each image as a vector in a very high-dimensional vector space. The database is used to generate a basis for this vector space of all faces in such a way that the dimensions of the vector space may be compressed or flattened.  

As an example, think of a dataset consisting of many points with coordinates in three dimensions, but which all fall nearly, but not quite,  on a single plane in space. These points are in three dimensions, but we can reasonably approximate them by only two dimensions. To do this, we think of the points as vectors and find two perpendicular unit (orthonormal) vectors on that plane. These vectors are a basis for this subspace (ie the plane within 3-d space). Any point  can be expresses as just two numbers, it's coordinated on that plane. A point's coordinates are found by projecting the vector representing it onto the coordinate vectors.

In this exercise, we will use this idea to project an image onto basis vectors to get its coordinates in the image data base. We will find that the basis vectors are a much smaller set than the full dataset, allowing a very efficient means of processing images.

For face recognition and image processing, if you have a database of thousands of images, there aren't really thousands dimensions of variability among the faces, and the database can be reduced to a small set of basis images that explain the important differences among the images. If we treat the images as vectors, they are actually members of a small subspace spanned by our  basis images. Here, we will construct these basis images as the eigenvectors of the covariance matrix computed from the images. These details are not important, only that we have a set of basis images that describe the dataset.

The dataset consists of 40 Faces (fig below) and 39 "EigenFaces" (fig below). The eigenfaces were computed using only 39 faces, leaving one out for validation. You can see from the EigenFaces  that the first few contain a lot more information and we can likely ignore about half of the EigenFaces, which compresses the vector space to something like 10-20 dimensions.

We can consider each face as a vector with dimensions equal to the number of pixels in the image, 96x64=6144. However, in practice, the faces all live in a subspace spanned by the 39 eigen vectors. This 39-dimensional space in turn, is very "flat" (like a pancake in 3-D) and can be approximated by something like 20 dimensions, the first 20 eigenvectors.

Read the code so you understand how it works. Change some of the options. The script will plot the coordinates of the selected faces in the basis of EigenFaces; you can see how the coordinates become quite small for higher numbered EigenFaces, showing that the vector space is flattened.

The script does 3 tasks: 1) Reconsruct an image in the training set (used to compute the EigenFace basis) using a small number of EigenFaces (try 10 or 20 )  2) Reconstruct an messed up image, like might be captured with a web cam. 3) Reconsruct an unknown image (not used to compute the EigenFace basis) using a small number of EigenFaces (try 10 or 20 or even all 39 )

In [None]:
# Face recognition by Santiago Serrano
# http://www.pages.drexel.edu/~sis26/Eigencode.htm
# Modified by Eric Salathe 
from numpy import (
    linspace,array,zeros,log,exp,sin,cos,sqrt,pi,e, 
    ones, arange, zeros, real, imag, sign, shape, dot, size,
    mean
    )

from numpy.random import rand

from matplotlib.pyplot import (
    plot,xlabel,ylabel,legend,show, figure, subplot, title, tight_layout, stem, pcolormesh,
    get_cmap
    )   
from scipy.io import loadmat

## Read in the image database
This fills an 96x64x40  array called  pics where each 96x64 layer is a
picture and there are 40 pictures (layers). 
The first pictur is pics[:,:,0]
To make  picture number i into a single column vector, use
X=pics[:,:,i].reshape((row*col,))

In [None]:
# read in images

infile=loadmat('Faces.mat')
pics = infile['pics']

row, col, mpictot = shape(pics) # image size
npixel = row*col # total pixels in each image

print('Read in ',mpictot,' images')
print('Each image contains ',npixel,' pixels')

## Read in the EigenFaces

The set of eigen faces $\vec u_i$  form a basis for our
vector space of faces. Only the top few eigenfaces are necesary
to represent the dataset, and we can flatten the vector space -- reduce it's dimensions.

In [None]:
infile=loadmat('EigenFaces.mat')
u=infile['u']
nn, meig = shape(u) # meig is the number of eigen vectors in our basis

## Image Atlas

### Image database

In [None]:
# plot all the faces, 5 to a row

fig=figure(1, figsize=[8,16])
cmap=get_cmap('gray')
figcol=5
figrow=int(mpictot/figcol)
for i in range(mpictot):
    subplot(figrow, figcol, i+1)
    pcolormesh(pics[:,:,i],cmap=cmap)
fig.suptitle('Face Database')

tight_layout()

### Eigen Faces

In [None]:
# plot all the eigen faces, 5 to a row

fig=figure(2, figsize=[8,16])
for i in range(meig):
    subplot(figrow, figcol, i+1)
    pcolormesh(u[:,i].reshape((row,col),order='F'),cmap=cmap)
fig.suptitle('Eigen Faces')
tight_layout()


## 1. Reconstruct an image in the training set using a reduced basis

In this part, selet an image $\vec x$ from the training data and see how many eigenfaces are necesary to reproduce the image.

To reconstruct the image we sum over each eigenvector (ie columns of u(:,i)) after multiplying by
the appropriate coordinate value (coord). The coordinates are found from taking the dot product between the face ($\vec x$) and the eigenfaces:

$$
\vec{X}_{rec} = \sum_{i=1}^{n} \sigma_i\vec{u}_i
$$

$$
\sigma_i = \vec{x} \cdot \vec{u}_i
$$

Xrec = sig(1)*u(1) + sig(2)*u(2) + ... + sig(n)*u(n)

where sig(i) are the coordinates and u(i) are the eigenvectors
but we use a for loop to do this in python.

You can change the selected image (i_img) and number of eigenfaces for reconstruction (m_rec).

In [None]:
# Select an image from the training set onto eigen vectors
i_img = 2 # select an arbitrary image
m_rec =  15 # number of images to use for reconstruction < meig. try 10 <= m_rec <= 20


######### Don't change below here #############

picture = pics[:,:,i_img]

# put the face into a vector
X = picture.reshape((row*col,),order='F')

# get the coordinates of this face in our eigenvector space using the dot prodcuct (dot(a,b) in numpy)
#         sig_i = X•u_i

sig=zeros(meig)
### Create a for loop and compute each sig[i]
for i in range(meig):
    sig[i] = dot(X, u[:,i])


# Loop and sum over the eigenvalues
Xrec = zeros(size(X)) # all zeros and size of the original vector
### Create a for loop and sum to get Xrec
for i in range(m_rec):
    Xrec = Xrec + sig[i]*u[:,i]




# Note that if we were to take the above sum for m_rec=mpic, we'd be
# exactly inverting the computation in the earlier loop that computs the
# coordinates. By truncating the sum to fewer terms, we get an approximation. 


# plot the image coordinates
figure(1)

ll = arange(meig)
stem(ll,sig) # this makes a "stem plot"
title('Weight of Input Face') # ,'fontsize',14)

# draw the face
figure(2)
subplot(1,2,1)
cmap=get_cmap('gray')
pcolormesh(picture,cmap=cmap) 

# draw the reconstructed image.
subplot(1,2,2)
pcolormesh(Xrec.reshape((row,col),order='F'),cmap=cmap) 


## 2 Reconstruct a messed-up image 

Now we will add noise to the image and see if we can still reconstruct it. This is like recognizing and image in our training data given a poor quality or slightly different image.

You can adjust the noise level and the number of eigenfaces for reconstruction.

In [None]:
# Select an image and add some random speckle of specified amplitude

i_img = 2 # seclect an arbitrart image
m_rec =  15 # number of images to use for reconstruction < meig. try 10 <= m_rec <= 20
noiselevel=2 # the magnitude of the noise relative to average; try different values

######### Don't change below here #############

picture = pics[:,:,i_img] + rand(row,col) * noiselevel*mean(pics[:,:,i_img]) 

# put the face into a vector
X = picture.reshape((row*col,),order='F')

# get the coordinates of this face in our eigenvector space using the dot prodcuct (dot(a,b) in numpy)
#         sig_i = X•u_i

sig=zeros(meig)
### Create a for loop and compute each sig[i]
for i in range(meig):
    sig[i] = dot(X, u[:,i])


# Loop and sum over the eigenvalues
Xrec = zeros(size(X)) # all zeros and size of the original vector
### Create a for loop and sum to get Xrec
for i in range(m_rec):
    Xrec = Xrec + sig[i]*u[:,i]




# Note that if we were to take the above sum for m_rec=mpic, we'd be
# exactly inverting the computation in the earlier loop that computs the
# coordinates. By truncating the sum to fewer terms, we get an approximation. 


# plot the image coordinates
figure(1)

ll = arange(meig)
stem(ll,sig) # this makes a "stem plot"
title('Weight of Input Face') # ,'fontsize',14)

# draw the face
figure(2)
subplot(1,2,1)
cmap=get_cmap('gray')
pcolormesh(picture,cmap=cmap) 

# draw the reconstructed image.
subplot(1,2,2)
pcolormesh(Xrec.reshape((row,col),order='F'),cmap=cmap) 

## 3 Reconstruct an image not included in the training data
i_img=39 was not used for computing the eigen values u
see what happens if you use this image. Will any number of eigen faces reconstruct the image? What does that say about trying to identify a face when it is not in the database?

In [None]:
# Select an image from the training set onto eigen vectors
i_img = 39 # select an arbitrary image
m_rec =  15 # number of images to use for reconstruction < meig. try 10 <= m_rec <= 20


######### Don't change below here #############

picture = pics[:,:,i_img]

# put the face into a vector
X = picture.reshape((row*col,),order='F')

# get the coordinates of this face in our eigenvector space using the dot prodcuct (dot(a,b) in numpy)
#         sig_i = X•u_i

sig=zeros(meig)
### Create a for loop and compute each sig[i]
for i in range(meig):
    sig[i] = dot(X, u[:,i])


# Loop and sum over the eigenvalues
Xrec = zeros(size(X)) # all zeros and size of the original vector
### Create a for loop and sum to get Xrec
for i in range(m_rec):
    Xrec = Xrec + sig[i]*u[:,i]




# Note that if we were to take the above sum for m_rec=mpic, we'd be
# exactly inverting the computation in the earlier loop that computs the
# coordinates. By truncating the sum to fewer terms, we get an approximation. 


# plot the image coordinates
figure(1)

ll = arange(meig)
stem(ll,sig) # this makes a "stem plot"
title('Weight of Input Face') # ,'fontsize',14)

# draw the face
figure(2)
subplot(1,2,1)
cmap=get_cmap('gray')
pcolormesh(picture,cmap=cmap) 

# draw the reconstructed image.
subplot(1,2,2)
pcolormesh(Xrec.reshape((row,col),order='F'),cmap=cmap) 
