<img src="./img/vs265header.svg"/>

<h1 align="center">Lab 3 - Unsupervised Learning </h1>

<h2 align="center"> Part 2 - Faces <font color="red"> [OPTIONAL] </font> </h2>

In [None]:
%pylab inline

import utils.lab3utils as util

faces = np.load('./data/yalefaces.npz')['faces'].T

## Eigenfaces

The file `data/yalefaces.npz` is used in this problem. It contains the Yale Faces Database, a collection of close-cropped pictures of faces. There are fifteen individuals in the database, each making eight facial expressions.

The data comes in as a 3965x120 array. Each column contains an image of a face. You must reshape the 3965x1 column vector into a 61x65 matrix to make it an image. For example, to view the first 32 faces, run the code in the cell below.

**Handy JuPyter Tip**: You can single-click the area to the left of the output to switch between a fixed-size output and a scrolling output. You can also double-click the same area to collapse the output entirely. This is useful for situations like this one where you have an output that takes up a lot of screen real estate.

In [None]:
plt.figure(figsize=(12,20))

N = faces.shape[1] # Replace with 'faces.shape[1]' to view the whole dataset.

numCols = 4
numRows = N//numCols+((N%numCols)>0); #compute the grid shape from N

for i in range(N):
    faceColumn = faces[:,i]
    faceImage = util.faceReshape(faceColumn)
    
    plt.subplot(numRows,numCols,i+1)
    
    util.facePlot(faceImage)

### The Average Face

Compute the average face and take a look at it by plotting it with `util.facePlot`. The function `util.faceInitialize` below will subtract this face from the data before running `sangerLearn`. Why do we need to do this? (Hint: check out p201 of Hertz, Krogh, and Palmer).

In [None]:
averageFace =  # YOUR CODE HERE
util.facePlot(util.faceReshape(averageFace));

### Sanger's Rule for Faces

Use Sanger's Rule to learn the first four (or more if you like) principal components of the data (the so-called "eigenfaces"). Show what these look like (perhaps by using `util.facePlot`). You should be able to use the exact same `sangerLearn` code from the first half of the problem set.

In [None]:
numOutputs = 6 # how many PCs should we find?
learningRate = 1e-8 # decrease this value if you get a Warning: converting a masked element to nan.
faces, weights = util.faceInitialize(faces, numOutputs=numOutputs)

In [None]:
def sangerLearn():
    return   # COPY YOUR SOLUTION FROM lab3_1

In [None]:
numSteps = 2500 #increase if you want to find more PCs (takes awhile!)

for i in range(numSteps):
    weights = sangerLearn(faces,weights,learningRate)

In [None]:
plt.figure(figsize=(20,6))

for i in range(numOutputs):
    plt.subplot(2,numOutputs,i+1)
    util.facePlot(util.faceReshape(weights[:,i]))
    plt.subplot(2,numOutputs,i+1+numOutputs)
    util.facePlot(-1*util.faceReshape(weights[:,i]))

### Dimensionality Reduction

Plot each face as a point in the two-dimensional space spanned by the first two PC’s.
We obtain the coordinates of those points by projecting each face into that space.

Use the provided color list to plot a differently colored marker for each individual. Note that there are eight poses per individual, there are fifteen individuals, and they're arranged in order.

What would this projection look like for Gaussian data?

In [None]:
colorList = ['red','orange','yellow','blue','black','brown','gray',
             'skyblue','tomato','mediumspringgreen',
            'plum','darkcyan','indigo','darkolivegreen','hotpink']

numPoses = 8
numFaces = faces.shape[1]
numIndividuals = numFaces//numPoses #floor division

In [None]:
plt.figure(figsize=(12, 8))

projectionVectors = # YOUR CODE HERE - two vectors onto which we project

for i in range(numIndividuals):
    startIndex = i*numPoses; endIndex = startIndex+numPoses
    faceColumns = faces[:, startIndex:endIndex]
    projection = # YOUR CODE HERE - compute dot product of each face with each projectionVector
    plt.scatter(projection[0, :], projection[1, :], color=colorList[i])

plt.title('Projection of Faces onto the First Two Principal Components')
plt.xlabel('PC1 Projection'); plt.ylabel('PC2 Projection');

### Reconstruction

Pick a face and show what the reconstructions look like as you reconstruct with progressively more principal components. Remember to add back in the mean face before you reconstruct. If the first face you pick doesn't work, try several different faces. If your reconstructions are bad, you might also want to go back and learn more PCs (say, 8).

In [None]:
faceIndex = 102

faceColumn = faces[:, faceIndex, None] #keeps column shape

maxComponents = weights.shape[1]

plt.figure(figsize=(20,4))

for i in range(maxComponents):
    projectionVectors = weights[:, 0:i+1] #weight vectors are projection vectors
    projection = projectionVectors.T @ faceColumn #get projection coordinates as before
    
    #now, return to the coordinates in the full face-space by "up-projecting"
    reconstruction = (projectionVectors @ projection) + averageFace[:, None] 

    if maxComponents > 4:
        plt.subplot(2, maxComponents//2, i+1)
    else:
        plt.subplot(1, maxComponents, i+1)
    util.facePlot(util.faceReshape(reconstruction))

### Winner-Take-All Learning

Now train a WTA network on the faces data. You should experiment
with different numbers of units. Do the learned weight vectors appear any more
meaningful than those learned by PCA?

A side note for those familiar with the K-means clustering algorithm: the WTA network learning
rule basically performs stochastic gradient descent on the same objective
function as K-means. See the discussion on page 222 of Hertz, Krogh, and Palmer.

In [None]:
def WTALearn():
    return  # COPY YOUR WTALearn() from lab3_1 here

In [None]:
# run this block once each time you want to train a new WTA network

numOutputs = 15; learningRate = 1

faces, weights = util.faceInitialize(faces, numOutputs=numOutputs)

In [None]:
# you can run this block more than once to train the same network more

numSteps = 1000

for i in range(numSteps):
    weights = WTALearn(faces, weights, learningRate)

In [None]:
plt.figure(figsize=(20,8))
normalizingConstant = np.sqrt(np.sum(np.square(averageFace)))

for i in range(numOutputs):
    if numOutputs >= 8:
        plt.subplot(2, numOutputs//2+numOutputs%2, i+1)
    else:
        plt.subplot(1, numOutputs, i+1)
    util.facePlot(util.faceReshape(normalizingConstant * weights[:,i] + averageFace))