Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE". As a reminder, there is **NO COLLABORATION** whatsoever on the final.

---

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
from ipywidgets import interact

# special imports for plotting faces in the widgets
from pca import plot_face, reconstruct

---
## Introduction

Principal components analysis (PCA) is a standard technique for dimensionality
reduction. PCA provides a way to find a lower-dimensional representation
of a dataset which preserves as much of the variance of the original data as
possible. This approach is often applied to representing sets of
images. In this problem we will apply it to representing faces. When applied to face images, the resulting principal
components are often called "eigenfaces."

The datafile `faces.npy` contains a single array, `faces`. Each **row** in the array corresponds to an image of a face.  Each **column** in the array corresponds to a pixel ID. The entries in the array correspond to pixel intensities. The provided function `plot_face` automatically reshapes a 361-dimensional row vector of pixel intensities back into a 19 x 19 image and plots it. 

The code below loads the data and plots 25 random faces:

In [None]:
# Load in the face data and plot some random faces
faces = np.load("data/faces.npy")

fig, axes = plt.subplots(5, 5)
ix = np.random.randint(0, faces.shape[0], 25)
for i in range(25):
    plot_face(axes.flat[i], faces[ix[i]])

We also can plot the "average face" for the dataset. This is the "face" that is produced when we average across each of the faces (rows) in the faces array.

In [None]:
# Plot average face
fig, axis = plt.subplots()
plot_face(axis, np.mean(faces, axis=0))

---
## Part A (2.5 points)

Run the provided function `PCA` on the images (rows) in `faces`:

In [None]:
# Run PCA on the face data
pca = PCA(n_components=361)
pca.fit(faces)
v = np.concatenate([pca.mean_[None], pca.components_])

The widget below allows you to inspect the principal components computed from the `faces` array. Each component corresponds to a seperate dimension along which the images from the faces dataset vary. One way you can think of this is that each face in the dataset corresponds to a particular weighted combination of the average face with each of these principal components. 

In [None]:
# Interactive visualization of all the different components
@interact
def show_component(component=(1, 361)):
    plt.close('all')
    fig, ax = plt.subplots()
    plot_face(ax, v[component])
    fig.set_figwidth(2)
    fig.set_figheight(2)

<div class="alert alert-success">Describe what the first three principal components capture
about the images in the dataset. What properties of the images do they seem to correspond to? </div>

**Hint**: Take a look at the random faces we plotted in the introduction. What a few of the most glaring differences across images? How might the first couple of principal components capture these differences?

YOUR ANSWER HERE

<div class="alert alert-success">
Compare these to at the last three principal components. What properties do the last three principal components correspond to?
</div>

YOUR ANSWER HERE

---
## Part B (1.5 points)

We can also visualize the results as we add and subtract in additional principal components from the average face. As you move the slider to the right, a new principal component is added to the average face (components are added in the order they appear in the Part A). As you move the slider to the left, the principal component is subtracted.

In [None]:
# Interactive visualization of the reconstruction of a face

@interact
def plot_reconstruction(face_index=(0, faces.shape[0]-1), components=(1, 200)):
    plt.close('all')
    fig, (ax1, ax2, ax3) = plt.subplots(1, 3)

    plot_face(ax1, pca.mean_)
    plot_face(ax2, reconstruct(faces[face_index], v, components, pca))
    plot_face(ax3, faces[face_index])
    
    ax1.set_title('Average Face')
    ax2.set_title('Reconstruction')
    ax3.set_title('True Face')
    
    plt.draw()
    fig.set_figwidth(6)
    fig.set_figheight(2)

<div class="alert alert-success">In general, what happens to the reconstruction as you (1) add and (2) remove principal components?</div>

YOUR ANSWER HERE

---
## Part C (3 points)

<div class="alert alert-success"> Does it seem like you need all 361 principal components to get a good
reconstruction of the original faces? In other words, after how many principal components are most faces clearly recognizable? At approximately how many principal components do you stop seeing a significant difference between the reconstruction and the true image? Comment on what this suggests about the dimensionality of our mental representation of faces.</div>

YOUR ANSWER HERE

---
## Part D (3 points)

<div class="alert alert-success"> Compare the reconstructions of the different faces. Can you identify any features of a face which predict whether its reconstruction will require more/fewer components to recognize? Explain why this might occur (your explanation should reference properties of the faces and of the PCA reconstructions). </div>

YOUR ANSWER HERE

---

Before turning this problem in remember to do the following steps:

1. **Restart the kernel** (Kernel$\rightarrow$Restart)
2. **Run all cells** (Cell$\rightarrow$Run All)
3. **Save** (File$\rightarrow$Save and Checkpoint)

<div class="alert alert-danger">After you have completed these three steps, ensure that the following cell has printed "No errors". If it has <b>not</b> printed "No errors", then your code has a bug in it and has thrown an error! Make sure you fix this error before turning in your exam.</div>

In [None]:
print("No errors!")