# Advanced Python: Principal Component Analysis (PCA)

<center>
<img src="../pictures/escher_pattern.avif" style="width:676px;height:392px;">
<br>
<i>Day and Night (1938, M.C. Escher)</i>
</center>

Principal Component Analysis (PCA) is a powerful technique for dimensionality reduction and data visualization and usually a good first step in data analysis in neuroscience, psychology and beyond. In this week's sessions, we will explore PCA with 2 sklearn built-in datasets.

A secondary theme is to practice showing and manipulating *images* in Python. We will start with handwritten digits and progress to work with images of faces.

References:
1. [Importance of Feature Scaling](https://scikit-learn.org/stable/auto_examples/preprocessing/plot_scaling_importance.html)

## PCA with digits dataset

In [1]:
# import packages
import numpy as np
import matplotlib.pyplot as plt

In [2]:
# load dataset
from sklearn.datasets import load_digits
digits = load_digits()
digits_data = digits.data
digits_target = digits.target

### Warm-up exercises:
1. `digits_data` contains all the samples of handwritten digits and is of shape (n_samples, n_features); `digits_target` contains the corresponding labels.
   1. What is the shape of `digits_data` and `digits_target`?
   2. How many *unique* labels are there in `digits_target`?
2. Turns out, each sample in `digits_data` is a (square) image.
   1. Use `plt.imshow` to visualize first 4 samples.
   > *Hint:* You should use `reshape` to convert the 1D array into a 2D array.
3. (Grayscale) Images and 2D arrays of pixel values are just 2 sides of the same coin. (Feel free to print out the values for some example images.) How can you flip the image horizontally or vertically using slicing? On your own, add and subtract images together to create new images. What effects do you see?

In [3]:
# put your code here

### Preprocess the data
Prior to applying PCA, we will standardize the data by scaling each feature to unit variance and zero 0 (typically `sklearn.PCA` deals with demeaning internally, but it does not scale the data to unit variance).

In [4]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
digits_data_scaled = scaler.fit_transform(digits_data)

### Run PCA
We can use a built-in PCA implementation from `sklearn.decomposition`. Similarly to linear regression, we need to first create a PCA object.

### Exercises
1. Go to the documentation of `sklearn.decomposition.PCA` and find out what the *attributes* are for the PCA object.
2. Plot *cumulative* explained variance ratio for our PCA object. What trend do you expect to see (if any)? How many components do you need to capture 60% of the variance? How many for 80%? How many for 90%?
3. How much variance is explained by the first 2 components? How about the first 3 components?

### Exercises
1. Check the components of PCA object. What are they?
   > *Hint:* You can use `pca.components_` to get the components.
2. Visualize the first 12 components as images.
3. Check sample 0's PCA projection. What are the most important components for this sample?
4. Visualize sample 0 now compressed into the first 12 components. How does it look like? How does it compare to the original image? 