# Exercise 9: Principal and Independent Component Analysis (PCA & ICA)

## Exercise 9.1: Principal Component Analysis (PCA)
Principal component analysis (PCA) is a procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components.

Let $X \in \mathbb{R}^{L \times d}$ be a given data matrix that contains $L$ `samples` $\vec{x}_i \in \mathbb{R}^d$, $i = 1,...,L$.
The principal components of $X$ can be derived by computing the `eigenvectors` $W = (\vec{w}_1, ..., \vec{w}_d)$ of the sample covariance matrix

$$
\begin{equation}
    C_X = \frac{1}{L-1} \sum_{i=1}^{L} \left(\vec{x}_{i} - \vec{\mu}_{X}\right) \left( \vec{x}_{i} - \vec{\mu}_{X} \right)^{T}
\end{equation},
$$
where
$$
\begin{equation}
    \vec{\mu}_{X} = \frac{1}{L}\sum_{i=1}^{L} \vec{x}_{i}
\end{equation}.
$$

1. Implement the function `pca(samples)`, that computes the `eigenvectors` $W$ and the `eigenvalues` of $C_X$. Sort the `eigenvalues` and `eigenvectors` such that the ith column of $W$ corresponds to the ith largest `eigenvalue`.
2. Apply your function to the `samples` in file `data_9_1.npz`.
3. Use the function `utils.plot_principal_components` to visualize the principal components on the original data cloud.
4. Perform a _change of basis_, i.e. represent the `samples` $X$ within the obtained (`eigenvectors`) basis $W$ (take care to subtract the sample mean $\vec{\mu}_{X}$ from each sample $\vec{x}_{i}$ beforehand) and plot the result using `utils.plot_data`.
5. Do you observe a connection between the `eigenvalues` and the marginal sample variances after the transformation?
6. How does the covariance matrix before and after the transformation look like?

__Hint__: The Python functions `np.cov`, `np.linalg.eig` and `np.argsort` might be helpful.

In [None]:
import numpy as np
from matplotlib import pyplot as plt
from utils import utils_9 as utils
%matplotlib inline

In [None]:
def pca(samples):
    # TODO: calculate the covariance matrix
    covariance = 
    
    # TODO: calculate the eigenvalues and eigenvectors
    eigenvalues, eigenvectors = 
    
    # TODO: sort the eigenvectors and eigenvalues in descending order
    
    return (eigenvalues, eigenvectors)

samples = utils.load_data('data/data_9_1.npz')
eigenvalues, eigenvectors = pca(samples)
utils.plot_principal_components(samples, eigenvectors)

# TODO: perform a change of basis
samples_pca = 
utils.plot_data(samples_pca)

## Exercise 9.2: Independent Component Analysis (ICA)
This exercise considers the Independent Component Analysis (ICA). On the website, you find the file `data_9_2.npz`. This file contains 10,000 `samples` of a two-dimensional data distribution. Furthermore, you also find the Python function `utils.plot_joint(samples)`, which can be used to plot the marginal distributions of the data (i.e., the distribution of the first and second component of the data). Here, we want to manually perform ICA by implementing the following steps:

1. Load the `samples` and plot the marginal distributions using `utils.plot_joint`.
2. Compute and subtract the mean of the `samples`.
3. Remove correlations from the data by performing a `pca`. What does the data look like after this step has been applied?
4. Now, correlations have been removed from the data but the dimensions do not have the same variance. In order to align the variances of the dimensions, each dimension is divided by its standard deviation. This step is also called _whitening_. Apply this whitening step to the data and visualize the data after this step has been performed. __Note__: The `pca` function also provides the eigenvalues of the covariance matrix of the data which correspond to the variances of the data dimensions after the PCA has been applied.
5. The last step tries to maximize the statistical independence of the data dimensions by rotating the whitened data as follows: $Y = X A_{\theta}^{T}$, where

 $$
\begin{equation}
    A_{\theta} = \left(
        \begin{matrix}
            \cos(\theta) & -\sin(\theta) \\
            \sin(\theta)  & \cos(\theta)
        \end{matrix}
    \right)
\end{equation}
$$

 is a rotation matrix w.r.t. an angle $\theta$, such that the non-Gaussianity of the marginal data distributions is
maximal, i.e., until the marginal distributions are as dissimilar to a Gaussian distribution as
possible. Utilize the kurtosis to measure the non-Gaussianity, i.e., (iteratively) find the rotation

 $$
\begin{equation}
    \theta^{*} = \underset{\theta}{\operatorname{argmax}}\left(
        \left\vert 
            \frac{1}{2} \left(\text{kurtosis}(y_1) + \text{kurtosis}(y_2) \right) - 3 
        \right\vert
    \right)
\end{equation}.
$$
 Again, visualize your result.
6. Why do we look for the rotation that leads to maximal non-Gaussianity in the resulting data dimensions?

__Hint__: The Python functions `utils.rotation_matrix` and `utils.kurtosis` might be helpful.

In [None]:
# Load and plot the original data and its marginal distributions
samples = utils.load_data('data/data_9_2.npz')
utils.plot_joint(samples, 
                 title='Original Data')

# TODO: Perform a PCA and plot the data
samples_pca = 
utils.plot_joint(samples_pca, 
                 title='After PCA')

# TODO: Whiten and plot the data
samples_whitened = 
utils.plot_joint(samples_whitened, 
                 title='After Whitening')

# TODO: Perform a manual ICA (find the optimal rotation angle) and plot the data
samples_ica = 
utils.plot_joint(samples_ica, 
                 title='After ICA')