# Dimensionality Reduction

We will go over a few of the main dimensionality reduction techniques in the machine learning space.

### Table of Contents
1. []()

## Preliminaries

Design / Data Matrix:
$$
\begin{aligned}
\underset{n\times m}{\mathbf{X}} &= \underset{n\,\text{samples}\,\times \,m\,\text{features}}{\begin{bmatrix} x_{11} & x_{12} & \ldots & x_{1m} \\ x_{21} & x_{22} & \ldots & x_{2m} \\ \vdots & \vdots & \ddots & \vdots \\ x_{n1} & x_{n2} & \ldots & x_{nm} \\ \end{bmatrix}}
\end{aligned}
$$

Unit Matrix (Matrix of all ones):
$$
\begin{aligned}
\underset{n\times n}{\mathbf{e}\mathbf{e}^\top} &= \begin{bmatrix} 1 \\ 1 \\ \vdots \\ 1 \end{bmatrix} \cdot \begin{bmatrix} 1 & 1 & \ldots & 1 \end{bmatrix} \\
&= \underset{n\,\times \,n}{\begin{bmatrix} 1 & 1 & \ldots & 1 \\ 1 & 1 & \ldots & 1 \\ \vdots & \vdots & \ddots & \vdots \\ 1 & 1 & \ldots & 1 \\ \end{bmatrix}}
\end{aligned}
$$

Matrix of Feature / Covariate Means:
$$
\begin{aligned}
\underset{n\times m}{\bar{\mathbf{X}}} &= \frac{1}{n}\cdot\mathbf{e}\mathbf{e}^\top\cdot \underset{n\times m}{\mathbf{X}} \\
&= \underset{n\,\text{duplicates of feature means}\,\times \,m\,\text{features}}{\begin{bmatrix} \bar{x}_{1} & \bar{x}_{2} & \ldots & \bar{x}_{m} \\ \bar{x}_{1} & \bar{x}_{2} & \ldots & \bar{x}_{m} \\ \vdots & \vdots & \ddots & \vdots \\ \bar{x}_{1} & \bar{x}_{2} & \ldots & \bar{x}_{m} \\ \end{bmatrix}}
\end{aligned}
$$

Sample Data ($\frac{1}{n - 1}$) Covariance Matrix:
$$
\begin{aligned}
\underset{m\times m}{\Sigma} &= \frac{1}{n - 1} {(\underset{n\times m}{\mathbf{X}} - \underset{n\times m}{\bar{\mathbf{X}}})}^\top \cdot {(\underset{n\times m}{\mathbf{X}} - \underset{n\times m}{\bar{\mathbf{X}}})} \\
&= \frac{1}{n - 1} \begin{bmatrix} x_{11} - \bar{x}_{1} & x_{21} - \bar{x}_{1} & \ldots & x_{n1} - \bar{x}_{1} \\ x_{12} - \bar{x}_{2} & x_{22} - \bar{x}_{2} & \ldots & x_{n2} - \bar{x}_{2} \\ \vdots & \vdots & \ddots & \vdots \\ x_{1m} - \bar{x}_{m} & x_{2m} - \bar{x}_{m} & \ldots & x_{nm} - \bar{x}_{m} \\ \end{bmatrix} \cdot \begin{bmatrix} x_{11} - \bar{x}_{1} & x_{12} - \bar{x}_{2} & \ldots & x_{1m} - \bar{x}_{m} \\ x_{21} - \bar{x}_{1} & x_{22} - \bar{x}_{2} & \ldots & x_{2m} - \bar{x}_{m} \\ \vdots & \vdots & \ddots & \vdots \\ x_{n1} - \bar{x}_{1} & x_{n2} - \bar{x}_{2} & \ldots & x_{nm} - \bar{x}_{m} \\ \end{bmatrix} \\
&= \frac{1}{n - 1} \begin{bmatrix} 
\sum^{n}_{i = 1} {(x_{i1} - \bar{x}_{1})}{(x_{i1} - \bar{x}_{1})} & \sum^{n}_{i = 1} {(x_{i1} - \bar{x}_{1})}{(x_{i2} - \bar{x}_{2})} & \ldots & \sum^{n}_{i = 1} {(x_{i1} - \bar{x}_{1})}{(x_{im} - \bar{x}_{m})} \\ 
\sum^{n}_{i = 1} {(x_{i2} - \bar{x}_{2})}{(x_{i1} - \bar{x}_{1})} & \sum^{n}_{i = 1} {(x_{i2} - \bar{x}_{2})}{(x_{i2} - \bar{x}_{2})} & \ldots & \sum^{n}_{i = 1} {(x_{i2} - \bar{x}_{2})}{(x_{im} - \bar{x}_{m})} \\ 
\vdots & \vdots & \ddots & \vdots \\ 
\sum^{n}_{i = 1} {(x_{im} - \bar{x}_{m})}{(x_{i1} - \bar{x}_{1})} & \sum^{n}_{i = 1} {(x_{im} - \bar{x}_{m})}{(x_{i2} - \bar{x}_{2})} & \ldots & \sum^{n}_{i = 1} {(x_{im} - \bar{x}_{m})}{(x_{im} - \bar{x}_{m})} \\ \end{bmatrix} \\
&= \begin{bmatrix} 
Var(x_1) & Cov(x_1, x_2) & \ldots & Cov(x_1, x_m) \\ 
Cov(x_2, x_1) & Var(x_2) & \ldots & Cov(x_2, x_m) \\
\vdots & \vdots & \ddots & \vdots \\
Cov(x_m, x_1) & Cov(x_m, x_2) & \ldots & Var(x_m) \\ 
\end{bmatrix}
\end{aligned}
$$

---
# Principal Component Analysis

1. Compute Covariance matrix $\Sigma$
2. Get Singular Value Decomposition of $\Sigma$
3. Take the $k$-first eigenvectors of $U$ matrix, $U_k$
4. $U_k^\top \cdot \mathbf{x}$

---
# Linear Discriminant Analysis

---
# Factor Analysis

---
# Canonical Correlation Analysis

---
# TSNE (T-distributed Stochastic Neighbourhood Embeddings)

---
## Resources:

General:
- [Penn State's STAT505 Lesson Notes](https://newonlinecourses.science.psu.edu/stat505/)

PCA:
- [Stat Quest's "StatQuest: Principal Component Analysis (PCA), Step-by-Step"](https://www.youtube.com/watch?v=FgakZw6K1QQ&t=16s)
- [Very intuitive and detailed explanation of PCA](https://stats.stackexchange.com/questions/2691/making-sense-of-principal-component-analysis-eigenvectors-eigenvalues)
- [Andrew Ng on PCA](https://www.youtube.com/watch?v=rng04VJxUt4)