#### Main Approaches for Dimensionality Reduction
- Projection
![projection](./pics/projection.png)
- Manifold Learning
![SwissRoll](./pics/SwissRoll.png)

![unrolling](./pics/unrolling.png)

![boundary](./pics/boundary.png)

#### Principal Component Analysis (PCA)
> First it identifies the hyperplane that lies closest to the data, and then it projects the data onto it.

### PCA
#### Principal Components
- Singular Value Decomposition (SVD)
Singular Value Decomposition (SVD) that can decompose the training set matrix $X$ into the dot product of three matrices $U \cdot \Sigma \cdot V^T$, where $V^T$ contains all the principal components

```python
X_centered = X - X.mean(axis=0)
U, s, V = np.linalg.svd(X_centered)
c1 = V.T[:, 0]
c2 = V.T[:, 1]
```
#### Projecting Down to $d$ Dimensions
- Projecting the training set down to d dimensions
$$X_{d‐proj} = X \cdot W_d$$

```python
W2 = V.T[:, :2]
X2D = X_centered.dot(W2)
```
- PCA inverse transformation, back to the original number of dimensions
$$X_{recovered}=X_{d-proj} \cdot W_{d}^{T}$$
#### Using Scikit-Learn
```python
from sklearn.decomposition import PCA
pca = PCA(n_components = 2)
X2D = pca.fit_transform(X)
```

In [1]:
from sklearn.datasets import make_moons

X, y = make_moons(n_samples=500, noise=0.30, random_state=42)

In [2]:
from sklearn.decomposition import PCA

pca = PCA()
X2D = pca.fit_transform(X)

In [3]:
pca.components_.T[:, 0]

array([-0.94556058,  0.32544614])

In [4]:
pca.explained_variance_ratio_

array([0.76412601, 0.23587399])

In [5]:
import numpy as np

cumsum = np.cumsum(pca.explained_variance_ratio_)
np.argmax(cumsum >= 0.95) + 1

2

In [6]:
pca = PCA(n_components=0.95)
X_reduced = pca.fit_transform(X)

In [7]:
pca.components_

array([[-0.94556058,  0.32544614],
       [-0.32544614, -0.94556058]])

![explained-variance](./pics/explained-variance.png)

#### Incremental PCA(IPCA)
```python
from sklearn.decomposition import IncrementalPCA
n_batches = 100
inc_pca = IncrementalPCA(n_components=154)
for X_batch in np.array_split(X_mnist, n_batches):
    inc_pca.partial_fit(X_batch)
X_mnist_reduced = inc_pca.transform(X_mnist)
```

```python
X_mm = np.memmap(filename, dtype="float32", mode="readonly", shape=(m, n))
batch_size = m // n_batches
inc_pca = IncrementalPCA(n_components=154, batch_size=batch_size)
inc_pca.fit(X_mm)
```

#### Randomized PCA
```python
rnd_pca = PCA(n_components=154, svd_solver="randomized")
X_reduced = rnd_pca.fit_transform(X_mnist)
```

![Figure 8-10](./pics/Figure8-10.png)