# 📉 Chapter 8: Dimensionality Reduction — Hands-On Guide

Reducing the number of features (dimensions) helps with computation, visualization, and avoiding the "Curse of Dimensionality." 
Let's explore key techniques with practical examples!

---
## 1. 🧠 The Curse of Dimensionality

- In high dimensions:
  - Data becomes sparse
  - Distance metrics lose meaning
  - Models tend to overfit

**Solution:** apply dimensionality reduction to simplify data while retaining important structure.

---
## 2. 🔍 Main Approaches to Dimensionality Reduction

- **Projection methods:** Find new axes (e.g., PCA)
- **Manifold learning:** Preserve local relationships (e.g., Kernel PCA, LLE)

---
## 3. 🧮 Principal Component Analysis (PCA)

- PCA finds orthogonal axes (principal components) that maximize data variance.
- Useful for visualization, compression, and noise reduction.

### A. Basic PCA projection to 2D

In [None]:
from sklearn.decomposition import PCA
from sklearn.datasets import load_digits
import matplotlib.pyplot as plt

# Load the digits dataset
digits = load_digits()
X = digits.data
y = digits.target

# Apply PCA to reduce to 2 dimensions for visualization
pca = PCA(n_components=2)
X2D = pca.fit_transform(X)

# Plot the projection
plt.figure(figsize=(8,6))
scatter = plt.scatter(X2D[:, 0], X2D[:, 1], c=y, cmap='tab10', s=15)
plt.colorbar(scatter, ticks=range(10))
plt.title("Digits dataset projected via PCA")
plt.xlabel("PC 1")
plt.ylabel("PC 2")
plt.show()

---
### B. Variance Explained & Choosing Number of Components

In [None]:
# Fit PCA with all components to examine explained variance
pca_full = PCA()
pca_full.fit(X)
explained_variance = pca_full.explained_variance_ratio_
cumulative_variance = explained_variance.cumsum()

import numpy as np
plt.plot(range(1, len(cumulative_variance)+1), cumulative_variance, marker='o')
plt.xlabel('Number of Components')
plt.ylabel('Cumulative Explained Variance')
plt.title('Explained Variance by PCA Components')
plt.grid(True)
plt.show()

# Find number of components for 95% variance
n_components_95 = (cumulative_variance >= 0.95).argmax() + 1
print(f"Number of components for 95% variance: {n_components_95}")

---
### C. PCA for Dimensionality Reduction (compression)

In [None]:
# Reduce to enough components to preserve 95% variance
pca_95 = PCA(n_components=0.95)
X_reduced = pca_95.fit_transform(X)
print("Original shape:", X.shape)
print("Reduced shape:", X_reduced.shape)

---
### D. Fast PCA with Randomized SVD (for large datasets)

In [None]:
pca_rand = PCA(n_components=50, svd_solver='randomized', random_state=42)
X_rand = pca_rand.fit_transform(X)
print("Randomized PCA shape:", X_rand.shape)

---
### E. Incremental PCA (for large datasets or streaming data)

In [None]:
import numpy as np
from sklearn.decomposition import IncrementalPCA

# Simulate batching on the digits data
batch_size = 100
ipca = IncrementalPCA(n_components=50)
for X_batch in np.array_split(X, len(X) // batch_size):
    ipca.partial_fit(X_batch)

X_ipca = ipca.transform(X)
print("Incremental PCA shape:", X_ipca.shape)

---
## 4. 🧪 Nonlinear Techniques

- **Kernel PCA:** captures nonlinear structures via kernels.
- **Locally Linear Embedding (LLE):** preserves local relationships.
- **t-SNE:** excellent for visualization of high-dimensional data.
- **Isomap:** preserves geodesic distances on manifolds.

---
### A. Kernel PCA

In [None]:
from sklearn.decomposition import KernelPCA

# Apply Kernel PCA with RBF kernel
kpca = KernelPCA(n_components=2, kernel='rbf', gamma=0.04)
X_kpca = kpca.fit_transform(X)

# Visualize
plt.scatter(X_kpca[:, 0], X_kpca[:, 1], c=y, cmap='tab10', s=15)
plt.title("Digits via Kernel PCA (RBF kernel)")
plt.xlabel("Kernel PC 1")
plt.ylabel("Kernel PC 2")
plt.show()

---
### B. Locally Linear Embedding (LLE)

In [None]:
from sklearn.manifold import LocallyLinearEmbedding

lle = LocallyLinearEmbedding(n_components=2, n_neighbors=30, method='standard')
X_lle = lle.fit_transform(X)

plt.scatter(X_lle[:, 0], X_lle[:, 1], c=y, cmap='tab10', s=15)
plt.title("Digits via LLE")
plt.xlabel("LLE Dim 1")
plt.ylabel("LLE Dim 2")
plt.show()

---
## 5. Summary & Use Cases

| Technique | Type | Use Case |
| --- | --- | --- |
| PCA | Projection | Linear reduction, compression |
| Randomized PCA | Projection | Large datasets |
| Incremental PCA | Projection | Streaming data |
| Kernel PCA | Nonlinear | Nonlinear structures |
| LLE | Manifold learning | Preserve local structure |
| t-SNE | Visualization | High-dimensional clustering |
| Isomap | Manifold | Geodesic distances |

Choose based on dataset size, linearity, and visualization needs.

---
## 6. Practice Exercises

1. Apply PCA on a 5D dataset, reconstruct original data, and compute reconstruction error.
2. Compare Kernel PCA with RBF vs polynomial kernels on labeled data.
3. Use LLE and Isomap on a Swiss roll dataset.
4. Visualize MNIST digits with t-SNE to identify clusters.