# Real-World Use Case: Image Compression (Eigenfaces)

## 1. The Problem
We need to store thousands of face images in a database. Storing raw pixels is expensive. Can we compress the faces while keeping the recognizable features?

## 2. Why PCA?
Faces share a lot of structure (eyes above nose, nose above mouth, oval shape). PCA finds these common patterns ("Eigenfaces"). We can reconstruction any face just by summing a few of these standard patterns.

## 3. Data (labeled faces in the Wild)

In [None]:
from sklearn.datasets import fetch_lfw_people
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
import numpy as np

# 1. Load Faces (Restricted for speed)
faces = fetch_lfw_people(min_faces_per_person=20, resize=0.4)
X = faces.data
h, w = faces.images.shape[1], faces.images.shape[2]
print(f"Original Data Shape: {X.shape} (Features = {X.shape[1]} pixels)")

# 2. Visualize Original
def plot_gallery(images, titles, h, w, n_row=2, n_col=4):
    plt.figure(figsize=(1.8 * n_col, 2.4 * n_row))
    plt.subplots_adjust(bottom=0, left=.01, right=.99, top=.90, hspace=.35)
    for i in range(n_row * n_col):
        plt.subplot(n_row, n_col, i + 1)
        plt.imshow(images[i].reshape((h, w)), cmap=plt.cm.gray)
        plt.title(titles[i], size=12)
        plt.xticks(()) 
        plt.yticks(())
    plt.show()

plot_gallery(X, faces.target_names[faces.target[:8]], h, w, titles=[faces.target_names[t] for t in faces.target[:8]])

# 3. PCA Compression (Keep only 100 components out of 1850!)
n_components = 100
pca = PCA(n_components=n_components, whiten=True).fit(X)

# Transform (Compress) and Inverse Transform (Reconstruct)
X_pca = pca.transform(X)
X_recovered = pca.inverse_transform(X_pca)

# 4. Compare Original vs Compressed
print(f"Compressed size ratio: {n_components / X.shape[1]:.2%}")
plot_gallery(X_recovered, ["Reconstructed"] * 8, h, w)