# Linear Algebra for Developers: An Image Processing Approach

Welcome to this hands-on session! We will explore core Linear Algebra concepts using something we all understand: **Images**.

## Goals
1.  Understand images as matrices.
2.  Perform basic matrix operations (scalar multiplication, addition, dot product) and see their visual effects.
3.  Grasp the concept of "Dimensionality" in data.
4.  Demystify Principal Component Analysis (PCA) and Eigenfaces.

Let's get started!

In [None]:
# Basic Imports
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_lfw_people
from sklearn.decomposition import PCA

%matplotlib inline
print("Libraries imported successfully!")

## Part 1: Images are just Matrices

A grayscale image is nothing but a grid of numbers. Each number represents the intensity of a pixel (0 = black, 255 or 1.0 = white).

We will use the **Labeled Faces in the Wild (LFW)** dataset.

In [None]:
# Load the LFW people dataset
print("Loading dataset... (this might take a moment)")
lfw_people = fetch_lfw_people(min_faces_per_person=50, resize=0.4)

# X contains the flattened image vectors (we'll talk about this later)
X = lfw_people.data

# images contains the 2D matrices (height x width)
images = lfw_people.images

h, w = images.shape[1], images.shape[2]
print(f"Dataset loaded. Total images: {images.shape[0]}")
print(f"Image dimensions: {h}x{w} pixels")

In [None]:
# Helper function to plot images
def plot_gallery(images, titles, h, w, n_row=3, n_col=4):
    plt.figure(figsize=(1.8 * n_col, 2.4 * n_row))
    plt.subplots_adjust(bottom=0, left=.01, right=.99, top=.90, hspace=.35)
    for i in range(n_row * n_col):
        plt.subplot(n_row, n_col, i + 1)
        plt.imshow(images[i].reshape((h, w)), cmap=plt.cm.gray)
        plt.title(titles[i], size=12)
        plt.xticks(())
        plt.yticks(())
    plt.show()

# Let's look at the first few faces
titles = [f"Person {i}" for i in range(12)]
plot_gallery(images, titles, h, w, 1, 4)

### Exercise 1: The Matrix
Let's look at the raw numbers of a single face and manipulate them.
**Task**: Create a copy of the first face and put a black box (pixels = 0) in the center.

In [None]:
# Select the first face
face = images[0].copy()

# --- YOUR CODE HERE ---
# Hint: Slicing in numpy works like [row_start:row_end, col_start:col_end]
# Set a region to 0.0
face[10:30, 10:25] = 0.0
# ----------------------

plt.figure(figsize=(4, 3))
plt.imshow(face, cmap=plt.cm.gray)
plt.title("Face with a black box")
plt.axis('off')
plt.show()

## Part 2: Basic Linear Algebra Operations

### 2.1 Scalar Multiplication
Multiplying a matrix by a single number (scalar). In image terms, this affects **brightness/contrast**.

$$ B = c \cdot A $$

In [None]:
original_face = images[0]

# Make it brighter (multiply by 1.5)
bright_face = original_face * 1.5

# Make it darker (multiply by 0.5)
dark_face = original_face * 0.5

plot_gallery([original_face, bright_face, dark_face], ["Original", "Bright (x1.5)", "Dark (x0.5)"], h, w, 1, 3)

### 2.2 Matrix Addition
Adding two matrices together element-wise. In images, this is **blending** or **morphing**.

$$ C = A + B $$

In [None]:
face1 = images[0]
face2 = images[1]

# Simple addition (average to keep intensity in range)
blended_face = (face1 + face2) / 2

plot_gallery([face1, face2, blended_face], ["Face 1", "Face 2", "Blended (Average)"], h, w, 1, 3)

### 2.3 Dot Product (Similarity)
The dot product of two vectors is a measure of how much they point in the same direction. For normalized vectors, it's the cosine of the angle between them.

In face recognition, this is often used as a similarity score!

$$ \mathbf{a} \cdot \mathbf{b} = \sum a_i b_i $$

In [None]:
# We need to flatten the 2D images into 1D vectors first
vec1 = face1.flatten()
vec2 = face2.flatten()

# Calculate dot product
similarity = np.dot(vec1, vec2)
print(f"Raw Dot Product: {similarity:.2f}")

# For a better metric, we usually normalize vectors first (Cosine Similarity)
norm1 = np.linalg.norm(vec1)
norm2 = np.linalg.norm(vec2)
cosine_sim = np.dot(vec1, vec2) / (norm1 * norm2)

print(f"Cosine Similarity (0 to 1): {cosine_sim:.4f}")

## Part 3: The Curse of Dimensionality

Our images are $50 \times 37$ pixels.
That means each image is a vector in a **1850-dimensional space**!

Processing data in 1850 dimensions is slow and difficult. Do we really need all 1850 pixels to recognize a face? Probably not. Most pixels (like the background) are redundant.

## Part 4: Eigenfaces (PCA)

**Principal Component Analysis (PCA)** finds the "main directions" of variance in the data.
In our case, these directions are images themselves, called **Eigenfaces**.

Think of them as the "ingredients" of a face.

In [None]:
# Compute PCA (Eigenfaces)
n_components = 150

print(f"Extracting the top {n_components} eigenfaces from {X.shape[0]} faces...")
pca = PCA(n_components=n_components, whiten=True).fit(X)

eigenfaces = pca.components_.reshape((n_components, h, w))

print("Done!")

# Visualize the top Eigenfaces (The 'Ingredients')
eigenface_titles = [f"Eigenface {i}" for i in range(eigenfaces.shape[0])]
plot_gallery(eigenfaces, eigenface_titles, h, w, 2, 5)

## Part 5: Reconstruction

Any face in our dataset can be reconstructed as a weighted sum of these Eigenfaces.

$$ \text{Face} \approx \text{Mean Face} + w_1 \cdot E_1 + w_2 \cdot E_2 + ... + w_n \cdot E_n $$

Let's see what happens when we use only a few Eigenfaces to rebuild a face.

In [None]:
def reconstruct_face(face_idx, n_components_to_use):
    # Get the original face vector
    original_vec = X[face_idx]
    
    # Project it onto the PCA space (get the weights)
    weights = pca.transform([original_vec])
    
    # Zero out the weights we don't want to use (keep only top N)
    # Note: This is a hacky way to visualize it using the fitted PCA object
    weights_truncated = weights.copy()
    weights_truncated[:, n_components_to_use:] = 0
    
    # Reconstruct
    reconstruction = pca.inverse_transform(weights_truncated)
    
    return reconstruction[0]

# Visualize reconstruction with increasing components
test_idx = 10
components_list = [5, 15, 50, 150]
reconstructions = [reconstruct_face(test_idx, n) for n in components_list]

titles = [f"{n} Components" for n in components_list]
plot_gallery(reconstructions, titles, h, w, 1, 4)

plt.figure(figsize=(3, 4))
plt.imshow(images[test_idx], cmap=plt.cm.gray)
plt.title("Original")
plt.axis('off')
plt.show()

## Summary
1.  **Images are Matrices**: We can manipulate them with math.
2.  **Linear Algebra Operations**: Have direct visual interpretations (brightness, blending, similarity).
3.  **Dimensionality Reduction**: We can compress massive amounts of data (1850 pixels) into a few meaningful numbers (150 weights) using PCA.

**Next Steps**: Try using your own photo!