# Linear Algebra Foundations

## Key Concepts
- Vectors, matrices, linear maps
- Eigenvalues/eigenvectors
- SVD (Singular Value Decomposition)
- PCA (Principal Component Analysis)

## References
- Bishop PRML: Chapter 1.2 (Probability distributions involve linear algebra)
- Matrix Cookbook: Sections 1-5
- MML Book: Chapters 2-4

In [None]:
import numpy as np
import matplotlib.pyplot as plt

## 1. Matrix-Vector Multiplication

Given matrix $A \in \mathbb{R}^{m \times n}$ and vector $x \in \mathbb{R}^n$:

$$y = Ax \quad \text{where} \quad y_i = \sum_{j=1}^{n} A_{ij} x_j$$

In [None]:
# Implement matrix-vector multiplication from scratch
def matrix_vector_mult(A, x):
    """Matrix-vector multiplication without using np.dot"""
    m, n = A.shape
    assert x.shape[0] == n, "Dimensions must match"
    
    y = np.zeros(m)
    for i in range(m):
        for j in range(n):
            y[i] += A[i, j] * x[j]
    return y

# Test
A = np.array([[1, 2], [3, 4], [5, 6]])
x = np.array([1, 2])
print("Manual:", matrix_vector_mult(A, x))
print("NumPy:", A @ x)

## 2. Eigenvalue Decomposition

For a square matrix $A$, eigenvalues $\lambda$ and eigenvectors $v$ satisfy:

$$Av = \lambda v$$

Decomposition: $A = V \Lambda V^{-1}$ where $\Lambda$ is diagonal matrix of eigenvalues.

In [None]:
# Eigenvalue decomposition example
A = np.array([[4, 2], [1, 3]])
eigenvalues, eigenvectors = np.linalg.eig(A)

print("Eigenvalues:", eigenvalues)
print("Eigenvectors:\n", eigenvectors)

# Verify: A @ v = lambda * v
for i in range(len(eigenvalues)):
    v = eigenvectors[:, i]
    lam = eigenvalues[i]
    print(f"\nVerify λ_{i}: A@v = {A @ v}, λ*v = {lam * v}")

## 3. SVD (Singular Value Decomposition)

For any matrix $A \in \mathbb{R}^{m \times n}$:

$$A = U \Sigma V^T$$

Where:
- $U \in \mathbb{R}^{m \times m}$ (left singular vectors)
- $\Sigma \in \mathbb{R}^{m \times n}$ (diagonal matrix of singular values)
- $V \in \mathbb{R}^{n \times n}$ (right singular vectors)

In [None]:
# SVD example
A = np.array([[1, 2, 3], [4, 5, 6]])
U, S, Vt = np.linalg.svd(A, full_matrices=False)

print("U shape:", U.shape)
print("S (singular values):", S)
print("V^T shape:", Vt.shape)

# Reconstruct A
A_reconstructed = U @ np.diag(S) @ Vt
print("\nReconstruction error:", np.linalg.norm(A - A_reconstructed))

## 4. PCA via SVD

PCA finds directions of maximum variance. For centered data $X$:

1. Compute SVD: $X = U \Sigma V^T$
2. Principal components are columns of $V$
3. Projected data: $Z = XV = U\Sigma$

In [None]:
# TODO: Implement PCA from scratch using SVD
def pca_svd(X, n_components):
    """PCA via SVD
    
    Args:
        X: Data matrix (n_samples, n_features)
        n_components: Number of principal components
    
    Returns:
        Z: Projected data (n_samples, n_components)
        V: Principal component directions (n_features, n_components)
        explained_variance_ratio: Variance explained by each component
    """
    # Center the data
    X_centered = X - np.mean(X, axis=0)
    
    # SVD
    U, S, Vt = np.linalg.svd(X_centered, full_matrices=False)
    
    # Principal components
    V = Vt.T[:, :n_components]
    
    # Project data
    Z = X_centered @ V
    
    # Explained variance
    total_var = np.sum(S**2)
    explained_variance_ratio = (S[:n_components]**2) / total_var
    
    return Z, V, explained_variance_ratio

# Test with random data
np.random.seed(42)
X = np.random.randn(100, 5)
Z, V, var_ratio = pca_svd(X, n_components=2)
print("Projected shape:", Z.shape)
print("Explained variance ratio:", var_ratio)

## Exercises

1. Implement matrix-matrix multiplication from scratch
2. Derive why $A^T A$ and $A A^T$ have the same non-zero eigenvalues
3. Implement PCA via eigendecomposition of covariance matrix
4. Apply PCA to MNIST and visualize first 2 components