# Lesson 09 - Kernels and Kernelized Regression


## Objectives
- Compute kernel matrices for RBF and polynomial kernels.
- Fit a kernel ridge regression model.
- Visualize nonlinear decision boundaries.


## From the notes

**Kernel trick**
- Representer theorem: solution lies in span of training examples.
- Kernel: $K(x, z) = \phi(x)^T \phi(z)$ without explicit $\phi$.

_TODO: Validate kernel definitions in the CS229 main notes PDF._


## Intuition
Kernels let linear models act nonlinearly by replacing dot products with similarity functions in high-dimensional feature spaces.


## Data
We use a nonlinear 2D dataset (two moons style) to see the benefit of kernels.


In [None]:
import numpy as np
import matplotlib.pyplot as plt

np.random.seed(42)

# Simple nonlinear dataset
angles = np.linspace(0, np.pi, 80)
X1 = np.c_[np.cos(angles), np.sin(angles)] + 0.1 * np.random.randn(len(angles), 2)
X2 = np.c_[1 - np.cos(angles), 1 - np.sin(angles)] + 0.1 * np.random.randn(len(angles), 2)
X = np.vstack([X1, X2])
y = np.hstack([np.ones(len(X1)), -np.ones(len(X2))])

def rbf_kernel(X, Z, gamma=2.0):
    X_norm = (X**2).sum(axis=1)[:, None]
    Z_norm = (Z**2).sum(axis=1)[None, :]
    return np.exp(-gamma * (X_norm + Z_norm - 2 * X @ Z.T))

K = rbf_kernel(X, X)


## Experiments


In [None]:
# Kernel ridge regression for classification
lam = 1e-2
alpha = np.linalg.pinv(K + lam * np.eye(len(X))) @ y

def predict(X_new):
    K_new = rbf_kernel(X_new, X)
    return K_new @ alpha

preds = np.sign(predict(X))
(preds == y).mean()


## Visualizations


In [None]:
plt.figure(figsize=(6,4))
plt.scatter(X1[:,0], X1[:,1], label="class 1")
plt.scatter(X2[:,0], X2[:,1], label="class -1")
plt.title("Nonlinear dataset")
plt.xlabel("x1")
plt.ylabel("x2")
plt.legend()
plt.show()

# Decision boundary plot
grid_x1 = np.linspace(-1.5, 2.5, 100)
grid_x2 = np.linspace(-1.0, 2.0, 100)
xx1, xx2 = np.meshgrid(grid_x1, grid_x2)
grid = np.c_[xx1.ravel(), xx2.ravel()]
scores = predict(grid).reshape(xx1.shape)
plt.figure(figsize=(6,4))
plt.contourf(xx1, xx2, scores, levels=20, cmap="coolwarm", alpha=0.6)
plt.contour(xx1, xx2, scores, levels=[0], colors="black")
plt.scatter(X1[:,0], X1[:,1], c="white", edgecolor="black")
plt.scatter(X2[:,0], X2[:,1], c="black")
plt.title("Kernel ridge regression boundary")
plt.xlabel("x1")
plt.ylabel("x2")
plt.show()


## Takeaways
- Kernel methods let linear models learn nonlinear boundaries.
- The kernel matrix encodes pairwise similarities among training examples.


## Explain it in an interview
- Describe the representer theorem intuition.
- Explain how RBF kernel bandwidth affects model flexibility.


## Exercises
- Try a polynomial kernel and compare decision boundaries.
- Increase gamma and observe overfitting behavior.
- Implement kernelized logistic regression.
