# Notebook 02: Regularization (Ridge & Lasso)

## The Art of Restraint

Overfitting is the enemy of generalization. Regularization tames it by penalizing complexity. Ridge shrinks coefficients smoothly. Lasso can zero them out entirely, performing automatic feature selection.

### Why Regularize?

- **Bias-variance tradeoff**: Reduce variance (overfitting) at the cost of slight bias
- **Multicollinearity**: When features are correlated, coefficients become unstable
- **Feature selection**: Lasso can automatically select important features

### Ridge vs Lasso

- **Ridge (L2)**: Penalizes sum of squared coefficients. Shrinks all coefficients toward zero, but rarely sets them to exactly zero.
- **Lasso (L1)**: Penalizes sum of absolute coefficients. Can set coefficients to exactly zero, performing feature selection.

We use cross-validation to pick the regularization strength (alpha).

In [ ]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.linear_model import RidgeCV, LassoCV
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from src.utils import set_seed
set_seed(42)

## TODO: Pipeline with RidgeCV

In [ ]:
# === TODO (you code this) ===
# Pipeline: StandardScaler + RidgeCV over alphas logspace(1e-3..1e3)
# Acceptance: Print best alpha, test RMSE, R2

## TODO: Pipeline with LassoCV

In [ ]:
# === TODO (you code this) ===
# Pipeline: StandardScaler + LassoCV. Compare metrics with Ridge.
# Acceptance: Table with RMSE, MAE, R2 for both; 2-sentence comparison

## TODO: Plot coefficient magnitudes

In [ ]:
# === TODO (you code this) ===
# Plot coefficient magnitudes side by side for Ridge vs Lasso.
# Acceptance: Figure with clear legend; note which coefficients go to zero with Lasso