# Regularization

This notebook is a **companion to `05_regularization.md`**.

Purpose:
- Visualize the effect of regularization
- Compare unregularized vs regularized models
- Build interview intuition

---

In [None]:
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression, Ridge, Lasso
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

np.random.seed(42)

## Synthetic Dataset with Correlated Features

We create correlated features to highlight the impact of regularization.

---

In [None]:
n = 300
X = np.random.randn(n, 5)
X[:, 1] = X[:, 0] + np.random.normal(0, 0.1, size=n)
y = 3 * X[:, 0] - 2 * X[:, 2] + np.random.normal(0, 1, size=n)

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.3)

## Compare Models

---

In [None]:
models = {
    'Linear': LinearRegression(),
    'Ridge': Ridge(alpha=1.0),
    'Lasso': Lasso(alpha=0.1)
}

results = []
for name, model in models.items():
    model.fit(X_train, y_train)
    train_mse = mean_squared_error(y_train, model.predict(X_train))
    test_mse = mean_squared_error(y_test, model.predict(X_test))
    coef_norm = np.linalg.norm(model.coef_)
    results.append([name, train_mse, test_mse, coef_norm])

pd.DataFrame(results, columns=['Model', 'Train MSE', 'Test MSE', 'Coeff Norm'])

## Interview Takeaways

- Regularization stabilizes coefficients
- L1 promotes sparsity, L2 promotes smoothness
- Proper scaling is essential

In interviews, tie regularization choices to data size and risk.

---