# Solution: Ridge Regression and Active Learning

Student :Utkarsh Tiwari  
Course:  Machine Learning I — Module 3  
Instructor: John Paisley

This notebook reproduces the assignment solution: closed-form ridge regression and an active learning selection of the first 10 measurement indices. The code has student-style comments, short explanations, and final outputs.

In [None]:
import numpy as np
from pathlib import Path

# Load data (assumes notebook is in the assignment root)
DATA_DIR = Path('.')
X_train = np.genfromtxt(DATA_DIR / 'X_train-2.csv', delimiter=',')
y_train = np.genfromtxt(DATA_DIR / 'y_train-1.csv')
X_test = np.genfromtxt(DATA_DIR / 'X_test.csv', delimiter=',')

print('Loaded shapes:', X_train.shape, y_train.shape, X_test.shape)

## Ridge regression (closed-form)

We compute the ridge solution using a small helper function and display the first few weights.

In [None]:
def ridge_weights(X, y, lam):
    """Compute closed-form ridge regression weights.
    X: (n,d), y: (n,), lam: regularization scalar.
    Returns w shape (d,)
    """
    d = X.shape[1]
    A = X.T @ X + lam * np.eye(d)
    b = X.T @ y
    return np.linalg.solve(A, b)

lam = 5.0
w = ridge_weights(X_train, y_train, lam)
print('Ridge weights (first 10):', np.round(w[:10], 6))

## Active learning selection (first 10 indices)

We implement a simple posterior update and choose pool points by maximizing predictive variance x^T Sigma x. At each step we simulate a label using the current posterior mean (student-style simulation) and update the posterior.

In [None]:
def update_posterior(lambda0, sigma2, X_batch, y_batch, old_xx, old_xy):
    """Update sufficient statistics and compute posterior cov/mean for w.
    old_xx, old_xy are cumulative X^T X and X^T y.
    Returns: new_var (posterior covariance), new_mean (posterior mean), updated old_xx, old_xy
    """
    old_xx = old_xx + X_batch.T @ X_batch
    old_xy = old_xy + X_batch.T @ y_batch
    new_var_inv = lambda0 * np.eye(old_xx.shape[0]) + (1.0 / sigma2) * old_xx
    new_var = np.linalg.inv(new_var_inv)
    sigma_temp = lambda0 * sigma2 * np.eye(old_xx.shape[0]) + old_xx
    new_mean = np.linalg.solve(sigma_temp, old_xy)
    return new_var, new_mean, old_xx, old_xy

def active_learning(lambda0, sigma2, X_train, y_train, X_pool, k=10):
    d = X_train.shape[1]
    old_xx = X_train.T @ X_train
    old_xy = X_train.T @ y_train
    var_inv = lambda0 * np.eye(d) + (1.0 / sigma2) * old_xx
    new_var = np.linalg.inv(var_inv)
    new_mean = np.linalg.solve(lambda0 * sigma2 * np.eye(d) + old_xx, old_xy)
    w_rr = new_mean
    pool = X_pool.copy()
    indices = list(range(pool.shape[0]))
    selected = []
    for _ in range(k):
        variances = np.einsum('ij,jk,ik->i', pool, new_var, pool)
        idx = int(np.argmax(variances))
        selected_idx = indices[idx]
        selected.append(selected_idx + 1)
        x_sel = pool[idx:idx+1, :]
        y_sel = (x_sel @ w_rr).ravel()
        new_var, new_mean, old_xx, old_xy = update_posterior(lambda0, sigma2, x_sel, y_sel, old_xx, old_xy)
        w_rr = new_mean
        pool = np.delete(pool, idx, axis=0)
        indices.pop(idx)
    return new_var, new_mean, old_xx, old_xy, w_rr, selected

# Run active learning with deterministic settings
new_var, new_mean, old_xx, old_xy, w_rr, selected = active_learning(5.0, 2.0, X_train, y_train, X_test, k=10)
print('Selected indices (1-based):', selected)
print('Posterior mean weights (first 10):', np.round(w_rr[:10], 6))