## Least Angles Regression (LARS)

### Introduction

In [47]:
import numpy as np

np.sign([((-1)**x)*x for x in range(10)])

array([ 0, -1,  1, -1,  1, -1,  1, -1,  1, -1])

Input: 
    - Data matrix X (n x p), centered and normalized (columns have mean 0 and norm 1)
    - Response vector y, centered (mean 0)
Output: 
    - Path of coefficients β over steps

Initialize:
    β = 0
    r = y             # residual
    A = []            # active set (indices of predictors currently in the model)
    β_path = [β]

While |A| < p:

    # 1. Compute correlations with residual
    c = X.T @ r                     # correlations between each predictor and residual
    C = max(|c|)                    # largest absolute correlation
    j = argmax(|c|)                 # index of most correlated predictor

    # 2. Add j to active set
    A.append(j)
    X_A = X[:, A]                  # submatrix with active predictors

    # 3. Compute equiangular direction u_A
    G_A = X_A.T @ X_A              # Gram matrix
    one_vec = [sign(c[j]) for j in A] # for each active predictor `a` in A, indicate if the correlation `c` between `a` and `r` is +ve or -ve
    w = inv(G_A) @ one_vec
    u = X_A @ w / sqrt(one_vec.T @ w)   # equiangular direction (unit norm)

    # 4. Compute step size γ (how far to go in this direction before another variable catches up)
    a = X.T @ u
    γ = min over j ∉ A of:
        ((C - c[j]) / (C - a[j]), (C + c[j]) / (C + a[j])) if denominator > 0

    # 5. Update β and residual
    β[A] += γ * w
    r = y - X @ β
    β_path.append(β.copy())

Return β_path


In [49]:
from sklearn.linear_model import Lasso, LassoLars
from sklearn.datasets import make_regression

In [56]:
X, y = make_regression(n_informative=5, n_samples=200, n_features=5)
Lasso(fit_intercept=False).fit(X,y).coef_, LassoLars(fit_intercept=False).fit(X,y).coef_

(array([39.69859513, 64.71630739, 61.73356078, 33.10463631, 20.27644721]),
 array([39.6985006 , 64.71631659, 61.73357321, 33.10462564, 20.27645807]))

- LARS is a variable selection technique, and can be treated as an "upgraded" forward stepwise regression
    - In stepwise regerssion, we iteratively add the best variable relative to the target
    - In LARS, we add the variable most correlated to the current residual

- Inuitively, the idea of LARS is quite simple
    - We start of with 0s for all coefficients, and find the residuals. If all coefficients are 0, the residual is just the target $y$ at the outset
    - Next, we identify the predictor that correlates most with the target among those not yet chosen
    - While

- Start with all coefficients = 0. So the initial residual is just the response vector y.

- Find the predictor most correlated with the residual — that is, the one that best explains what hasn’t been modeled yet.

- Move the coefficient for that predictor in the direction that minimally increases the residual sum of squares, until:
    - Another predictor becomes equally correlated with the current residual.

- At that point, move in a direction that is equiangular between the current set of active predictors.

- Continue this process, adding predictors to the active set and adjusting their coefficients accordingly.

