# **Ridge Regression (nD Data) From Scratch**

## Overview

### This notebook extends Ridge Regression to a multi-dimensional dataset (the Diabetes dataset). The focus is on handling higher-dimensional data and comparing scikit-learn's implementation with a custom solution.

In [51]:
from sklearn.datasets import load_diabetes
import numpy as np

## Data Loading:

### The Diabetes dataset is loaded using `load_diabetes`, containing 10 features and a target variable.

In [52]:
X, y = load_diabetes(return_X_y = True)

In [53]:
X.shape

(442, 10)

In [54]:
y.shape

(442,)

## Train-Test Split:

### The data is split into training and testing sets.

In [55]:
from sklearn.model_selection import train_test_split

In [56]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state = 18)

In [57]:
X_train.shape

(353, 10)

In [58]:
X_test.shape

(89, 10)

In [59]:
y_train.shape

(353,)

In [60]:
y_test.shape

(89,)

## Ridge Regression (scikit-learn):

* #### Ridge Regression is applied using `Ridge` with `alpha=0.1` and the Cholesky solver.
* #### Coefficients, intercept, and predictions are extracted.
* #### The R² score is computed to evaluate performance.

In [61]:
from sklearn.linear_model import Ridge

In [62]:
rid1 = Ridge(alpha=0.1, solver='cholesky')
rid1.fit(X_train, y_train)

In [63]:
rid1.coef_

array([  39.14749113, -230.68227562,  431.38074999,  306.1111574 ,
        -44.62129771,   -7.54913504, -204.1057191 ,  126.03849993,
        382.30967543,  108.08233528])

In [64]:
rid1.intercept_

153.157781394127

In [65]:
y_pred1 = rid1.predict(X_test)
y_pred1

array([247.15645792, 113.01119705, 154.84722167, 218.24245931,
       167.01873229, 171.50962694, 263.5259099 , 223.76914781,
        83.08433163,  89.95974328, 159.04835816, 195.83638689,
       126.6653368 ,  77.38477676, 136.25773768, 118.67352388,
        95.71804207, 171.90562889, 132.04608097, 218.28511713,
       153.79763828,  94.08100346, 111.2594726 , 121.66098045,
       240.78571145, 175.74979694, 128.83035513, 135.21176867,
       103.99948748, 180.80756289, 174.90892545, 212.03058101,
       174.83526447, 151.88930975, 207.40871584, 251.05550673,
       104.46474097, 170.29969107, 126.52730347, 216.19302693,
       192.71784715, 141.57022837,  99.79048017, 190.64334213,
       164.02535995, 192.73869226, 243.86109784, 142.57099197,
        93.95653577, 192.69023179,  68.4938671 , 192.99044823,
       208.68042152, 161.14023922,  79.23309913, 166.82036292,
       156.53626216, 158.30731479, 219.33351423, 179.95197553,
       240.92409981, 109.50327619,  67.33848278, 132.08

In [66]:
from sklearn.metrics import r2_score

In [67]:
score1 = r2_score(y_test, y_pred1)
score1

0.47422858803577095

## Custom Ridge Regression Implementation:

* #### A custom Ridge Regression class is implemented for multi-dimensional data.
* #### The `fit` method uses matrix operations to solve the regularized least squares problem.
* #### The `predict` method applies the learned coefficients to new data.
* #### Results are compared with scikit-learn's implementation.

In [68]:
class RidgeRegression:
    def __init__(self, alpha):
        self.coefficients = None
        self.intercept = None
        self.alpha = alpha

    def fit(self, X_train, y_train):
        X_train = np.insert(X_train, 0, 1, axis=1)
        I = np.identity(X_train.shape[1])
        I[0][0] = 0
        
        coeff_matrix = np.linalg.inv(np.dot(X_train.T, X_train) + self.alpha*I).dot(X_train.T).dot(y_train)

        self.intercept = coeff_matrix[0]
        self.coefficients = coeff_matrix[1:]

    def predict(self, X_test):
        return np.dot(X_test, self.coefficients) + self.intercept

In [69]:
rid2 = RidgeRegression(0.1)
rid2.fit(X_train, y_train)

In [70]:
rid2.coefficients

array([  39.14749113, -230.68227562,  431.38074999,  306.1111574 ,
        -44.62129771,   -7.54913504, -204.1057191 ,  126.03849993,
        382.30967543,  108.08233528])

In [71]:
rid2.intercept

153.15778139412694

In [72]:
y_pred2 = rid2.predict(X_test)
y_pred2

array([247.15645792, 113.01119705, 154.84722167, 218.24245931,
       167.01873229, 171.50962694, 263.5259099 , 223.76914781,
        83.08433163,  89.95974328, 159.04835816, 195.83638689,
       126.6653368 ,  77.38477676, 136.25773768, 118.67352388,
        95.71804207, 171.90562889, 132.04608097, 218.28511713,
       153.79763828,  94.08100346, 111.2594726 , 121.66098045,
       240.78571145, 175.74979694, 128.83035513, 135.21176867,
       103.99948748, 180.80756289, 174.90892545, 212.03058101,
       174.83526447, 151.88930975, 207.40871584, 251.05550673,
       104.46474097, 170.29969107, 126.52730347, 216.19302693,
       192.71784715, 141.57022837,  99.79048017, 190.64334213,
       164.02535995, 192.73869226, 243.86109784, 142.57099197,
        93.95653577, 192.69023179,  68.4938671 , 192.99044823,
       208.68042152, 161.14023922,  79.23309913, 166.82036292,
       156.53626216, 158.30731479, 219.33351423, 179.95197553,
       240.92409981, 109.50327619,  67.33848278, 132.08

In [73]:
score2 = r2_score(y_test, y_pred2)
score2

0.4742285880357714

## Observations

* #### The custom implementation produces identical coefficients and predictions as scikit-learn, validating the approach.
* #### Ridge Regression effectively handles multi-collinearity in higher-dimensional data by shrinking coefficients.
* #### The R² score indicates moderate predictive performance, which is expected for this dataset.

## Key Differences from 1D Case

* #### The closed-form solution involves matrix operations (`np.linalg.inv`, `np.dot`).
* #### The identity matrix (`I`) is adjusted to avoid penalizing the intercept term.
* #### The implementation generalizes to any number of features.