## Implement Ridge Regression (Gradient Descent) for unlimited number of variables






We have added λ in total cost function as a tuning parameter to balance the fit of data and magnitude of coefficients.

Ridge Regression Cost = RSS(W) + λ||W||² = (XW - y)(XW - y) + WW In matrix notation it will be written as: Ridge Regression Cost = (XW - y)ᵗ (XW - y) + WᵗW

Taking gradient of above equation(differentiation):
Δ[RSS(W) + λ||W||]² = Δ{(XW - y)ᵗ(XW - y)} + λ Δ{WᵗW}

= -2/n *Xᵗ(XW - y)+2λW

Setting the above gradient to 0 we get

W = (XᵗX + λI)-¹XᵗY

Thus, we know the values of W coefficients.


In [5]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from matplotlib import pyplot as plt 
from sklearn.metrics import mean_squared_error
from sklearn.metrics import r2_score
import sklearn.datasets

In [35]:
class RidgeRegression:
    def __init__(self, lr = 0.1, lmb = 1, max_iter = 1000, tol = 0.001):
        self.lr = lr
        self.lmb = lmb 
        self.max_iter = max_iter
        self.tol = tol
        self.w = np.array([])
    
    def cost_function(self, X, y):
        X = np.array(X)
        y = np.array(y)
        if(X.ndim == 1): 
            X = X.reshape(-1,1)
        
        n = X.shape[0]
        X = np.concatenate((np.ones((n, 1)), X), axis=1)
        
        return ((X @ self.w - y).T @ (X @ self.w - y) + self.lmb * self.w[1:].T @ self.w[1:]) / n
    
    def update_weights(self, X, y):
        n = X.shape[0]
        reg_term = 2 * self.lmb * self.w
        reg_term[0] = 0
        grad = (1 / n) * (X.T @ (X @ self.w - y) + reg_term)
        self.w -= self.lr * grad
    
    def fit(self, x_train, y_train):
        x_train = np.array(x_train)
        y_train = np.array(y_train)
        if(x_train.ndim == 1): 
            x_train = x_train.reshape(-1,1)
        
        n = x_train.shape[0]
        X = np.concatenate((np.ones((n, 1)), x_train), axis=1)
        self.w = np.zeros(X.shape[1])
        
        for _ in range(self.max_iter):
            cost_prev = self.cost_function(x_train, y_train)
            self.update_weights(X, y_train)
            
            if abs(cost_prev - self.cost_function(x_train, y_train)) < self.tol:
                break
        
    def predict(self, x_test):
        x_test = np.array(x_test)
        if(x_test.ndim == 1): 
            x_test = x_test.reshape(-1,1)
        
        x_test = np.concatenate((np.ones((x_test.shape[0], 1)), x_test), axis=1)

        return x_test @ self.w

In [42]:
df = pd.read_csv('C:/Users/anush/Downloads/Ecommerce.csv')
df = df._get_numeric_data()
df

Unnamed: 0,Avg Session Length,Time on App,Time on Website,Length of Membership,Yearly Amount Spent
0,34.497268,12.655651,39.577668,4.082621,587.951054
1,31.926272,11.109461,37.268959,2.664034,392.204933
2,33.000915,11.330278,37.110597,4.104543,487.547505
3,34.305557,13.717514,36.721283,3.120179,581.852344
4,33.330673,12.795189,37.536653,4.446308,599.406092
...,...,...,...,...,...
495,33.237660,13.566160,36.417985,3.746573,573.847438
496,34.702529,11.695736,37.190268,3.576526,529.049004
497,32.646777,11.499409,38.332576,4.958264,551.620145
498,33.322501,12.391423,36.840086,2.336485,456.469510


In [43]:
xTrain, xTest, yTrain, yTest = train_test_split(df[df.columns.difference(['Yearly Amount Spent'])],df['Yearly Amount Spent'], test_size=0.2, random_state=10)

In [44]:
rr = RidgeRegression(lr = 0.00061, max_iter = 10000,lmb = 1.2)
rr.fit(xTrain,yTrain)
yPred = rr.predict(xTest)
rr.cost_function(xTest, yTest)

504.95142572203537

In [45]:
r2_score(yPred, yTest)

0.9237482980079497