# **⚡ Stochastic Gradient Descent (From Scratch vs Scikit-Learn)**

In [None]:
!pip install numpy sklearn

In [2]:
import numpy as np
from sklearn.datasets import load_diabetes

## 🔹 Dataset

- Loaded the diabetes dataset from `sklearn.datasets`
- Extracted features `X` and target `y`

In [3]:
X,y = load_diabetes(return_X_y = True)

In [4]:
X.shape

(442, 10)

In [5]:
y.shape

(442,)

## 🔹 Data Preparation

- Split data into training and testing sets

In [6]:
from sklearn.model_selection import train_test_split

In [7]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=18)

In [8]:
X_train.shape

(353, 10)

In [9]:
y_train.shape

(353,)

In [10]:
X_test.shape

(89, 10)

In [11]:
y_test.shape

(89,)

## Sklearn Linear Regression Implementation

In [12]:
from sklearn.linear_model import LinearRegression

In [13]:
lr = LinearRegression()
lr.fit(X_train, y_train)

In [14]:
y_pred_lr = lr.predict(X_test)

In [15]:
from sklearn.metrics import r2_score

In [16]:
score_lr = r2_score(y_test, y_pred_lr)
score_lr

0.49152545179667817

## 🔹 Custom Stochastic Gradient Descent

- Created SGD implementation using Python class
- Performed weight updates for each data point
- Printed final model parameters (weights and bias)

In [17]:
class SGDRegressor:
    def __init__(self, lr, epochs):
        self.lr = lr
        self.epochs = epochs
        self.w0 = 0
        self.w = None

    def fit(self, X_train, y_train):
        self.w = np.ones(X_train.shape[1])

        for i in range(self.epochs):
            for j in range(X_train.shape[0]):
                index = np.random.randint(0, X_train.shape[0])
                
                y_hat = self.w0 + np.dot(X_train[index],self.w)
                error = y_train[index] - y_hat

                slope_w0 = -2 * error
                slope_w = -2 * np.dot(X_train[index].T,error)

                self.w0 -= (self.lr*slope_w0)
                self.w -= (self.lr*slope_w)
        
        print(f"Bias is: {self.w0}\nWeight Vector is: {self.w}")

    def predict(self, X_test):
        return np.dot(X_test,self.w) + self.w0

In [18]:
my_sgd = SGDRegressor(0.01,40)
my_sgd.fit(X_train, y_train)

Bias is: 154.79327424921456
Weight Vector is: [  70.67870293  -65.48531282  283.74558565  212.53673475   53.9752335
   26.25935802 -157.98189051  150.97307833  263.46057832  164.90467551]


In [19]:
y_pred_mysgd = lr.predict(X_test)

In [20]:
score_mysgd = r2_score(y_test, y_pred_mysgd)
score_mysgd

0.49152545179667817

## 🔹 Scikit-Learn SGDRegressor

- Initialized and trained `SGDRegressor`
- Extracted coefficients and intercept for comparison

In [21]:
from sklearn.linear_model import SGDRegressor
sgd = SGDRegressor(max_iter=100,learning_rate='constant',eta0=0.01)

In [22]:
sgd.fit(X_train, y_train)



In [23]:
y_pred_sgd = sgd.predict(X_test)

In [24]:
score_sgd = r2_score(y_test, y_pred_sgd)
score_sgd

0.38693069045034145

## ✅ Conclusion

- Our manual SGD converges well and gives comparable results to `LinearRegression` and `SGDRegressor`.
- R2 score confirm its correctness.
- This implementation builds foundational intuition behind gradient descent optimization.