### Linear Regression Implementation using Gradient Descent

Here we are implementing the linear regression using gradient descent. We will pick the Loss function and then calculate the derivative of it with respect to the slope and intercept, and then, in each epoch, we will update it.

1. **Write the Loss Function**
   - \( L = \sum (y_i - y_{pred})^2 \)

2. **Initialize some values of the m and b**

3. **In each epoch, calculate the partial derivative of m and b**

4. **Update the values of m and b**
   - \( b_{\text{new}} = b_{\text{old}} - \text{learning\_rate} \times \frac{dL}{db} \)
   - \( m_{\text{new}} = m_{\text{old}} - \text{learning\_rate} \times \frac{dL}{dm} \)

The partial derivative has been calculated and fed into each epoch and updated repeatedly.


In [1]:
import numpy as np
from sklearn.model_selection import train_test_split

In [12]:
class MyGDRegressor:
    
    def __init__(self,learning_rate=0.01,epochs=100):
        
        self.coef_ = None
        self.intercept_ = None
        self.lr = learning_rate
        self.epochs = epochs
        
    def fit(self,X_train,y_train):
        # init your coefs
        self.intercept_ = 0
        self.coef_ = np.ones(X_train.shape[1])
        
        for i in range(self.epochs):
            # update all the coef and the intercept
            y_hat = np.dot(X_train,self.coef_) + self.intercept_
            #print("Shape of y_hat",y_hat.shape)
            intercept_der = -2 * np.mean(y_train - y_hat)
            self.intercept_ = self.intercept_ - (self.lr * intercept_der)
            
            coef_der = -2 * np.dot((y_train - y_hat),X_train)/X_train.shape[0]
            self.coef_ = self.coef_ - (self.lr * coef_der)
        
        print(self.intercept_,self.coef_)
    
    def predict(self,X_test):
        return np.dot(X_test,self.coef_) + self.intercept_

In [13]:
# testing it on the dataset and comparing it with the sklearn
from sklearn.datasets import load_diabetes
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score

In [14]:
X,y = load_diabetes(return_X_y=True)

In [15]:
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=2)

In [16]:
reg = LinearRegression()
reg.fit(X_train,y_train)

LinearRegression()

In [17]:
print(reg.coef_)
print(reg.intercept_)

[  -9.16088483 -205.46225988  516.68462383  340.62734108 -895.54360867
  561.21453306  153.88478595  126.73431596  861.12139955   52.41982836]
151.8833452085463


In [18]:
y_pred = reg.predict(X_test)
r2_score(y_pred, y_test)

-0.015305593565441589

In [23]:
#now testing our own model
gdr = MyGDRegressor(epochs=1000,learning_rate=0.5)

In [24]:
gdr.fit(X_train,y_train)

152.0135263267291 [  14.38915082 -173.72674118  491.54504015  323.91983579  -39.32680194
 -116.01099114 -194.04229501  103.38216641  451.63385893   97.57119174]


In [26]:
y_pred = gdr.predict(X_test)

In [27]:
r2_score(y_pred, y_test)

-0.04225644685048091

### We can see from here that the coefficient from the both sklearn and our own class matches very well & r2score of the both techniques matches very well 