# **Mathematical Formulation of Multiple Linear Regression**
In multiple linear regression, the slope (m) and the intercept (b) of the linear equation can be calculated using the following mathematical formula:

<center><img src="https://i0.wp.com/cmdlinetips.com/wp-content/uploads/2020/03/Linear_Regression_Beta_Hat_Matrix_Multiplication.png?resize=561%2C136&ssl=1" style="width:30%"></center>

Read this Blog:<br>
https://towardsdatascience.com/building-linear-regression-least-squares-with-linear-algebra-2adf071dd5dd

## **Import Required Libraries**

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings("ignore")

## **Read the Data**

In [2]:
# Read the diabetes data from scikit learn
from sklearn.datasets import load_diabetes

In [3]:
X, y = load_diabetes(return_X_y=True)

In [4]:
X.shape

(442, 10)

## **Train Test Split**

In [5]:
from sklearn.model_selection import train_test_split

In [6]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)
X_train.shape, X_test.shape

((309, 10), (133, 10))

## **Train a Linear Regression Model** 

In [7]:
from sklearn.linear_model import LinearRegression

In [8]:
# Instantiate a LinearRegression object
lr = LinearRegression()

# Fit the training data
lr.fit(X_train, y_train)

In [9]:
# Print the coefficients
lr.coef_

array([ -52.46478548, -193.50733393,  579.49108514,  272.453666  ,
       -504.64830389,  241.62372969,  -69.76596029,   86.61313961,
        721.92083806,   26.78067442])

In [10]:
# Print the intercept
lr.intercept_

153.71901624380382

## **Accuracy Assessment**

In [11]:
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

In [12]:
# Predict the test data
y_pred = lr.predict(X_test)

In [13]:
print("Mean Absolute Error (MAE):", mean_absolute_error(y_test, y_pred))
print("Mean Squared Error (MSE):", mean_squared_error(y_test, y_pred))
print("R2 Score:", r2_score(y_test, y_pred))

Mean Absolute Error (MAE): 44.61759529676317
Mean Squared Error (MSE): 3097.119163424609
R2 Score: 0.39289927216962917


## **Build a Custom Linear Regression Model**

In [14]:
class CustomLR:
    def __init__(self):
        self.coef_ = None
        self.intercept_ = None
        
    def fit(self, X_train, y_train):
        X = np.insert(X_train, 0, 1, axis=1)
        y = y_train
        betas = np.linalg.inv(np.dot(X.T, X)).dot(X.T).dot(y)
        self.intercept_ = betas[0]
        self.coef_ = betas[1:]
        
    def predict(self, X_test):
        y_pred = X_test.dot(self.coef_) + self.intercept_
        return y_pred

In [15]:
# Instantiate a CustomLR object
lr = CustomLR()

# Fit the training data
lr.fit(X_train, y_train)

In [16]:
# Print the coefficients
lr.coef_

array([ -52.46478548, -193.50733393,  579.49108514,  272.453666  ,
       -504.64830389,  241.62372969,  -69.76596029,   86.61313961,
        721.92083806,   26.78067442])

In [17]:
# Print the intercept
lr.intercept_

153.71901624380382

In [18]:
# Predict the test data
y_pred = lr.predict(X_test)

In [19]:
# Check the accuarcy
print("Mean Absolute Error (MAE):", mean_absolute_error(y_test, y_pred))
print("Mean Squared Error (MSE):", mean_squared_error(y_test, y_pred))
print("R2 Score:", r2_score(y_test, y_pred))

Mean Absolute Error (MAE): 44.61759529676319
Mean Squared Error (MSE): 3097.1191634246143
R2 Score: 0.39289927216962817
