![formula](formula.jpg)

This formula is used to calculate the coefficients (β) of a linear regression model. Let me break it down into simpler terms:

- (β): The coefficients of the linear regression model. These coefficients represent the relationship between the independent variables (features) and the dependent variable (target).

- \( X \): The design matrix. It contains the values of the independent variables (features) for each data point. Each row represents a data point, and each column represents a different feature.

- \( Y \): The target vector. It contains the observed values of the dependent variable for each data point.

- \( X^T \): The transpose of the design matrix. This operation flips the rows and columns of the design matrix.

- \( (X^T X)^{-1} \): The inverse of the matrix product of the transpose of the design matrix (\( X^T \)) and the design matrix itself (\( X \)). This term is known as the "precision matrix" or the "inverse of the Gram matrix". It helps in finding the coefficients that minimize the sum of squared residuals.

- \( (X^T Y) \): The matrix product of the transpose of the design matrix (\( X^T \)) and the target vector (\( Y \)). This operation is also known as the "cross-product matrix".

Putting it all together, the formula calculates the coefficients (\( \beta \)) by first computing the inverse of the matrix product of the transpose of the design matrix and the design matrix (\( (X^T X)^{-1} \)), and then multiplying it by the matrix product of the transpose of the design matrix and the target vector (\( (X^T Y) \)).

In simpler terms, this formula finds the best-fitting line (or hyperplane in higher dimensions) that minimizes the difference between the predicted values (based on the independent variables) and the actual observed values (the target variable). It does so by solving a system of equations derived from the least squares method.

In [1]:
import numpy as np
from sklearn.datasets import load_diabetes

Dataset details = https://scikit-learn.org/stable/datasets/toy_dataset.html#diabetes-dataset

In [2]:
X,y = load_diabetes(return_X_y=True)

In [3]:
X.shape

(442, 10)

In [4]:
y.shape

(442,)

## Using Sklearn's Linear Regression

In [5]:
from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=2)

In [6]:
print(X_train.shape)
print(X_test.shape)

(353, 10)
(89, 10)


In [7]:
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(X_train, y_train)
y_pred = lr.predict(X_test)

In [8]:
from sklearn.metrics import r2_score

In [9]:
r2_score(y_test,y_pred)

0.4399338661568968

In [10]:
lr.coef_

array([  -9.15865318, -205.45432163,  516.69374454,  340.61999905,
       -895.5520019 ,  561.22067904,  153.89310954,  126.73139688,
        861.12700152,   52.42112238])

In [11]:
lr.intercept_

151.88331005254167

## Making our own Linear Regression Class

In [12]:
class MeraLR:
    
    def __init__(self):
        self.coef_ = None
        self.intercept_ = None
        
    def fit(self,X_train,y_train):
        X_train = np.insert(X_train,0,1,axis=1)
        
        # calcuate the coeffs
        betas = np.linalg.inv(np.dot(X_train.T,X_train)).dot(X_train.T).dot(y_train)
        self.intercept_ = betas[0]
        self.coef_ = betas[1:]
    
    def predict(self,X_test):
        y_pred = np.dot(X_test,self.coef_) + self.intercept_
        return y_pred

Let's break down the code step by step:

```python
class MeraLR:
    def __init__(self):
        self.coef_ = None
        self.intercept_ = None
```

1. The `MeraLR` class is defined, which serves as a custom linear regression model.
2. The `__init__` method initializes the class attributes `coef_` and `intercept_` to `None`. These attributes will store the coefficients and intercept of the linear regression model once it's fitted.

```python
    def fit(self, X_train, y_train):
        X_train = np.insert(X_train, 0, 1, axis=1)
```

3. The `fit` method is defined, which takes training data `X_train` and corresponding target values `y_train` as input.
4. A constant term (1s) is inserted at the beginning of each data point in the feature matrix `X_train`. This is done by using `np.insert()` to add a column of ones at index 0 along the axis 1.

```python
        # calculate the coeffs
        betas = np.linalg.inv(np.dot(X_train.T, X_train)).dot(X_train.T).dot(y_train)
        self.intercept_ = betas[0]
        self.coef_ = betas[1:]
```

5. The coefficients of the linear regression model are computed using the ordinary least squares (OLS) method.
6. First, the dot product of the transpose of `X_train` and `X_train` itself is computed (`np.dot(X_train.T, X_train)`).
7. Then, the inverse of this matrix product is calculated (`np.linalg.inv()`).
8. Next, the dot product of the transpose of `X_train` and the target vector `y_train` is computed (`np.dot(X_train.T, y_train)`).
9. Finally, the dot product of the inverse of the matrix product and the dot product of the transpose of `X_train` and `y_train` is calculated to get the coefficients (`betas`).
10. The intercept is set to the first element of `betas`, and the remaining coefficients are stored in `self.coef_`.

```python
    def predict(self, X_test):
        y_pred = np.dot(X_test, self.coef_) + self.intercept_
        return y_pred
```

11. The `predict` method takes test data `X_test` as input.
12. The predicted values (`y_pred`) are computed using the dot product of `X_test` and the coefficients (`self.coef_`), plus the intercept (`self.intercept_`).
13. The predicted values are returned.

In summary, this code defines a custom linear regression model `MeraLR` with methods to fit the model to training data (`fit`) and make predictions on test data (`predict`). The model computes coefficients using the OLS method and stores them along with the intercept for later use in prediction.

In [13]:
lr = MeraLR()

In [14]:
lr.fit(X_train,y_train)

In [15]:
X_train.shape

(353, 10)

In [16]:
y_pred = lr.predict(X_test)

In [17]:
lr.coef_

array([  -9.15865318, -205.45432163,  516.69374454,  340.61999905,
       -895.5520019 ,  561.22067904,  153.89310954,  126.73139688,
        861.12700152,   52.42112238])

In [18]:
lr.intercept_

151.8833100525417