
   $\text{Numerator} = \sum_{i=1}^n \left( (X_i - \bar{X}) \cdot (Y_i - \bar{Y}) \right)$

   Where:
   - $(X_i)$is the $(i)$-th value of $(X)$.
   - $(Y_i)$ is the $(i)$-th value of $(Y)$.
   - $(\bar{X})$ is the mean of $(X)$.
   - $(\bar{Y})$ is the mean of $(Y)$.


   $\text{Denominator} = \sum_{i=1}^n \left( X_i - \bar{X} \right)^2$

   This represents the total squared deviation of $(X)$ values from their mean.

---

### Combined Usage:
The slope $(m)$abs of the regression line is calculated as:

$
m = \frac{\text{Numerator}}{\text{Denominator}} = \frac{\sum_{i=1}^n \left( (X_i - \bar{X}) \cdot (Y_i - \bar{Y}) \right)}{\sum_{i=1}^n \left( X_i - \bar{X} \right)^2}
$

This formula ensures that the regression line minimizes the sum of squared errors between the predicted and actual \(Y\)-values.

---

### Explanation in Context:
The numerator measures how $(X)$ and $(Y)$ vary together (covariance), while the denominator measures how $(X)$ varies by itself (variance). By dividing the two, you get the slope of the best-fit line.

---

***Formula for line***

$y=mx+b$

$\bar{y} = m \cdot \bar{x} + b$

Solving for $(b)$:

$b = \bar{y} - m \cdot \bar{x}$




In [39]:
class My_lr:
    def __init__(self):
        self.m = None 
        self.b = None 

    def fit(self, X_train, y_train): 

        num = 0
        den = 0

        for i in range(X_train.shape[0]): 
            num = num + ((X_train[i] - X_train.mean()) * (y_train[i] - y_train.mean())) #numerator
            den = den +  ((X_train[i] - X_train.mean()) ** 2) # denominator
            
        self.m = num/den 
        self.b = y_train.mean() - (self.m * X_train.mean())
        
        print(self.m)
        print(self.b)
                                                                
    def predict(self, X_test):
        print('X_test: ',X_test)
        print('predict : ', end="")
        return self.m * X_test + self.b  # y =mx+b
        

In [40]:
import numpy as np 
import pandas as pd 

In [41]:
df = pd.read_csv('/home/aromal/Documents/jupyter_Lab/Datasets_Main/Linear_regression/placement.csv')
df.head(2)

Unnamed: 0,cgpa,package
0,6.89,3.26
1,5.12,1.98


In [42]:
X = df.iloc[:, 0:1].values
y = df.iloc[:,-1].values

In [43]:
from sklearn.model_selection import train_test_split

In [44]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=2)

In [45]:
X_train.shape

(160, 1)

In [46]:
lr = My_lr()
lr.fit(X_train, y_train)

[0.55795197]
[-0.89611192]


In [47]:
print(lr.predict(X_test[0]))

X_test:  [8.58]
predict : [3.89111601]
