The two ways are:

 - Closed-form / Analytical solution (Direct formula)
 - Optimization approach (Gradient Descent)

### Closed-form solution (Direct math)

## Formulas for Simple Linear Regression

### Slope (m)
$$
m = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sum (x_i - \bar{x})^2}
$$

### Intercept (b)
$$
b = \bar{y} - m\bar{x}
$$


## Explanation of the Formulas

- $x_i$ and $y_i$ represent individual data points from the dataset.
- $\bar{x}$ is the mean (average) of the input variable $X$.
- $\bar{y}$ is the mean (average) of the target variable $Y$.

## Interpretation of Slope ($m$)

- The numerator  
  $$
  \sum (x_i - \bar{x})(y_i - \bar{y})
  $$
  represents the **covariance between $X$ and $Y$**, which measures how $X$ and $Y$ vary together.

- The denominator  
  $$
  \sum (x_i - \bar{x})^2
  $$
  represents the **variance of $X$**, which measures how $X$ varies on its own.

- The slope $m$ indicates **how much $Y$ changes for a unit change in $X$**.

## Interpretation of Intercept ($b$)

- The intercept is calculated as:
  $$
  b = \bar{y} - m\bar{x}
  $$

- It ensures that the regression line passes through the point $(\bar{x}, \bar{y})$.
- It represents the predicted value of $Y$ when $X = 0$.


### Total Error (E)

$$
E = \sum_{i=1}^{n} d_i^2
$$


### Error Term ($d_i$)

$$
d_i = y_i - \hat{y}_i
$$

where:
- $y_i$ is the actual value  
- $\hat{y}_i$ is the predicted value given by the model



1) new formula : 

$$
E[total error] = \sum_{i=1}^{n} (y_i - \hat{y}_i)^2
$$

we have to find the m, b for this formula so that E is minimun but where is m , b
$$
 y = m*x_i + b 
$$

$$
 E(m,b) =  \sum_{i=1}^{n} (y_i -( m*x_i + b  ))^2
$$

- we have to minimize this

$$
 E(m,b) =  \sum_{i=1}^{n} (y_i - m*x_i - b  )^2
$$

## Final Formula for Slope ($m$) in Simple Linear Regression

$$
m = \frac{\sum_{i=1}^{n} (x_i - \bar{x})(y_i - \bar{y})}
         {\sum_{i=1}^{n} (x_i - \bar{x})^2}
$$


## Linear Regression: Finding Parameters $m$ and $b$

---

## Case 1: Linear Regression with Intercept Fixed at Zero ($b = 0$)

When the intercept is assumed to be zero, the linear regression model becomes:

$$
\hat{y} = mx
$$

---

### Error Function

The total squared error is given by:

$$
E(m) = \sum_{i=1}^{n} (y_i - m x_i)^2
$$

---

### Minimization of Error

To find the optimal value of $m$, we differentiate the error function with respect to $m$ and set it equal to zero.

#### Step 1: Differentiate with respect to $m$

$$
\frac{dE}{dm}
= \sum_{i=1}^{n} 2 (y_i - m x_i)(-x_i)
$$

---

#### Step 2: Set derivative equal to zero

$$
\sum_{i=1}^{n} x_i (y_i - m x_i) = 0
$$

---

#### Step 3: Expand the summation

$$
\sum_{i=1}^{n} x_i y_i - m \sum_{i=1}^{n} x_i^2 = 0
$$

---

#### Step 4: Solve for $m$

$$
m = \frac{\sum_{i=1}^{n} x_i y_i}{\sum_{i=1}^{n} x_i^2}
$$

---

### Final Result (Intercept Fixed at Zero)

$$
\boxed{m = \frac{\sum_{i=1}^{n} x_i y_i}{\sum_{i=1}^{n} x_i^2}}
$$

---

## Case 2: Linear Regression with Intercept ($b \neq 0$)

The general linear regression model is:

$$
\hat{y} = m x + b
$$

---

### Error Function

The total squared error is defined as:

$$
E(m,b) = \sum_{i=1}^{n} (y_i - m x_i - b)^2
$$

To find the optimal values of $m$ and $b$, we minimize the error by taking **partial derivatives** with respect to both parameters and setting them equal to zero.

---

### Partial Derivative with respect to $m$

$$
\frac{\partial E}{\partial m}
= \sum_{i=1}^{n} 2 (y_i - m x_i - b)(-x_i)
$$

Setting it to zero:

$$
\sum_{i=1}^{n} x_i (y_i - m x_i - b) = 0
$$

---

### Partial Derivative with respect to $b$

$$
\frac{\partial E}{\partial b}
= \sum_{i=1}^{n} 2 (y_i - m x_i - b)(-1)
$$

Setting it to zero:

$$
\sum_{i=1}^{n} (y_i - m x_i - b) = 0
$$

---

### Final Insight

Solving these two equations simultaneously gives the optimal values of:
- $m$ (slope)
- $b$ (intercept)

These values minimize the total squared error and define the **best-fit regression line**.


In [8]:
import numpy as np
import pandas as pd

In [16]:
# closed-form (analytical) implementation of Simple Linear Regression from scratch.
class DiwLR:
    
    def __init__(self):
        self.m = None
        self.b = None
        
    def fit(self, X_train, y_train):
        
        num = 0
        den = 0
        
        for i in range(X_train.shape[0]):
            num += (X_train[i] - X_train.mean()) * (y_train[i] - y_train.mean())
            den += (X_train[i] - X_train.mean()) * (X_train[i] - X_train.mean())
        
        self.m = num / den
        self.b = y_train.mean() - (self.m * X_train.mean())
        
        print("Slope (m):", self.m)
        print("Intercept (b):", self.b)
    
    def predict(self, X_test):
        return self.m * X_test + self.b

In [17]:
df = pd.read_csv("placement.csv")

In [18]:
df.head()

Unnamed: 0,cgpa,package
0,6.89,3.26
1,5.12,1.98
2,7.82,3.25
3,7.42,3.67
4,6.94,3.57


In [19]:
X = df.iloc[:,0].values
y = df.iloc[:,1].values

In [20]:
X

array([6.89, 5.12, 7.82, 7.42, 6.94, 7.89, 6.73, 6.75, 6.09, 8.31, 5.32,
       6.61, 8.94, 6.93, 7.73, 7.25, 6.84, 5.38, 6.94, 7.48, 7.28, 6.85,
       6.14, 6.19, 6.53, 7.28, 8.31, 5.42, 5.94, 7.15, 7.36, 8.1 , 6.96,
       6.35, 7.34, 6.87, 5.99, 5.9 , 8.62, 7.43, 9.38, 6.89, 5.95, 7.66,
       5.09, 7.87, 6.07, 5.84, 8.63, 8.87, 9.58, 9.26, 8.37, 6.47, 6.86,
       8.2 , 5.84, 6.6 , 6.92, 7.56, 5.61, 5.48, 6.34, 9.16, 7.36, 7.6 ,
       5.11, 6.51, 7.56, 7.3 , 5.79, 7.47, 7.78, 8.44, 6.85, 6.97, 6.94,
       8.99, 6.59, 7.18, 7.63, 6.1 , 5.58, 8.44, 4.26, 4.79, 7.61, 8.09,
       4.73, 6.42, 7.11, 6.22, 7.9 , 6.79, 5.83, 6.63, 7.11, 5.98, 7.69,
       6.61, 7.95, 6.71, 5.13, 7.05, 7.62, 6.66, 6.13, 6.33, 7.76, 7.77,
       8.18, 5.42, 8.58, 6.94, 5.84, 8.35, 9.04, 7.12, 7.4 , 7.39, 5.23,
       6.5 , 5.12, 5.1 , 6.06, 7.33, 5.91, 6.78, 7.93, 7.29, 6.68, 6.37,
       5.84, 6.05, 7.2 , 6.1 , 5.64, 7.14, 7.91, 7.19, 7.91, 6.76, 6.93,
       4.85, 6.17, 5.84, 6.07, 5.66, 7.57, 8.28, 6.

In [21]:
y

array([3.26, 1.98, 3.25, 3.67, 3.57, 2.99, 2.6 , 2.48, 2.31, 3.51, 1.86,
       2.6 , 3.65, 2.89, 3.42, 3.23, 2.35, 2.09, 2.98, 2.83, 3.16, 2.93,
       2.3 , 2.48, 2.71, 3.65, 3.42, 2.16, 2.24, 3.49, 3.26, 3.89, 3.08,
       2.73, 3.42, 2.87, 2.84, 2.43, 4.36, 3.33, 4.02, 2.7 , 2.54, 2.76,
       1.86, 3.58, 2.26, 3.26, 4.09, 4.62, 4.43, 3.79, 4.11, 2.61, 3.09,
       3.39, 2.74, 1.94, 3.09, 3.31, 2.19, 1.61, 2.09, 4.25, 2.92, 3.81,
       1.63, 2.89, 2.99, 2.94, 2.35, 3.34, 3.62, 4.03, 3.44, 3.28, 3.15,
       4.6 , 2.21, 3.  , 3.44, 2.2 , 2.17, 3.49, 1.53, 1.48, 2.77, 3.55,
       1.48, 2.72, 2.66, 2.14, 4.  , 3.08, 2.42, 2.79, 2.61, 2.84, 3.83,
       3.24, 4.14, 3.52, 1.37, 3.  , 3.74, 2.82, 2.19, 2.59, 3.54, 4.06,
       3.76, 2.25, 4.1 , 2.37, 1.87, 4.21, 3.33, 2.99, 2.88, 2.65, 1.73,
       3.02, 2.01, 2.3 , 2.31, 3.16, 2.6 , 3.11, 3.34, 3.12, 2.49, 2.01,
       2.48, 2.58, 2.83, 2.6 , 2.1 , 3.13, 3.89, 2.4 , 3.15, 3.18, 3.04,
       1.54, 2.42, 2.18, 2.46, 2.21, 3.4 , 3.67, 2.

In [23]:
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=42)

In [24]:
X_train.shape

(160,)

In [25]:
model = DiwLR()

In [26]:
model.fit(X_train, y_train)

Slope (m): 0.5742564727019197
Intercept (b): -1.0270069374542108


In [27]:
X_train.shape[0]

160

In [28]:
X_train[0]

7.18

In [30]:
X_train.mean()

7.005375000000001

In [31]:
X_test[0]

6.63

In [32]:
print(lr.predict(X_test[0]))

2.839345329174611


#### gradient descent–based Linear Regression from scratch.

In [15]:
# gradient descent–based Linear Regression from scratch.
class LinearRegressionScratch:    
    def __init__(self, lr=0.01, epochs=1000):
        self.lr = lr
        self.epochs = epochs
        self.m = 0
        self.b = 0
        
    def fit(self, X, y):
        n = len(X)
        
        for _ in range(self.epochs):
            y_pred = self.m * X + self.b
            
            dm = (-2/n) * np.sum(X * (y - y_pred))
            db = (-2/n) * np.sum(y - y_pred)
            
            self.m -= self.lr * dm
            self.b -= self.lr * db
            
    def predict(self, X):
        return self.m * X + self.b