<font color = "red" size = 12>📘 Linear Regression: Step-by-Step Derivation Notes</font>

---

**✍️ 1. Objective of Linear Regression**

We want to find a line:

$$
y = mx + b
$$

That **best fits** the given data points $ (x_1, y_1), (x_2, y_2), \dots, (x_n, y_n) $.
Our goal is to minimize the **error between the actual values and predicted values**.



**📉 2. Error or Loss Function (Sum of Squared Errors)**

We define the **error (residual)** for each data point as:

$$
e_i = y_i - \hat{y}_i = y_i - (mx_i + b)
$$

- $\hat{y}$ = Is Predicted Value

To penalize large errors and avoid cancellation of positive/negative errors, we square them:

$$
E = \sum_{i=1}^{n} (y_i - (mx_i + b))^2
$$

This is called the **Loss Function** or **Cost Function**:

$$
J(m, b) = \sum_{i=1}^{n} (y_i - mx_i - b)^2
$$



**🔍 3. Goal: Minimize the Loss Function**

We want to find the values of $ m $ and $ b $ such that $ J(m, b) $ is **minimum**.

To do this, we apply **differentiation** (calculus) and set partial derivatives to zero.



**🧮 4. Differentiation of Loss Function**

Expanding:

$$
J(m, b) = \sum_{i=1}^{n} (y_i - mx_i - b)^2
$$

We take **partial derivative** of \( J \) w.r.t. \( m \) and \( b \).



**🔹 Partial Derivative w.r.t. \( m \):**

$$
\frac{\partial J}{\partial m} = \sum_{i=1}^{n} 2(y_i - mx_i - b)(-x_i)
$$



**🔹 Partial Derivative w.r.t. \( b \):**

$$
\frac{\partial J}{\partial b} = \sum_{i=1}^{n} 2(y_i - mx_i - b)(-1)
$$



Set both partial derivatives to zero for minimization:



**🧩 5. Solving for $ m $ and $ b $ (Closed-form)**

Solving step-by-step:

**Step 1: Solve for \( b \)**

From $ \frac{\partial J}{\partial b} = 0 $:

$$
\sum_{i=1}^{n} (y_i - mx_i - b) = 0
$$

Distribute sum:

$$
\sum y_i - m\sum x_i - nb = 0
$$

Rearrange:

$$
b = \frac{\sum y_i - m\sum x_i}{n}
$$



**Step 2: Solve for $ m $**

From $ \frac{\partial J}{\partial m} = 0 $:

$$
\sum (y_i - mx_i - b)x_i = 0
$$

Expanding:

$$
\sum y_i x_i - m\sum x_i^2 - b\sum x_i = 0
$$

Solving simultaneously, we get:



**✅ Final Formula for $ m $ (slope):**

$$
m = \frac{\sum (x_i - \bar{x}) (y_i - \bar{y})}{\sum (x_i - \bar{x})^2}
$$

or equivalently,

$$
m = \frac{n\sum x_i y_i - \sum x_i \sum y_i}{n\sum x_i^2 - (\sum x_i)^2}
$$



**✅ Final Formula for \( b \) (intercept):**

$$
b = \bar{y} - m\bar{x}
$$

Where:

$$
\bar{x} = \frac{1}{n} \sum x_i, \quad \bar{y} = \frac{1}{n} \sum y_i
$$

---

# 📊 6. Example Calculation

Let's use the dataset:

| $ x $ | $ y $ |
| --- | --- |
| 1 | 2 |
| 2 | 2.5 |
| 3 | 3 |
| 4 | 4.5 |
| 5 | 5 |

**Step 1: Compute Summations**

**Compute Means:**
$$
\bar{x} = \frac{1+2+3+4+5}{5} = 3
$$
$$
\bar{y} = \frac{2+2.5+3+4.5+5}{5} = 3.4
$$

**Compute $ \sum x_i y_i $, $ \sum x_i^2 $:**

$$
\sum x_i y_i = (1)(2) + (2)(2.5) + (3)(3) + (4)(4.5) + (5)(5) = 42
$$

$$
\sum x_i^2 = 1^2 + 2^2 + 3^2 + 4^2 + 5^2 = 55
$$

**Step 2: Compute $ m $ and $ b $**

Using the formulas:

$$
m = \frac{5(42) - (1+2+3+4+5)(2+2.5+3+4.5+5)}{5(55) - (1+2+3+4+5)^2}
$$

$$
m = \frac{210 - 85}{275 - 225} = \frac{125}{50} = 2.5
$$

$$
b = \bar{y} - m\bar{x} = 3.4 - (2.5)(3) = 3.4 - 7.5 = -4.1
$$

**✅ Final Regression Line:**

$$
y = 2.5x - 4.1
$$



**🚀 7. Bonus (Optional for Future - Matrix Form)**

In vectorized/matrix form, the closed-form solution is:

$$
\theta = (X^TX)^{-1}X^Ty
$$

Where $ X $ is the design matrix including the bias term (column of 1s), and $ y $ is the target vector.



**📚 Prerequisites**

To fully understand this:

* Basic algebra
* Summation rules
* Differentiation (partial derivatives)
* Concept of minimization
* Basic statistics: mean ($\bar{x}, \bar{y}$)



**✅ Summary Box (Quick Reference)**

| Element | Formula |
| --- | --- |
| Error | $ e_i = y_i - (mx_i + b) $ |
| Loss Function | $ J(m, b) = \sum (y_i - mx_i - b)^2 $ |
| Slope (m) | $ m = \frac{n\sum xy - \sum x \sum y}{n\sum x^2 - (\sum x)^2} $ |
| Intercept (b) | $ b = \bar{y} - m\bar{x} $ |
| Final Line | $ \hat{y} = mx + b $ |

