# **Linear Regression**



## 🧠 What is Linear Regression?

It’s a method to find the **best straight line** that predicts an output **$y$** from input features **$x_1, x_2, ..., x_n$**.

---

## 📏 Mathematical Equation

### 🔹 **For one feature (Simple Linear Regression)**

$$
\hat{y} = w_1 x + b
$$

* $\hat{y}$ → predicted output
* $x$ → input feature
* $w_1$ → weight (slope of the line)
* $b$ → bias (intercept — where the line cuts the y-axis)

---

### 🔹 **For multiple features (Multiple Linear Regression)**

$$
\hat{y} = w_1 x_1 + w_2 x_2 + \dots + w_n x_n + b
$$

Or using vector notation:

$$
\hat{y} = \mathbf{w}^\top \mathbf{x} + b
$$

Where:

* $\mathbf{w} = [w_1, w_2, ..., w_n]$
* $\mathbf{x} = [x_1, x_2, ..., x_n]$

---

## 🔍 Goal of Linear Regression

Find the **best weights $w_1, w_2, ..., w_n$** and **bias $b$** such that the **difference between actual $y$** and **predicted $\hat{y}$** is minimized.

This difference is called the **error**.

---

## 🎯 Loss Function (What We Minimize)

We use **Mean Squared Error (MSE)**:

$$
\text{MSE} = \frac{1}{m} \sum_{i=1}^{m} (y_i - \hat{y}_i)^2
$$

* $m$ = number of data points
* $y_i$ = actual value
* $\hat{y}_i$ = predicted value using weights and bias

---

## ✏️ Summary in One Line

> Linear Regression finds the best-fitting straight line by minimizing the **average squared difference** between actual and predicted values.

---

## 🎯 Maths Behind

Problem: Predict Marks Based on Study Hours

You have the following data:

| Hours Studied (x) | Marks Scored (y) |
| ----------------- | ---------------- |
| 1                 | 2                |
| 2                 | 4                |
| 3                 | 5                |
| 4                 | 4                |
| 5                 | 5                |

We want to find a straight-line equation:

$$
\hat{y} = w x + b
$$

That predicts marks $\hat{y}$ based on hours studied $x$.

---

## 🧮 Step 1: Calculate Averages

$$
\bar{x} = \frac{1 + 2 + 3 + 4 + 5}{5} = 3
$$

$$
\bar{y} = \frac{2 + 4 + 5 + 4 + 5}{5} = 4
$$

---

## 🧮 Step 2: Calculate Slope $w$

Use the formula: The equation shown below is the formula for the slope (w) of the simple linear regression line using the least squares method.

$$
w = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sum (x_i - \bar{x})^2}
$$

Let’s compute step-by-step:

| x | y | $x - \bar{x}$ | $y - \bar{y}$ | $(x - \bar{x})(y - \bar{y})$ | $(x - \bar{x})^2$ |
| - | - | ------------- | ------------- | ---------------------------- | ----------------- |
| 1 | 2 | -2            | -2            | 4                            | 4                 |
| 2 | 4 | -1            | 0             | 0                            | 1                 |
| 3 | 5 | 0             | 1             | 0                            | 0                 |
| 4 | 4 | 1             | 0             | 0                            | 1                 |
| 5 | 5 | 2             | 1             | 2                            | 4                 |
|   |   |               |               | **Total: 6**                 | **Total: 10**     |

$$
w = \frac{6}{10} = 0.6
$$

---

## 🧮 Step 3: Calculate Intercept $b$

$$
b = \bar{y} - w \bar{x} = 4 - 0.6 × 3 = 2.2
$$

---

## ✅ Final Equation

$$
\hat{y} = 0.6x + 2.2
$$

---

## 🧪 Step 4: Predict

If a student studies for 4 hours:

$$
\hat{y} = 0.6 × 4 + 2.2 = 4.6 \text{ marks}
$$

---
Great! Let’s continue with **Step-by-Step Evaluation** of our **Linear Regression model**:

We already derived the prediction line:

$$
\hat{y} = 0.6x + 2.2
$$

Let’s now evaluate this using:

1. **Predicted values $\hat{y}$**
2. **Residuals (Errors)**
3. **MSE (Mean Squared Error)**
4. **R² (R-squared score)**
5. **Plot (visual explanation)**

---

## 🧮 Step 1: Predictions and Residuals

| x (Hours) | y (Actual Marks) | $\hat{y}$ = 0.6x + 2.2 | Residual = $y - \hat{y}$ |
| --------- | ---------------- | ---------------------- | ------------------------ |
| 1         | 2                | 0.6×1 + 2.2 = 2.8      | -0.8                     |
| 2         | 4                | 0.6×2 + 2.2 = 3.4      | +0.6                     |
| 3         | 5                | 0.6×3 + 2.2 = 4.0      | +1.0                     |
| 4         | 4                | 0.6×4 + 2.2 = 4.6      | -0.6                     |
| 5         | 5                | 0.6×5 + 2.2 = 5.2      | -0.2                     |

---

## 📏 Step 2: MSE (Mean Squared Error)

$$
\text{MSE} = \frac{1}{n} \sum (y - \hat{y})^2
$$

$$
= \frac{(-0.8)^2 + (0.6)^2 + (1)^2 + (-0.6)^2 + (-0.2)^2}{5}
= \frac{0.64 + 0.36 + 1 + 0.36 + 0.04}{5} = \frac{2.4}{5} = 0.48
$$

✅ **MSE = 0.48**
(Small value = good)

---

## 📈 Step 3: R² Score (Goodness of Fit)

Formula(SS = Sum of Sqaures):

$$
R^2 = 1 - \frac{SS_\text{res}}{SS_\text{tot}}
$$

Where:

* $SS_\text{res} = \sum (y - \hat{y})^2 = 2.4$
* $SS_\text{tot} = \sum (y - \bar{y})^2 = (2-4)^2 + (4-4)^2 + (5-4)^2 + (4-4)^2 + (5-4)^2 = 4 + 0 + 1 + 0 + 1 = 6$

$$
R^2 = 1 - \frac{2.4}{6} = 1 - 0.4 = 0.6
$$

✅ **R² = 0.6**
This means **60% of the variation in marks is explained** by hours studied.

---

| R² Score | How Good is the Model         |
| -------- | ----------------------------- |
| 1.0      | Perfect! 🚀                   |
| 0.7–0.9  | Very good 👍                  |
| 0.4–0.6  | Decent 😐                     |
| 0–0.3    | Weak 👎                       |
| < 0      | Worse than guessing average ❌ |

---




## ✅ GENERALIZED PROCESS (for both Linear & Lasso Regression):

### 🔁 Step-by-Step:

1. **Initialize** weights $w_1, w_2, ..., w_n$ and intercept $b$

2. **Make Predictions**

   $$
   \hat{y} = w_1 x_1 + w_2 x_2 + \dots + w_n x_n + b
   $$

3. **Calculate Error**
   Usually **Mean Squared Error (MSE)**:

   $$
   \text{MSE} = \frac{1}{m} \sum (y - \hat{y})^2
   $$

4. **(If Lasso) Add Penalty**

   $$
   \text{Lasso Loss} = \text{MSE} + \alpha \sum |w_i|
   $$

5. **Adjust Weights and Bias**
   Change weights **to reduce the total loss** (error + penalty in case of Lasso)

6. **Repeat Steps 2–5**
   Until the **error is minimized** (or until improvement is very small)

---

## 📊 Key Difference in Step 4:

| Type                  | What’s minimized?                         |
| --------------------- | ----------------------------------------- |
| **Linear Regression** | Just prediction error (MSE)               |
| **Lasso Regression**  | Prediction error **+** penalty on weights |

---

## 📌 So in Simple Words:

> Yes — in both linear and lasso, the model **guesses weights**, **checks how bad the error is**, and **keeps adjusting them** to make the **total error as small as possible**.
> Lasso adds **extra pressure to keep weights small or zero**.

---

