# Linear Regression using OLS (Ordinary Least Squares)

## Summary

* **Ordinary Least Squares (OLS)** estimates linear regression parameters by minimizing the **sum of squared errors** between actual and predicted values.
* Unlike **Gradient Descent**, OLS provides a **closed-form solution**, meaning coefficients are computed directly without iteration.
* The method defines a **squared error cost function** and computes **partial derivatives** with respect to the intercept ($\beta_0$) and slope ($\beta_1$).
* Setting these derivatives equal to **zero** yields the **global minimum**, producing optimal coefficient values.

## Exam Notes

### OLS vs Gradient Descent

**Question**: What is the primary difference between Gradient Descent and OLS?

**Answer**:  
**Gradient Descent** is an **iterative optimization algorithm** that updates parameters step-by-step.  
**OLS** is an **analytical method** that directly computes the optimal coefficients ($\beta_0$, $\beta_1$) by solving equations obtained from setting derivatives to zero.

### Deriving the Coefficients

**Question**: How are $\beta_0$ and $\beta_1$ derived in OLS?

**Answer**:  
They are derived by minimizing the **Sum of Squared Errors (SSE)**. This is done by:
1. Taking partial derivatives of SSE with respect to $\beta_0$ and $\beta_1$
2. Setting the derivatives equal to zero
3. Solving the resulting system of equations

---

## Derivation of Coefficients

### 1. The Cost Function

OLS minimizes the **Sum of Squared Errors (SSE)**:

$$
S(\beta_0, \beta_1) = \sum_{i=1}^{n} (y_i - \beta_0 - \beta_1 x_i)^2
$$

---

### 2. Derivation of $\beta_0$ (Intercept)

**Step 1: Partial Derivative**

$$
\frac{\partial S}{\partial \beta_0}
= -2 \sum (y_i - \beta_0 - \beta_1 x_i)
$$

Set derivative to zero:

$$
\sum (y_i - \beta_0 - \beta_1 x_i) = 0
$$

**Step 2: Simplify**

$$
\sum y_i - n\beta_0 - \beta_1 \sum x_i = 0
$$

**Step 3: Solve for $\beta_0$**

$$
\beta_0 = \bar{y} - \beta_1 \bar{x}
$$

---

### 3. Derivation of $\beta_1$ (Slope)

**Step 1: Partial Derivative**

$$
\frac{\partial S}{\partial \beta_1}
= -2 \sum (y_i - \beta_0 - \beta_1 x_i)x_i
$$

Set derivative to zero:

$$
\sum x_i(y_i - \beta_0 - \beta_1 x_i) = 0
$$

**Step 2: Expand**

$$
\sum x_i y_i - \beta_0 \sum x_i - \beta_1 \sum x_i^2 = 0
$$

**Step 3: Substitute $\beta_0$**

Substitute  $\beta_0 = \bar{y} - \beta_1 \bar{x}$  into the normal equation:

$$
\sum_{i=1}^{n} \left(
x_i y_i - (\bar{y} - \beta_1 \bar{x})x_i - \beta_1 x_i^2
\right) = 0
$$

---

**Step 4: Expand and Group Terms**

First, expand the expression:

$$
\sum_{i=1}^{n} \left(
x_i y_i - x_i \bar{y} + \beta_1 \bar{x} x_i - \beta_1 x_i^2
\right) = 0
$$

Now, group the terms and take $x_i$ as a common factor:


$$
\sum_{i=1}^{n} x_i(y_i - \bar{y})
+
\sum_{i=1}^{n} x_i(\beta_1 \bar{x} - \beta_1 x_i)
= 0
$$

---

**Step 5: Factor Out $\beta_1$**

$$
\sum_{i=1}^{n} (y_i - \bar{y})
-
\beta_1 \sum_{i=1}^{n} (x_i - \bar{x})
= 0
$$

Rearranging to isolate $\beta_1$:

$$
\beta_1 \sum_{i=1}^{n} (x_i - \bar{x})
=
\sum_{i=1}^{n} (y_i - \bar{y})
$$

---


---

## Final Result

$$
\boxed{
\begin{aligned}
\beta_1 &= \frac{\sum (y_i - \bar{y})}{\sum (x_i - \bar{x})} \\
\beta_0 &= \bar{y} - \beta_1 \bar{x}
\end{aligned}
}
$$
