# Multiple Linear Regression

## Summary

* **Multiple Linear Regression** expands on simple linear regression by incorporating multiple **independent features** to predict an output, rather than just one.
* The **House Pricing Dataset** serves as a primary example, using features like the number of rooms, size of the house, and location to determine price.
* The regression equation includes one **intercept** ($\theta_0$) and multiple **coefficients** ($\theta_1, \theta_2, \theta_3, \dots$) corresponding to each input feature.
* **Gradient Descent** in this context involves optimizing all coefficients simultaneously to minimize the **cost function** and reach the **global minima**.

---

## Difference Between Simple and Multiple Linear Regression

### Simple Linear Regression Recap
In **Simple Linear Regression**, the model relies on a single **input feature** (or **independent feature**) to predict an output. A classic example is using **weight** to predict **height**.

The equation for a single feature is defined as:

$$
h_\theta(x) = \theta_0 + \theta_1 x
$$

* **$\theta_0$**: The **Intercept**
* **$\theta_1$**: The **Slope**
* **$x$**: The input data from the training set

In this scenario, the algorithm only needs to update $\theta_0$ and $\theta_1$ to find the **best fit line**.

### Introduction to Multiple Linear Regression
**Multiple Linear Regression** applies when a dataset contains more than one **input feature**.

#### Example: House Pricing Dataset
To illustrate this, consider a **House Pricing Dataset** with the following attributes:
* **Number of Rooms** ($x_1$)
* **Size of the House** ($x_2$)
* **Location** ($x_3$)
* **Price of the House** (Output Feature)

In this case, the first three attributes act as the **Independent Features**, while the price is the **Output Feature**.

## The Multiple Linear Regression Equation

The mathematical representation changes to accommodate the additional features. The hypothesis equation becomes:

$$
h_\theta(x) = \theta_0 + \theta_1x_1 + \theta_2x_2 + \theta_3x_3
$$

### Breakdown of Parameters
* **$\theta_0$ (Intercept)**: There is always exactly one intercept in any regression problem, regardless of the number of features.
* **$\theta_1, \theta_2, \theta_3$ (Coefficients/Slopes)**: Each input feature has a corresponding coefficient.
    * $\theta_1$ is the coefficient for the **Number of Rooms**
    * $\theta_2$ is the coefficient for the **Size of the House**
    * $\theta_3$ is the coefficient for the **Location**

As the number of **input features** increases, the number of **coefficients** increases accordingly.

## Gradient Descent in Multiple Dimensions

When applying **Gradient Descent** to multiple linear regression, the visualization shifts from a simple 2D or 3D graph to a multi-dimensional concept. While it is impossible to draw a 4D diagram, the underlying principle remains the same.

* The **Cost Function** ($J(\theta)$) typically forms an **inverted mountain** or bowl shape.
* The algorithm initializes starting points for all parameters ($\theta_0, \theta_1, \theta_2, \dots$).
* The objective is to iteratively update all coefficients to converge at the **Global Minima**, thereby minimizing the error.

This process involves continuously changing all $\theta$ values ($\theta_0, \theta_1, \theta_2, \theta_3$) to reduce the cost function.
