<div style="background-color:#daee8420; line-height:1.5; text-align:center;border:2px solid black;">
    <div style="color:#7B242F; font-size:24pt; font-weight:700;">The Ultimate Machine Learning Mastery Course with Python</div>
</div>

---
### **Course**: The Ultimate Machine Learning Course with Python  
#### **Chapter**: Regression  
##### **Lesson**: Linear Regression Theory  
###### **Author:** Dr. Saad Laouadi   
###### **Copyright:** Dr. Saad Laouadi    

---

## License

**This material is intended for educational purposes only and may not be used directly in courses, video recordings, or similar without prior consent from the author. When using or referencing this material, proper credit must be attributed to the author.**

```text
#**************************************************************************
#* (C) Copyright 2024 by Dr. Saad Laouadi. All Rights Reserved.           *
#**************************************************************************                                                                    
#* DISCLAIMER: The author has used their best efforts in preparing        *
#* this content. These efforts include development, research,             *
#* and testing of the theories and programs to determine their            *
#* effectiveness. The author makes no warranty of any kind,               *
#* expressed or implied, with regard to these programs or                 *
#* to the documentation contained within. The author shall not            *
#* be liable in any event for incidental or consequential damages         *
#* in connection with, or arising out of, the furnishing,                 *
#* performance, or use of these programs.                                 *
#*                                                                        *
#* This content is intended for tutorials, online articles,               *
#* and other educational purposes.                                        *
#**************************************************************************
```

# Comprehensive Lesson on Multiple Linear Regression

### Overview

Multiple Linear Regression (MLR) is an extension of Simple Linear Regression where more than one independent variable is used to predict the dependent variable. The goal remains the same: to model the relationship between the independent variables and the dependent variable by fitting a linear equation.


### 1. **General Mathematical Formula**

In Multiple Linear Regression, the predicted value ($\hat{y}$) is a linear combination of multiple independent variables ($x_1, x_2, ..., x_p$):

$$
\hat{y} = \hat{\beta_0} + \hat{\beta_1} x_1 + \hat{\beta_2} x_2 + ... + \hat{\beta_p} x_p
$$

Where:
- $\hat{y}$ is the predicted value (dependent variable).
- $x_1, x_2, ..., x_p$ are the independent variables.
- $\hat{\beta_0}$ is the intercept (the value of $\hat{y}$ when all $x$ values are zero).
- $\hat{\beta_1}, \hat{\beta_2}, ..., \hat{\beta_p}$ are the estimated coefficients for the independent variables.


### 2. **Model Representation**

The general formula for multiple linear regression is:

$$
y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p + \epsilon
$$

Where:
- $y$ is the actual observed value (dependent variable).
- $\epsilon$ is the error term (or residual) representing the difference between the actual and predicted values.

The goal is to find the best-fitting coefficients $\hat{\beta_0}, \hat{\beta_1}, \hat{\beta_2}, ..., \hat{\beta_p}$ that minimize the sum of squared residuals (errors).

### Representing the Data in Matrix Notation

Multiple Linear Regression can be represented concisely in matrix form, which is particularly useful for computation and deriving the coefficients.

Let’s represent the model as:

$$
\mathbf{y} = \mathbf{X} \boldsymbol{\beta} + \boldsymbol{\epsilon}
$$

Where:

- $\mathbf{y}$ is the $n \times 1$ vector of observed values (dependent variable).
  
  $$
  \mathbf{y} =
  \begin{bmatrix}
  y_1 \\
  y_2 \\
  \vdots \\
  y_n
  \end{bmatrix}
  $$

- $\mathbf{X}$ is the $n \times (p+1)$ matrix of independent variables, including a column of ones for the intercept.

  $$
  \mathbf{X} =
  \begin{bmatrix}
  1 & x_{11} & x_{12} & \dots & x_{1p} \\
  1 & x_{21} & x_{22} & \dots & x_{2p} \\
  \vdots & \vdots & \vdots & \dots & \vdots \\
  1 & x_{n1} & x_{n2} & \dots & x_{np}
  \end{bmatrix}
  $$

- $\boldsymbol{\beta}$ is the $(p+1) \times 1$ vector of coefficients (including the intercept).

  $$
  \boldsymbol{\beta} =
  \begin{bmatrix}
  \beta_0 \\
  \beta_1 \\
  \beta_2 \\
  \vdots \\
  \beta_p
  \end{bmatrix}
  $$

- $\boldsymbol{\epsilon}$ is the $n \times 1$ vector of errors or residuals.

  $$
  \boldsymbol{\epsilon} =
  \begin{bmatrix}
  \epsilon_1 \\
  \epsilon_2 \\
  \vdots \\
  \epsilon_n
  \end{bmatrix}
  $$

### Coefficient Estimation in Matrix Form

The Ordinary Least Squares (OLS) method gives the estimated coefficients $\hat{\boldsymbol{\beta}}$ as:

$$
\hat{\boldsymbol{\beta}} = (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{y}
$$

Where:
- $\mathbf{X}^T$ is the transpose of the matrix $\mathbf{X}$.
- $(\mathbf{X}^T \mathbf{X})^{-1}$ is the inverse of the product of $\mathbf{X}^T$ and $\mathbf{X}$.
  
This matrix formulation provides a compact and efficient way to compute the regression coefficients using linear algebra techniques.


### 3. **Derivation of Parameters**

To estimate the coefficients $\hat{\beta_0}, \hat{\beta_1}, \hat{\beta_2}, ..., \hat{\beta_p}$, we again use the **Ordinary Least Squares (OLS)** method. The OLS approach aims to minimize the sum of squared errors (SSE):

$$
SSE = \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 = \sum_{i=1}^{n} (y_i - (\beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + ... + \beta_p x_{ip}))^2
$$

The solution to this minimization problem involves matrix algebra and can be represented as:

#### Matrix Form of Multiple Linear Regression

The multiple linear regression model can be written in matrix form as:

$$
\mathbf{y} = \mathbf{X} \boldsymbol{\beta} + \boldsymbol{\epsilon}
$$

Where:
- $\mathbf{y}$ is an $n \times 1$ vector of the dependent variable values.
- $\mathbf{X}$ is an $n \times (p+1)$ matrix of the independent variables (including a column of 1s for the intercept).
- $\boldsymbol{\beta}$ is a $(p+1) \times 1$ vector of the coefficients ($\beta_0, \beta_1, ..., \beta_p$).
- $\boldsymbol{\epsilon}$ is an $n \times 1$ vector of the residuals (errors).

The estimated coefficients ($\hat{\boldsymbol{\beta}}$) are computed as:

$$
\hat{\boldsymbol{\beta}} = (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{y}
$$


### 4. **Evaluating the Model**

As with simple linear regression, multiple linear regression models are evaluated using various statistical metrics to assess the fit and accuracy of the model.

#### a. **Coefficient of Determination ($R^2$)**:

The $R^2$ value measures the proportion of the variance in the dependent variable that is predictable from the independent variables.

$$
R^2 = 1 - \frac{\sum_{i=1}^{n} (y_i - \hat{y}_i)^2}{\sum_{i=1}^{n} (y_i - \bar{y})^2}
$$

#### b. **Adjusted $R^2$**:

Adjusted $R^2$ is a modified version of $R^2$ that accounts for the number of independent variables in the model. It is used to prevent overfitting.

$$
\text{Adjusted } R^2 = 1 - \left( \frac{(1 - R^2)(n - 1)}{n - p - 1} \right)
$$

Where:
- $n$ is the number of observations.
- $p$ is the number of independent variables.

#### c. **Mean Squared Error (MSE)**:
MSE measures the average of the squared differences between actual and predicted values.

$$
MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2
$$

#### d. **Root Mean Squared Error (RMSE)**:
RMSE is the square root of MSE and is used to measure the average magnitude of the errors in the predictions.

$$
RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2}
$$


### 5. **Example of Multiple Linear Regression**

Let’s consider an example where we want to predict a house price based on two independent variables: **size (square footage)** and **number of bedrooms**.

| Size (sq ft) | Bedrooms | Price (\$1000s) |
|--------------|----------|-----------------|
| 1400         | 3        | 300             |
| 1600         | 3        | 330             |
| 1700         | 4        | 355             |
| 1875         | 3        | 375             |
| 1100         | 2        | 225             |

The multiple linear regression model is given by:

$$
\hat{y} = \hat{\beta_0} + \hat{\beta_1} \cdot \text{Size} + \hat{\beta_2} \cdot \text{Bedrooms}
$$

#### Step 1: Calculate the coefficients ($\hat{\beta_0}, \hat{\beta_1}, \hat{\beta_2}$)

Using OLS, we compute the best-fitting coefficients for the model.

#### Step 2: Prediction Formula

Once the coefficients are determined, we can predict house prices for new data points using the formula:

$$
\hat{y} = \hat{\beta_0} + \hat{\beta_1} \cdot \text{Size} + \hat{\beta_2} \cdot \text{Bedrooms}
$$

---

### 6. **Assumptions of Multiple Linear Regression**

Just like in Simple Linear Regression, Multiple Linear Regression also relies on several assumptions:

1. **Linearity**: The relationship between the independent variables and the dependent variable is linear.
2. **Independence**: Observations should be independent of each other.
3. **Homoscedasticity**: The residuals have constant variance at every level of the independent variables.
4. **Normality**: The residuals are normally distributed.
5. **No Multicollinearity**: The independent variables are not too highly correlated with each other.

---

### Conclusion

Multiple Linear Regression extends the power of simple linear regression by using more than one independent variable. It allows us to model more complex relationships and capture the influence of multiple factors on the dependent variable. By understanding the derivation and evaluation of the model, we can apply multiple linear regression to various real-world scenarios.

### References

1. [Newcastle University](https://www.ncl.ac.uk/webtemplate/ask-assets/external/maths-resources/statistics/regression-and-correlation/simple-linear-regression.html)
2. 