<div style="background-color:#daee8420; line-height:1.5; text-align:center;border:2px solid black;">
    <div style="color:#7B242F; font-size:24pt; font-weight:700;">The Ultimate Machine Learning Mastery Course with Python</div>
</div>

---
### **Course**: The Ultimate Machine Learning Course with Python  
#### **Chapter**: Regression  
##### **Lesson**: Linear Regression Theory  
###### **Author:** Dr. Saad Laouadi   
###### **Copyright:** Dr. Saad Laouadi    

---

## License

**This material is intended for educational purposes only and may not be used directly in courses, video recordings, or similar without prior consent from the author. When using or referencing this material, proper credit must be attributed to the author.**

```text
#**************************************************************************
#* (C) Copyright 2024 by Dr. Saad Laouadi. All Rights Reserved.           *
#**************************************************************************                                                                    
#* DISCLAIMER: The author has used their best efforts in preparing        *
#* this content. These efforts include development, research,             *
#* and testing of the theories and programs to determine their            *
#* effectiveness. The author makes no warranty of any kind,               *
#* expressed or implied, with regard to these programs or                 *
#* to the documentation contained within. The author shall not            *
#* be liable in any event for incidental or consequential damages         *
#* in connection with, or arising out of, the furnishing,                 *
#* performance, or use of these programs.                                 *
#*                                                                        *
#* This content is intended for tutorials, online articles,               *
#* and other educational purposes.                                        *
#**************************************************************************
```

## Introduction 

### Overview

Simple Linear Regression is a statistical method used to model the relationship between a dependent variable (target) and one independent variable (feature). The objective is to find a linear relationship between the two variables such that we can predict the value of the dependent variable based on the independent variable.

### 1. **General Mathematical Formula**

The general form of a simple linear regression model can be expressed as:

$$
\hat{y} = \beta_0 + \beta_1 x
$$

Where:
- $\hat{y}$ is the predicted value (dependent variable).
- $x$ is the independent variable.
- $\beta_0$ is the intercept (the value of $\hat{y}$ when $x = 0$).
- $\beta_1$ is the slope of the line, representing the change in $\hat{y}$ for a one-unit change in $x$.


### 2. **Model Representation**

Simple linear regression assumes a linear relationship between the independent variable and the dependent variable, which can be represented graphically as a straight line. Mathematically, the model can be described as:

$$
y = \beta_0 + \beta_1 x + \epsilon
$$

Where:
- $y$ is the actual observed value (dependent variable).
- $\epsilon$ is the error term, which accounts for the difference between the actual value and the predicted value $\hat{y}$.

The goal of linear regression is to find the values of $\beta_0$ and $\beta_1$ that minimize the difference between the actual values $y$ and the predicted values $\hat{y}$.


### 3. Representing the Data in Matrix Notation (Simple Linear Regression)

In simple linear regression, the model can also be expressed in matrix notation for easier computation. 

The model equation is:

$$
y_i = \beta_0 + \beta_1 x_i + \epsilon_i
$$

This can be written in matrix form as:

$$
\mathbf{y} = \mathbf{X} \boldsymbol{\beta} + \boldsymbol{\epsilon}
$$

Where:

- $\mathbf{y}$ is the $n \times 1$ vector of observed values (dependent variable):

  $$
  \mathbf{y} =
  \begin{bmatrix}
  y_1 \\
  y_2 \\
  \vdots \\
  y_n
  \end{bmatrix}
  $$

- $\mathbf{X}$ is the $n \times 2$ matrix of independent variables (including the intercept):

  $$
  \mathbf{X} =
  \begin{bmatrix}
  1 & x_1 \\
  1 & x_2 \\
  \vdots & \vdots \\
  1 & x_n
  \end{bmatrix}
  $$

- $\boldsymbol{\beta}$ is the $2 \times 1$ vector of coefficients (including the intercept $\beta_0$):

  $$
  \boldsymbol{\beta} =
  \begin{bmatrix}
  \beta_0 \\
  \beta_1
  \end{bmatrix}
  $$

- $\boldsymbol{\epsilon}$ is the $n \times 1$ vector of errors or residuals:

  $$
  \boldsymbol{\epsilon} =
  \begin{bmatrix}
  \epsilon_1 \\
  \epsilon_2 \\
  \vdots \\
  \epsilon_n
  \end{bmatrix}
  $$

### Coefficient Estimation in Matrix Form

Using the Ordinary Least Squares (OLS) method, the estimated coefficients $\hat{\boldsymbol{\beta}}$ are given by:

$$
\hat{\boldsymbol{\beta}} = (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{y}
$$

Where:
- $\mathbf{X}^T$ is the transpose of the matrix $\mathbf{X}$,
- $(\mathbf{X}^T \mathbf{X})^{-1}$ is the inverse of the product of $\mathbf{X}^T$ and $\mathbf{X}$.

In the case of simple linear regression, this equation provides an efficient way to compute the slope ($\beta_1$) and the intercept ($\beta_0$) of the regression line.


### 4. **Derivation of Parameters**

To estimate the coefficients $\beta_0$ and $\beta_1$, we use the **Ordinary Least Squares (OLS)** method, which minimizes the sum of squared errors (SSE) between the predicted values $\hat{y}$ and the actual values $y$.

The error (or residual) for each observation is defined as:

$$
e_i = y_i - \hat{y}_i = y_i - (\beta_0 + \beta_1 x_i)
$$

The sum of squared errors (SSE) is then:

$$
SSE = \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 = \sum_{i=1}^{n} (y_i - (\beta_0 + \beta_1 x_i))^2
$$

We want to minimize the SSE with respect to $\beta_0$ and $\beta_1$. The OLS method gives the following formulas for the coefficients:

#### a. **Slope $\beta_1$**:

$$
\beta_1 = \frac{\sum_{i=1}^{n} (x_i - \bar{x})(y_i - \bar{y})}{\sum_{i=1}^{n} (x_i - \bar{x})^2}
$$

Where:
- $\bar{x}$ is the mean of the independent variable $x$.
- $\bar{y}$ is the mean of the dependent variable $y$.

#### b. **Intercept $\beta_0$**:

Once $\beta_1$ is calculated, we can find $\beta_0$ using the following equation:

$$
\beta_0 = \bar{y} - \beta_1 \bar{x}
$$

Where:
- $\bar{y}$ is the mean of the dependent variable $y$.
- $\bar{x}$ is the mean of the independent variable $x$.



### 4. **Evaluating the Model**

After fitting the model, we need to evaluate how well it fits the data. Common metrics used in linear regression evaluation include:

#### a. **Coefficient of Determination ($R^2$)**:
The $R^2$ value measures the proportion of the variance in the dependent variable that is predictable from the independent variable.

$$
R^2 = 1 - \frac{\sum_{i=1}^{n} (y_i - \hat{y}_i)^2}{\sum_{i=1}^{n} (y_i - \bar{y})^2}
$$

#### b. **Mean Squared Error (MSE)**:
MSE measures the average of the squared differences between actual and predicted values.

$$
MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2
$$

#### c. **Root Mean Squared Error (RMSE)**:
RMSE is the square root of MSE and is used to measure the average magnitude of the errors in the predictions.

$$
RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2}
$$

---

### 5. **Example of Simple Linear Regression**

Let’s consider a simple example where we have data on a student's study time ($x$) and their corresponding test scores ($y$).

| Study Time (Hours) | Test Score |
|-------------------|------------|
| 1                 | 50         |
| 2                 | 55         |
| 3                 | 65         |
| 4                 | 70         |
| 5                 | 75         |

We want to build a simple linear regression model to predict test scores based on the amount of time a student studies.

#### Step 1: Calculate the slope ($\beta_1$)

$$
\beta_1 = \frac{\sum_{i=1}^{n} (x_i - \bar{x})(y_i - \bar{y})}{\sum_{i=1}^{n} (x_i - \bar{x})^2}
$$

#### Step 2: Calculate the intercept ($\beta_0$)

$$
\beta_0 = \bar{y} - \beta_1 \bar{x}
$$

#### Step 3: Prediction Formula

$$
\hat{y} = \beta_0 + \beta_1 x
$$

You can use this formula to predict the test score based on any given study time.


### 6. **Assumptions of Simple Linear Regression**

1. **Linearity**: The relationship between the independent and dependent variables should be linear.
2. **Independence**: Observations should be independent of each other.
3. **Homoscedasticity**: The variance of the residuals (errors) should be constant for all values of the independent variable.
4. **Normality**: The residuals should be approximately normally distributed.


### Conclusion

Simple Linear Regression is a foundational technique in machine learning and statistics for modeling relationships between two variables. It is a straightforward method but has powerful applications in prediction and interpretation. By understanding how the parameters are derived and evaluated, you can apply this technique to various real-world problems.

### References

1. [Newcastle University](https://www.ncl.ac.uk/webtemplate/ask-assets/external/maths-resources/statistics/regression-and-correlation/simple-linear-regression.html)
2. 