# Evaluation Metrics for Linear Regression

To evaluate the performance of a linear regression model, several metrics are used to determine how well the model predicts the observed outputs. The most common metrics are:

## 1. Mean Squared Error (MSE)
MSE calculates the average of the squared differences between the actual and predicted values.

**Formula:**
$
\text{MSE} = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y_i})^2
$

Where:
- **n** = number of data points
- $ y_i $ = actual value for the $ i^{\text{th}} $ data point
- $ \hat{y_i} $ = predicted value for the $ i^{\text{th}}$ data point


MSE quantifies the accuracy of a model's predictions. It is sensitive to outliers because large errors significantly affect the overall score.



## 2. Mean Absolute Error (MAE)
MAE measures the average absolute difference between predicted values and actual values.

**Formula:**
$
\text{MAE} = \frac{1}{n} \sum_{i=1}^{n} |y_i - \hat{y_i}|
$

Where:
- **n** = number of observations
- $ y_i $ = actual values
- $ \hat{y_i} $ = predicted values

MAE is not sensitive to outliers as it only considers absolute differences.



## 3. Root Mean Squared Error (RMSE)
RMSE measures the square root of the variance of residuals and describes how well the observed data points match the expected values.

**Formula:**
$
\text{RMSE} = \sqrt{\frac{\sum_{i=1}^{n} (y_{i, \text{actual}} - y_{i, \text{predicted}})^2}{n}}
$

RMSE is sensitive to the scale of the variables since its value depends on the units of the variables. A lower RMSE indicates a better fit.

### Residual Standard Error (RSE):
RSE gives an unbiased estimate of the RMSE by adjusting for the degrees of freedom.

**Formula:**
$
\text{RSE} = \sqrt{\frac{\sum_{i=1}^{n} (y_{i, \text{actual}} - y_{i, \text{predicted}})^2}{n - 2}}
$


## 4. Coefficient of Determination (R²)
R² indicates how much variation in the dependent variable can be explained by the independent variables in the model. The value ranges from 0 to 1, with higher values indicating a better model.

**Formula:**
$
R^2 = 1 - \frac{\text{RSS}}{\text{TSS}}
$

Where:
- **RSS** = Residual Sum of Squares
- **TSS** = Total Sum of Squares

### Residual Sum of Squares (RSS):
RSS is the sum of the squared differences between observed values and predicted values.

**Formula:**
$
\text{RSS} = \sum_{i=1}^{n} (y_i - b_0 - b_1x_i)^2
$

Where:
- $ b_0, b_1 $ = model coefficients
- $ x_i $ = predictor variable

### Total Sum of Squares (TSS):
TSS measures the total variation of the observed values from their mean.

**Formula:**
$
\text{TSS} = \sum (y_i - \bar{y})^2
$

Where:
- $ \bar{y} $ = mean of actual values



## 5. Adjusted R²
Adjusted R² accounts for the number of predictors in the model and penalizes the model for including irrelevant predictors.

**Formula:**
$
\text{Adjusted } R^2 = 1 - \left( \frac{(1 - R^2) \cdot (n - 1)}{n - k - 1} \right)
$

Where:
- **n** = number of observations
- **k** = number of predictors
- $ R^2 $ = coefficient of determination

### Significance of Adjusted R²:
- Adjusted R² prevents overfitting by penalizing the model for adding predictors that do not significantly improve the explanation of variance in the dependent variable.



