## What is R-squared?

**R-squared** (also known as the **coefficient of determination**) is a statistical metric used to evaluate how well a regression model explains the variability of the target variable. It represents the proportion of the variance in the dependent variable that is predictable from the independent variables.

Mathematically, R-squared is defined as:

$$
R^2 = 1 - \frac{SS_{res}}{SS_{tot}}
$$

where:
- $SS_{res} = \sum_{i=1}^{n} (y_i - \hat{y}_i)^2$ (the sum of squared residuals)
- $SS_{tot} = \sum_{i=1}^{n} (y_i - \bar{y})^2$ (the total sum of squares)
- $y_i$ = actual value
- $\hat{y}_i$ = predicted value
- $\bar{y}$ = mean of actual values

**Interpretation:**
- $R^2 = 1$: The model perfectly predicts all the data points.
- $R^2 = 0$: The model does not explain any of the variability in the target variable.
- $R^2 < 0$: The model performs worse than simply predicting the mean.

### Pros and Cons of R-squared

**Pros:**
- **Intuitive Interpretation:** Indicates the proportion of variance explained by the model.
- **Widely Used:** Standard metric for evaluating regression models.

**Cons:**
- **Does Not Indicate Causation:** A high R-squared does not mean that changes in independent variables cause changes in the dependent variable.
- **Can Be Misleading:** Adding more variables can artificially inflate R-squared, even if those variables are not meaningful.
- **Not Always Suitable:** For non-linear models or models with outliers, R-squared may not be the best metric.


 ## What is Adjusted R-squared?
 
 **Adjusted R-squared** is a modified version of the R-squared metric that adjusts for the number of predictors (independent variables) in a regression model. While R-squared always increases (or stays the same) as more variables are added to the model, Adjusted R-squared provides a more accurate measure by penalizing the addition of irrelevant predictors.
 
 Mathematically, Adjusted R-squared is defined as:
 
 $$
 \text{Adjusted } R^2 = 1 - \left(1 - R^2\right) \frac{n - 1}{n - p - 1}
 $$
 
 where:
 - $R^2$ = R-squared value
 - $n$ = number of observations
 - $p$ = number of independent variables (predictors)
 
 **Interpretation:**
 - Adjusted R-squared increases only if the new predictor improves the model more than would be expected by chance.
 - It can decrease if the added variable does not improve the model sufficiently.
 
 ### Pros and Cons of Adjusted R-squared
 
 **Pros:**
 - **Penalizes Irrelevant Predictors:** Helps prevent overfitting by discouraging the inclusion of unnecessary variables.
 - **Better Model Comparison:** Useful for comparing models with different numbers of predictors.
 
 **Cons:**
 - **Still Not a Guarantee of Causation:** Like R-squared, a high Adjusted R-squared does not imply causation.
 - **Can Be Negative:** In poorly fitting models, Adjusted R-squared can be negative.
 
 **Summary:**  
 Adjusted R-squared is generally preferred over R-squared when comparing models with different numbers of predictors, as it provides a more reliable measure of model quality.


## What is MAE?

**MAE** stands for **Mean Absolute Error**. It is a commonly used metric in regression analysis that measures the average absolute difference between the actual (true) values and the predicted values.

Mathematically, for a set of $n$ data points, MAE is calculated as:

$$
\text{MAE} = \frac{1}{n} \sum_{i=1}^{n} |y_i - \hat{y}_i|
$$

where:
- $y_i$ = actual value
- $\hat{y}_i$ = predicted value

A lower MAE indicates that the predictions are closer to the actual values.

### Pros and Cons of MAE

**Pros:**
- **Interpretability:** MAE is easy to interpret because it is in the same units as the target variable.
- **Robust to Outliers:** MAE is less sensitive to outliers compared to MSE, as it does not square the errors.
- **Widely Used:** MAE is a standard metric for regression tasks.

**Cons:**
- **Non-differentiable at Zero:** The absolute value function is not differentiable at zero, which can be a challenge for some optimization algorithms.
- **Equal Weight to All Errors:** MAE treats all errors equally, which may not be desirable if larger errors should be penalized more.


 ## What is MAPE?
 
 **MAPE** stands for **Mean Absolute Percentage Error**. It is a metric used to measure the accuracy of a regression model as a percentage. MAPE calculates the average absolute difference between the actual and predicted values, expressed as a percentage of the actual values.
 
 Mathematically, for a set of $n$ data points, MAPE is calculated as:
 
 $$
 \text{MAPE} = \frac{100\%}{n} \sum_{i=1}^{n} \left| \frac{y_i - \hat{y}_i}{y_i} \right|
 $$
 
 where:
 - $y_i$ = actual value
 - $\hat{y}_i$ = predicted value
 
 A lower MAPE indicates better predictive accuracy, with 0% being a perfect prediction.
 
 ### Pros and Cons of MAPE
 
 **Pros:**
 - **Interpretability:** MAPE is easy to interpret because it expresses errors as percentages.
 - **Scale Independence:** Useful for comparing forecast accuracy across different datasets or scales.
 
 **Cons:**
 - **Undefined for Zero Actuals:** If any $y_i = 0$, MAPE is undefined or can be extremely large.
 - **Biased Toward Underestimation:** MAPE can be biased when actual values are very small.
 - **Sensitive to Outliers:** Large percentage errors can dominate the metric if actual values are close to zero.


## What is MSE?

**MSE** stands for **Mean Squared Error**. It is a common metric used to measure the average squared difference between the actual (true) values and the predicted values in regression problems.

Mathematically, for a set of $n$ data points, MSE is calculated as:

$$
\text{MSE} = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2
$$

where:
- $y_i$ = actual value
- $\hat{y}_i$ = predicted value

A lower MSE indicates that the predictions are closer to the actual values.


 ### Pros and Cons of MSE
 
 **Pros:**
 - **Widely Used:** MSE is a standard and well-understood metric in regression analysis.
 - **Penalizes Large Errors:** By squaring the errors, MSE gives more weight to larger errors, making it sensitive to outliers.
 - **Differentiable:** MSE is continuous and differentiable, which is useful for optimization algorithms like gradient descent.
 
 **Cons:**
 - **Sensitive to Outliers:** Because errors are squared, a few large errors can disproportionately affect the MSE, making it less robust to outliers.
 - **Units:** The value of MSE is in the squared units of the target variable, which can make interpretation less intuitive.
 - **Not Always Interpretable:** A lower MSE does not always mean better performance if the data contains outliers or is not normally distributed.


## What is RMSE?
 
 **RMSE** stands for **Root Mean Squared Error**. It is a popular metric used to measure the average magnitude of the errors between predicted values and actual values in regression problems.
 
 Mathematically, for a set of $n$ data points, RMSE is calculated as:
 
 $$
 \text{RMSE} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2}
 $$
 
 where:
 - $y_i$ = actual value
 - $\hat{y}_i$ = predicted value
 
RMSE is always non-negative, and a lower RMSE value indicates better predictive accuracy. Unlike MSE, RMSE has the same units as the target variable, making it more interpretable.
 
### Pros and Cons of RMSE
 
**Pros:**
- **Interpretability:** RMSE is in the same units as the target variable, making it easier to interpret.
- **Penalizes Large Errors:** Like MSE, RMSE gives higher weight to larger errors due to the squaring of differences.
- **Widely Used:** RMSE is a standard metric in regression analysis and model evaluation.
 
**Cons:**
 - **Sensitive to Outliers:** RMSE is affected by outliers, as large errors have a disproportionately large effect.
 - **Not Robust:** If the data contains significant outliers, RMSE may not reflect the typical prediction error.
 - **Not Always Intuitive:** While more interpretable than MSE, RMSE can still be misleading if the error distribution is highly skewed.
