# ***REGRESSION METRICS***
# ***HIMEL SARDER***

![image.png](attachment:9fec381e-cca0-4c81-a06c-24f506931e0c.png)

# ***1. Mean Absolute Error (MAE)***
It is the Average of the absolute difference between prediction and actual values.

The Mean Absolute Error (MAE) is a commonly used metric to evaluate the performance of regression models. It calculates the average of the absolute differences between predicted and actual values.

![image.png](attachment:b8e4d03f-3ea1-4d5b-97b4-1240c072e499.png)

![image.png](attachment:2a3d62d4-616c-4d0c-8eb6-d92f51e7b3a0.png)

![image.png](attachment:96303401-d938-4e97-9402-bdf8d21a3fa6.png)

### Advantages of **Mean Absolute Error (MAE)**:

1. **Easy to Understand**  
   - MAE directly shows the average size of the error in the same unit as the target variable (e.g., dollars, kilograms).

2. **Treats All Errors Equally**  
   - Every error contributes equally to the final value, making it fair and straightforward.

3. **Less Sensitive to Outliers**  
   - MAE does not square the errors, so extreme values (outliers) have less impact compared to metrics like MSE.

4. **Simple to Calculate**  
   - Just take the absolute differences between actual and predicted values and find the average.

5. **Good for Real-World Interpretations**  
   - In practical applications, understanding the average error is often more meaningful than squared errors.

6. **Robust for Non-Gaussian Data**  
   - Works well when the data distribution is not normal or has heavy tails.

### Disadvantages of **Mean Absolute Error (MAE)**:

1. **Non-Differentiable at Zero**  
   - The absolute value function is not differentiable at zero, making it harder to optimize in some machine learning algorithms (e.g., gradient-based methods).

2. **Ignores the Magnitude of Errors**  
   - MAE treats all errors equally, which means it does not penalize larger errors more heavily. This can be a problem if large errors are critical in your application.

3. **Less Popular for Complex Models**  
   - Metrics like **Mean Squared Error (MSE)** or **Root Mean Squared Error (RMSE)** are more commonly used in advanced machine learning tasks because they align better with gradient-based optimization methods.

# ***Mean Squred Error (MSE)***

![image.png](attachment:fcc80162-cb99-437a-80de-bc0be7f02151.png)

![image.png](attachment:32afc73f-c4b5-442d-9646-6cd59aefa612.png)

### **Advantages of MSE**

1. **Penalizes Large Errors Heavily**  
   - Squaring the errors makes MSE more sensitive to large errors, which is useful in applications where larger deviations are critical.

2. **Smooth and Differentiable**  
   - MSE is differentiable, making it ideal for optimization techniques like gradient descent in machine learning models.

3. **Mathematically Convenient**  
   - The squaring operation simplifies mathematical derivations and is widely used in statistical and machine learning algorithms.

4. **Highlights Variance in Errors**  
   - MSE provides insights into the variability of errors, which can be useful for model improvement.

5. **Widely Accepted**  
   - MSE is a standard metric in many fields, making it easy to compare results across studies or models.

---

### **Disadvantages of MSE**

1. **Sensitive to Outliers**  
   - Squaring the errors magnifies the impact of outliers, which can distort the evaluation of model performance.

2. **Harder to Interpret**  
   - The squared error is in units of the square of the target variable, making it less interpretable compared to metrics like MAE.

3. **Overemphasizes Large Errors**  
   - While penalizing large errors can be beneficial, it may lead to overfitting, as the model focuses excessively on minimizing a few large errors.

4. **Not Robust for Non-Gaussian Data**  
   - MSE assumes that errors are normally distributed; it may not perform well with skewed or heavy-tailed error distributions.

5. **Scale Dependency**  
   - The value of MSE depends on the scale of the target variable, making it less suitable for comparisons across datasets with different scale.

# ***Root Mean Squared Error (RMSE)***

![image.png](attachment:5fc1bf4e-6d55-48b0-a7cd-824d53cc812d.png)

### **Advantages of RMSE**

1. **Same Unit as Target Variable**  
   - RMSE is easy to interpret because it is in the same unit as the target variable, unlike MSE, which is in squared units.

2. **Penalizes Large Errors**  
   - Like MSE, RMSE gives higher weight to large errors, which is beneficial in applications where large deviations are critical.

3. **Widely Used and Recognized**  
   - RMSE is a standard metric in regression tasks and is commonly used in research and industry.

4. **Smooth and Differentiable**  
   - RMSE is differentiable, making it suitable for optimization in machine learning algorithms.

5. **Balances Small and Large Errors**  
   - While it penalizes large errors, it still considers smaller errors, offering a balanced evaluation of model performance.

---

### **Disadvantages of RMSE**

1. **Sensitive to Outliers**  
   - RMSE, like MSE, is highly influenced by outliers due to the squaring of errors, which can distort performance evaluation.

2. **Difficult Comparisons Across Scales**  
   - Since RMSE depends on the scale of the target variable, it cannot be used to compare models across datasets with different units or ranges.

3. **Overemphasizes Large Errors**  
   - The squaring process may lead to overfitting, as the model tries to minimize a few large errors at the expense of overall performance.

4. **Complexity in Interpretation**  
   - While it is in the same unit as the target variable, RMSE does not provide direct information about the distribution of errors or how they are spread.

---

### **When to Use RMSE**
- When interpretability in the same unit as the target variable is important.  
- When penalizing large errors is a priority.  
- In regression tasks where understanding the magnitude of errors is critical.

---

### **Summary**
- **RMSE** is a widely used metric that balances interpretability and sensitivity to large errors.  
- However, its sensitivity to outliers and dependency on the scale of the target variable are key limitations.

# ***R² Score (Coefficient of Determination)***

![image.png](attachment:b04b3352-1a22-4f63-aac3-167d6bae43ed.png)

![image.png](attachment:48bf9eee-ce39-405d-a95f-a59d0354d95d.png)

### **Advantages of RMSE**

1. **Same Unit as Target Variable**  
   - RMSE is easy to interpret because it is in the same unit as the target variable, unlike MSE, which is in squared units.

2. **Penalizes Large Errors**  
   - Like MSE, RMSE gives higher weight to large errors, which is beneficial in applications where large deviations are critical.

3. **Widely Used and Recognized**  
   - RMSE is a standard metric in regression tasks and is commonly used in research and industry.

4. **Smooth and Differentiable**  
   - RMSE is differentiable, making it suitable for optimization in machine learning algorithms.

5. **Balances Small and Large Errors**  
   - While it penalizes large errors, it still considers smaller errors, offering a balanced evaluation of model performance.

---

### **Disadvantages of RMSE**

1. **Sensitive to Outliers**  
   - RMSE, like MSE, is highly influenced by outliers due to the squaring of errors, which can distort performance evaluation.

2. **Difficult Comparisons Across Scales**  
   - Since RMSE depends on the scale of the target variable, it cannot be used to compare models across datasets with different units or ranges.

3. **Overemphasizes Large Errors**  
   - The squaring process may lead to overfitting, as the model tries to minimize a few large errors at the expense of overall performance.

4. **Complexity in Interpretation**  
   - While it is in the same unit as the target variable, RMSE does not provide direct information about the distribution of errors or how they are spread.

---

### **When to Use RMSE**
- When interpretability in the same unit as the target variable is important.  
- When penalizing large errors is a priority.  
- In regression tasks where understanding the magnitude of errors is critical.

### **Advantages of R² Score**

1. **Explains Variance**  
   - Provides a clear understanding of how much of the target variable's variance is explained by the model.

2. **Easy to Interpret**  
   - R² is expressed as a percentage, making it simple to interpret and compare models.

3. **Useful for Model Comparison**  
   - Helps compare multiple models to see which one explains more variance.

4. **Standard Metric**  
   - Widely used in regression analysis, making it a standard for evaluating model performance.

5. **Highlights Model Fit**  
   - Shows how well the independent variables explain the dependent variable.

---

### **Disadvantages of R² Score**

1. **Does Not Penalize Overfitting**  
   - Adding more predictors can increase \( R^2 \), even if they are irrelevant, leading to overfitting.

2. **Not Always Reliable for Nonlinear Models**  
   - R² assumes a linear relationship and may not be meaningful for nonlinear models.

3. **Insensitive to Scale**  
   - R² does not indicate the magnitude of errors, so it might not reflect the actual performance of the model.

4. **Does Not Handle Outliers Well**  
   - Outliers can significantly distort the R² score, making it less reliable.

5. **Negative Values Are Hard to Interpret**  
   - When \( R^2 < 0 \), it means the model is worse than a mean prediction, but this is not intuitive.

---

### **When to Use R²**
- When you want to measure how much variance in the target variable is explained by the model.  
- For comparing the performance of different regression models.  
- In cases where explaining the relationship between predictors and the target variable is more important than error magnitude.

---

### **Summary**
- **R²** is a valuable metric for understanding and comparing the explanatory power of regression models.  
- However, it has limitations, especially in overfitting and nonlinear scenarios, and should often be used alongside other metrics like RMSE or MAE.

# ***Adjusted R2 Score***

![image.png](attachment:be80ba5d-1196-4829-a153-7df20cf9f08e.png)

### **Key Points**
1. **Purpose**: Adjusted R² helps evaluate whether adding more predictors genuinely improves the model or just inflates the R² artificially.  
2. **Behavior**:
   - If a new predictor improves the model significantly, Adjusted R² increases.  
   - If the predictor is irrelevant, Adjusted R² decreases.

---

### **Advantages of Adjusted R²**
1. **Prevents Overfitting**  
   - Penalizes the addition of irrelevant predictors, ensuring the model is not unnecessarily complex.
   
2. **Better for Multiple Regression**  
   - Provides a more realistic measure of model performance when dealing with multiple predictors.

3. **Accounts for Model Complexity**  
   - Balances the trade-off between goodness of fit and the number of predictors.

---

### **Disadvantages of Adjusted R²**
1. **More Complex to Interpret**  
   - Compared to regular R², Adjusted R² requires understanding of the penalty mechanism.
   
2. **Sensitive to Sample Size**  
   - For small datasets, the penalty for adding predictors can disproportionately reduce Adjusted R².

3. **Not Always Necessary**  
   - In simple regression (single predictor), Adjusted R² is the same as R² and adds no additional value.

---

### **Difference Between R² and Adjusted R²**
| **Aspect**          | **R²**                          | **Adjusted R²**                |
|---------------------|----------------------------------|---------------------------------|
| **Effect of Predictors** | Always increases or stays the same when predictors are added | Can decrease if irrelevant predictors are added |
| **Penalty for Predictors** | No penalty | Penalizes for adding irrelevant predictors |
| **Use Case**         | Simple regression or for rough comparison | Multiple regression or when model complexity matters |

---

### **When to Use Adjusted R²**
- When comparing models with different numbers of predictors.  
- In multiple regression, to ensure added predictors genuinely improve the model.

---

### **Example**
- **Model 1**: One predictor, \( R^2 = 0.85 \), Adjusted \( R^2 = 0.83 \).  
- **Model 2**: Three predictors, \( R^2 = 0.87 \), Adjusted \( R^2 = 0.84 \).  

Although Model 2 has a slightly higher R², the Adjusted R² shows that the improvement is marginal, indicating that the additional predictors might not be very useful.

---

### **Summary**
Adjusted R² is a more reliable metric for evaluating regression models with multiple predictors, as it accounts for both model fit and complexity. It prevents overfitting by penalizing unnecessary predictors, making it particularly valuable in multiple regression scenarios.