#### Common Evaluation Metrics for MLR

1. R^^2 Score (Coefficient of Determination)
- tells us what percentage of the variation in the target variable is explained by the model.
- % variance explained
- Ideal Value - Closer to 1.
- Value lies between 0 and 1.
- Use Case - Model fit quality.

2. RMSE (Root Mean Squared Error)
- lower the better.
- Average squared error 
- Ideal Value- Closer to 0.
- Use Case - Penalizes large errors.

3. MAE (Mean Absolute Error)
- Average absolute error 
- Ideal Value - Close to 0.
- Simpler error interpretation.

4. MAPE (Mean Absolute Percentage Error)
- Average % error from actuals 
- Ideal Value - Lower% 
- Explaining % Error to audience.


Notes: 
- Always evaluate the model on test data.
- Use multiple metrics to get a complete picture.
- RMSE penalizes large errors more; MAE is easier to explain
- R^^2 shows explacatory power, not accuracy.

Use these metrics to select and justify your regression model before making business decisions or deployment

- A residual is the difference between actual value and the predicted value.
- Residual value = Y actual - Y predicted

### What does a good residual plot look like?
- Residuals should be randomly scattered around the horizontal line at 0.
- No clear pattern should be visible.
- This means: your model's errors are evenly distributed - a sign of good fit.

#### If residuals do not look random - 
- You may have missed a pattern in the data.
- If residuals increase/ decrease systematically, it may mean: 
    - You are missing a nonlinear relationship.
    - Your model may be underfitting or overfitting.


#### If Residuals show a systematic Pattern - 
- A clear curve, slope, or funnel shape in the residuals means your model is violating key assumptions of linear regression - especially linearity and homoscedasticitity(constant variance).
- This usually points to - 
    - A nonlinear relationship that linear regression con't capture
    - Missing variables or interactions in the model.
    - Possible underfitting.

#### What can you do to fix it - 
- 
    1. Add Polynomial Terms - If the residuals follow a curved pattern (U-shared or inverted U), it means the relationsipt is nonlinear.
    2. Apply Feature Transformation - 
        - If residuals increase or decrease systematically, apply tranformations like:
             - log or sqrt for right-skewed variables.
             - box cox for automatic transformation (scipy).
    3. Add Missing Features or Interaction Terms
        - Sometimes the model misses important variables or the interactions.
    4. Use a more Flexible Model 
        - If linear regression is still struggling:
            - Try Decision Trees, Random Forests, or Gradient Boosting.
        - These handle nonlinearities and interactions automatically.
    

#### Note - 
- If your residuals show a curved pattern or increase with predictions, your model is likely missing something.
- You can fix this by adding polynomial terms, transforming variables, or switching to noninear model.
- Residual plots aren't just diagnostic - they guide how to improve the model.

#### If the Residuals suggest Underfitting or Overfitting - 
- 
    1. Underfitting - 
        - Clues: Residuals show clear pattern (eg - curver, slope).
        - R^^2 is low on both training and test sets.
        - Model is too simple to capture true data structure.
        - Commong Clauses: 
            - Using only linear terms when the relationship is nonlinear
            - Leaving out important features.
        - Remidies - 
            - Add polynomical features. (ex - degree 2/3)
            - Include interaction terms
            - Apply log/sqrt transformatioons.
            - Switch to a more complex model.

    2. Overfitting - 
        - Residuals look random on training data, but prediction errors are high on test data.
        - High R^^2 on traning, low R^^2 on test data.
        - Too many predictiors or overly complex model.
        - Common Causes:
            - Model memorizes notise instear of generating patters.
        - Remedies:
            - Simplify the model (remove unnecessary features).
            - Use regularization: Tidge or Lasso Regresion.
            - Apply cross-validation to detect overfitting early.
            - Gather more data if possible.

        


