# 🧪 Regression Metrics Analysis

In this notebook, we explore and compare various **regression evaluation metrics** using `scikit-learn`.  
Through different test cases, we highlight how metrics like **MAE, RMSE, R², EVS, MAPE, SMAPE, WAPE, RMSLE** behave under perfect predictions, outliers, constant shifts, and edge conditions.

### ⚙️ Importing Libraries & Environment Setup

In [248]:
import numpy as np
import pandas as pd
from colorama import Fore, Style
from numpy.typing import NDArray
from sklearn.metrics import (
    explained_variance_score,
    mean_absolute_error,
    mean_absolute_percentage_error,
    mean_squared_error,
    r2_score,
    root_mean_squared_error,
    root_mean_squared_log_error,
)

In [249]:
%matplotlib inline

pd.set_option("display.width", 150)
pd.set_option("display.max_rows", 100)
pd.set_option("display.max_columns", None)

### 🧰 Utility Functions

In [None]:
def highlight_invalid_regression_metrics(column: pd.Series) -> list[str]:
    """Returns a list of CSS styles to highlight regression metric values.

    Highlights depend on the metric type (column name) and value ranges:
    - Green: good values,
    - Orange: warning,
    - Red: poor values.

    Args:
        column (pd.Series): Metric values.

    Returns:
        list[str]: CSS styles for each cell in the column.
    """
    metric = column.name
    highlight_styles = []
    for val in column:
        if val == "NaN" or (isinstance(val, float) and np.isnan(val)):
            highlight_styles.append("")
            continue

        try:
            val_float = float(val)
        except (ValueError, TypeError):
            highlight_styles.append("")
            continue

        # R² / EVS
        if metric in {"R²", "EVS"}:
            if val_float < 0.5:
                style = "background-color: red; color: white"
            elif val_float <= 0.8:
                style = "background-color: orange; color: white"
            else:
                style = "background-color: green; color: white"

        # MAPE / SMAPE / WAPE / RMSLE
        elif metric in {"MAPE", "SMAPE", "RMSLE", "WAPE"}:
            if val_float > 0.2:
                style = "background-color: red; color: white"
            elif val_float >= 0.1:
                style = "background-color: orange; color: white"
            else:
                style = "background-color: green; color: white"

        else:
            style = ""

        highlight_styles.append(style)

    return highlight_styles

In [251]:
def symmetric_mean_absolute_percentage_error(
    y_true: NDArray[np.float64], y_pred: NDArray[np.float64]
) -> float:
    """Compute the Symmetric Mean Absolute Percentage Error.

    Args:
        y_true (NDArray[np.float64]): Ground truth (actual) values.
        y_pred (NDArray[np.float64]): Predicted values.

    Returns:
        float: SMAPE value (between 0 and 2). Multiply by 100 to get percentage.
    """
    diff = 2 * np.abs(y_true - y_pred)

    denominator = np.abs(y_true) + np.abs(y_pred)
    denominator = np.where(denominator == 0, 1, denominator)  # Avoid division by zero

    return np.mean(diff / denominator)

In [252]:
def evaluate_regression_metrics(
    y_true: NDArray[np.float64], y_pred: NDArray[np.float64]
) -> pd.io.formats.style.Styler:
    """Evaluate regression metrics for a single prediction result.

    Args:
        y_true (np.ndarray): Ground truth values.
        y_pred (np.ndarray): Predicted values.

    Returns:
        pd.io.formats.style.Styler: Styler object with highlighted problematic metrics.
    """
    metrics = {
        "MAE": mean_absolute_error(y_true, y_pred),
        "MSE": mean_squared_error(y_true, y_pred),
        "RMSE": root_mean_squared_error(y_true, y_pred),
        "R²": r2_score(y_true, y_pred),
        "EVS": explained_variance_score(y_true, y_pred),
        "MAPE": mean_absolute_percentage_error(y_true, y_pred),
        "SMAPE": symmetric_mean_absolute_percentage_error(y_true, y_pred),
        "RMSLE": root_mean_squared_log_error(y_true, y_pred),
        "WAPE": np.sum(np.abs(y_true - y_pred)) / np.sum(np.abs(y_true)),
    }

    def clean_number(val: float) -> str:
        """Format a numeric metric value into a string.

        - Returns "NaN" if the value is NaN.
        - Converts to integer string if the value is integer.
        - Otherwise, rounds to 4 decimal places and converts to string.

        Args:
            val (float): The numeric value to format.

        Returns:
            str: Formatted string representation of the number.
        """
        if np.isnan(val):
            return "NaN"
        if float(val).is_integer():
            return str(int(val))
        return str(round(val, 4))

    df = pd.DataFrame([metrics], index=["Value"])
    df = df.map(clean_number)
    df = df.style.apply(highlight_invalid_regression_metrics, axis=0)
    return df

### 📊 Comparing Metrics Versions

#### Perfect Prediction

In [253]:
y_true = np.array([1, 2, 3, 4, 5])
y_pred = np.array([1, 2, 3, 4, 5])
evaluate_regression_metrics(y_true, y_pred)

Unnamed: 0,MAE,MSE,RMSE,R²,EVS,MAPE,SMAPE,RMSLE,WAPE
Value,0,0,0,1,1,0,0,0,0


**Conclusion:**  
All metrics return ideal values:
- MAE, MSE, RMSE = 0
- R² = EVS = 1 → perfect model, explains 100% of the variance
- MAPE, RMSLE, WAPE = 0%

#### Small, Uniform Errors

In [254]:
y_true = np.array([100, 200, 300, 400, 500])
y_pred = np.array([110, 190, 310, 390, 510])
evaluate_regression_metrics(y_true, y_pred)

Unnamed: 0,MAE,MSE,RMSE,R²,EVS,MAPE,SMAPE,RMSLE,WAPE
Value,10,100,10,0.995,0.9952,0.0457,0.0449,0.0522,0.0333


**Conclusion:**
- MAE ≈ RMSE → errors are evenly distributed.
- R² ≈ 0.99 and EVS ≈ 0.99 → strong fit.
- MAPE and WAPE are both around 5% — very good.
- RMSLE is also low, indicating minimal shift in magnitude.

#### Outlier in One Prediction

In [255]:
y_true = np.array([100, 200, 300, 400, 500])
y_pred = np.array([110, 190, 310, 390, 1000])  # large outlier at the end
evaluate_regression_metrics(y_true, y_pred)

Unnamed: 0,MAE,MSE,RMSE,R²,EVS,MAPE,SMAPE,RMSLE,WAPE
Value,108,50080,223.7856,-1.504,-1.004,0.2417,0.1743,0.3138,0.36


**Conclusion:**
- RMSE increases sharply → confirms its sensitivity to outliers.
- R² and EVS becomes negative → model performs worse than the mean.
- MAPE and WAPE worsen significantly.
- RMSLE is heavily affected by the outlier.

#### Division by Zero in MAPE

In [256]:
y_true = np.array([0, 1, 2, 3, 4])
y_pred = np.array([0.1, 1.2, 2.1, 2.9, 4.2])
evaluate_regression_metrics(y_true, y_pred)

Unnamed: 0,MAE,MSE,RMSE,R²,EVS,MAPE,SMAPE,RMSLE,WAPE
Value,0.14,0.022,0.1483,0.989,0.994,90071992547410,0.4627,0.0655,0.07


**Conclusion:**
- MAPE returns a huge value (sometimes `inf`) due to division by zero.
- RMSLE handles zeros but requires all values ≥ 0.
- R², EVS, MAE, RMSE behave normally.

⚠️ MAPE is not safe when `y_true` contains zeros.

#### Large Value Masks Small Errors (WAPE is Misleading)

In [257]:
y_true = np.array([1, 1, 1, 1, 1000])
y_pred = np.array([0, 0, 0, 0, 900])
evaluate_regression_metrics(y_true, y_pred)

Unnamed: 0,MAE,MSE,RMSE,R²,EVS,MAPE,SMAPE,RMSLE,WAPE
Value,20.8,2000.8,44.7303,0.9875,0.9902,0.82,1.6211,0.6218,0.1036


**Conclusion:**
- WAPE is ~10% → looks acceptable, but small values have 100% errors!
- WAPE underrepresents small absolute errors when dominated by large values.

#### Predicting the Mean

In [258]:
y_true = np.array([10, 20, 30, 40, 50])
y_pred = np.array([30, 30, 30, 30, 30])
evaluate_regression_metrics(y_true, y_pred)

Unnamed: 0,MAE,MSE,RMSE,R²,EVS,MAPE,SMAPE,RMSLE,WAPE
Value,12,200,14.1421,0,0,0.63,0.4371,0.557,0.4


**Conclusion:**
- MAE and RMSE reflect the average distance from the mean.
- R² = 0 → the model is no better than simply predicting the mean.
- EVS = 0 → the model does not explain the dispersion in the data better than simply predicting the mean value.
- WAPE and MAPE are moderate.

⚠️ **R² is not an error — it's a measure of explained variance** and can be misleading on constant predictions.

#### RMSLE Fails on Negative Values

In [259]:
y_true = np.array([10, 20, 30])
y_pred = np.array([12, -5, 25])

try:
    evaluate_regression_metrics(y_true, y_pred)
except ValueError as e:
    print(f"{Fore.RED}Error: {e}{Style.RESET_ALL}")

[31mError: Root Mean Squared Logarithmic Error cannot be used when targets contain values less than or equal to -1.[0m


**Conclusion:**
- RMSLE cannot be computed with negative values.    
- Raises a ValueError: all values for RMSLE must be non-negative.
    
⚠️ Use caution when applying RMSLE — it assumes all values ≥ 0.

#### Constant Shift (+20)

In [260]:
y_true = np.array([10, 20, 30, 40, 50])
y_pred = np.array([30, 40, 50, 60, 70])
evaluate_regression_metrics(y_true, y_pred)

Unnamed: 0,MAE,MSE,RMSE,R²,EVS,MAPE,SMAPE,RMSLE,WAPE
Value,20,400,20,-1,1,0.9133,0.58,0.6382,0.6667


**Conclusion:**
- MAE and RMSE = 20 → consistent error.
- R² = -1 → model performs worse than predicting the mean.
- EVS = 1 → the predictions perfectly replicate the dispersion of the target variable (the shape of the graph is identical).
- MAPE ≈ 66% → significant relative error.
- RMSLE is large → strong mismatch in scale.
- WAPE shows 66% total absolute error relative to `y_true`.

#### Small Values Dominate MAPE

In [261]:
y_true = np.array([1, 10, 100])
y_pred = np.array([2, 11, 101])  # constant error +1
evaluate_regression_metrics(y_true, y_pred)

Unnamed: 0,MAE,MSE,RMSE,R²,EVS,MAPE,SMAPE,RMSLE,WAPE
Value,1,1,1,0.9995,1,0.37,0.2573,0.2395,0.027


**Conclusion:**
- MAE = RMSE = 1 → errors are consistent and small in absolute terms.
- MAPE = 37% → heavily influenced by the first point (1 → 2 is 100% error).
- RMSLE is slightly inflated as well.

⚠️ MAPE overreacts to small values — even tiny absolute errors can create large percentages.