# Error Metrics

## Regression
* Mean Absolute Percentage Error (MAPE): MAPE is often used when you want a measure that is independent of the scale of your data. It expresses the error as a percentage of the actual values, making it easier to interpret and compare across different datasets or models, regardless of their units or scales.
  * The output can be arbitrarily high when y_true is small or when abs(y_true - y_pred) is large (which is common for most regression metrics).

$$
MAPE = \frac{1}{n} \sum_{i=1}^{n} \left| \frac{Actual_i - Prediction_i}{Actual_i} \right| \times 100\%
$$


* Modified MAPE (MMAPE) or Adjusted MAPE: To address the issue of small actual values, a modified version of MAPE can be used where a small constant is added to the denominator to avoid extremely high percentage errors.

$$
\text{MMAPE} = \frac{1}{n} \sum_{i=1}^{n} \left| \frac{A_i - P_i}{A_i + \epsilon} \right| \times 100\%
$$

* Symmetric MAPE (sMAPE): This is an alternative to MAPE that equally penalizes overforecast and underforecast by modifying the denominator to be the average of the predicted and actual values, thus avoiding the issue when actual values are near zero.

$$
\text{sMAPE} = \frac{1}{n} \sum_{i=1}^{n} \frac{2 \times \left| A_i - P_i \right|}{\left| A_i \right| + \left| P_i \right|} \times 100\%
$$


In [35]:
import numpy as np
import pandas as pd

df = pd.DataFrame({    
    'actual': [0, 100, 200, 300, 400, 500],
    'predicted': [100, 110, 210, 320, 380, 480]
})


df['MAPE'] = abs((df['actual'] - df['predicted']) / df['actual']) * 100
df['MMAPE'] = abs((df['actual'] - df['predicted']) / (df['actual'] + 1e-8)) * 100
df['SMAPE'] = 2 * abs(df['actual'] - df['predicted']) / (abs(df['actual']) + abs(df['predicted'])) * 100

df

Unnamed: 0,actual,predicted,MAPE,MMAPE,SMAPE
0,0,100,inf,1000000000000.0,200.0
1,100,110,10.0,10.0,9.52381
2,200,210,5.0,5.0,4.878049
3,300,320,6.666667,6.666667,6.451613
4,400,380,5.0,5.0,5.128205
5,500,480,4.0,4.0,4.081633




* Mean Squared Error (MSE): Useful when large errors are particularly undesirable. Since it squares the errors before averaging them, larger errors have a disproportionately larger impact on the metric. 