# Regression Error Measures

### Scale-dependent errors:

- **Mean Absolute Error (MAE):**
  $$
  MAE = \frac{1}{N} \sum_{n=1}^{N} |y_n - \hat{y}_n|
  $$

- **Root Mean Square Error (RMSE):**
  $$
  RMSE = \sqrt{\frac{1}{N} \sum_{n=1}^{N} (y_n - \hat{y}_n)^2}
  $$

### Percentage errors are independent of scale:

- **Mean Absolute Percentage Error (MAPE):**
  $$
  MAPE = \frac{1}{N} \sum_{n=1}^{N} \left| \frac{y_n - \hat{y}_n}{y_n} \right|
  $$

---

## Fix MAPE

- Percentage errors are **undefined** when \( y_n = 0 \) and produce **extreme values** when \( y_n \approx 0 \).
- A solution is to use **scaled errors** as an alternative to percentage errors.

- **Mean Absolute Scaled Error (MASE):**
  $$
  MASE = \frac{1}{N} \sum_{n=1}^{N} \left| \frac{y_n - \hat{y}_n}{\frac{1}{N-1} \sum_{n'=2}^{N} |y_{n'} - y_{n'-1}|} \right|
  $$

- Both the numerator and denominator are on the **same scale**, but MASE **cannot be perceived percentually**.



In [76]:
import numpy as np
import matplotlib.pyplot as plt

In [77]:
from sklearn.linear_model import LinearRegression
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split

In [78]:
X, y = make_regression()
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.8)

In [79]:
model = LinearRegression()
model.fit(X_train, y_train)
y_hat = model.predict(X_test)

- **Mean Absolute Error (MAE):**
  $$
  MAE = \frac{1}{N} \sum_{n=1}^{N} |y_n - \hat{y}_n|
  $$

In [80]:
def MAE(y: np.ndarray,
        y_hat: np.ndarray) -> float:
    return np.mean(np.abs(y - y_hat))

- **Root Mean Square Error (RMSE):**
  $$
  RMSE = \sqrt{\frac{1}{N} \sum_{n=1}^{N} (y_n - \hat{y}_n)^2}
  $$


In [81]:
def RMSE(y: np.ndarray,
         y_hat: np.ndarray) -> float:
    return np.sqrt(np.mean((y - y_hat) ** 2))

- **Mean Absolute Percentage Error (MAPE):**
  $$
  MAPE = \frac{1}{N} \sum_{n=1}^{N} \left| \frac{y_n - \hat{y}_n}{y_n} \right|
  $$

In [82]:
def MAPE(y: np.ndarray,
        y_hat: np.ndarray) -> float:
    return np.mean(np.abs((y - y_hat) / y_hat))

- **Mean Absolute Scaled Error (MASE):**
  $$
  MASE = \frac{1}{N} \sum_{n=1}^{N} \left| \frac{y_n - \hat{y}_n}{\frac{1}{N-1} \sum_{n'=2}^{N} |y_{n'} - y_{n'-1}|} \right|
  $$

In [83]:
def MASE(y: np.ndarray,
        y_hat: np.ndarray) -> float:
    div = np.mean(np.abs(y[1:] - y[:-1]))
    return np.mean(np.abs((y - y_hat) / div))

In [84]:
mae = MAE(y_test, y_hat)
rmse = RMSE(y_test, y_hat)
mape = MAPE(y_test, y_hat)
mase = MASE(y_test, y_hat)

In [85]:
print(f'MAE = {mae}')
print(f'RMSE = {rmse}')
print(f'MAPE = {mape}')
print(f'MASE = {mase}')

MAE = 73.30664801427926
RMSE = 91.99270529959463
MAPE = 0.8962516177978974
MASE = 0.428911837835573
