# Evaluation Frameworks: Common Forecast Accuracy Metrics

The goal of any time series forecasting model is to make accurate forecasts, but the question is how we can measure and compare the predictive accuracy. Therefore, as a preliminary requirement, we have to define a suitable performance metrics that measure predictive accuracy.

The commonly used accuracy metrics to judge forecasts are:

- Forecast Error
- Mean Absolute Percentage Error (MAPE)
- Mean (Forecast) Error (ME)
- Mean Absolute Error (MAE)
- Mean Percentage Error (MPE)
- Mean Squared Error (MSE)
- Root Mean Squared Error (RMSE)
- Lag 1 Autocorrelation of Error (ACF1)
- Correlation between the Actual and the Forecast (corr)
- Min-Max Error (minmax)

Typically, if you are comparing forecasts of two different series, the MAPE, Correlation and Min-Max Error can be used. Why not use the other metrics? Because only the above three are percentage errors that vary between 0 and 1. That way, you can judge how good is the forecast irrespective of the scale of the series.The other error metrics are quantities. That implies, an RMSE of 100 for a series whose mean is in 1000’s is better than an RMSE of 5 for series in 10’s. So, you can’t really use them to compare the forecasts of two different scaled time series.

## Import libraries

In [1]:
import numpy as np
from statsmodels.tsa.stattools import acf

  import pandas.util.testing as tm


## Forecast Error

Consider a validation dataset with ‘v’ periods, t=1,…v. The forecast error (e) is defined as the difference between an observed value (x)and the predicted value (y) at a time-period (t).

$e(t)=y(t)-x(t)$

The forecast error can be calculated for each prediction. Does this solve our problem of evaluation? Not yet, we have to summarize these errors as a single comparable number.

In [2]:
# Generate a dummy data and calculate forecast error
actual = np.array([1.1, 1.5, 2.2, 2.9, 3.0,3.6,4.0,4.4,5.0,5.5])
predicted = np.array([1.2,1.4,2.1,2.6,3.2,3.4,4.2,4.5,4.8,5.1])
forecast_errors = [actual[i]-predicted[i] for i in range(len(actual))]
print('Forecast Errors: %s' % forecast_errors)

Forecast Errors: [-0.09999999999999987, 0.10000000000000009, 0.10000000000000009, 0.2999999999999998, -0.20000000000000018, 0.20000000000000018, -0.20000000000000018, -0.09999999999999964, 0.20000000000000018, 0.40000000000000036]


## Others 

In [3]:
# Accuracy metrics
def forecast_accuracy(forecast, actual):

  # Mean Absolute Percentage Error (MAPE)
  mape = np.mean(np.abs(forecast - actual)/np.abs(actual))

  # Mean (Forecast) Error (ME)
  me = np.mean(forecast - actual)       

  # Mean Absolute Error (MAE)
  mae = np.mean(np.abs(forecast - actual)) 

  # Mean Percentage Error (MPE)
  mpe = np.mean((forecast - actual)/actual) 

  # Mean Squared Error (MSE)
  mse = np.mean((forecast - actual)**2)

  # Root Mean Squared Error (RMSE)
  rmse = np.mean((forecast - actual)**2)**.5

  # Correlation between the Actual and the Forecast (corr)
  corr = np.corrcoef(forecast, actual)[0,1] 

  # Lag 1 Autocorrelation of Error (ACF1)
  acf1 = acf(forecast-actual)[1]       

  # Min-Max Error (minmax)
  mins = np.amin(np.hstack([forecast[:,None], 
                            actual[:,None]]), axis=1)
  maxs = np.amax(np.hstack([forecast[:,None], 
                            actual[:,None]]), axis=1)
  minmax = 1 - np.mean(mins/maxs)      
  
  # output
  return({'mape':mape,
          'me':me, 
          'mae': mae, 
          'mpe': mpe, 
          'mse': mse,
          'rmse':rmse, 
          'acf1':acf1, 
          'corr':corr, 
          'minmax':minmax})

In [4]:
forecast_accuracy(predicted, actual)



{'acf1': -0.15685785536159605,
 'corr': 0.9897750981698683,
 'mae': 0.19000000000000006,
 'mape': 0.061415534656913975,
 'me': -0.07000000000000009,
 'minmax': 0.05995269194407116,
 'mpe': -0.015354928596307932,
 'mse': 0.04500000000000004,
 'rmse': 0.21213203435596434}