# Walk-Forward Validation

Walk-Forward Validation (WFV) is a validation method that tests a forecasting model's ability to predict the future. This is accomplished by splitting up the dataset into smaller chunks of training and test sets. For this specific case, we would like to test how well the models perform in forecasting the next 7 days given a fixed training window. This training window is then moved across the dataset every 7 days to order to test its effectiveness. 

One of the advantages of WFV is that it tests how well the model predicts future values at multiple time windows within the data. This eliminates the bias possibly surfacing from selecting a single window to validate on: it may under or over perform depending on its behavior.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

## Validation Metric

In this notebook, we use Median Absolute Percentage Error to evaluate model performance, although other metrics such as RMSE, MAPE, etc. are just as valid. The application of these evaluation metrics varies from case to case.

In [2]:
def median_absolute_percentage_error(y_true, y_pred):
    """
    Computes MDAPE
    
    Parameters
    ----------
    y_true : list
    
    y_pred : list
    """
    y_true, y_pred = np.array(y_true), np.array(y_pred)
    return np.median(np.abs((y_true - y_pred) / y_true)) * 100

## General Structure

In [7]:
def walk_forward(X, window=120, horizon=7, debug=False):
    """
    Evaluates model performance using walk-forward validation.
    
    Parameters
    ----------
    X : numpy array, pandas series
        Input array of values for splitting into train-test
        
    w : int
        Size of training window per iteration
        
    horizon : int
        Range of values to forecast
        
    debug : bool
        Set to True if you want to inspect if the data is splitting properly
    """
    
    
    # no. of loops:
    loops = (len(df_all)-window) // 7
    
    # position tracker
    start = 0
    end = window
    
    window = window
    horizon = horizon
    
    train_performance = []
    test_performance = []
        
    for i in range(loops):

        train = X[start:end]
        test = X[end:end+horizon]
        
        if len(test) < 7:
            break
            
        # Place any forecasting model here. Note the code may have to change a bit 
        # depending on what the fit method returns
        
        # Model Object here
        results = model.fit(disp=-1)
        
        y_fitted = results.fittedvalues
        
        # Place your validation function here
        mdape_train = median_absolute_percentage_error(train[1:], y_fitted)
        train_performance.append(mdape_train)
        
        y_true = np.array(test)
        y_pred = results.forecast(7)[0]

        mdape_test = median_absolute_percentage_error(y_true, y_pred)
        test_performance.append(mdape_test)
        
        if debug:
            print('train')
            display(train)
            print('test')
            display(test)
        # break if test data is cut off
        
        # update and repeat
        start += horizon
        end += horizon
        
    train_results_mdape = np.mean(np.array(train_performance))
    test_results_mdape = np.mean(np.array(test_performance))
        
    print("total number of forecasts:", loops)
    print("Train MdAPE:", train_results_mdape)
    print("Test MdAPE:", test_results_mdape)

# Walk-Forward: Seasonal ARIMA

Here is an example using Seasonal ARIMA as the forecasting model. 

In [18]:
import statsmodels.api as sm
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.arima_model import ARIMA
from pandas.plotting import register_matplotlib_converters

In [50]:
def walk_forward_SARIMA(X, window=120, horizon=7, debug=False):
    """
    Evaluates model performance using walk-forward validation.
    
    Parameters
    ----------
    X : numpy array, pandas series
        Input array of values for splitting into train-test
        
    w : int
        Size of training window per iteration
        
    horizon : int
        Range of values to forecast
        
    debug : bool
        Set to True if you want to inspect if the data is splitting properly
    """
    
    fit_vals = []
    
    # no. of loops:
    loops = (len(df_all)-window) // horizon    
    
    # position tracker
    start = 0
    end = window
    
    window = window
    horizon = horizon
    
    train_performance = []
    test_performance = []
        
    for i in range(loops):

        train = X[start:end]
        test = X[end:end+horizon]
        
        if len(test) < 7:
            break
        
        model = sm.tsa.statespace.SARIMAX(train, order=(7,1,0),
                                          seasonal_order=(0,0,1,6))
        results = model.fit(disp=-1)
        
        y_fitted = results.fittedvalues
        fit_vals += list(y_fitted[:7])
        
        mdape_train = median_absolute_percentage_error(train, y_fitted)
        train_performance.append(mdape_train)
        
        y_true = np.array(test)
        y_pred = results.forecast(7)

        mdape_test = median_absolute_percentage_error(y_true, y_pred)
        test_performance.append(mdape_test)
        
        if debug:
            print('train')
            display(train)
            print('test')
            display(test)
        # break if test data is cut off
        
        # update and repeat
        start += horizon
        end += horizon
        
    train_results_mdape = np.mean(np.array(train_performance))
    test_results_mdape = np.mean(np.array(test_performance))
        
    print("total number of forecasts:", loops)
    print("Train MdAPE:", train_results_mdape)
    print("Test MdAPE:", test_results_mdape)
    
    return fit_vals

In [51]:
out = walk_forward_SARIMA(df_x['y'])



total number of forecasts: 41
Train MdAPE: 12.666533059005035
Test MdAPE: 17.29518600784417
