# (13) Vector Autoregressive Model with Elastic Net Estimation (VAR_Elastic_Net)

A vector autoregressive (VAR) model with $p$ lags is defined by 

$$
Y_{t} = c + \sum_{i=1}^{p} \Phi_{i}Y_{t-i} + e_{t}.
$$

where $Y_{t}$ is an $8 \times 1$ vector of endogenous variables, $c$ is an $8 \times 1 $ vector of equation constants, $\Phi_{i}$ is an $8 \times 8$ matrix of coefficients to be determined during model estimation, and $e_{t}$ is an $8 \times 1$ vector of forecast errors. 

Elastic Net estimation is applied to every equation within the VAR framework. Elastic Net estimation is used to minimize forecast errors. Elastic Net estimation works by adding a penalty term designed to minimize the sum of squared coefficients and the sum of absolute coefficients. Therefore, the coefficients of less important predictors are pushed to zero and elastic net performs variables selection. Additionally, elastic net takes linear combinations of correlated predictors and works well in cases of multicolinearity.  

$$
L(a_{1},...,a_{n_{a}}) = \sum_{t}(Y_{t+1} - Y_{t+1|t})^{2} + \lambda_{1}\sum_{j=1}^{n_{a}}|a_{j}| + \lambda_{2}\sum_{j=1}^{n_{a}}a_{j}^{2}
$$

The optimal lag length of $p$ is set to a length long enough to return white noise residuals. Reasonable penalty parameters ($\lambda_{1},\lambda_{2}$) are set using validation set root mean squared error (RMSE) minimization. The following code reestimates the VAR model each period using walk foreword cross-validation with a fixed lag length over the validation set. Model validation is carried out using an 80-20 split. The initial training model is estimated on the first 80% of the training data. The training model weights are updated after each peiord. Therefore, model weights are always updated to reflect the most recent information. Walk foreword cross-validation is carried out on the remaining 20% of the in-sample set. Each $h$-step ahead forecast is produced using linear model iteration. In the codes below, the phrase "test" actually references the “validation” set AND NOT an out-of-sample test set. 

In the Python Scikit-Learn library, the elastic net loss function is redefined to the following:

$$
L(a_{1},...,a_{n_{a}}) = \sum_{t}(y_{t+1} - f_{t+1|t})^{2} + \alpha \lambda_{1}^{Ratio} \sum_{j=1}^{n_{a}}|a_{j}| + \alpha (1-\lambda_{1}^{Ratio})\sum_{j=1}^{n_{a}}a_{j}^{2}
$$
where $\alpha = \lambda_{1} + \lambda_{2}$ and $\lambda_{1}^{Ratio} = \lambda_{1}/(\lambda_{1} + \lambda_{2})$. Here, $\alpha$ is a homogenous hyperparameter that controls the strength of the penalty. Homogeneity implies that a doubling of $\alpha$ imposes a doubling of each pentalty parameter, both equally and respectively. The Elastic Net Mixture is controlled by the hyperparameter $\lambda_{1}^{Ratio}$. If $\lambda_{1}^{Ratio} = 0$, then the Elastic Net loss function equals the Ridge Regression loss function. If $\lambda_{1}^{Ratio} = 1$, then the Elastic Net loss function equals the Lasso Regression loss function. Therefore, our constraints are $\alpha > 0$ and $0 < \lambda_{1}^{Ratio} < 1$.

The first block of code defines two functions. The DataSpace function takes in the data (both the target series and the remaining predictors) with the number of lags to use during estimation (lags) and returns a dataframe containing the current periods predictors (current_data), a dataframe containing the lagged data (lagged_data), the number of observations in the training set (train_size), the number of observations in the test set (test_size), and the size of the information set (features). The MODEL function takes in seven arguments. The current data to be forecasted is defined using the current_data argument. The dataframe containing the correct number of lags is defined using the lagged_data argument. The number of observations in the training set using the train_size argument. The number of observations in the test set is set using the test_set argument. The number of variables in variable space is defined by the features argument. The regularization parameter is set using the penalty and mixture arguments. Lastly, the number of forecast horizons is defined by step_size. The output of the MODEL function is designed to return the training and validation set RMSE values during regularization parameter grid searching. After a reasonable regularization parameter is set into the model, the MODEL function will then return the training and validation set predicted values. The first block of code defines a region to grid search in order to identify the reasonable regularization parameters. The second block of code sets the reasonable regularization parameters into the model and returns the forecasts.   

In [None]:
# Load Library:
from pandas import read_csv
import pandas as pd
import numpy as np
from sklearn.linear_model import ElasticNet
from sklearn.metrics import mean_squared_error
from matplotlib import pyplot
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')
# Function to Create VAR Feature Space:
def DataSpace(data, p = 36):
    # Inital Training & Test Set Sizes:
    train_size = int(data.shape[0]*0.8 - p)
    test_size = data.shape[0] - train_size - p
    features = data.shape[1]
    # Compute Lagged DataFrame:
    names = data.columns
    for i in range(1,p+1,1):
        for j in range(features):
            names = np.append(names, names[j]+'_L'+str(i))
            data[names[j]+'_L'+str(i)] = data[names[j]].shift(i)
    data = data.dropna()
    # Break in to Current and Lagged Space:
    current_data = data.iloc[:, 0:features]
    lagged_data = data.iloc[:, features:]
    return current_data, lagged_data, train_size, test_size, features
# Function to Fit Model using Walk Foreward Cross-Validation:
def MODEL(current_data, lagged_data, train_size, test_size, features, alpha = 1.0, l1_ratio = 0.5, step_size = 1):
    # Solving the Hyperparameters:
    lambda_1 = alpha*l1_ratio
    lambda_2 = alpha*(1-l1_ratio)
    # Extracting Data:
    index_values = current_data.index.values
    # Storage & Model Estimation:
    test_pred = []
    name = 'VAR-Type Elastic Net Regression'
    print('-'*len(name))
    print(name)
    print('-'*len(name))
    print('Alpha (Penalty Strength) Hyperparameter Value: ', alpha)
    print('L1 Ratio (Mixture) Hyperparameter Value: ', l1_ratio)
    for t in range(test_size - step_size + 1):
        # Tracking Convergence:
        print('Test Set Walk Foreward: Iteration '+str(t+1))
        # Define Walk Foreward Training Sets:
        feature_space_train = lagged_data.values[:train_size+t, :]
        RHP_train = current_data.values[:train_size+t, 0]
        DSPIC96_train = current_data.values[:train_size+t, 1]
        CPIAUCSL_train = current_data.values[:train_size+t, 2]
        REALSP500_train = current_data.values[:train_size+t, 3]
        CUSR0000SEHA_train = current_data.values[:train_size+t, 4]
        UNRATE_train = current_data.values[:train_size+t, 5]
        RMORTGAGE_train = current_data.values[:train_size+t, 6]
        TWEXAFEGSMTHx_train = current_data.values[:train_size+t, 7]
        # Define Walk Foreward Test Sets:
        feature_space_test = lagged_data.values[train_size+t:, :]
        RHP_test = current_data.values[train_size+t:, 0]
        DSPIC96_test = current_data.values[train_size+t:, 1]
        CPIAUCSL_test = current_data.values[train_size+t:, 2]
        REALSP500_test = current_data.values[train_size+t:, 3]
        CUSR0000SEHA_test = current_data.values[train_size+t:, 4]
        UNRATE_test = current_data.values[train_size+t:, 5]
        RMORTGAGE_test = current_data.values[train_size+t:, 6]
        TWEXAFEGSMTHx_test = current_data.values[train_size+t:, 7]
        # Fit Model to Training Set: RHP Equation 
        RHP_model = ElasticNet(alpha = alpha, l1_ratio = l1_ratio, random_state = 1)
        RHP_model.fit(X = feature_space_train, y = RHP_train)
        # Fit Model to Training Set: DSPIC96 Equation
        DSPIC96_model = ElasticNet(alpha = alpha, l1_ratio = l1_ratio, random_state = 1)
        DSPIC96_model.fit(X = feature_space_train, y = DSPIC96_train)
        # Fit Model to Training Set: CPIAUCSL Equation
        CPIAUCSL_model = ElasticNet(alpha = alpha, l1_ratio = l1_ratio, random_state = 1)
        CPIAUCSL_model.fit(X = feature_space_train, y = CPIAUCSL_train)
        # Fit Model to Training Set: REALSP500 Equation
        REALSP500_model = ElasticNet(alpha = alpha, l1_ratio = l1_ratio, random_state = 1)
        REALSP500_model.fit(X = feature_space_train, y = REALSP500_train)
        # Fit Model to Training Set: CUSR0000SEHA Equation
        CUSR0000SEHA_model = ElasticNet(alpha = alpha, l1_ratio = l1_ratio, random_state = 1)
        CUSR0000SEHA_model.fit(X = feature_space_train, y = CUSR0000SEHA_train)
        # Fit Model to Training Set: UNRATE Equation
        UNRATE_model = ElasticNet(alpha = alpha, l1_ratio = l1_ratio, random_state = 1)
        UNRATE_model.fit(X = feature_space_train, y = UNRATE_train)
        # Fit Model to Training Set: RMORTGAGE Equation
        RMORTGAGE_model = ElasticNet(alpha = alpha, l1_ratio = l1_ratio, random_state = 1)
        RMORTGAGE_model.fit(X = feature_space_train, y = RMORTGAGE_train)
        # Fit Model to Training Set: TWEXAFEGSMTHx Equation
        TWEXAFEGSMTHx_model = ElasticNet(alpha = alpha, l1_ratio = l1_ratio, random_state = 1)
        TWEXAFEGSMTHx_model.fit(X = feature_space_train, y = TWEXAFEGSMTHx_train) 
        # Forecast Storage:
        forecast_storage = lagged_data.values[train_size+t,:]
        RHP_horizons = []
        DSPIC96_horizons = []
        CPIAUCSL_horizons = []
        REALSP500_horizons =[]
        CUSR0000SEHA_horizons = []
        UNRATE_horizons = []
        RMORTGAGE_horizons = []
        TWEXAFEGSMTHx_horizons = []
        for h in range(step_size):
            # Storing Iterative Forecasts:
            RHP_horizons = np.append(RHP_horizons, RHP_model.predict(X = forecast_storage[0:lagged_data.shape[1]].reshape(1,lagged_data.shape[1])))
            DSPIC96_horizons = np.append(DSPIC96_horizons, DSPIC96_model.predict(X = forecast_storage[0:lagged_data.shape[1]].reshape(1,lagged_data.shape[1])))
            CPIAUCSL_horizons = np.append(CPIAUCSL_horizons, CPIAUCSL_model.predict(X = forecast_storage[0:lagged_data.shape[1]].reshape(1,lagged_data.shape[1])))
            REALSP500_horizons = np.append(REALSP500_horizons, REALSP500_model.predict(X = forecast_storage[0:lagged_data.shape[1]].reshape(1,lagged_data.shape[1])))
            CUSR0000SEHA_horizons = np.append(CUSR0000SEHA_horizons, CUSR0000SEHA_model.predict(X = forecast_storage[0:lagged_data.shape[1]].reshape(1,lagged_data.shape[1])))
            UNRATE_horizons = np.append(UNRATE_horizons, UNRATE_model.predict(X = forecast_storage[0:lagged_data.shape[1]].reshape(1,lagged_data.shape[1])))
            RMORTGAGE_horizons = np.append(RMORTGAGE_horizons, RMORTGAGE_model.predict(X = forecast_storage[0:lagged_data.shape[1]].reshape(1,lagged_data.shape[1])))
            TWEXAFEGSMTHx_horizons = np.append(TWEXAFEGSMTHx_horizons, TWEXAFEGSMTHx_model.predict(X = forecast_storage[0:lagged_data.shape[1]].reshape(1,lagged_data.shape[1])))
            # Updating Predictor Space:
            forecast_storage = np.insert(forecast_storage, 0, RHP_horizons[h])
            forecast_storage = np.insert(forecast_storage, 1, DSPIC96_horizons[h])
            forecast_storage = np.insert(forecast_storage, 2, CPIAUCSL_horizons[h])
            forecast_storage = np.insert(forecast_storage, 3, REALSP500_horizons[h])
            forecast_storage = np.insert(forecast_storage, 4, CUSR0000SEHA_horizons[h])
            forecast_storage = np.insert(forecast_storage, 5, UNRATE_horizons[h])
            forecast_storage = np.insert(forecast_storage, 6, RMORTGAGE_horizons[h])
            forecast_storage = np.insert(forecast_storage, 7, TWEXAFEGSMTHx_horizons[h])
        # Store Forecasted Values:
        test_pred = np.append(test_pred, RHP_horizons[step_size - 1])
        # Store Training Predictions:
        if t == 0:
            train_pred = RHP_model.predict(X = feature_space_train)
            train_RMSE = np.sqrt(mean_squared_error(RHP_train, train_pred))
    # Model Evaluation:
    test_RMSE = np.sqrt(mean_squared_error(current_data.values[train_size + step_size - 1:, 0], test_pred))
    return train_RMSE, test_RMSE, lambda_1, lambda_2 
# Setting Seed:
np.random.seed(12345)
# Load Data:
data = read_csv('Milunovich_National.csv', header = 0, index_col = 0, parse_dates = True)
data.index = pd.DatetimeIndex(data.index.values, freq = "MS")
# Create VAR-Type Feature Space:
AR_Lags = 36
current_data, lagged_data, train_size, test_size, features = DataSpace(data, p = AR_Lags)
# Storage for Results & Hyperparameters:
Results = pd.DataFrame(columns = ['Lags', 'Alpha', 'L1_Ratio', 'Lambda_1', 'Lambda_2', 'Train_RMSE', 'Test_RMSE'])
alpha = np.arange(0.060,0.080,0.001)
l1_ratio = np.arange(0.980,0.999,0.001)
horizons = 1
for lr in l1_ratio:
    for a in alpha:
        try:
            train_RMSE, test_RMSE, lambda_1, lambda_2 = MODEL(current_data, lagged_data, train_size, test_size, features, alpha = a, l1_ratio = lr, step_size = horizons)
            model_performance = {'Lags':AR_Lags, 'Alpha':a, 'L1_Ratio':lr, 'Lambda_1':lambda_1, 'Lambda_2':lambda_2, 'Train_RMSE':train_RMSE, 'Test_RMSE':test_RMSE}
            Results = Results.append(model_performance, ignore_index = True)
        except:
            continue

The second block of code reestimates the top performing model after setting the reasonable regularization parameters.

In [None]:
# Load Library:
from pandas import read_csv
import pandas as pd
import numpy as np
from sklearn.linear_model import ElasticNet
from sklearn.metrics import mean_squared_error
from matplotlib import pyplot
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')
# Function to Create VAR Feature Space:
def DataSpace(data, p = 36):
    # Inital Training & Test Set Sizes:
    train_size = int(data.shape[0]*0.8 - p)
    test_size = data.shape[0] - train_size - p
    features = data.shape[1]
    # Compute Lagged DataFrame:
    names = data.columns
    for i in range(1,p+1,1):
        for j in range(features):
            names = np.append(names, names[j]+'_L'+str(i))
            data[names[j]+'_L'+str(i)] = data[names[j]].shift(i)
    data = data.dropna()
    # Break in to Current and Lagged Space:
    current_data = data.iloc[:, 0:features]
    lagged_data = data.iloc[:, features:]
    return current_data, lagged_data, train_size, test_size, features
# Function to Fit Model using Walk Foreward Cross-Validation:
def MODEL(current_data, lagged_data, train_size, test_size, features, alpha = 1.0, l1_ratio = 0.5, step_size = 1):
    # Solving the Hyperparameters:
    lambda_1 = alpha*l1_ratio
    lambda_2 = alpha*(1-l1_ratio)
    # Extracting Data:
    index_values = current_data.index.values
    # Storage & Model Estimation:
    test_pred = []
    name = 'VAR-Type Elastic Net Regression'
    print('-'*len(name))
    print(name)
    print('-'*len(name))
    print('Alpha (Penalty Strength) Hyperparameter Value: ', alpha)
    print('L1 Ratio (Mixture) Hyperparameter Value: ', l1_ratio)
    for t in range(test_size - step_size + 1):
        # Tracking Convergence:
        print('Test Set Walk Foreward: Iteration '+str(t+1))
        # Define Walk Foreward Training Sets:
        feature_space_train = lagged_data.values[:train_size+t, :]
        RHP_train = current_data.values[:train_size+t, 0]
        DSPIC96_train = current_data.values[:train_size+t, 1]
        CPIAUCSL_train = current_data.values[:train_size+t, 2]
        REALSP500_train = current_data.values[:train_size+t, 3]
        CUSR0000SEHA_train = current_data.values[:train_size+t, 4]
        UNRATE_train = current_data.values[:train_size+t, 5]
        RMORTGAGE_train = current_data.values[:train_size+t, 6]
        TWEXAFEGSMTHx_train = current_data.values[:train_size+t, 7]
        # Define Walk Foreward Test Sets:
        feature_space_test = lagged_data.values[train_size+t:, :]
        RHP_test = current_data.values[train_size+t:, 0]
        DSPIC96_test = current_data.values[train_size+t:, 1]
        CPIAUCSL_test = current_data.values[train_size+t:, 2]
        REALSP500_test = current_data.values[train_size+t:, 3]
        CUSR0000SEHA_test = current_data.values[train_size+t:, 4]
        UNRATE_test = current_data.values[train_size+t:, 5]
        RMORTGAGE_test = current_data.values[train_size+t:, 6]
        TWEXAFEGSMTHx_test = current_data.values[train_size+t:, 7]
        # Fit Model to Training Set: RHP Equation 
        RHP_model = ElasticNet(alpha = alpha, l1_ratio = l1_ratio, random_state = 1)
        RHP_model.fit(X = feature_space_train, y = RHP_train)
        # Fit Model to Training Set: DSPIC96 Equation
        DSPIC96_model = ElasticNet(alpha = alpha, l1_ratio = l1_ratio, random_state = 1)
        DSPIC96_model.fit(X = feature_space_train, y = DSPIC96_train)
        # Fit Model to Training Set: CPIAUCSL Equation
        CPIAUCSL_model = ElasticNet(alpha = alpha, l1_ratio = l1_ratio, random_state = 1)
        CPIAUCSL_model.fit(X = feature_space_train, y = CPIAUCSL_train)
        # Fit Model to Training Set: REALSP500 Equation
        REALSP500_model = ElasticNet(alpha = alpha, l1_ratio = l1_ratio, random_state = 1)
        REALSP500_model.fit(X = feature_space_train, y = REALSP500_train)
        # Fit Model to Training Set: CUSR0000SEHA Equation
        CUSR0000SEHA_model = ElasticNet(alpha = alpha, l1_ratio = l1_ratio, random_state = 1)
        CUSR0000SEHA_model.fit(X = feature_space_train, y = CUSR0000SEHA_train)
        # Fit Model to Training Set: UNRATE Equation
        UNRATE_model = ElasticNet(alpha = alpha, l1_ratio = l1_ratio, random_state = 1)
        UNRATE_model.fit(X = feature_space_train, y = UNRATE_train)
        # Fit Model to Training Set: RMORTGAGE Equation
        RMORTGAGE_model = ElasticNet(alpha = alpha, l1_ratio = l1_ratio, random_state = 1)
        RMORTGAGE_model.fit(X = feature_space_train, y = RMORTGAGE_train)
        # Fit Model to Training Set: TWEXAFEGSMTHx Equation
        TWEXAFEGSMTHx_model = ElasticNet(alpha = alpha, l1_ratio = l1_ratio, random_state = 1)
        TWEXAFEGSMTHx_model.fit(X = feature_space_train, y = TWEXAFEGSMTHx_train) 
        # Forecast Storage:
        forecast_storage = lagged_data.values[train_size+t,:]
        RHP_horizons = []
        DSPIC96_horizons = []
        CPIAUCSL_horizons = []
        REALSP500_horizons =[]
        CUSR0000SEHA_horizons = []
        UNRATE_horizons = []
        RMORTGAGE_horizons = []
        TWEXAFEGSMTHx_horizons = []
        for h in range(step_size):
            # Storing Iterative Forecasts:
            RHP_horizons = np.append(RHP_horizons, RHP_model.predict(X = forecast_storage[0:lagged_data.shape[1]].reshape(1,lagged_data.shape[1])))
            DSPIC96_horizons = np.append(DSPIC96_horizons, DSPIC96_model.predict(X = forecast_storage[0:lagged_data.shape[1]].reshape(1,lagged_data.shape[1])))
            CPIAUCSL_horizons = np.append(CPIAUCSL_horizons, CPIAUCSL_model.predict(X = forecast_storage[0:lagged_data.shape[1]].reshape(1,lagged_data.shape[1])))
            REALSP500_horizons = np.append(REALSP500_horizons, REALSP500_model.predict(X = forecast_storage[0:lagged_data.shape[1]].reshape(1,lagged_data.shape[1])))
            CUSR0000SEHA_horizons = np.append(CUSR0000SEHA_horizons, CUSR0000SEHA_model.predict(X = forecast_storage[0:lagged_data.shape[1]].reshape(1,lagged_data.shape[1])))
            UNRATE_horizons = np.append(UNRATE_horizons, UNRATE_model.predict(X = forecast_storage[0:lagged_data.shape[1]].reshape(1,lagged_data.shape[1])))
            RMORTGAGE_horizons = np.append(RMORTGAGE_horizons, RMORTGAGE_model.predict(X = forecast_storage[0:lagged_data.shape[1]].reshape(1,lagged_data.shape[1])))
            TWEXAFEGSMTHx_horizons = np.append(TWEXAFEGSMTHx_horizons, TWEXAFEGSMTHx_model.predict(X = forecast_storage[0:lagged_data.shape[1]].reshape(1,lagged_data.shape[1])))
            # Updating Predictor Space:
            forecast_storage = np.insert(forecast_storage, 0, RHP_horizons[h])
            forecast_storage = np.insert(forecast_storage, 1, DSPIC96_horizons[h])
            forecast_storage = np.insert(forecast_storage, 2, CPIAUCSL_horizons[h])
            forecast_storage = np.insert(forecast_storage, 3, REALSP500_horizons[h])
            forecast_storage = np.insert(forecast_storage, 4, CUSR0000SEHA_horizons[h])
            forecast_storage = np.insert(forecast_storage, 5, UNRATE_horizons[h])
            forecast_storage = np.insert(forecast_storage, 6, RMORTGAGE_horizons[h])
            forecast_storage = np.insert(forecast_storage, 7, TWEXAFEGSMTHx_horizons[h])
        # Store Forecasted Values:
        test_pred = np.append(test_pred, RHP_horizons[step_size - 1])
        # Store Training Predictions:
        if t == 0:
            train_pred = RHP_model.predict(X = feature_space_train)
            train_RMSE = np.sqrt(mean_squared_error(RHP_train, train_pred))
    # Model Evaluation:
    test_RMSE = np.sqrt(mean_squared_error(current_data.values[train_size + step_size - 1:, 0], test_pred))
    train_pred = pd.DataFrame(train_pred, index = index_values[:train_size], columns = ['train_pred'])
    test_pred = pd.DataFrame(test_pred, index = index_values[train_size + step_size - 1:], columns = ['test_pred'])
    return train_RMSE, test_RMSE, train_pred, test_pred, lambda_1, lambda_2 
# Setting Seed:
np.random.seed(12345)
# Load Data:
data = read_csv('Milunovich_National.csv', header = 0, index_col = 0, parse_dates = True)
data.index = pd.DatetimeIndex(data.index.values, freq = "MS")
# Create VAR-Type Feature Space:
AR_Lags = Results.sort_values(by = 'Test_RMSE', ascending = True).iloc[0,0]
current_data, lagged_data, train_size, test_size, features = DataSpace(data, p = AR_Lags)
target_series = current_data.iloc[:,0]
# Storage for Results & Hyperparameters:
alpha = Results.sort_values(by = 'Test_RMSE', ascending = True).iloc[0,1]
l1_ratio = Results.sort_values(by = 'Test_RMSE', ascending = True).iloc[0,2] 
horizons = 1
# Evaluate Model:
train_RMSE, test_RMSE, train_pred, test_pred, lambda_1, lambda_2 = MODEL(current_data, lagged_data, train_size, test_size, features, alpha = alpha, l1_ratio = l1_ratio, step_size = horizons)

The third block presents and graphs the stored output from the MODEL function. The MODEL above is fit to housing price data in order to forecast real housing price growth rates at the U.S. national level.

In [None]:
# Evaluate Model: Growth Rates
print('-----------------------------')
print('National Housing Price Series')
print('-----------------------------')
print('Data Type: Growth Rates')
print('Model Type: VAR-Type Elastic Net Regression')
print('Alpha (Strength) Hyperparameter: ', alpha)
print('L1 Ratio (Mixture) Hyperparameter: ', l1_ratio)
print('Lambda 1 (L1) Hyperparameter: ', lambda_1)
print('Lambda 2 (L2) Hyperparameter: ', lambda_2)
print('Train RMSE: %.3f' % (train_RMSE))
print('Test RMSE: %.3f' % (test_RMSE))
# Plot Forecast: Growth Rates
sns.set_theme(style = 'whitegrid')
pyplot.figure(figsize = (12,6))
pyplot.plot(target_series, label = 'Observed')
pyplot.plot(train_pred, label = 'VAR_Elastic_Net: Train')
pyplot.plot(test_pred, label = 'VAR_Elastic_Net: Test')
pyplot.xlabel('Date')
pyplot.ylabel('Growth Rate')
pyplot.title('Real Housing Price Series (National)')
pyplot.legend()
pyplot.show()

The fourth block of code is used to analyze the forecast errors for stationarity. The forecast errors are computed, plotted, and distributed. Lastly, the autocorrelation function (ACF) is plotted and the Augmented Dickey-Fuller (ADF) unit root test is carried out.

In [None]:
# Load Library:
import pandas as pd
from statsmodels.tsa.stattools import adfuller
from statsmodels.graphics.tsaplots import plot_acf
# Compute Model Residuals:
Error = pd.concat([target_series,train_pred], axis = 1)
Error = Error.dropna()
Error['Resids'] = Error.iloc[:,0] - Error.iloc[:,1]
# Plot Residuals:
sns.set_theme(style = 'whitegrid')
pyplot.figure(figsize = (16,4))
pyplot.subplot(1,2,1)
pyplot.plot(Error['Resids'])
pyplot.xlabel('Date')
pyplot.title('Residual Series')
pyplot.subplot(1,2,2)
pyplot.hist(Error['Resids'], bins = 20)
pyplot.title('Residual Distribution')
pyplot.tight_layout()
pyplot.show()
# Plot Autocorelation Function (ACF):
sns.set_theme(style = 'whitegrid')
fig, ax = pyplot.subplots(figsize=(8,4))
plot_acf(Error['Resids'], title = 'Residual ACF', lags = 36, ax = ax)
pyplot.show()
# ADF Test: Non-Stationary v. Stationary
ADF_Test = adfuller(Error['Resids'])
print('----------------------')
print('  ADF Unit-Root Test  ')
print('----------------------')
print('Test Statistic: %.3f' % (ADF_Test[0]))
print('P-Value: %.3f' % (ADF_Test[1]))
print('Critical Values:')
for key, value in ADF_Test[4].items():
    print('%s: %.3f' % (key, value))

The last block of code loads in the previous .csv files "National_Train_Growth_One" and "National_Test_Growth_One" that contain the stored forecasted values. The storage files are then augmented to include the predicted values from the current algorithm in order to estimate the forecast combinations, produce the final "top performing" model plots, and carry out the final comparison tests for predictive accuracy.

In [None]:
# Load Forecast Tables: 
train_forecasts = read_csv('National_Train_Growth_One.csv', header = 0, index_col = 0, parse_dates = True)
train_forecasts.index = pd.DatetimeIndex(train_forecasts.index.values, freq = "MS")
test_forecasts = read_csv('National_Test_Growth_One.csv', header = 0, index_col = 0, parse_dates = True)
test_forecasts.index = pd.DatetimeIndex(test_forecasts.index.values, freq = "MS")
# Add New Forecast Model:
train_forecasts['VAR_Elastic_Net'] = train_pred
test_forecasts['VAR_Elastic_Net'] = test_pred
# Save Forecast:
pd.DataFrame(train_forecasts).to_csv('National_Train_Growth_One.csv')
pd.DataFrame(test_forecasts).to_csv('National_Test_Growth_One.csv')