# VARMAX Model

## Content
* Elements
* Data Preprocessing
* Model Identification
* Model Estimation
* Model Verification
* Model Use

Import required tools. 

In [None]:
import warnings
import time 
import itertools
import numpy as np
import scipy as sp
import pandas as pd
import statsmodels as sm
import sklearn
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

Get required config.

In [None]:
# Seed for reproducibility
np.random.seed(42)

## Elements
### Definition
VARMAX (Vector Autoregressive Moving Average with Exogenous Variables) isa linear model for multivariate stationary time series. This is, it models data that involves both autoregressive behavior and moving average components across multiple interrelated time series. Moreover, it can also include exogenous variables for capturing the influece of external variables over the time series. 
  
The following are the components of the VARMAX model:
   
**Autoregressive Component ($AR(p)$)**  
The main idea behind an AR part of the model is that the current value of a time series is regressed on its own previous values. Essentially, the future values of the series are predicted based on its $p$ past values. 
  
**Moving Average Component ($MA(q)$)**  
The main idea behind the MA part of the model is that the the current value of a time series can be expressed as as a linear combination of past $q$ error terms (shocks or noise).  
  
**Exogenous Variables Component ($BX$)**  
This component of the model consider the influence of external variables over the different endogenous time series over time.
  
Thus, the VARMAX model captures the  internal relationship between the endogenous time series and the influence of external factors. 
  
---
  
### Specification
Let be the following:  
* $\{y_t\}_1,...,\{y_t\}_n$ time series. Asume we have $m$ occurrences for each of the $n$ time series whose data is chronologically arranged on a matrix $Y\in\mathcal{M}_{m\times n}^{(\mathbb{R})}$ so $Y_t = ren(Y)_t\in\mathbb{R}^n$ is the value of the time series at time $t$. 
* $ x_1,..., x_{n_{exog}}$ external variables. Asume we have $m$ occurences for each of the $n_{exog}$ external variables whose data is chronologically arranged on a matrix $X\in\mathbb{M}_{m\times n_{exog}}^{(\mathbb{R})}$ so $X_t = ren(X)_t\in\mathbb{R}^n$ is the value of the external variables at time $t$.
* $\{\varepsilon_t\}$ a white noise process whose realizaton at time $t$ is such that $a_t\in\mathbb{R}^n$.
* $\mu\in\mathbb{R}^n$ the intercepts. They might be set to zero. 
* $p$ the order of the autoregressive component of the model. 
* $q$ the order of the moving average component of the model. 
* $\Phi_i\in\mathcal{M}_{n}^{(\mathbb{R})}$ for $1\leq i \leq p$ the coefficient matrices for the autoregressive model component at lag $i$.
* $\Theta_j\in\mathcal{M}_{n}^{(\mathbb{R})}$ for $1\leq j \leq q$ the coefficient matrices for the moving averages model component at lag $j$.
* $B\in\mathcal{M}_{n\times n_{exog}}^{(\mathbb{R})}$ the coefficient matrix for the exogenous variables. 
  
The VARMAX model is represented as follows:
$$Y_t = \mu + \displaystyle\sum_{i=1}^p\Phi_i Y_{t-i} + \displaystyle\sum_{j=1}^q \Theta_j \varepsilon_{t-j} + BX_t + \varepsilon_t$$
  
Where:
* $\displaystyle\sum_{i=1}^p\Phi_i Y_{t-i}$ is the autoregressive component of the model where:
    * $Y_k\in\mathbb{R}^n$ are the time series values at time $k$.
    * $\Phi_i\in\mathcal{M}_{n}(\mathbb{R})$ are the AR coeffiecients. 
* $\displaystyle\sum_{j=1}^q \Theta_j \varepsilon_{t-j}$ is the moving averages components of the model where:
    * $\varepsilon_k\in\mathbb{R}^n$ are the noise terms values at time $k$.
    * $\Theta_j\in\mathcal{M}_{n}(\mathbb{R})$ are the MA coeffiecients. 
* $BX_t$ is the exogenous variable component where:
    * $X_t\in\mathbb{R}^{n_{exog}}$ are the exogenous variables values at time $t$.
    * $B\in\mathcal{M}_{n\times n_{exog}}(\mathbb{R})$ are the exogenous variable coefficients.
* $\mu\in\mathbb{R}^n$ is the intercept term.
* $\varepsilon_t\mathbb{R}^n$ is the noise term at time $t$. 

Alternativeley, the VARMAX model be expresed as follows:
$$\Phi(L)y_t = \mu + \Theta(L)\epsilon_t + BX_t$$
  
Where:
* $\Phi(L) = I_n -\Phi_1L - \Phi_2L^2 - ... - \Phi_pL^p$ are lag polynomial matrices corresponding to the AR part with $\Phi_1, \Phi_2, ..., \Phi_p \in \mathcal{M}_n(\mathbb{R})$.
* $\Theta(L) = I_n - \Theta_1L - \Theta_2L^2 - ... - \Theta_qL^q$ are lag polynomial matrices corresponding to the MA part with $\Theta_1, \Theta_2, ..., \Theta_q \in \mathcal{M}_n(\mathbb{R})$.
* $BX_t$ is the exogenous component with $X_t\in\mathbb{R}^{n_{exog}}$ and $B\in\mathcal{M}_{n\times n_{exog}}(\mathbb{R})$
* $\mu\in\mathbb{R}^n$ is the intercept term.
  
---
  
### Parameter Estimation
Once $p,q$ are defined, $pn^2 + qn^2 + nn_{exog} + n + \frac{n(n+1)}{2}$ parameters must be estimated in order for the model to be used.
* $p$ matrices $\Phi_i\in\mathcal{M}_n(\mathbb{R})$ for the VAR component. Thus, $p\cdot n^2$ parameters must be estimated for this part. 
* $q$ matrrices $\Theta_j\in\mathcal{M}_n(\mathbb{R})$ $qn^2$ for the VMA component. Thus, $q\cdot n^2$ parameters must be estimated. 
* One matrix $B\in\mathcal{M}_{n\times n_{exog}}^{(\mathbb{R})}$ for the exogenous variable component. Thus, $n\cdot n_{exog}$ parameters must be estimated. 
* One vector $\mu\in\mathbb{R}^n$ for the intercpets. Thus, $n$ parameters must be estimated. 
* One symetrical matrix $\Sigma\in\mathcal{M}_n(\mathbb{R})$ for covariance error matrix. Thus, $\frac{n(n+1)}{2}$ must be estimated.

The Maximum Likelihood Estimation (MLE) method is used to estimate the parameters of a VARMAX model by maximizing the log-likelihood function, which is based on the observed data and the assumed multivariate normal distribution of the error terms. The main parameters estimated include the autoregressive coefficients, moving average coefficients, exogenous variable coefficients, error covariance matrix, and intercepts. These estimates are obtained through iterative optimization techniques that maximize the likelihood of the observed data under the model.
  
---
  
### Assumptions
1. **Linearity**  
The model is a linear function of past values and errors. This is:
$$Y_t = \mu + \sum_{i=1}^p \Phi_i Y_{t - i} + \sum_{j=1}^q \Theta_j \varepsilon_{t - j} + BX_t + \varepsilon_t$$
  
2. **Stationarity of AR process**  
The AR part of the model. Consider the lag polynomial $\Phi(z) = I_n - \Phi_1z - \Phi_2z^2 - ... - \Phi_pz^p$ for $z\in\mathbb{C}$ and examine the roots of the determinant equation $det(\Phi(z)) = 0$. All of them must lie outside the unit circle in the complex plane. This is, $|z| > 1$ for $z$ such that $det(\Phi(z)) = 0$.
  
3. **Invertibility of MA process**   
The MA part of the model is invertible. Consider the lag polynomial $\Theta(z) = I_n + \Theta_1z + \Theta_2z^2 + ... + \Theta_qz^q$ for $z\in\mathbb{C}$ and examine the roots of the determinant equation $det(\Theta(z)) = 0$. All of them must lie outside the unit circle in the complex plane. This is, $|z| > 1$ for $z$ such that $det(\Theta(z)) = 0$.

4. **Exogeneity of $X_t$**  
The exogenous variables are strictly exogenousm, meaning $X_t$ is uncorrelated with past, present, and future errors. This is:

$$\mathbb{E}[\varepsilon_t \mid X_s] = 0, \quad \forall t,s$$

5. **Error Term Properties**  
The process $\{\varepsilon_t\}$ is a white noise process. This is:
* It has zero mean or $\mathbb{E}[\varepsilon_t] = \mathbf{0}_n$.
* Homoskedasticity or $\mathbb{E}[\varepsilon_t \varepsilon_t^\top] = \Sigma$ where $\Sigma$ is positive definite and constant covariance matrix.
* No autocorrelation or $\mathbb{E}[\varepsilon_t \varepsilon_s^\top] = \mathbf{0}_{n \times n} \quad \forall t \neq s$.

6. **No Perfect Multicollinearity**  
The regressors formed by lagged time series values $Y_t$, lagged errors $\varepsilon_t$, and lagged exogenous variables $X_t$ should not be perfectly collinear.

7. **Finite Moments**  
The process $\{Y_t\}$ and $\{\varepsilon_t\}$ have finite second moments. This is:

$$\mathbb{E}[\|\mathbf{y}_t\|^2] < \infty, \quad \mathbb{E}[\|\varepsilon_t\|^2] < \infty $$
  
---
  
### Other considerations
**Seasonality**    
VARMAX model can also be provided with seasonality. 


**Impulse Response Analysis**  
Impulse Response Function (IRF): One of the useful features of the VARMAX model is that it allows for impulse response analysis. This analysis helps in understanding how a shock or sudden change in one variable (e.g., a marketing campaign or a sudden price change) can affect the entire system of time series over time.
Use case: For example, if a company launches a new product, VARMAX can be used to analyze how the product's launch (an exogenous shock) affects sales, inventory, and marketing costs over time.

## Example with Synthetic Data

### Required Data
The dataset contains 36 months of monthly retail sales data for two products, labeled Sales A and Sales B, spanning from January 2018 to December 2020.

* Endogenous Variables
Sales A and Sales B are the two endogenous time series representing monthly sales volumes for two distinct product categories or stores.
    * Both series exhibit an overall increasing trend over the three-year period, reflecting growth in sales.
    * Their sales patterns also show fluctuations due to seasonal and promotional effects, as well as some random noise to simulate real-world variability.

* Exogenpus Variables
There are two exogenous variables included, which influence the sales. 
    * Promo Index is a numeric variable representing the intensity of promotional activities or marketing campaigns during each month.This variable follows a smooth cyclical pattern with added noise, simulating promotional cycles such as quarterly or seasonal sales campaigns. Higher promo index values generally increase sales for both products.
    * Holiday Flag is a binary indicator variable marking whether a given month includes a major holiday shopping season (specifically November and December).These months typically experience significant boosts in sales due to holidays and year-end shopping events. When the flag is 1 (holiday months), sales tend to increase markedly for both products.

The dataset is indexed by month start dates (YYYY-MM-DD) for clear time ordering and easy time series analysis.

**Generate Data**

In [None]:
### Generate monthly data for three months

# Set data range
dates = pd.date_range(start='2018-01-01', periods=36, freq='MS') 

## Generate exogenous variables

# promo_index: cyclical promotional intensity, e.g., seasonal sales campaigns
promo_index = 0.5 * np.sin(np.linspace(0, 4 * np.pi, 36)) + 0.1 * np.random.randn(36)

# holiday_flag: binary indicator for holiday month (e.g., Nov, Dec as holiday season)
holiday_flag = dates.month.isin([11,12]).astype(int)

## Generate endogenous variables
# Generate base sales with trend
base_sales_A = 200 + np.linspace(0, 50, 36)  # increasing trend over months
base_sales_B = 300 + np.linspace(0, 30, 36)

# Add effects of promo and holidays plus noise
sales_A = base_sales_A + 20 * promo_index + 30 * holiday_flag + 15 * np.random.randn(36)
sales_B = base_sales_B + 25 * promo_index + 40 * holiday_flag + 20 * np.random.randn(36)

## Set table with endogenous and exogenous variables
data = pd.DataFrame({
    'date': dates,
    'sales_A': sales_A,
    'sales_B': sales_B,
    'promo_index': promo_index,
    'holiday_flag': holiday_flag
}).set_index('date')

# Create a copy of the original endogenous columns for later verification
holdout_endog = data[['sales_A', 'sales_B', 'promo_index', 'holiday_flag']].iloc[-6:].copy()

# Erase last 6 months of sales data in the main data DataFrame by setting them to NaN
data.loc[data.index[-6:], ['sales_A', 'sales_B']] = np.nan

data.head()

In [None]:
## Plot data
sns.set(style="whitegrid")
plt.figure(figsize=(14, 7))
plt.plot(data.index, data['sales_A'], marker='o', linestyle='-', color='tab:blue', label='Product A Sales')
plt.plot(data.index, data['sales_B'], marker='o', linestyle='-', color='tab:orange', label='Product B Sales')
plt.title('Monthly Retail Sales', fontsize=16, fontweight='bold')
plt.xlabel("Date", fontsize=14)
plt.ylabel("Sales", fontsize=14)
plt.grid(True, which='both', linestyle='--', linewidth=0.5, alpha=0.7)
plt.legend(fontsize=12)
plt.xticks(data.index, [d.strftime('%Y-%m') for d in data.index], rotation=45, fontsize=10)
plt.tight_layout()
plt.show()

**Arrange data as required**  
  
We'll assume the following:
* Available data of endogenous variables are from 2018-01 to 2020-06. Thus, we'll ignore that endogenous data from 2020-07 to 2020-12 is available, but we'll keep it for evaluation porpouse.
* Available data of exogenous variables are from 2018-10 to 2020-06. Nevertheless, we'll consider data from 2020-07 to 2020-12 as projected data in order to forecast. 

In [None]:
# Prepare data for modeling
endog = data[['sales_A', 'sales_B']]
exog = data[['promo_index', 'holiday_flag']]

In [None]:
# Set present data
endog_present = endog.iloc[:-6]
endog_validation = endog.tail(6)
exog_present = exog.iloc[:-6]
exog_validation = exog.tail(6)

## Identification

### Auxiliary Functions

In [None]:
def diff_df(original_df, I):
    """
    Apply I differences to all columns in the DataFrame
    """

    diff_df = original_df.copy()

    if I == 0:
        return original_df

    if I == 1:
        for col in diff_df.columns:
            diff_df[col] = original_df[col].diff().dropna()
            #diff_df[col] = diff_df[col].clip(lower=1e-6)
    diff_df = diff_df.dropna()

    if I == 2:
        for col in diff_df.columns:
            diff_df[col] = original_df[col].diff().dropna().diff().dropna()
            #diff_df[col] = diff_df[col].clip(lower=1e-6)
        diff_df = diff_df.dropna()
    
    return diff_df

In [None]:
def diff_inv_df(original_df, diff_df, I):
    """
    Reverse I differences of a time series to all columns in the DataFrames that has been differenced I times
    """
    if I == 0:
        return original_df.copy()


    elif I == 1:
        rev_df = original_df.copy()
        for col in diff_df.columns:
            y = pd.Series(original_df[col], index = original_df.index, name=col)
            y_0 = y.iloc[0]
            y_diff_1 = pd.Series(diff_df[col], index=diff_df.index, name=col)

            y_diff_0 = pd.Series(index = y_diff_1.index, dtype='float64', name = col)
            cumulative_sum = 0
            for timestamp in y_diff_1.index:
                cumulative_sum = cumulative_sum + y_diff_1.loc[timestamp]  # same as += but explicit
                y_diff_0.loc[timestamp] = y_0 + cumulative_sum
            y_diff_0 = pd.concat([pd.Series([y_0], index=[y.index[0]]), y_diff_0])

            rev_df[col] = y_diff_0

    elif I == 2:
        rev_df = original_df.copy()
        for col in diff_df.columns:
            y = pd.Series(original_df[col], index = original_df.index, name=col)
            y_0 = y.iloc[0]
            y_1 = y.iloc[1]
            y_diff_2 = pd.Series(diff_df[col], index=diff_df.index, name=col)

            y_diff_1 = pd.Series(index = y_diff_2.index, dtype='float64', name = col)
            cumulative_sum = 0
            for timestamp in y_diff_2.index:
                cumulative_sum = cumulative_sum + y_diff_2.loc[timestamp]
                y_diff_1.loc[timestamp] = (y_1 - y_0) + cumulative_sum
            y_diff_1 = pd.concat([pd.Series([y_1 - y_0], index = [y.index[1]]), y_diff_1])
            
            y_diff_0 = pd.Series(index = y_diff_1.index, dtype='float64', name = col)
            cumulative_sum = 0
            for timestamp in y_diff_1.index:
                cumulative_sum = cumulative_sum + y_diff_1.loc[timestamp]
                y_diff_0.loc[timestamp] = y_0 + cumulative_sum
            y_diff_0 = pd.concat([pd.Series([y_0], index = [y.index[0]]), y_diff_0])
            

            rev_df[col] = y_diff_0

    return rev_df


In [None]:
from scipy.stats import boxcox

def boxcox_df(df):
    """
    Apply Box-Cox transformation to all columns in the DataFrame"
    """
    transformed_data = {}
    lambdas = {}

    for col in df.columns:
        # Ensure all values are positive
        if (df[col] <= 0).any():
            raise ValueError(f"Box-Cox requires positive values. Column '{col}' has non-positive values.")

        bc, l = boxcox(df[col])
        transformed_data[col] = bc
        lambdas[col] = l

    return pd.DataFrame(transformed_data, index=df.index, columns = df.columns), lambdas


In [None]:
from scipy.special import inv_boxcox

def inv_boxcox_df(df_transformed, lambdas):
    """
    Inverse Box-Cox transformation for all columns in the DataFrame
    """
    recovered_data = {}

    for col in df_transformed.columns:
        recovered = inv_boxcox(df_transformed[col], lambdas[col])
        recovered_data[col] = recovered

    return pd.DataFrame(recovered_data, index=df_transformed.index, columns = df_transformed.columns)

In [None]:
def evaluate_multivariate_forecast(actual, predicted):
    """
    Computes MAPE, R²,and MAE per time series.
    """
    from sklearn.metrics import mean_absolute_error, mean_absolute_percentage_error, mean_squared_error, r2_score
    MAPE =  []
    R2 = []
    MAE =  []

    for col in actual.columns:
        y_true = actual[col]
        y_pred = predicted[col]

        mape = mean_absolute_percentage_error(y_true, y_pred)
        r2 = r2_score(y_true, y_pred)
        mae = mean_absolute_error(y_true, y_pred)

        MAPE.append(mape)
        R2.append(r2)
        MAE.append(mae)

    metrics = pd.DataFrame({
        'MAPE': MAPE,
        'R2': R2,
        'MAE': MAE
    })

    return metrics


### Grid Seacrh with AIC

In [None]:
# Parameter grid
param_grid = {
    'AR_p' : [0,1,2],
    'MA_q' : [0,1,2],
    'I': [0,1,2],
    'Box-Cox' : [True, False]
}

In [None]:
# Import required tools
from itertools import product
from scipy.stats import boxcox
from statsmodels.tsa.statespace.varmax import VARMAX

# Hyperparameter grid search function
def VARMAX_GRID_SEARCH_AIC(endog, exog, max_p, max_q, bc, freq):
    """
    VARMAX Grid Search Based on AIC

    IN:
    -> endog: DataFrame with endogenous variables
    -> exog: DataFrame with exogenous variables
    -> param_grid: dictionary with hyperparameter grid
    -> max_p: maximum AR order
    -> max_q: maximum MA order
    -> freq: frequency of the time series data
    
    OUT:
    -> df_metrics: DataFrame with hyperparameters and AIC
    """
    p = range(0, max_p+1)
    q = range(0, max_q+1)
    I = [0, 1, 2]
    pq = list(product(p, q))

    

    with warnings.catch_warnings():
        warnings.simplefilter("ignore")
        # Arrange data
        for d in I:
             




        for param in piq:
                try:
                    model = SARIMAX(endog = endog_data_bc,
                                    exog = exog_data_norm,
                                    order=param,
                                    seasonal_order=(param_seasonal[0], param_seasonal[1], param_seasonal[2], param_seasonal[3]),
                                    )
                    results = model.fit()
                    if results.aic < best_aic and param != (0,0,0):
                        best_aic = results.aic
                        best_params = (param, param_seasonal)
                except Exception as e:
                    continue

print(f'Best SARIMA model: order={best_params[0]}, seasonal_order={best_params[1]} with AIC={best_aic}')

### Grid Search with In-Sample Metrics

In [None]:
# Parameter grid
param_grid = {
    'AR_p' : [0,1,2],
    'MA_q' : [0,1,2],
    'I': [0,1,2],
    'Box-Cox' : [True, False]
}

In [None]:
# Import required tools
from itertools import product
from statsmodels.tsa.statespace.varmax import VARMAX


# Hyperparameter grid search function
def VARMAX_GRID_SEARCH_IN_SAMPLE(endog, exog, param_grid, freq):
    """
    VARMAX Hyperparameter Grid Search with In-Sample Evaluation

    IN:
    -> endog: endogenous variables at present time
    -> exog: exogenous variables at present time
    -> param_grid: hyperparameter grid to search over
    -> freq: frequency of the time series data

    OUT:
    -> df_metrics: DataFrame with hyperparameters and in-sample metrics
    """

    # Print total iterations
    print('Total Iterations: ', len(list(product(*param_grid.values()))))

    # Initialize metrics df and metrics lists
    df_metrics = pd.DataFrame(columns = ['AR_p', 'MA_q', 'I', 'Box-Cox', 'IN_SAMPLE_MAPE', 'IN_SAMPLE_R2', 'IN_SAMPLE_MAE', 'IN_SAMPLE_AVG_MAPE', 'IN_SAMPLE_AVG_R2', 'IN_SAMPLE_AVG_MAE', 'IN_SAMPLE_TIME'])

    # Initialize iteration counter 
    iter = 1

    # Perform grid search
    for params in product(*param_grid.values()):
        print('------------------------------')
        print('Iteration: ', iter)
        print('Parameters: ', params)

        # Discard not suitable parameter combinations
        if params[0] == 0 and params[1] == 0:
            print('Not suitable parameter combination')
        else:

            ### Prepare data
            # Initialize hyperparameters
            p, q, I, bc = params

            # Initialize training data
            endog_train = endog.copy()
            exog_train = exog.copy()

            # Handle differences
            if I > 0:
                endog_train = diff_df(endog_train, I)
                exog_train = diff_df(exog_train, I)

            # Discard not possible Box-Cox transformation 
            if bc & (endog_train <= 0).any().any():
                print('Box-Cox not suitable because of non-positive values')
            else:
            
                # Handle Box-Cox transformation
                if bc:
                    endog_train, lambdas = boxcox_df(endog_train)

                ### Train model
                # Start time
                start_time = time.time()
                
                # Set model
                model = VARMAX(
                    endog = endog_train, 
                    exog = exog_train, order=(p, q), 
                    trend='c',
                    freq = freq
                )

                # Train model
                results = model.fit(disp=False, freq = freq)

                ### Get predicted values and actual values
                # Get predicted values
                start = endog.index[I]
                end = endog.index[-1]
                predicted = results.get_prediction(start=start, end=end, dynamic=False)
                predicted_mean = predicted.predicted_mean

                ## Return predicted values to original scale
                # Inverse differences
                if I > 0:
                    predicted_mean = diff_inv_df(endog, predicted_mean, I)
                    predicted_mean = predicted_mean.iloc[I:]
                # Inverse Box-Cox
                if bc:
                    predicted_mean = inv_boxcox_df(predicted_mean, lambdas)

                # Set actual values
                if I == 0:
                    actual = endog
                else:
                    actual = endog.iloc[I:]
                
                # Stop time 
                end_time = time.time()

                ### Calculate metrics
                metrics = evaluate_multivariate_forecast(actual, predicted_mean)
                avg_mape = metrics['MAPE'].mean()
                avg_r2 = metrics['R2'].mean()
                avg_mae = metrics['MAE'].mean()

                # Fill metrics data frame
                df_metrics_aux = pd.DataFrame({
                    'AR_p': [p],
                    'MA_q': [q],
                    'I': [I],
                    'Box-Cox': [bc],
                    'IN_SAMPLE_MAPE': [list(metrics['MAPE'])],
                    'IN_SAMPLE_R2': [list(metrics['R2'])],
                    'IN_SAMPLE_MAE': [list(metrics['MAE'])],
                    'IN_SAMPLE_AVG_MAPE': [avg_mape],
                    'IN_SAMPLE_AVG_R2': [avg_r2],
                    'IN_SAMPLE_AVG_MAE': [avg_mae],
                    'IN_SAMPLE_TIME': [end_time - start_time]
                })
                print(df_metrics_aux.head())
                df_metrics = pd.concat([df_metrics, df_metrics_aux])

        # Increase iteration counter
        iter = iter + 1 

    # Show metrics data frame
    df_metrics.head()

    # Return hyperparameter-metrics data frame
    return df_metrics

In [None]:
df_results = VARMAX_GRID_SEARCH_IN_SAMPLE(endog = endog_present, exog = exog_present, param_grid = param_grid, freq = 'MS')
df_results.to_csv(r'VARMAX_results_1.csv', index = False)

### Train - Test Split Metrics for Grid Search

## Pendientes
* Ahondar en tranformación potencia Box-Cox
* Ahondar en número óptimo de diferencias
* Evaluar supuestos del modelo