# TVP single model

In [252]:
import pandas as pd
%load_ext autoreload
%autoreload 2
import numpy as np
import numpy.linalg as la

from src.data import import_data
from src.data import transform_data
from src.models import preliminaries
from src.data import data_class

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [253]:
# load data from DMA_ECB
df = import_data.load_dma_ecb()

In [254]:
# print variables in dataset
df.columns

Index(['HICP_SA', 'M1', 'M2', 'M3_OUT', 'M3_OUTPS', 'LPS', 'D4PSTARPSCGL',
       'D4PSTARCGL', 'D4DIV_TV', 'D4M3DEPHH', 'D4M3DFM', 'D4M3MULTI',
       'RGAPDFR', 'USD_EUR', 'R_EFF_EXCH', 'OILP', 'WPRM_EE', 'WPRM',
       'STOCKPRICES', 'STOCK_PE', 'DIVYIELD', 'STOCKPRICES_EUR', 'UNEMPL'],
      dtype='object')

## Set preliminaries

In [255]:
# initialize Settings
params = preliminaries.Settings()
# adjust settings
params.use_y = ['HICP_SA'] # use as y
params.use_x = ['USD_EUR', 'OILP', 'STOCKPRICES'] # indep vars
# transformations for X
    #     -- Tcodes:
    #                 1 Level
    #                 2 First Difference
    #                 3 Second Difference
    #                 4 Log-Level
    #                 5 Log-First-Difference
    #                 6 Log-Second-Difference
    #                 7 Detrend Log Using 1-sided HP detrending for Monthly data
    #                 8 Detrend Log Using 1-sided HP detrending for Quarterly data
    #                16 Log-Second-Difference
    #                17 (1-L)(1-L^12)
params.tcodesX = [5,5,5]
params.tcodey = 5
# specify end of training data
    # correct?
params.first_sample_ends = 0.5

In [256]:
params.print_settings()

The following preliminary settings are specified:
intercept : 1
plag : 1
hlag : 0
use_x : ['USD_EUR', 'OILP', 'STOCKPRICES']
use_y : ['HICP_SA']
tcodesX : [5, 5, 5]
tcodey : 5
miss_treatment : 2
lamda : 0.99
alpha : 0.9
kappa : 0.95
forgetting_method : 1
prior_theta : 2
initial_V_0 : 2
initial_DMA_weights : 1
expert_opinion : 1
h_fore : 1
first_sample_ends : 0.5


## Transform data

In [257]:
# create data class
data = data_class.Data(df)
# inherit previously specified settings
data = data.get_settings(df, params)

In [258]:
# execute: data.prepare_data_all_steps()
data.set_data()
# transform data and shift for forecasting
data.transf_data()
# create lags
data.lagg_matrices()
# combine data of lagged y with exogenous vars
data.combine_lagged_y_X()
# remove observations
data.remove_obs()
# deal with missings
data.deal_with_missing()
# add intercept
data.add_intercept()

In [259]:
data.X

Unnamed: 0,intercept,HICP_SA,HICP_SA_lag_1,USD_EUR,OILP,STOCKPRICES
1980.Q3,1,0.020494,0.024052,0.021353,-0.089918,0.080320
1980.Q4,1,0.021663,0.020494,-0.057987,0.174227,-0.025546
1981.Q1,1,0.026490,0.021663,-0.085655,-0.045197,-0.085118
1981.Q2,1,0.024307,0.026490,-0.093685,-0.121839,-0.083670
1981.Q3,1,0.025404,0.024307,-0.083770,-0.016026,-0.110984
...,...,...,...,...,...,...
2009.Q2,1,0.002584,-0.002492,0.045120,0.298103,0.132513
2009.Q3,1,0.001842,0.002584,0.050190,0.141009,0.154005
2009.Q4,1,0.002206,0.001842,0.034368,0.099589,0.108996
2010.Q1,1,0.004671,0.002206,-0.069959,0.020666,-0.061161


In [260]:
data.y_dep.name

'HICP_SA_t+1'

In [261]:
# save vars such as N, T
data.save_data_info()
# transform to numpy
data.data_to_numpy()
X = data.X_np
y = data.y_dep_np

In [262]:
data.set_priors()

## Predict TVP

### Kalman Filter Explanation
Given the State-Space model:
$y_t = Z_t \alpha_t + d_t + \epsilon_t$, $\epsilon_t \sim iid N(0,H_t)$
$\alpha_t = T_t \alpha_{t-1} + c_t +R_t \eta_t$, $\eta_t \sim iid N(0,Q_t)$
Kalman filter consists of
1. Prediction equations
2. Updating equations

Using:
- $a_t = E[\alpha_t|I_t]$ = optimal estimator of $\alpha_t$ based on $I_t$
- $P_t = E[(\alpha_t - a_t)(\alpha_t - a_t)'|I_t] =$ Covariance of the estimation error

Generally:
1. Prediction Equations
    - $a_{t|t-1} = E[\alpha_t|I_t] = T_t a_{t-1} + c_t$
    - $P_{t|t-1} = E[(\alpha_t - a_{t-1})(\alpha_t - a_{t-1})'|I_{t-1}] = T_t P_{t-1} T_{t-1}' + R_t Q_t R_t'$
   -> Computing the point prediction of $y_t|I_{t-1}$ as $y_{t|t-1} = Z_t a_{t|t-1} + d_t$ and using new information these lead to
    - the prediction error and its MSE $v_t = y_t - y_{t|t-1} = Z_t(\alpha_t - a_{t|t-1}) + \epsilon_t$ and $E[v_t v_t']=F_t = Z_t P_{t|t-1} Z_t' + H_t$

2. Updating equations
    with new information $y_t$, the optimal predictor and its MSE get updated using the prediction error that contains new information about $\theta_t$
   - $a_t = a_{t|t-1} + P_{t|t-1} Z_t' F_t^{-1} (y_t -Z_t a_{t|t-1} -d_t) = a_{t|t-1} + P_{t|t-1} Z_t' F_t^{-1} v_t$
   - $P_t = P_{t|t-1} - P_{t|t-1} Z_t' F_t^{-1} Z_t P_{t|t-1}$
   -> $a_t$ is the filtered estimated of $\alpha_t$, i.e. the optimal estimate of $\alpha_t|I_t$. $P_t$ is its MSE matrix.

### TVP-Model Estimation Scheme
Estimate:
(1) measurment equation:  $ y_t = X_t \theta_t + \epsilon_t$ with $\epsilon_t \sim iidN(0,H_t)$
(2) transition equation with unobservable state vector $\theta$: $ \theta_t = \theta_{t-1} + \eta_t$ with $\eta_t \sim iid N(0,Q_t)$
also $E[\epsilon_t \eta_t] = 0$

Given the variances $H_t$ and $Q_t$, the standard State-Space estimation can be used, i.e. the Kalman filter. (See notes in Liquid text to Harvey (1990))

In our case:
1. Prediction equations
    - $\hat{\theta_{t|t-1}} = \hat{\theta_{t-1}}$
    - $P_{t|t-1} = P_{t-1} + Q_t$ and using the forgetting factor $P_{t|t-1} = \frac{1}{\lambda} P_{t-1}$
        - Sidenotes:
             - The covariance of the estimation error is the covariance of the estimator: $P_{t|t-1} = \Sigma_{t|t-1}$ and $P_{t} = \Sigma_{t}$
             - the predictive likelihood of $y_t$ is $f(y_t|I_{t-1}) \sim N(X_{t-1}\theta_{t|t-1}, P_{t|t-1})$
             - with the forgetting factor there is no need to reestimate $Q_t$ once new information arrives
        - $E[v_t v_t']=F_t = Z_t P_{t|t-1} Z_t' + H_t$
            - To get the time dependent $H_t$ (error variance likely to change over time - e.g. Great Moderation), a Exponentially Weighted Moving Average (EWMA) estimator is used to get a consistent estimate
            - $\hat{H}_t = \sqrt{(1-\kappa)\sum_{j=1}^t \kappa^{j-1} (y_j - z_j \hat{\theta}_j)^2}$
            - $\kappa$ is a decay factor that is set according to quarterly data
            - recursive forecast_ $\hat{H}_{t+1|t} = \kappa \hat{H}_{t|t-1} + (1-\kappa) (y_t - Z_t \hat{\kappa}_t)^2$
            - No ARCH specification used to ease computational burden

2. Updating Equations
    - $\hat{\theta_{t}} = \hat{\theta_{t|t-1}} + P_{t|t-1} Z_t' F_t^{-1} v_t$
    - $P_t  = P_{t|t-1} - P_{t|t-1} Z_t F_t^{t-1} Z_t P_{t|t-1}$
        - Sidenote:
            - Now $\theta_t|t \sim N(\hat{\theta_{t}}, P_t)$

The estimation procedure goes as follows:
1. in period t=0
   - Set $\theta_0$
   - Set $P_t$
2. At the beginning of t use prediction equation to get
   - $\hat{\theta}_{t|t-1}$
   - $P_{t|t-1}$
3. With new data get prediction error
   - Estimate $H_t$
   - calculate prediction error
4. Use prediction error to update equations
   - $\hat{\theta}_t$
   - $P_t$
5. Repeat Steps 2-4

In [264]:
X = data.X_np
y = data.y_dep
T = data.T
N = data.N
theta_prmean = data.theta_prmean
theta_prvar = data.theta_prvar
lamda = data.lamda
kappa = data.kappa
V_0 = data.V_0

In [352]:
theta_pred = np.empty((N,T), dtype=float)*np.nan
theta_update = np.empty((N,T), dtype=float)*np.nan
P_pred = np.empty((N,N,T), dtype=float)*np.nan
P_update = np.empty((N,N,T), dtype=float)*np.nan
y_pred = np.empty(T, dtype=float)*np.nan
e_t = np.empty(T, dtype=float)*np.nan
H_pred = np.empty(T, dtype=float)*np.nan

In [353]:
for t in range(T):
    # prediction step
    if t == 0:
        theta_pred[:,t] = theta_prmean
        P_pred[:,:,t] = (1/lamda)*theta_prvar
    else:
        theta_pred[:,t] = theta_update[:,t-1]
        P_pred[:,:,t] = (1/lamda)*P_update[:,:,t-1]
    # predict y_t and calculate prediction error
    y_pred[t] = X[t,:] @ theta_pred[:,t]
    e_t[t] = y[t] - X[t,:] @ theta_pred[:,t]
    # Estimate H_t
    if t == 0:
        H_t = (1/(t+1))*(e_t[t]**2 - X[t,:] @ P_pred[:,:,t] @ X[t,:].T)
        H_pred[t] = H_t if H_t > 0 else V_0
    else:
        H_t = kappa*H_pred[t-1] + (1-kappa)*(e_t[t-1]**2)
        H_pred[t] = H_t if H_t > 0 else H_pred[t-1]
    # updating step
    F = H_pred[t] + X[t,:] @ P_pred[:,:,t] @ X[t,:].T
    theta_update[:,t] = theta_pred[:,t] + P_pred[:,:,t] @ X[t,:].T * (1/F) * e_t[t]
    P_update[:,:,t] = P_pred[:,:,t] - P_pred[:,:,t] @ X[[t],:].T @ X[[t],:] @ P_pred[:,:,t] * (1/F)