Introduction ----------
The principle behind score-driven models is that the linear update yt − θt, that the Kalman filter relies upon, can be robustified by replacing it with the conditional score of a non-normal distribution. For this reason, any class of traditional state space model has a score-driven equivalent.
For example, consider a dynamic regression model in this framework:
p(yt∣θt)
θt = xt′βt
βt = βt − 1 + ηHt − 1 − 1St − 1
Here η represents the learning rates or scaling terms, and are the latent variables which are estimated in the model.
We will use a dynamic t regression to extract a dynamic β for a stock. Using t-distributed errors is more robust than a normality assumption, that could be obtained with a Kalman filter. The β captures the amount of systematic risk in the stock - i.e. the stock's relationship with the market.
from pandas_datareader import DataReader
from datetime import datetime
a = DataReader('AMZN', 'yahoo', datetime(2012,1,1), datetime(2016,6,1))
a_returns = pd.DataFrame(np.diff(np.log(a['Adj Close'].values)))
a_returns.index = a.index.values[1:a.index.values.shape[0]]
a_returns.columns = ["Amazon Returns"]
spy = DataReader('SPY', 'yahoo', datetime(2012,1,1), datetime(2016,6,1))
spy_returns = pd.DataFrame(np.diff(np.log(spy['Adj Close'].values)))
spy_returns.index = spy.index.values[1:spy.index.values.shape[0]]
spy_returns.columns = ['S&P500 Returns']
one_mon = DataReader('DGS1MO', 'fred',datetime(2012,1,1), datetime(2016,6,1))
one_day = np.log(1+one_mon)/365
returns = pd.concat([one_day,a_returns,spy_returns],axis=1).dropna()
excess_m = returns["Amazon Returns"].values - returns['DGS1MO'].values
excess_spy = returns["S&P500 Returns"].values - returns['DGS1MO'].values
final_returns = pd.DataFrame(np.transpose([excess_m,excess_spy, returns['DGS1MO'].values]))
final_returns.columns=["Amazon","SP500","Risk-free rate"]
final_returns.index = returns.index
plt.figure(figsize=(15,5))
plt.title("Excess Returns")
x = plt.plot(final_returns);
plt.legend(iter(x), final_returns.columns);
We can fit a GAS Regression model with a t()
family:
model = pf.GASReg('Amazon ~ SP500', data=final_returns, family=pf.t())
Next we estimate the latent variables. For this example we will use a Maximum Likelihood estimate zMLE:
x = model3.fit()
x.summary()
t GAS Regression
======================================== =================================================
Dependent Variable: Amazon Method: MLE
Start Date: 2012-01-04 00:00:00 Log Likelihood: 3158.435
End Date: 2016-06-01 00:00:00 AIC: -6308.87
Number of observations: 1101 BIC: -6288.8541
==========================================================================================
Latent Variable Estimate Std Error z P>|z| 95% C.I.
========================= ========== ========== ======== ======== ========================
Scale 1 0.0
Scale SP500 0.0474
t Scale 0.0095
v 2.8518
==========================================================================================
We can plot the fit with :pyplot_fit
:
model.plot_fit(intervals=False,figsize=(15,15))
One of the advantages of using a GASRegression rather than a Kalman filtered Dynamic Linear Regression is that the GASRegression with t errors is more robust to outliers. We do not produce the whole analysis here, but for the same data, the filtered estimates are compared below:
Class Description ----------
Creal, D; Koopman, S.J.; Lucas, A. (2013). Generalized Autoregressive Score Models with Applications. Journal of Applied Econometrics, 28(5), 777–795. doi:10.1002/jae.1279.
Harvey, A.C. (2013). Dynamic Models for Volatility and Heavy Tails: With Applications to Financial and Economic Time Series. Cambridge University Press.