# Generalized Autoregressive Conditional Heteroskedasticity model - GARCH

Extension of ARCH by adding past conditional variances to the variance equation.

It comes from observing the volatility clustering where high variance is followed by other high variances and low variances are followed by low variances.

If we include the conditional variance from the past period we will have a benchmark on what to expect.


Var(y<sub>t</sub> | y<sub>t - 1</sub>) = Ω + α<sub>1</sub> * ε<sup>2</sup><sub>t - 1</sub> + β<sub>1</sub> * σ<sup>2</sup><sub>t - 1</sub>

Var(y<sub>t</sub> | y<sub>t - 1</sub>) - the variance today is conditional on the values of the variable yesterday

Ω - constant

α<sub>1</sub> - numeric coefficient for the squared residual for the past period

ε<sup>2</sup><sub>t - 1</sub> - squared residual for the past period 

β<sub>1</sub> - numeric coefficient for the conditional variance from last period

σ<sup>2</sup><sub>t - 1</sub> - conditional variance from last period


GARCH (1, 1) - ARCH (past ε<sup>2</sup><sub>t</sub>), GARCH order (past σ<sup>2</sup><sub>t</sub>)

GARCH component ≈ AR component

ARCH component ≈ MA component

GARCH(p, q)

μ<sub>t</sub> = ARMA(p, q)

σ<sup>2</sup><sub>t</sub> = GARCH(p, q)

GARCH is similar to ARMA because it includes past values and past errors (conditional variances and squared residuals)


In [4]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.graphics.tsaplots as sgt
import statsmodels.tsa.stattools as sts
from statsmodels.tsa.arima_model import ARIMA
from scipy.stats.distributions import chi2
from arch import arch_model
import seaborn as sns
sns.set() 

In [2]:
raw_csv_data = pd.read_csv('Index2018.csv')
df_comp = raw_csv_data.copy()
df_comp.date = pd.to_datetime(df_comp.date, dayfirst=True)
df_comp.set_index('date', inplace=True)
df_comp = df_comp.asfreq('b')
df_comp = df_comp.fillna(method='ffill')

In [3]:
df_comp['market_value'] = df_comp.ftse

In [5]:
del df_comp['spx']
del df_comp['dax']
del df_comp['ftse']
del df_comp['nikkei']
size = int(len(df_comp) * 0.8)
df, df_test = df_comp.iloc[:size], df_comp.iloc[size:]

In [6]:
def LLR_test(mod_1, mod_2, DF=1): # models we want to compare and degrees of freedom
  L1 = mod_1.fit().llf
  L2 = mod_2.fit().llf
  LR = (2 * (L2 - L1))
  p = chi2.sf(LR, DF).round(3)
  return p

In [7]:
df['returns'] = df.market_value.pct_change(1) * 100

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['returns'] = df.market_value.pct_change(1) * 100


## Simple GARCH model

Fit a GARCH(1,1) model with serially uncorrelated data.

In [8]:
model_garch_1_1 = arch_model(df.returns[1:], mean='Constant', vol='GARCH', p=1, q=1)
results_garch_1_1 = model_garch_1_1.fit()
results_garch_1_1.summary()

# this GARCH(1,1) model outperforms ARCH(12) examined earlier

Iteration:      1,   Func. Count:      6,   Neg. LLF: 6579303469.390623
Iteration:      2,   Func. Count:     15,   Neg. LLF: 2701100877.2298183
Iteration:      3,   Func. Count:     23,   Neg. LLF: 7009.030632045198
Iteration:      4,   Func. Count:     29,   Neg. LLF: 7024.035884053223
Iteration:      5,   Func. Count:     35,   Neg. LLF: 7010.712869697414
Iteration:      6,   Func. Count:     41,   Neg. LLF: 6975.41810662336
Iteration:      7,   Func. Count:     47,   Neg. LLF: 7092.271036620865
Iteration:      8,   Func. Count:     53,   Neg. LLF: 6973.879267693578
Iteration:      9,   Func. Count:     59,   Neg. LLF: 6970.088049128948
Iteration:     10,   Func. Count:     64,   Neg. LLF: 6970.058478416067
Iteration:     11,   Func. Count:     69,   Neg. LLF: 6970.058367475404
Iteration:     12,   Func. Count:     74,   Neg. LLF: 6970.058366189876
Iteration:     13,   Func. Count:     78,   Neg. LLF: 6970.058366189162
Optimization terminated successfully    (Exit mode 0)
          

0,1,2,3
Dep. Variable:,returns,R-squared:,0.0
Mean Model:,Constant Mean,Adj. R-squared:,0.0
Vol Model:,GARCH,Log-Likelihood:,-6970.06
Distribution:,Normal,AIC:,13948.1
Method:,Maximum Likelihood,BIC:,13974.2
,,No. Observations:,5020.0
Date:,"Sun, Oct 31 2021",Df Residuals:,5019.0
Time:,19:57:51,Df Model:,1.0

0,1,2,3,4,5
,coef,std err,t,P>|t|,95.0% Conf. Int.
mu,0.0466,1.183e-02,3.939,8.187e-05,"[2.342e-02,6.981e-02]"

0,1,2,3,4,5
,coef,std err,t,P>|t|,95.0% Conf. Int.
omega,0.0109,3.004e-03,3.640,2.724e-04,"[5.048e-03,1.682e-02]"
alpha[1],0.0835,1.071e-02,7.794,6.476e-15,"[6.249e-02, 0.104]"
beta[1],0.9089,1.148e-02,79.168,0.000,"[ 0.886, 0.931]"


## Higher-lag GARCH models

No higher-order GARCH models outperform GARCH(1,1) when it comes to variance of market returns.

This is due to the recursive nature in which the past conditional variances are computed. 

In [10]:
model_garch_1_2 = arch_model(df.returns[1:], mean='Constant', vol='GARCH', p=1, q=2)
results_garch_1_2 = model_garch_1_2.fit()
results_garch_1_2.summary()

# p-value of 1 for beta[2] coefficient. This means we have full multiocollinearity due to the relationship between conditional variances

Iteration:      1,   Func. Count:      7,   Neg. LLF: 136466865907811.38
Iteration:      2,   Func. Count:     17,   Neg. LLF: 623837004.6012297
Iteration:      3,   Func. Count:     25,   Neg. LLF: 10137.468310101489
Iteration:      4,   Func. Count:     33,   Neg. LLF: 7008.431369262152
Iteration:      5,   Func. Count:     40,   Neg. LLF: 6974.174338061595
Iteration:      6,   Func. Count:     46,   Neg. LLF: 6971.512115431834
Iteration:      7,   Func. Count:     52,   Neg. LLF: 6970.618692971883
Iteration:      8,   Func. Count:     58,   Neg. LLF: 6973.873501261716
Iteration:      9,   Func. Count:     65,   Neg. LLF: 6970.063569109683
Iteration:     10,   Func. Count:     71,   Neg. LLF: 6970.058392236019
Iteration:     11,   Func. Count:     77,   Neg. LLF: 6970.058367067028
Iteration:     12,   Func. Count:     83,   Neg. LLF: 6970.058366267055
Optimization terminated successfully    (Exit mode 0)
            Current function value: 6970.058366267055
            Iterations: 12

0,1,2,3
Dep. Variable:,returns,R-squared:,0.0
Mean Model:,Constant Mean,Adj. R-squared:,0.0
Vol Model:,GARCH,Log-Likelihood:,-6970.06
Distribution:,Normal,AIC:,13950.1
Method:,Maximum Likelihood,BIC:,13982.7
,,No. Observations:,5020.0
Date:,"Sun, Oct 31 2021",Df Residuals:,5019.0
Time:,20:05:28,Df Model:,1.0

0,1,2,3,4,5
,coef,std err,t,P>|t|,95.0% Conf. Int.
mu,0.0466,1.184e-02,3.938,8.219e-05,"[2.341e-02,6.982e-02]"

0,1,2,3,4,5
,coef,std err,t,P>|t|,95.0% Conf. Int.
omega,0.0109,2.908e-03,3.760,1.696e-04,"[5.236e-03,1.663e-02]"
alpha[1],0.0835,1.189e-02,7.019,2.233e-12,"[6.017e-02, 0.107]"
beta[1],0.9089,0.188,4.845,1.269e-06,"[ 0.541, 1.277]"
beta[2],3.7130e-09,0.180,2.065e-08,1.000,"[ -0.352, 0.352]"


In [11]:
model_garch_1_3 = arch_model(df.returns[1:], mean='Constant', vol='GARCH', p=1, q=3)
results_garch_1_3 = model_garch_1_3.fit()
results_garch_1_3.summary()

# same p-value 1 for beta[2] and beta[3]

Iteration:      1,   Func. Count:      8,   Neg. LLF: 48216.80281467884
Iteration:      2,   Func. Count:     20,   Neg. LLF: 30197.37469083751
Iteration:      3,   Func. Count:     31,   Neg. LLF: 645132346.733002
Iteration:      4,   Func. Count:     39,   Neg. LLF: 1579678655.9946866
Iteration:      5,   Func. Count:     47,   Neg. LLF: 7044.915118311412
Iteration:      6,   Func. Count:     55,   Neg. LLF: 7035.999098115932
Iteration:      7,   Func. Count:     63,   Neg. LLF: 6984.463705447777
Iteration:      8,   Func. Count:     71,   Neg. LLF: 6974.307547265434
Iteration:      9,   Func. Count:     79,   Neg. LLF: 7374.448309463874
Iteration:     10,   Func. Count:     88,   Neg. LLF: 6973.180837188822
Iteration:     11,   Func. Count:     96,   Neg. LLF: 6970.1608624628025
Iteration:     12,   Func. Count:    103,   Neg. LLF: 6970.09228427
Iteration:     13,   Func. Count:    110,   Neg. LLF: 6970.059719305382
Iteration:     14,   Func. Count:    117,   Neg. LLF: 6970.05867464

0,1,2,3
Dep. Variable:,returns,R-squared:,0.0
Mean Model:,Constant Mean,Adj. R-squared:,0.0
Vol Model:,GARCH,Log-Likelihood:,-6970.06
Distribution:,Normal,AIC:,13952.1
Method:,Maximum Likelihood,BIC:,13991.2
,,No. Observations:,5020.0
Date:,"Sun, Oct 31 2021",Df Residuals:,5019.0
Time:,20:05:48,Df Model:,1.0

0,1,2,3,4,5
,coef,std err,t,P>|t|,95.0% Conf. Int.
mu,0.0466,1.179e-02,3.954,7.683e-05,"[2.351e-02,6.972e-02]"

0,1,2,3,4,5
,coef,std err,t,P>|t|,95.0% Conf. Int.
omega,0.0109,8.168e-03,1.339,0.181,"[-5.074e-03,2.694e-02]"
alpha[1],0.0835,6.069e-02,1.376,0.169,"[-3.546e-02, 0.202]"
beta[1],0.9089,2.151,0.422,0.673,"[ -3.308, 5.125]"
beta[2],0.0000,3.380,0.000,1.000,"[ -6.625, 6.625]"
beta[3],7.3918e-13,1.296,5.704e-13,1.000,"[ -2.540, 2.540]"


In [12]:
model_garch_2_1 = arch_model(df.returns[1:], mean='Constant', vol='GARCH', p=2, q=1)
results_garch_2_1 = model_garch_2_1.fit()
results_garch_2_1.summary()


Iteration:      1,   Func. Count:      7,   Neg. LLF: 159537792891988.94
Iteration:      2,   Func. Count:     17,   Neg. LLF: 1848307647.4642267
Iteration:      3,   Func. Count:     25,   Neg. LLF: 10354.116095029069
Iteration:      4,   Func. Count:     33,   Neg. LLF: 7005.361877757425
Iteration:      5,   Func. Count:     40,   Neg. LLF: 8793.711867692436
Iteration:      6,   Func. Count:     47,   Neg. LLF: 7019.706996857094
Iteration:      7,   Func. Count:     54,   Neg. LLF: 6973.161801614009
Iteration:      8,   Func. Count:     61,   Neg. LLF: 7010.720504720297
Iteration:      9,   Func. Count:     69,   Neg. LLF: 6967.937761572108
Iteration:     10,   Func. Count:     76,   Neg. LLF: 6967.731247505904
Iteration:     11,   Func. Count:     82,   Neg. LLF: 6967.731020076308
Iteration:     12,   Func. Count:     87,   Neg. LLF: 6967.7310200762295
Optimization terminated successfully    (Exit mode 0)
            Current function value: 6967.731020076308
            Iterations: 

0,1,2,3
Dep. Variable:,returns,R-squared:,0.0
Mean Model:,Constant Mean,Adj. R-squared:,0.0
Vol Model:,GARCH,Log-Likelihood:,-6967.73
Distribution:,Normal,AIC:,13945.5
Method:,Maximum Likelihood,BIC:,13978.1
,,No. Observations:,5020.0
Date:,"Sun, Oct 31 2021",Df Residuals:,5019.0
Time:,20:06:42,Df Model:,1.0

0,1,2,3,4,5
,coef,std err,t,P>|t|,95.0% Conf. Int.
mu,0.0466,1.187e-02,3.922,8.780e-05,"[2.329e-02,6.982e-02]"

0,1,2,3,4,5
,coef,std err,t,P>|t|,95.0% Conf. Int.
omega,0.0129,4.097e-03,3.158,1.589e-03,"[4.908e-03,2.097e-02]"
alpha[1],0.0547,1.665e-02,3.286,1.017e-03,"[2.208e-02,8.735e-02]"
alpha[2],0.0389,2.345e-02,1.659,9.709e-02,"[-7.056e-03,8.488e-02]"
beta[1],0.8974,1.712e-02,52.415,0.000,"[ 0.864, 0.931]"


In [15]:
model_garch_3_1 = arch_model(df.returns[1:], mean='Constant', vol='GARCH', p=3, q=1)
results_garch_3_1 = model_garch_3_1.fit()
results_garch_3_1.summary()

# p-value for alpha[3] coefficient

Iteration:      1,   Func. Count:      8,   Neg. LLF: 150444968190829.7
Iteration:      2,   Func. Count:     19,   Neg. LLF: 2703507039.552999
Iteration:      3,   Func. Count:     28,   Neg. LLF: 10659.739250271865
Iteration:      4,   Func. Count:     37,   Neg. LLF: 7010.494991149147
Iteration:      5,   Func. Count:     45,   Neg. LLF: 6998.998850883414
Iteration:      6,   Func. Count:     53,   Neg. LLF: 7101.696868633609
Iteration:      7,   Func. Count:     61,   Neg. LLF: 6985.395347441217
Iteration:      8,   Func. Count:     69,   Neg. LLF: 6973.907033082958
Iteration:      9,   Func. Count:     77,   Neg. LLF: 7045.027676748644
Iteration:     10,   Func. Count:     85,   Neg. LLF: 6967.743972174774
Iteration:     11,   Func. Count:     92,   Neg. LLF: 6967.737171439584
Iteration:     12,   Func. Count:     99,   Neg. LLF: 6967.731036438667
Iteration:     13,   Func. Count:    106,   Neg. LLF: 6967.731025502168
Iteration:     14,   Func. Count:    113,   Neg. LLF: 6967.7310

0,1,2,3
Dep. Variable:,returns,R-squared:,0.0
Mean Model:,Constant Mean,Adj. R-squared:,0.0
Vol Model:,GARCH,Log-Likelihood:,-6967.73
Distribution:,Normal,AIC:,13947.5
Method:,Maximum Likelihood,BIC:,13986.6
,,No. Observations:,5020.0
Date:,"Sun, Oct 31 2021",Df Residuals:,5019.0
Time:,20:07:33,Df Model:,1.0

0,1,2,3,4,5
,coef,std err,t,P>|t|,95.0% Conf. Int.
mu,0.0466,1.187e-02,3.924,8.721e-05,"[2.330e-02,6.982e-02]"

0,1,2,3,4,5
,coef,std err,t,P>|t|,95.0% Conf. Int.
omega,0.0129,4.816e-03,2.687,7.217e-03,"[3.500e-03,2.238e-02]"
alpha[1],0.0547,1.665e-02,3.285,1.019e-03,"[2.207e-02,8.736e-02]"
alpha[2],0.0389,2.505e-02,1.553,0.120,"[-1.018e-02,8.800e-02]"
alpha[3],2.0879e-11,2.572e-02,8.117e-10,1.000,"[-5.041e-02,5.041e-02]"
beta[1],0.8974,2.245e-02,39.978,0.000,"[ 0.853, 0.941]"
