# ARMA-GARCH Maximum Likelihood Estimation 

One possible approach of fitting an ARMA-GARCH model is to perform a maximum likelihood estimation (MLE) for the conditional mean (ARMA), then an MLE of the conditional variance (GARCH). However, joint estimation is preferred. In the first stage of ARMA estimation, there is an implicit assumption of conditional homoskedasticity. It is contradicted in the second stage when you explicitly model conditional heteroskedasticity using GARCH.

## An Example: GARCH(1,1) with Normal Distribution

Recall a GARCH(1,1) model is defined as:

\begin{equation}
    \sigma_t^2 = \alpha_0 + \alpha_1 \epsilon_{t-1}^2 + \beta_1 \sigma_{t-1}^2,
\end{equation}

and the *log-likelihood* function for a normally distributed random variable is:

\begin{equation}
    L = - \frac{1}{2} \sum_{t=1}^T \left( \ln \sigma_t^2 + \left(\frac{\epsilon_{t}}{\sigma_t} \right)^2 \right),
\end{equation}

In [1]:
import numpy as np

def garch(alpha0, alpha1, beta1, epsilon):
    T = len(epsilon)
    sigma_2 = np.zeros(T)
    
    for t in range(T):
        if t == 0:
            sigma_2[t] = alpha0 / (1 - alpha1 - beta1) # initialize as unconditional variance
        else:
            sigma_2[t] = alpha0 + alpha1*epsilon[t-1]**2 + beta1*sigma_2[t-1]
            
    return sigma_2
    
def garch_neg_loglike(params, epsilon):
    T = len(epsilon)
    alpha0 = params[0]
    alpha1 = params[1]
    beta1 = params[2]
    sigma_2 = garch(alpha0, alpha1, beta1, epsilon)
    NegLogL = -0.5 * np.sum(-np.log(sigma_2) - epsilon**2/sigma_2)  # negative sign for minimization
    return NegLogL

In [2]:
# load data
import pandas as pd
from scipy.optimize import minimize

data = pd.read_csv('data/top10_logreturns.csv', index_col=0, parse_dates=True)['D05.SI'] * 100  # scaled for ease of optimization
bounds = tuple((0.0001, None) for i in range(3))
params_initial = (0.1, 0.05, 0.92)
cons = (
    {'type': 'ineq', 'func': lambda x: np.array(x)},
    {'type': 'ineq', 'func': lambda x: 1-x[1]-x[2]+0.00000000000001}
)  

res = minimize(garch_neg_loglike, params_initial, args=(data), bounds=bounds, options={'disp': True})
res.fun

1514.9967534819598

## Another Example: ARMA(1,1) with Normal Distrbitution

We utilise the same log-likelihood function defined above, but this time we use an ARMA(1,1) model,

\begin{equation}
    \epsilon_t = r_t - \phi_0 - \phi_1 r_{t-1} - \theta_1 \epsilon_{t-1},
\end{equation}

which has been rearranged to solve for $\epsilon_t$.

In [3]:
def arma(phi0, phi1, theta1, r):
    T = len(r)
    epsilon = np.zeros(T)
    
    for t in range(T):
        if t == 0:
            epsilon[t] = r[t] - np.mean(r)
        else:
            epsilon[t] = r[t] - phi0 - phi1*r[t-1] - theta1*epsilon[t-1]
    
    return epsilon

def arma_neg_loglike(params, r):
    T = len(r)
    phi0 = params[0]
    phi1 = params[1]
    theta1 = params[2]
    epsilon = arma(phi0, phi1, theta1, r)
    NegLogL = -0.5 * np.sum(-np.log(r.var()) - epsilon**2/r.var())
    return NegLogL

## Fitting ARMA-GARCH with MLE

Based on some preliminary research, many sources suggest estimating the *ARMA* process first, followed by modelling the innovations with GARCH. However, this will most likely lead to inconsistent parameter estimates. In fitting an ARMA model, there is an assumption made about the *conditional variance* - it is constant. This is clearly not the case when the process is assumed to follow that of GARCH. This is especially an issue when it comes to order determination for the ARMA model - the ACF and PACF confidence bounds will be invalid given the GARCH-type residuals. 

Therefore, parameter determination via MLE must be performed for both ARMA and GARCH *simultaneously*. This simply involves substituting the *conditional mean* component from ARMA and the *conditional variance* component from GARCH into the log-likelihood equation and minimizing with `scipy`. We should also account for additional lags in the ARMA component of the model. To do this, I implement a brute-force search for p and q in ARMA(p,q) and choose the order that maximizes log-likelihood.

Checkout `armagarch.py` for the VaR model implementation.

In [18]:
import armagarch as ag

model = ag.VaRModel()

In [24]:
X = pd.read_csv('data/top10_logreturns.csv', index_col=0, parse_dates=True)['D05.SI'].values
model.fit(X, max_p=3, max_q=3, verbose=True, summary_stats=True)

HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=16.0), HTML(value='')))

Current best: ARMA(0,0)-GARCH(1,1), AIC = 8050.785085605261
Current best: ARMA(0,1)-GARCH(1,1), AIC = 8044.969154630786

Order determination complete with p = 0 and q = 1
AIC = 8044.969154630786
Parameter   Estimate       Std. Err.      T-stat     p-value
c           0.053232        0.019489     2.73137     0.00635
theta0      0.059114        0.022267     2.65475     0.00798
omega       0.057600        0.037966     1.51716     0.12934
alpha       0.133531        0.051933     2.57123     0.01019
beta        0.826876        0.074948     11.03270     0.00000


## Students t-distribution

The Student's t-distribution has a density function:

\begin{equation}
f(\epsilon) = \frac{\Gamma\left(\frac{\nu + 1}{2}\right)}{\Gamma\left(\frac{\nu}{2}\right)} \left(1 + \frac{\epsilon_t^2}{(\nu-2)\sigma_t^2}\right)^{-\frac{(\nu+1}{2}},
\end{equation}

where $\Gamma$ is the gamma function defined as:

\begin{equation}
\Gamma(z) = \int_0^{\infty} t^{z-1} e^{-t} dt,
\end{equation}

and $\nu > 2$.

In [25]:
t_model = ag.VaRModel(llh_func='t')
t_model.fit(X, max_p=2, max_q=2, verbose=True, summary_stats=True)

HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=9.0), HTML(value='')))

Current best: ARMA(0,0)-GARCH(1,1), AIC = 8015.785461729878
Current best: ARMA(0,1)-GARCH(1,1), AIC = 8011.933621923644
Current best: ARMA(1,1)-GARCH(1,1), AIC = 8011.156622933455

Order determination complete with p = 1 and q = 1
AIC = 8011.156622933455
Parameter   Estimate       Std. Err.      T-stat     p-value
c           0.075817        0.031224     2.42814     0.01524
phi0        -0.657009        0.163684     4.01389     0.00006
theta0      0.699807        0.154447     4.53106     0.00001
omega       0.127998        0.052444     2.44066     0.01472
alpha       0.165976        0.051873     3.19968     0.00139
beta        0.822651        0.055870     14.72436     0.00000
v           3.000000        0.073827     40.63547     0.00000
