An assumption for Wold's representation is the homoscedasticity for the white noises innovations. This chapter digs on a new ethereal idea: Conditional dynamics characterized by heteroskedasticity, or time-varying volatility.   

Many applications take vantage of `time-varying volatility`. Financial assets have living-dynamic systems, seamlessly switching between smooth and relentless scenarios.

We will assess that `time-varying volatility` has compreehensive implications on confidence intervals, with time-varying interval widths and density forecast spreads. 

The conditional distribution by $\epsilon_t | \Omega_{t-1} \sim (0. \sigma_t^2)$, where $\Omega_{t-1} = \epsilon_{t-1}, \epsilon_{t-2}, ... $ . The conditional variance $\sigma_t^2$ will in general evolve as $\Omega_{t-1}$ evolve as $\Omega_{t-1}$ evolves.   

-------------
### The Basic ARCH process

Consider the general linear process,

$$ y_t = B(L)\epsilon_t $$
$$ B(L) = \sum_{i=0}^{\infty} b_i L^i $$
$$ \sum_{i=0}^{\infty} b_i^2 < \infty $$
$$ b_0 = 1 $$

where $\epsilon_t \sim WN(0, \sigma_2)$.

The conditional mean of y is time-varying: 

$$ E(y_t | \Omega_{t-1}) = \sum_{i=1}^{\infty} b_i \epsilon_{i-1} $$

where the information set is

$$ \Omega_{t-1} = \epsilon_{t-1}, \epsilon_{t-2}, ... $$ 

In order to afford a time-varying structure, we abide the strong white noise to be a weak with a particular `nonlinear dependence structure`. For the conditional variance, we parameterize the innovation process in terms of this conditional density,

$$ \epsilon_t | \Omega_{t-1} $$

From now on, we suppose:

$$ \epsilon | \Omega_{t-1} \sim N(0, \sigma_t^2) $$
$$ \sigma_t^2 = \omega + \gamma(L) \epsilon_t^2 $$
$$ \omega > 0  \quad   \gamma(L) = \sum_{i=1}^{p} \gamma_i L^i \quad  \omega_i > 0, \quad  \sum_{i}^{\infty} \gamma_i < 1.$$


Therefore, the innovation process is a zero conditional mean and a condionational variance that depedens linearly on p past squared innovations. The stated regularity conditions are enough to ensure that conditional and unconditional variances are positive and finite, and tht $y_t$ is covariance stationary. Let's establish a specification for the first and second moment of this formulation.


The unconditional moments of innovations $\epsilon_t$ are constant and given by: 

$$ E(\epsilon_t) = 0 $$
$$ E(\epsilon_t - E(\epsilon_t))^2 = \dfrac{\omega}{1 - \sum_{i=0}^{\infty}\gamma_i } $$

Moreover, the conditional variance is time-varying,  

$$ E( (\epsilon_t - E(\epsilon_t| \Omega{t-1}))^2 | \Omega_{t-1} ) = \omega + \gamma (L) \epsilon_t^2, $$
with mean zero.


Assembling the results for $y$, both conditional mean and variance are time-varying:

$$ E(y_t, \omega_{t-1}) = \sum_{i=1}^{\infty} b_i \epsilon_{t-1} $$
$$ E( (\epsilon_t - E(\epsilon_t| \Omega{t-1}))^2 | \Omega_{t-1} ) = \omega + \gamma (L) \epsilon_t^2. $$

In the aforementioned model, the $\epsilon_t$ is named an ARCH(p) process and the model is known as an infinite-ordered moving average with ARCH(p) innovations, where the ARCH stands for autoregressive conditional heteroskedasticity. 

ARCH process are tailor-made for known volatility clustering process, where large changes are followed by large, and small by small. Finally, an ARCH process can be viewed as a model the variance of a broader model, yet for the whole series, if there is not conditional mean dynamics of interest.

-------
### The GARCH process

A pure GARCH(p,q) process is given by:   
$y_t = \epsilon_t$ 


$$ \epsilon | \Omega_{t-1} \sim N(0, \sigma_t^2) $$
$$ \sigma_t^2 = \omega + \alpha(L) \epsilon_t^2 +  \beta(L) \sigma_t^2$$
$$ \alpha(L) = \sum_{i=1}^{p} \alpha_i L^i, \quad \beta(L) = \sum_{i=1}^{p} \beta_i L^i $$
$$ \omega > 0, \quad \alpha_i > 0, \quad \beta_i > 0,  \quad \sum_{i}^{\infty} \alpha_i + \sum_{i}^{\infty} \beta_i < 1.$$

from the mathematical notation is possible to assess that GARCH(p,q) can represent a restricted infinite-ordered ARCH proces, therefore being a parsimonious approximation to that. As it follows, three important aspects of GARCH processes.

1. The conditional variance is time-varying, given by:

$$ E( (\epsilon_t - E(\epsilon_t| \Omega{t-1}))^2 | \Omega_{t-1} ) = \omega + \alpha (L) \epsilon_t^2 + \beta (L) \sigma_t^2 $$

2. The unconditional variance is time-varying, given by:

$$ E(\epsilon_t - E(\epsilon_t))^2 = \dfrac{\omega}{1 - \sum_{i=0}^{\infty}\alpha_i - \sum_{i=0}^{\infty}\beta_i} $$

3. Beholding the h-step-ahead prediction is:

$$ E(\epsilon_{t+h} | \Omega_t) = 0, $$

the prediction error is

$$ \epsilon_{t+h} - E(\epsilon_{t+h} | \Omega_t) = \epsilon_{t+h}. $$

which depends on both $h$ and $\Omega_t$

4. A BONUS. Converge to normality under temporal aggregation is a property of covariance stationary GARCH processes



--------------------
### GARCH extensions

Many extensions for GARCH have been proposed over the years. Let's dig into three of most popular.


1. Exogenous Variables. 
    It is possible to include exogenous variables on the modeling of conditional variance dynamics. We simply modify the standard GARCH volatility function by writing

$$ \sigma^2 = \omega + \alpha \epsilon_{t-1}^2 + \beta \sigma_{t-1}^2 + \gamma x_t$$


where $x$ is an exogenous variable. To expand for multiple exogenous is straightforward.

2. Threshold-GARCH.
    The model adds a dummy variable to the equation, which updates the model in different ways for positive and negative returns. This asymmetric response is usefull for modeling the "leverage effects" in stock returns.

$$ \sigma^2 = \omega + \alpha \epsilon_{t-1}^2 + \gamma \epsilon_{t-1}^2 D + \sigma_{t-1}^2 $$

3. Exponential-GARCH.
    The model specifies the variance by log operations, grounding for volatility driven by both size and sign of shocks.


---------
### Final considerations on GARCH Models.

As long as robust estimates come from the Maximum Likelihood Estimator, it is a straightforward choice for the optimization of GARCH. Moreover, no closed-form expression exists for the GARCH MLE; therefore, we maximize it numerically.

The optimal h-step-ahead forecast is proportional to the optimal 1-step-ahead forecast, which is calculated by mean of lags of one period. Due to the volatility dynamics, the interval for a h-step-ahead forecast is given by: $y_{t+h,t} + 1.96\sigma_{t+h,t}$

A way to evaluate adequacy of a fitted GARCH model is by the correlogram of standardize returns, calculated with the conditional standard deviation from the fitted GARCH model $\sigma$