##### <p style="text-align: center; font-size: 300%"> ARCH Models in Julia </p>
<p style="text-align: center; font-size: 200%"> Simon A. Broda </p>
<p style="text-align: center; font-size: 100%"> University of Zurich and University of Amsterdam <br>
<a href="mailto:simon.broda@uzh.ch">simon.broda@uzh.ch</a> </p>
<img src="LOGO_ERC-FLAG_EU_.jpg" alt="LOGO" style="display:block; margin-left: auto; margin-right: auto; width: 20%;">
<p style="text-align: center; font-size: 100%"> This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program (grant agreement No. 750559). </p>

# Following Along
* These slides are available at https://github.com/s-broda/brownbag2018, in the form of a [Jupyter notebook](http://jupyter.org/). 
* Jupyter notebooks contain live code. Cells are evaluated by pressing `shift-enter`. 
* You can follow along without installing Julia locally by running the notebook on Google Colab (https://colab.research.google.com/; requires a Google account).
* Julia is not officially supported on Colab yet, so we require a trick to get it to work: click on `file -> New Python 2 notebook`, paste the following into the new notebook, and execute the cell with `shift-enter`:
```bash
!curl -sSL "https://julialang-s3.julialang.org/bin/linux/x64/1.0/julia-1.0.1-linux-x86_64.tar.gz"\
    -o julia.tar.gz
!tar -xzf julia.tar.gz -C /usr --strip-components 1
!rm -rf julia.tar.gz*
!julia -e 'using Pkg; pkg"add IJulia; precompile"'
```
* Wait for the code to execute (~ 1min), then click on `File -> Upload notebook`, choose `Github`, paste `https://github.com/s-broda/brownbag2018` into the search field, hit enter, and click on the filename once Colab has found the notebook. Optionally, click `Copy to Drive`.
* *Important*: If you see a warning when attempting to run code, uncheck *Reset all runtimes before running* before clicking `Run anyway`, or the above procedure will need to be repeated.

# Outline
* The Julia Language
* ARCH Models
* The `ARCH` Package
   * Usage
   * Benchmarks vs. Matlab

# The Julia Language
## General Information
* New programming language started at MIT.
* Designed with scientific computing in mind.
* Version 1.0 released in August 2018 after 9 years of development.
* Free and open source software. Available at https://julialang.org/.
* In the words of its creators,
> We want a language that’s open source, with a liberal license. We want the speed of C with the dynamism of Ruby. We want a
language that’s homoiconic, with true macros like Lisp, but with obvious, familiar mathematical notation like Matlab. We want
something as usable for general programming as Python, as easy for statistics as R, as natural for string processing as Perl, as
powerful for linear algebra as Matlab, as good at gluing programs together as the shell. Something that is dirt simple to learn,
yet keeps the most serious hackers happy. We want it interactive and we want it compiled.


## Highlights for Scientists
* Interactive REPL like Matlab, Python, R, etc. allows for exploratoty analysis, rapid prototyping.
* Unlike these, Julia is JIT compiled, hence fast (typically within 2x of C)
* Syntax superficially similar to Matlab.
* Rich type system, multiple dispatch.
* Fast-growing eco-system with many state-of-the-art packages (e.g., `ForwardDiff.jl`, `DifferentialEquations.jl`, `JuMP`, and obviously now `ARCH.jl`).


# ARCH Models
* Daily financial returns data exhibit a number of *stylized facts*:
  * Volatility clustering
  * Non-Gaussianity, fat tails
  * Leverage effects: negative returns increase future volatility
* Other types of data (e.g., changes in interest rates) exhibit similar phenomena.
* These effects are important in many areas in finance, in particular in risk management.
* [G]ARCH ([**G**eneralized] **A**utoregressive **C**onditional **V**olatility) models are the most popular for modelling them.

In [None]:
using Pkg
for package in ["BenchmarkTools", "Plots", "MATLAB"]
    if !haskey(Pkg.installed(), package)
        Pkg.add(package)
    end
end
if !isfile("returns.png") || !isfile("kde.png")
    for package in ["MarketData", "TimeSeries", "Distributions", "KernelDensity", "StatPlots"]
        if !haskey(Pkg.installed(), package)
            Pkg.add(package)
        end
    end
    using MarketData, TimeSeries, Plots, Distributions, KernelDensity, StatPlots
    r = percentchange(MarketData.AAPL[Symbol("Adj. Close")])
    p = plot(r)
    savefig("returns")
    plot(kde(getfield(r, :values)), label="Kernel Density")
    plot!(fit(Normal, getfield(r, :values)), label="Fitted Normal")
    savefig("kde")
end

## Example: volatility clustering in AAPL returns
<img src="returns.png" alt="RETURNS" style="display:block; margin-left: auto; margin-right: auto; width: 50%;">

## Example: fat tails in AAPL return density
<img src="kde.png" alt="Kernel Density" style="display:block; margin-left: auto; margin-right: auto; width: 50%;">



# (G)ARCH Models
* Basic setup: given a sample of financial returns $\{r_t\}_{t\in\{1,\ldots,T\}}$, decompose $r_t$ as
$$
r_t=\mu_t+\sigma_tz_t, \quad z_t\stackrel{i.i.d.}{\sim}(0,1),
$$
where $\mu_t\equiv\mathbb{E}[r_t\mid \mathcal{F}_{t-1}]$ and $\sigma_t^2\equiv \mathbb{E}[(r_t-\mu_t)^2\mid \mathcal{F}_{t-1}]$.
* Assume $\mu_t=0$ for simplicity. Focus is on the *volatility* $\sigma_t$. G(ARCH) models make $\sigma_t$ a function of *past* returns and variances. Examples: 

## Examples
* ARCH(q) (Engle, Ecta 1982):
$$\sigma_t^2=\omega+\sum_{i=1}^q \alpha_ir_{t-i}^2%,\quad \omega,\alpha_i>0,\quad \sum_{i=1}^q\alpha_i<1.
$$
* GARCH(p, q) (Bollerslev, JoE 1986)
$$\sigma_t^2=\omega+ \sum_{i=1}^p\beta_{i}\sigma_{t-i}^2 + \sum_{i=1}^q\alpha_ir_{t-i}^2%,\quad \omega,\alpha_i,\beta_i>0,\quad \sum_{i=1}^{\max p,q} \alpha_i+\beta_i<1.
$$
* EGARCH(o, p, q) (Nelson, Ecta 1991)
$$\log(\sigma_t^2)=\omega+\sum_{i=1}^o\gamma_{i}z_{t-i}+\sum_{i=1}^p\beta_i\log(\sigma_{t-i}^2)+\sum_{i=1}^q \alpha_i (|z_t|-\mathbb{E}|z_t|)%, \quad \sum_{i=1}^p \beta_i<0.
$$
 
  



## Estimation
* G(ARCH) models are usually estimated by maximum likelihood: with $f_z$ denoting the density of $z_t$,
$$\max \prod_t f(r_t\mid \mathcal{F}_{t-1})=\max \prod_t\frac{1}{\sigma_t}f_z(r_t/\sigma_t).$$
* Recursive nature of $\sigma_t$ means the computation cannot be "vectorized" $\Rightarrow$ loops.
* Julia is very well suited for this. Matlab (and the `rugarch` package for Python) have to implement the likelihood in C.



# The `ARCH` Package
## Installation
* `ARCH.jl` is available at https://github.com/s-broda/ARCH.jl. 
*  Extensive documentation available at https://s-broda.github.io/ARCH.jl/dev/.
* `ARCH.jl` is not a registered Julia package yet. To install in Julia 1.0 or later, do

In [2]:
using Pkg
if !haskey(Pkg.installed(), "ARCH")
    Pkg.add(PackageSpec(url="https://github.com/s-broda/ARCH.jl"))
end

## Key Features
* Supports simulating, estimating, forecasting, and backtesting ARCH models.
* Currently: ARCH, GARCH, and EGARCH models of arbitrary orders, with Gaussian, Student's $t$, and GED errors.
* Entirely written in Julia.
* Designed to be easily extensible with new models, distributions.
* Gradients and Hessians (for both numerical maximization of the likelihood and constructing standard errors) are obtained by [automatic differentiation](http://www.autodiff.org/?module=Introduction) via `ForwardDiff.jl`.

## Usage

In [3]:
using ARCH
using Random; Random.seed!(1)
T = 10^4  # sample size
volaspec = GARCH{1, 1}([1., .9, .05])  # [omega, beta, alpha]
am = simulate(volaspec, T; dist=StdT(3.))  # returns ARCHModel
fit(GARCH{1, 1}, am.data; dist=StdT)  # returns ARCHModel


GARCH{1,1} model with Student's t errors, T=10000.


Mean equation parameters:

       Estimate Std.Error z value Pr(>|z|)
μ    0.00310899  0.028261 0.11001   0.9124

Volatility parameters:

      Estimate  Std.Error z value Pr(>|z|)
ω      1.01996    0.16134 6.32181    <1e-9
β₁    0.898131  0.0121042 74.2001   <1e-99
α₁   0.0551942 0.00762137 7.24204   <1e-12

Distribution parameters:

     Estimate Std.Error z value Pr(>|z|)
ν     2.92974  0.096228 30.4458   <1e-99



In [4]:
# select an EGARCH model without intercept by minimizing AIC; o, p, q < 3
# Uses multiple threads to estimate several (here 2*2*2=8) models
am2 = selectmodel(EGARCH, am.data; meanspec=NoIntercept, criterion=aic, maxlags=2, dist=StdT)


EGARCH{1,1,1} model with Student's t errors, T=10000.


Volatility parameters:

       Estimate  Std.Error  z value Pr(>|z|)
ω      0.153944  0.0235374  6.54041   <1e-10
γ₁   0.00552636 0.00946358 0.583961   0.5592
β₁     0.955436 0.00737491  129.552   <1e-99
α₁     0.145674  0.0150556  9.67574   <1e-21

Distribution parameters:

     Estimate Std.Error z value Pr(>|z|)
ν     2.91693 0.0953735 30.5842   <1e-99



In [5]:
# Most of the interface of StatisticalModel is implemented:
# loglikelihood, nobs, fit, fit!, confint, aic, bic, aicc, dof, coef,
# coefnames, coeftable, CoefTable, informationmatrix, islinear, score, vcov:
confint(am2)'

2×5 LinearAlgebra.Adjoint{Float64,Array{Float64,2}}:
 0.107812  -0.0130219  0.940981  0.116165  2.73   
 0.200077   0.0240746  0.96989   0.175182  3.10385

# Benchmarks
  * Bollerslev and Ghysels (JBES 1996) data is de facto standard in comparing implementations of GARCH models.
  * Data consist of daily German mark/British pound exchange rates (1974 observations).
  * Available in `ARCH.jl` as the constant `BG96`.

In [6]:
if !isfile("DMGBP.png")
    using Plots
    plot(BG96, legend=:none)
    savefig("DMGBP")
end

## Bollerslev and Ghysels (1996) Data
<img src="DMGBP.png" alt="Bollerslev and Ghysels (1996) data" style="display:block; margin-left: auto; margin-right: auto; width: 50%;">

## GARCH
* Fitting in Julia:

In [8]:
using BenchmarkTools
@btime fit(GARCH{1, 1}, $BG96, meanspec=NoIntercept)  # Matlab doesn't use an intercept

  3.823 ms (1720 allocations: 344.88 KiB)



GARCH{1,1} model with Gaussian errors, T=1974.


Volatility parameters:

      Estimate  Std.Error z value Pr(>|z|)
ω    0.0108661 0.00657449 1.65277   0.0984
β₁    0.804431  0.0730395 11.0136   <1e-27
α₁    0.154597  0.0539319 2.86651   0.0042



* Now Matlab:

In [10]:
using MATLAB
mat"version"

"8.2.0.701 (R2013b)"

In [18]:
# run this cell a few times to give Matlab a fair chance
mat"tic; estimate(garch(1, 1), $BG96); toc; 0";

 
    GARCH(1,1) Conditional Variance Model:
    ----------------------------------------
    Conditional Probability Distribution: Gaussian

                                  Standard          t     
     Parameter       Value          Error       Statistic 
    -----------   -----------   ------------   -----------
     Constant       0.010868    0.00129723        8.37786
     GARCH{1}       0.804517     0.0160384        50.1619
      ARCH{1}       0.154325     0.0138523        11.1408
Elapsed time is 0.440673 seconds.


* ARCH.jl is faster by a factor of about 10-20, depending on the machine, despite Matlab calling into compiled C code.
* Estimates are quite similar, but standard errors and $t$-statistics differ.
* So which standard errors are correct? Let's compare with the results from Brooks et. al. (Int. J. Fcst. 2001).

* Brooks et. al. compare implementations of the GARCH(1, 1) model. They use a model with intercept, so let’s re-estimate in Julia (Matlab doesn't seem to allow this):

In [19]:
fit(GARCH{1, 1}, BG96)


GARCH{1,1} model with Gaussian errors, T=1974.


Mean equation parameters:

        Estimate  Std.Error   z value Pr(>|z|)
μ    -0.00616637 0.00920163 -0.670139   0.5028

Volatility parameters:

      Estimate  Std.Error z value Pr(>|z|)
ω    0.0107606 0.00649493 1.65677   0.0976
β₁    0.805875  0.0725003 11.1155   <1e-27
α₁    0.153411  0.0536586 2.85903   0.0042



* Brooks et. al. give the estimates (**$t$-stats**)
$\mu=−0.00619$ $(\mathbf{−0.67})$, $\omega=0.0108$ $(\mathbf{1.66})$, $\beta_1=0.806$ $(\mathbf{11.11})$, $\alpha_1=0.153$ $(\mathbf{2.86})$. Dead on!

## EGARCH
* Julia:

In [20]:
@btime fit(EGARCH{1, 1, 1}, $BG96, meanspec=NoIntercept)

  6.051 ms (2030 allocations: 414.27 KiB)



EGARCH{1,1,1} model with Gaussian errors, T=1974.


Volatility parameters:

      Estimate Std.Error  z value Pr(>|z|)
ω    -0.128026 0.0518431 -2.46948   0.0135
γ₁   -0.032216 0.0255372 -1.26153   0.2071
β₁    0.911947 0.0331381  27.5196   <1e-99
α₁    0.333243  0.070109  4.75321    <1e-5



* Matlab:

In [21]:
mat"tic; estimate(egarch(1, 1), $BG96); toc; 0"  # Matlab sets o=q

 
    EGARCH(1,1) Conditional Variance Model:
    --------------------------------------
    Conditional Probability Distribution: Gaussian

                                  Standard          t     
     Parameter       Value          Error       Statistic 
    -----------   -----------   ------------   -----------
     Constant        -0.1283     0.0157875       -8.12668
     GARCH{1}       0.911856     0.0084535        107.867
      ARCH{1}        0.33317     0.0217694        15.3045
  Leverage{1}     -0.0322515      0.012564       -2.56699
Elapsed time is 2.314886 seconds.


0.0

* Brooks et. al. give no benchmark results. But again, Julia is faster by a factor of about 20.

#  TODO
*  backtesting
*  MGARCH


# References
* Bollerslev, T (1986). Generalized autoregressive conditional heteroskedasticity. *Journal of Econometrics* **31**, 307–327.
* Bollerslev, T. & Ghysels, E. (1996). Periodic Autoregressive Conditional Heteroscedasticity. *Journal of Business & Economic Statistics* **14**, 139-151. https://doi.org/10.1080/07350015.1996.10524640.
* Brooks, C., Burke, S. P., & Persand, G. (2001). Benchmarks and the accuracy of GARCH model estimation. *International Journal of Forecasting* **17**, 45-56. https://doi.org/10.1016/S0169-2070(00)00070-4.
* Engle, R. F. (1982). Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation. *Econometrica* **50**, 987-1007. https://doi.org/10.2307/1912773.
* Nelson, D.B. (1991). Conditional Heteroskedasticity in Asset Returns: A New Approach. *Econometrica* **59**, 347--370. https://doi.org/10.2307/2938260.

