# Assignment 1: GARCH modelling

**Deadline**: Sunday 21 February, 23.59.

Your notebook should run withouth errors when executed with `Run All`. Please submit your answers via [Canvas](https://canvas.uva.nl/courses/21917/assignments/197851).

|**Name**|**Student ID**|**Email**|
|:-------|:-------------|:--------|
|        |              |         |
|        |              |         |


****Hand in the following:****
* Your notebook. N.B. **click on `Kernel`, then `Restart & Run All`** before submitting, see notes.
* A (printed) pdf version of your notebook. Tip: you can use `nbconvert` ([user guide](https://nbconvert.readthedocs.io/en/latest/)) for this, or simply print the webpage to pdf.

****NOTES****:
* The assignment is a partial stand-in for a final examination, so the usual rules regarding plagiarism and fraud apply, with all attendant consequences. Code found on the internet or elsewhere is not acceptable as a solution.
* Before submitting your work, **click on `Kernel`, then `Restart & Run All`** and verify that your notebook produces the desired results and does not error.
* If your function uses random numbers, then set the seed to 0 before calling it. This makes it much easier to grade the assignments (at least as long as the answer is correct).


**Declaration of Originality**:

By submitting these answers, we declare that
1. We have read and understood the notes above.
2. These solutions are solely our own work.
3. We have not made these solutions available to any other student.



---

## Introduction

This is an assignment about the selection, estimation, testing and Monte Carlo simulation of GARCH models for daily stock index returns. 

#### FTSE data

First download data on the FTSE 100 index for the period January 1, 1998 $-$ January 29, 2021, from Yahoo Finance using `pandas-datareader` ([example](https://pydata.github.io/pandas-datareader/devel/remote_data.html#remote-data-yahoo) and [function reference](https://pydata.github.io/pandas-datareader/devel/readers/yahoo.html)).

*Hint*:  use `'%5EFTSE%3FP%3DFTSE'` as ticker symbol. Using `'^FTSE'` doesn't work, likely this is because of some labeling/referencing issue within Yahoo Finance.

#### GARCH in Python

Make sure that the `arch` package is installed before importing it. It holds functionality to estimate GARCH models.

Uncomment the next line to install. Note: `!` executes shell commands.

In [1]:
#!conda install arch -y -c bashtage

## Questions

1. Use the theory explained in the book and the lecture notes to select, estimate and test an empirical ARMA-GARCH model for the daily log-returns (in percentages: $r_t = 100 \cdot \Delta \log P_t$). Report on your findings, paying attention to the following elements:
    
    1. Testing for autocorrelation in the returns: is there any need for ARMA terms, and if so, what would be useful order $p$ and $q$ to start with?

    2. Testing for volatility clustering: what type of ARCH or GARCH model would be suitable?

    3. Estimation and testing of a selected ARMA-GARCH model. Do the standardised residuals behave as homoskedastic white noise, according to the available tests? Have you taken appropriate account of possible asymmetry in the news impact curve? Is the standard normal distribution appropriate for the standardised residuals, or would it be better to use another distribution?

    4. If any of the tests under 1.C indicate room for improvement, then adapt or extend the model, and check wether the revised model passes the tests.

    5. Make a plot of the estimated volatility from your final model, and also make a graph of the estimated news impact curve.
    
    
2. Use the model estimated under 1, and the resulting residual $\hat{a}_{T}$ and estimated volatility $\hat{\sigma}_{T}$ for **January 29, 2021**, to simulate the conditional distribution of the index return over the following 21 trading days (about a month). You will have to simulate the daily returns $r_{T+1},\ldots ,r_{T+21}$, to obtain a simulation of total monthly return $r[21]_{T+21}=\sum_{t=1}^{21}r_{T+t}$. The function `simulation` provides a starting point for such an analysis, but you will have to complete the program with the information from your empirical analysis in the first part. After completing the program, analyse the outcomes and report on your findings, paying attention to the following:

    1. What is the standard deviation of the monthly return? Is this what you would expect from the average daily standard deviation of the returns over the last 21 years? If not, can you give an explanation for the difference? <br> [*Note*: an approximation of the $n$-period (average) volatility is $\sqrt{n}$ times the 1-period (average) volatility; this approximation is based on the assumption of uncorrelated returns.]

    2. What is the shape of the distribution of the monthly returns? Does it display skewness and/or excess kurtosis? Can you explain these findings from the model you have used for the simulations?

    3. It may be of interest to experiment a little with the effect of different parameter values on the outcomes under A. and B.; for example, you could compare the results with and without asymmetric (leverage) effects. Also you could choose another month (corresponding to other $r_{T}$, $\hat{a}_{T}$ and $\hat{\sigma}_{T}$ to start the simulation), and compare the result.    

---

In [None]:
import numpy as np
def simulation(params, rT, aT, sT, R=2000, m=22):
    '''
    Simulating the distribution of a monthly return from a daily ARMA-GARCH model.
    
    Notes: 
    * Need to fill in an expression for s2 and r depending on the model for the conditional volatility and mean.
    * The program is applicable if only one lag / starting value is needed, e.g. ARMA(1,1)-GARCH(1,1); 
      for higher-order models the program needs adjustment.
        
    INPUT
    -----
    params: parameter vector (both mean and volatility parameters) from ARMAResults
    
    R: number of replications
    m: number of trading days in a month (~21 days), plus one starting value (change this if more starting values are needed)
    
    Starting values (if more lags are involved, then more values from January 2020 are needed)
    rT: the daily return at time of forecasting
    aT: the residual at time of forecasting
    sT: the estimated volatility at time of forecasting
     
    
    OUTPUT
    ------
    monthreturn: array with R simulated monthly returns
    
    '''

    # Define variables
    r = np.zeros(m+1)  # daily returns
    a = np.zeros(m+1)  # disturbances
    s2 = np.zeros(m+1)  # conditional variance
    epsilon = np.zeros(m+1)  # standardized disturbances
    monthreturn = np.zeros(R)  # monthly return
           
    # Initialising various variables at the relevant starting values
    r[0] = rT
    a[0] = aT
    s2[0] = sT**2
    epsilon[0] = aT / sT

    # Draw R times the monthly return according to the ARMA-GARCH specification
    for rep in range(R):
        # use random number generator from other distribution for epsilon if necessary
        epsilon[1:] = np.random.randn(m)
        
        # use estimated GARCH equation, expressing conditional variance s2 in terms of s2[h-1] and epsilon[h-1]
        for h in range(1,m):
            # EGARCH(1,1)
            s2[h] = 
        
        a[1:] = np.sqrt(s2[1:])*epsilon[1:]
        
        # use estimated ARMA equation, expressing r in terms of a and possibly r[h-1] and/or a[h-1]
        for h in range(1,m):
            r[h] = 
            
        monthreturn[rep] = sum(r[1:])
        
    return monthreturn