# Advanced Risk Management – Assignment 1

**Deadline**:  February 21, 18:00.

| |Name |Student number|Email|
|:-|:----|:-------------|:----|
|1.|  |        |     |
|2.|  |        |     |
|3.|  |        |     |




****Hand in the following via Canvas****:
* Your notebook.
* A (printed) pdf version of your notebook. In Google Colab, this is most conveniently done in the Chrome browser, and then using the **`File` -> `Print`** menu option; you may have to print in landscape mode to make sure that everything appears in the pdf.

****Notes****:
* The assignment is part of the examination, so the usual rules regarding plagiarism and fraud apply.
* Before submitting your work, click on **`Runtime`-> `Restart and run all ...`** and verify that your notebook produces the desired results and does not error.

**Declaration of Originality**: We whose names are given under 1. and 2. above declare that these solutions are solely our own work, and that we have not made these solutions available to any other student.

## Introduction
The file `RV_data.xlsx` contains daily data (January 2000 – February 2020, or a sub-period), for a number of international stock market indices, on the open-to-close log-return R (measured as percentage) and the daily realized variance RV (obtained from 5-minute returns).
A list of the included indices is given on the website of the data provider, see
https://realized.oxford-man.ox.ac.uk/data/assets (in the Excel file, the leading '.' has been removed from the symbol; e.g. `FCHI` instead of `.FCHI`). In this assignment, you are asked to estimate, test and compare two GARCH models for one of the indices in terms of their in-sample fit and their out-of-sample forecast quality.

## Question 1: Load and display data
First, install and import the relevant libraries:

In [0]:
# !pip install arch             # uncomment for installing the arch package
import numpy as np
import os
import pandas as pd
import datetime as dt
import matplotlib.pyplot as plt
import scipy.stats as stats
import statsmodels.api as sm
import statsmodels.formula.api as smf
from arch import arch_model
from statsmodels.graphics.tsaplots import plot_acf


Next, import the data and obtain the returns for one chosen index. Uncomment and adapt the lines necessary to mount the drive and change the path.

In [0]:
# from google.colab import drive
# drive.mount('/content/drive')
# path = '/content/drive/...'    # change path to your working directory
# os.chdir(path)
df = pd.read_excel('RV_data.xlsx')
df['Date'] = pd.to_datetime(df['Date'], format='%Y%m%d')
df = df.set_index(['Date'])
sel = df['Symbol']=='ABC'   # Boolean array to select index;
                            # change 'ABC' to chosen index symbol, e.g., 'FCHI'
R = df['R'].loc[sel]

Display a line graph of the returns, and display the autocorrelation function of the returns and of the squared returns. Discuss whether you find the "stylized facts" mentioned in the textbook and slides of the course.

Discussion of results:

## Question 2: Fitting a symmetric GARCH model
Estimate and test a GARCH model for the returns, using only data over the sub-period January 2000 – December 2012. For this question, do **not** consider a GJR-GARCH model or any model with an asymmetric NIC (see next question), but focus on standard GARCH($p,q$) models. Display and discuss the estimation output, and test the model for the absence of volatility clustering in the standardized residuals (shocks) $\hat{z}_t$. If you try out various GARCH models, only report discuss the results on the final model.

Note: estimation over a sub-sample using the ARCH package can be done by specifying `last_obs =` in the `.fit` function; see https://arch.readthedocs.io.

Discussion of results:

## Question 3: Fitting an asymmetric GARCH model
(a) Extend the model you obtained above with one or more asymmetric terms (leading to a GJR-GARCH model), and estimate this second model using data over the same sub-period. Analogously to Question 2, display and discuss the estimation output, and test for volatility clustering in $\hat{z}_t$.

Discussion of results:

(b) Carry out a likelihood ratio test for the symmetric model of Question 2 (the null hypothesis) against the asymmetric model of Question 3 (the alternative hypothesis). Obtain the p-value of the test from the `scipy.stats` library.

Discussion of results:

## Question 4: Comparing the out-of-sample fit
From the two models estimated above, obtain the out-of-sample (one-step-ahead) conditional variance predictions over the period January 2013 – February 2020. These are based on coefficient estimates from data until 2012, but they use the most recent returns $R_t$ to obtain the predictions for $\sigma_{t+1}^2$ in the period after 2012 (consult the ARCH package documentation for details).

(a) Make a new data-frame containing the two sets of variance predictions, as wel as the realized variance for the same index and over the same sub-period (January 2013 – February 2020). Make a plot of the two predicted volatility series (square root of the predicted variances) in one figure, and discuss similarities and differences.

Discussion of results:

(b) Calculate the mean squared error of the two variance predictions (using the realized variance as the true $\sigma_{t+1}^2$), and discuss the result; which one of the two models gives the best predictions?

Discussion of results:

(c) As discussed in Sections 4.6.3 and 5.7 of the book, variance forecasts $\hat{\sigma}_{t+1}^2$ can be evaluated in the linear regression

$RV_{t+1} = b_0 + b_1\hat{\sigma}_{t+1}^2 + e_{t+1}$,

by testing the two restrictions $b_0=0, b_1=1$ (separately and jointly).
Estimate this regression twice: first, using the symmetric GARCH model prediction (Question 2), and next, using the asymmetric GARCH model prediction (Question 3) as explanatory variable. Report the estimation results and the outcome of the $F$-test for $b_0=0, b_1=1$ (use heteroskedasticity-robust standard errors). What do you conclude?

Discussion of results: