# SRISK - R

#### Goals

* Translate existing code for SRISK from Matlab to R
* Use the `ccgarch` package for the DCC-GARCH model at the heart of SRISK 
* Experiment with the Tidyverse
* Write the code in a modular way with `data_frame`s as inputs & outputs to all functions, with a view to hosting the R (or alternatively python) functions in [R](https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/r-language-modules) / [python](https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/python-language-modules) language modules in [Azure Machine Learning Studio](https://studio.azureml.net/). 
* Similar python code can be found in [Py-SRISK](https://notebooks.azure.com/ian-buckley/libraries/systemic-risk/html/Py-SRISK.ipynb)

For a **live SRISK dashboard** see **V-Lab** at NYU Stern school: https://vlab.stern.nyu.edu/welcome/risk/, with [SRISK documentation](https://vlab.stern.nyu.edu/doc/40?topic=mdls).

##### Notebook extensions
Because this is a long notebook, & a mixture of valuable code & less valuable experiments, it is a really good idea to turn on a couple of notebook extensions **Edit > nbextensions config** to open a new browser tab, & then select the following extensions:
* Collapsible headings
* Intitialization cells (allows you to conveniently run only specific cells   

Having selected those check boxes, reload the tab containing this notebook for the changes to take effect.

##### Sources for Matlab code

Belluzzo, Tommaso. SystemicRisk: A Framework for Systemic Risk Valuation and Analysis. Matlab, 2018. https://github.com/TommasoBelluzzo/SystemicRisk.  
Bisias, Dimitrios, Mark D. Flood, Andrew W. Lo, and Stavros Valavanis. A Survey of Systemic Risk Analytics. Matlab, 2012. https://financialresearch.gov/working-papers/files/OFRwp0001_BisiasFloodLoValavanis_MatlabCode-v0_3.zip.
———. “A Survey of Systemic Risk Analytics.” SSRN Scholarly Paper. Rochester, NY: Social Science Research Network, January 11, 2012. http://papers.ssrn.com/abstract=2747882.  
Dube, Qobolwakhe. SA-Systemic-Risk: Systemic Risk Ranking of South Africa’s Financial Institutions. Matlab, 2017. https://github.com/qobolwakhe/SA-systemic-risk.  
Perignon, Christophe, Sylvain Benoit, Christophe Hurlin, and Gilbert Colletaz. Run My Code - A Theoretical and Empirical Comparison of Systemic Risk Measures. Accessed July 11, 2016. http://www.runmycode.org/companion/view/175.  
V-Lab Stern NYU. “GARCH-DCC Documentation.” V-Lab. Accessed May 8, 2018. https://vlab.stern.nyu.edu/doc/13?topic=mdls.  


## Libraries

### Libraries - install

#### DCC GARCH  

Dynamic conditional correlation (DCC) generalized autoregressive conditional heteroskedastic (GARCH) model  
This is required to estimate the *marginal expected shortfall* (MES) and from that SRISK.

https://vlab.stern.nyu.edu/doc/13?topic=mdls  
https://www.stat.ncsu.edu/people/bloomfield/courses/ST810J/slides/mv-garch.pdf

In [1]:
if(!require(ccgarch)){
    install.packages("ccgarch")
    library(ccgarch)
}

Loading required package: ccgarch
“there is no package called ‘ccgarch’”Installing package into ‘/home/nbuser/R’
(as ‘lib’ is unspecified)


### Libraries - load

In [2]:
library(tibble)
library(dplyr)
library(purrr)
library(tidyr) # Reshape using `gather` and `spread`
library(lubridate)


Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union


Attaching package: ‘lubridate’

The following object is masked from ‘package:base’:

    date



### Notes

#### `ccgarch`

https://cran.r-project.org/web/packages/ccgarch/  
https://github.com/cran/ccgarch    
https://www.r-project.org/conferences/useR-2008/slides/Nakatani.pdf  

#### `rmgarch`

https://cran.r-project.org/web/packages/rmgarch/     
http://www.unstarched.net/2013/01/03/the-garch-dcc-model-and-2-stage-dccmvt-estimation/    
(Unable to install `rmgarch` on Azure.)

In [None]:
# install.packages("rugarch")

## SRISK

### General strategy

Because this project started out as a means to test-case the Azure Machine Learning Studio (AMLS), the main functions are designed to accept `data_frame`s as the primary data type for input arguments & outputs. The `ccgarch` function `dcc.estimation` breaks that rule because its output is a list (~dict) with elements: `h`, `DCC`, `std.resid` etc.  
The main control structure of the calculation is a loop over firms (banks). In R this is achieved by using `group_by` & `mutate`. However, in AMLS, the data-frames being passed between steps are for all firms at once, so the loop over firms has to be duplicated in all steps that contain aggregate functions and are therefore required to be firm specific (namely estimation, quantiles, sums etc.). Obviously, we need to get the same result whether or not we compose the functions for each step of the calculation & then loop, or loop over each step individually, & then compose.

| Step | In   | Out   | Parameters   |  Notes |
|------|------|------|---------------|
|  ccgarch  | returns| h, DCC| NA     |     |
|  MES  | h,DCC | MES,LRMES| Market shock (40%), Confidence level (95%)     |     |
|  SRISK  | LRMES, Assets, Liabilities | SRISK| NA     |     |

### Generate data

In [3]:
dates3=ymd("2018-03-01", "2018-03-02", "2018-03-03")
dates50 <- seq(today()-ddays(50-1), today(), by='days')
big5_rics = c("RY.TO","TD.TO","BNS.TO","BMO.TO","CM.TO")
#big5_names = c("Royal Bank of Canada","Toronto-Dominion Bank","Bank of Nova Scotia",
#              "Bank of Montreal","Canadian Imperial Bank of Commerce")
market_ric = ".GSPTSE"

In [4]:
generate_data <- function (firms, dates, fields) 
{
    nfirms <- length(firms)
    ndates <- length(dates)
    nfields <- length(fields)
    row_ids <- crossing(Date = dates, RIC = firms)[, c("RIC", 
        "Date")]
    values <- data.frame(matrix(rnorm(nfirms * ndates * nfields), 
        ncol = nfields))
    colnames(values) <- fields
    return(cbind(row_ids, values))
}

In [20]:
firm_data = generate_data(big5_rics,dates3,c("Return","Total Liabilities","Market Cap"))
market_data = generate_data(market_ric,dates3,c("Price"))

### Munge data, calc market return

In [43]:
returns_all <- merge(
    firm_data %>% 
        rename(ret_x = Return) %>%
        mutate(ret0_x = ret_x - mean(ret_x)),
    market_data %>% 
        mutate(ret_m = (Price / lag(Price)) -1) %>%
        replace_na(list(ret_m = 0)) %>% 
        mutate(ret0_m = ret_m - mean(ret_m)) %>%
        select(-RIC,-Price) , 
    by ="Date" ) %>% 
    arrange(RIC,Date) 

In [44]:
returns_all %>% head

Date,RIC,ret_x,Total Liabilities,Market Cap,ret0_x,ret_m,ret0_m
2018-03-01,RY.TO,-0.893430975,1.04942459,0.2313125,-0.6475614,0.0,2.513625
2018-03-02,RY.TO,0.54404338,2.01914958,0.6288523,0.7899129,-4.782511,-2.2688858
2018-03-03,RY.TO,-1.542441021,-1.56483272,1.0573473,-1.2965715,-2.758364,-0.2447392
2018-03-01,TD.TO,-0.468147219,-0.07482964,-0.913422,-0.2222777,0.0,2.513625
2018-03-02,TD.TO,0.004224587,-0.30323669,-0.6371521,0.2500941,-4.782511,-2.2688858
2018-03-03,TD.TO,-0.703234423,0.8298547,-0.337691,-0.4573649,-2.758364,-0.2447392


### DCC-GARCH

##### Remarks, experiments

Another approach would be to define a function that builds the arguments for `dcc.estimation` as a `list` & then use `do.call`.  
`result = do.call('dcc.estimation', list(arg1, arg2, etc, etc))`

In [91]:
dcc_arg_builder <- function (dvar, model = "extended", method = "BFGS", message = 0) 
{
    N <- dim(dvar)[2]
    a <- numeric(N)
    A <- diag(N)
    B <- diag(N)
    ini.dcc <- rep(0.01, 2)
    return(list(inia = a, iniA = A, iniB = B, ini.dcc = ini.dcc, 
        dvar = dvar, model = model, method = method, message = message))
}

In [56]:
dcc_estimation <- function (dvar, model = "extended", method = "BFGS", message = 0) 
{
    #' Estimate DCC GARCH model using ccgarch package 
    #'
    #' Wraps ccgarch::dcc.estimation, making data the first argument for piping.
    a <- numeric(N)
    A <- diag(N)
    B <- diag(N)
    ini.dcc <- rep(0.01, 2)
    return(dcc.estimation(inia = a, iniA = A, iniB = B, ini.dcc = ini.dcc, 
        dvar = dvar, model = model, method = method, message = message) %>% 
        {
            .[c("h", "DCC", "std.resid")] # Select required outputs from list
        })
}

##### Prepare inputs, estimate, transform results

What is the challenge? The `group_by(RIC)` operation is followed by a function with three tasks to perform for each firm:
* Prepare default inputs for `dcc.estimation`
* The `dcc.estimation` estimation itself
* Selecting & transforming the (`list`) output into a `data_frame`
* Using `std.resid` to calculate the Value at Risk (VaR) for the firm. NOT USED!

Add a new variable for the quantile of the (standardized) return.

In [5]:
dcc_estimation <- function (df, model = "extended", method = "BFGS", message = 0, var_quantile = 0.05) 
{
    #' Estimate DCC GARCH model using ccgarch package 
    #'
    #' Wraps ccgarch::dcc.estimation, making data the first argument for piping.
    # 1. Prepare default inputs for `dcc.estimation`
    # 2. The `dcc.estimation` estimation itself
    # 3. Selecting & transforming the (`list`) output into a `data_frame`
    # 4. Using `std.resid` to calculate the Value at Risk (VaR) for the firm. NOT USED!
    
    # ===1=== Prepare inputs, including initial values
    N <- dim(df)[2]
    a <- numeric(N)
    A <- diag(N)
    B <- diag(N)
    ini.dcc <- rep(0.01, 2)
    # ===2=== Estimate DCC_GARCH model
    results_df <- 
        dcc.estimation(inia = a, iniA = A, iniB = B, ini.dcc = ini.dcc, 
            dvar = df, model = model, method = method, message = message) %>%
    # ===3=== Select & transform the required outputs from the list    
        {data_frame( 
                s_m = sqrt(.$h[,1]), 
                s_x = sqrt(.$h[,2]), 
                p_mx = .$DCC[,2])              } %>% 
                #,var_x = quantile(.$std.resid[,2], var_quantile))  # Firm VaR                            
        # Define beta
        mutate(beta_x = p_mx * (s_x / s_m))
    return(results_df)   # Return type is data_frame
}

#### All steps

In [38]:
srisk_main <- function (df,a=0.05, d=0.4,l=0.08){
    df %>%
        dcc_estimation %>%
        mes(a=a, d=d) %>%
        srisk(l=l) 
}

In [47]:
estimation_results <-
    bind_cols(
        # === Inputs
        returns_all,
        # === Firm dccgarch  
        returns_all[c("RIC","ret0_m","ret0_x")] %>%
            group_by(RIC) %>%
            # Call ccgarch; estimate firm by firm; result is list of vectors
            {dcc_estimation(.[c("ret0_m","ret0_x")])}) 

In [48]:
estimation_results %>% head

Date,RIC,ret_x,Total Liabilities,Market Cap,ret0_x,ret_m,ret0_m,s_m,s_x,p_mx,beta_x
2018-03-01,RY.TO,-0.893430975,1.04942459,0.2313125,-0.6475614,0.0,2.513625,1.953332,0.7456752,-0.27343669,-0.10438315
2018-03-02,RY.TO,0.54404338,2.01914958,0.6288523,0.7899129,-4.782511,-2.2688858,1.946583,0.7474155,-0.64972289,-0.24946942
2018-03-03,RY.TO,-1.542441021,-1.56483272,1.0573473,-1.2965715,-2.758364,-0.2447392,1.939857,0.7491608,-0.67685564,-0.26139753
2018-03-01,TD.TO,-0.468147219,-0.07482964,-0.913422,-0.2222777,0.0,2.513625,1.933149,0.7509148,-0.02517544,-0.00977918
2018-03-02,TD.TO,0.004224587,-0.30323669,-0.6371521,0.2500941,-4.782511,-2.2688858,1.92647,0.7526658,-0.3909386,-0.15273846
2018-03-03,TD.TO,-0.703234423,0.8298547,-0.337691,-0.4573649,-2.758364,-0.2447392,1.919813,0.7544209,-0.41427818,-0.16279713


##### `main_pro.m`

https://github.com/TommasoBelluzzo/SystemicRisk/blob/master/ScriptsProbabilistic/main_pro.m 

In [None]:
# THIS IS MATLAB - DO NOT RUN!
ret0_x = ret_x - mean(ret_x);

#% Supply 2 series of returns (with means subtracted)
[p,s] = dcc_gjrgarch([ret0_m ret0_x]);
#% p       = An n-by-n-by-t matrix of floats containing the DCC coefficients.
#% s       = A t-by-n matrix of floats containing the conditional variances.
s_m = sqrt(s(:,1));
s_x = sqrt(s(:,2));
p_mx = squeeze(p(1,2,:)); #% Pull out the off-diagonal correlation (between market & firm)

beta_x = p_mx .* (s_x ./ s_m);
var_x = s_x * quantile((ret0_x ./ s_x),data.A); #% Find the value at risk of the firm? (optional)

[mes,lrmes] = calculate_mes(ret0_m,s_m,ret0_x,s_x,beta_x,p_mx,data.A,data.D); #% Hopefully R package ccgarch can do this step?
srisk = calculate_srisk(lrmes,data.FrmsLia(:,i),data.FrmsCap(:,i),data.L); #% SRISK needs the (LR)MES + balance sheet data + crash level (e.g. 40%)

### `mes`

https://github.com/TommasoBelluzzo/SystemicRisk/blob/master/ScriptsProbabilistic/calculate_mes.m

In [6]:
mes <- function(df, a=0.05, d=0.4){
    #' Calculate marginal expected shortfall (MES) & long range MES (LRMES)
    #'
    #' Input data_frame must contain: ret0_m, s_m, ret0_x, s_x, beta_x, p_mx
    #' Author: Tommaso Belluzzo (Matlab)
    #' In python, vectors are pandas series objects
    #' The input series must be for a single firm only!
    #' :param ret0_m: Demeaned market index log returns.
    #' :param s_m: Volatilities of the market index log returns.
    #' :param ret0_x: Demeaned firm log returns.
    #' :param s_x: Volatilities of the firm log returns.
    #' :param beta_x: Firm CAPM betas.
    #' :param p_mx: DCC coefficients.
    #' :param a: A float [0.01,0.10] representing the complement to 1 of the confidence level (optional, default=0.05).
    #' :param d: A float representing the six-month crisis threshold for the market index decline used to calculate LRMES (optional, default=0.40).
    return(
        df %>%
            mutate(
                c = quantile(ret0_m,a),
                h = length(ret0_m) ** (-0.2),
                u = ret0_m / s_m    ,          # Standardize
                x_den = sqrt(1 - p_mx**2),
                x_num = (ret0_x / s_x) - (p_mx * u),
                x = x_den / x_num,
                f = pnorm(((c / s_m) - u) / h),   # Normal CDF
                f_sum = sum(f),
                k1 = sum(u * f) /  sum(f),       
                k2 =  sum(x * f) / sum(f) ,
                mes = (s_x * p_mx * k1) + (s_x * x_den * k2),
                lrmes = 1 - exp(log(1 - d) * beta_x) )  %>%
            select() )}

In [49]:
estimation_results %>% 
    group_by(RIC) %>% 
    mes

Adding missing grouping variables: `RIC`


RIC
RY.TO
RY.TO
RY.TO
TD.TO
TD.TO
TD.TO
BNS.TO
BNS.TO
BNS.TO
BMO.TO


### `srisk`

https://github.com/TommasoBelluzzo/SystemicRisk/blob/master/ScriptsProbabilistic/calculate_srisk.m

In [7]:
srisk <- function(df,l=0.08){
    #' Calculate the SRISK measure of systemic risk
    #' 
    #' Data frame must contain fields: (lrmes,tl_x,mc_x)
    #' Author: Tommaso Belluzzo (Matlab)
    #' In python, input vectors are pandas series objects; output type is pandas dataframe
    #' :param lrmes:   A vector of floats containing the LRMES values.
    #' :param tl_x:    A numeric vector containing the firm total liabilities.
    #' :param mc_x:    A numeric vector containing the firm market capitalization.
    #' :param l:       A float [0.05,0.20] representing the capital adequacy ratio (optional, default=0.08).
    #' :return srisk:  A dict of series including SRISK.
    return(
        df %>%
            rename(tl_x = "Total Liabilities", mc_x = "Market Capitalization") %>%
            mutate(srisk = (l * tl_x) - ((1 - l) * (1 - lrmes) * mc_x) %>%
                                  {ifelse(. < 0, 0,.)})) 
    }

# Package `ccgarch` sample code

## Simulation - `dcc.sim`

Simulating data from the original DCC-GARCH(1,1) process  
See **P11** of https://cran.r-project.org/web/packages/ccgarch/ccgarch.pdf  
https://www.rdocumentation.org/packages/ccgarch/versions/0.2.3/topics/dcc.sim  

In [21]:
  nobs <- 50
  a <- c(0.003, 0.005, 0.001)
  A <- diag(c(0.2,0.3,0.15))
  B <- diag(c(0.75, 0.6, 0.8))
  uncR <- matrix(c(1.0, 0.4, 0.3, 0.4, 1.0, 0.12, 0.3, 0.12, 1.0),3,3)
  dcc_para <- c(0.01,0.98)
  dcc_data <- dcc.sim(nobs, a, A, B, uncR, dcc_para, model="diagonal")

Outputs are matrices:
* `z`
* `std.z`
* `dcc`
* `h`
* `eps`

In [28]:
#lapply(dcc_data,head)

## Estimation - `dcc.estimation`

#### Simple demo

Estimating a DCC-GARCH(1,1) model  
See **P6** of https://cran.r-project.org/web/packages/ccgarch/ccgarch.pdf   
https://www.rdocumentation.org/packages/ccgarch/versions/0.2.3/topics/dcc.estimation  

**`inia`** - a vector of initial values for the constants in the GARCH equation `length(inia)=N`  
**`iniA`** - a matrix of initial values for the ARCH parameter matrix (N×N)  
**`iniB`** - a matrix of initial values for the GARCH parameter matrix (N×N)  
**`ini.dcc`** - a vector of initial values for the DCC parameters (2×1)  
**`dvar`** - a matrix of the data (T×N)  
**`model`** - a character string describing the model. "diagonal" for the diagonal model and "extended" for the extended    
**`method`** - a character string specifying the optimisation method in optim.   
**`gradient`** - a switch variable that determines the optimisation algorithm in the second stage optimisation.  
**`message`** - a switch variable to turn off the display of the message   

In [31]:
dcc_results <- dcc.estimation(inia=a, iniA=A, iniB=B, ini.dcc=dcc_para, 
        dvar=dcc_data$eps, model="extended")  # or model="diagonal"

****************************************************************
*  Estimation has been completed.                              *
*  The outputs are saved in a list with components:            *
*    out    : the estimates and their standard errors          *
*    loglik : the value of the log-likelihood at the estimates *
*    h      : a matrix of estimated conditional variances      *
*    DCC    : a matrix of DCC estimates                        *
*    std.resid : a matrix of the standardised residuals        *
*    first  : the results of the first stage estimation        *
*    second : the results of the second stage estimation       *
****************************************************************


Outputs are:
* `out` - parameter estimates and their standard errors: vector `a`, matrices `A` & `B`
* `loglik` - the value of the log-likelihood at the estimates (scalar)
* `h` - estimated conditional variances ($T\times N$)
* `DCC` - a matrix of DCC estimates  ($T\times N^2$)
* `std.resid` a matrix of the standardised residuals ($T\times N$)
* `first `
* `second`

#### Results of estimation

In [292]:
#lapply(dcc_results,class)
#lapply(dcc_results,print)
dcc_results[c("h","DCC","std.resid")] %>% map(head)

0,1
0.04842365,0.03602246
0.04502395,0.02629809
0.0470647,0.02865784
0.05945404,0.05630532
0.05568064,0.05682818
0.0514102,0.04838037

0,1,2,3
1,0.6332127,0.6332127,1
1,0.6332127,0.6332127,1
1,0.6332127,0.6332127,1
1,0.6332127,0.6332127,1
1,0.6332127,0.6332127,1
1,0.6332127,0.6332127,1

0,1
-0.2330431,-0.0824776
-1.1632722,-0.5824885
2.0026542,1.9644564
0.5818376,-1.4136188
0.3460547,1.1971921
0.1706414,0.360051


#### DCC estimates - `dcc.results$DCC` - reshape the `DCC` matrix

Instead of a 3-d $T\times N\times N$ `array`, the `DCC` result is a 2-d $T\times N^2$ `matrix`, so we have to reshape it.

In [296]:
array(dcc_results$DCC, dim = c(50,3,3))[1,,]

0,1,2
1.0,1.0,0.6332127
0.6332127,1.0,1.0
0.6332127,0.6332127,1.0


##### Steps

In [47]:
head(dcc_results$DCC)

0,1,2,3,4,5,6,7,8
1,0.4153004,0.09603502,0.4153004,1,0.1762781,0.09603502,0.1762781,1
1,0.4153004,0.09603502,0.4153004,1,0.1762781,0.09603502,0.1762781,1
1,0.4153004,0.09603502,0.4153004,1,0.1762781,0.09603502,0.1762781,1
1,0.4153004,0.09603502,0.4153004,1,0.1762781,0.09603502,0.1762781,1
1,0.4153004,0.09603502,0.4153004,1,0.1762781,0.09603502,0.1762781,1
1,0.4153004,0.09603502,0.4153004,1,0.1762781,0.09603502,0.1762781,1


In [48]:
dim(dcc_results$DCC)

In [49]:
class(dcc_results$DCC)

In [26]:
dcc_results$DCC[1,]

In [27]:
matrix(dcc_results$DCC[1,],nrow = 3,ncol = 3)

0,1,2
1.0,0.4832593,0.3501419
0.4832593,1.0,0.4116337
0.3501419,0.4116337,1.0


In [32]:
matrix(dcc_results$DCC[1,],nrow = 3,ncol = 3)

0,1,2
1.0,0.4832593,0.3501419
0.4832593,1.0,0.4116337
0.3501419,0.4116337,1.0


#### Define `dcc_estimation` - pared-down `dcc.estimation`

https://www.rdocumentation.org/packages/ccgarch/versions/0.2.3/topics/dcc.estimation

In [26]:
# returns1 = generate_data(c("A","B","C","D"),dates3,c("return_mkt","return_firm"))

In [83]:
dcc_estimation <- function (dvar, model = "extended", method = "BFGS", message = 0) 
{
    N <- dim(dvar)[2]
    a <- numeric(N)
    A <- diag(N)
    B <- diag(N)
    ini.dcc <- rep(0.01, 2)
    return(dcc.estimation(inia = a, iniA = A, iniB = B, ini.dcc = ini.dcc, 
        dvar = dvar, model = model, method = method, message = message) %>% 
        {
            .[c("h", "DCC")]
        })
}

In [8]:
data_market <- generate_data(c("TSX"),dates50,c("Price")) 
data_firms <- generate_data(big5_rics,dates50,c("Return","Equity","Bond"))

In [9]:
returns_all <- merge(
    data_firms %>% 
        rename(return_firm = Return),
    data_market %>% 
        mutate(return_mkt = (Price / lag(Price)) -1) %>%
    replace_na(list(return_mkt = 0))%>% 
        select(-RIC,-Price) , 
    by ="Date" ) %>% 
    arrange(RIC,Date) 

In [57]:
returns_all %>% head(10)

Date,RIC,return_firm,Equity,Bond,return_mkt
2018-02-23,RY.TO,1.06806278,1.25109463,1.0484841,0.0
2018-02-23,TD.TO,-0.72786421,0.42168109,-1.0579882,0.0
2018-02-23,BNS.TO,-0.24959535,-0.88489474,-1.396664,0.0
2018-02-23,BMO.TO,-0.09460314,-0.94489995,-0.3218193,0.0
2018-02-23,CM.TO,0.29930143,-0.02178584,-0.3202923,0.0
2018-02-24,RY.TO,1.99680917,-1.33614968,-0.6541031,2.936656
2018-02-24,TD.TO,1.13307361,-1.14996412,-0.4109774,2.936656
2018-02-24,BNS.TO,-0.34823941,0.09843343,0.4104379,2.936656
2018-02-24,BMO.TO,0.55162127,-1.78962246,0.4118535,2.936656
2018-02-24,CM.TO,-0.52932741,0.41205111,0.3254376,2.936656


In [73]:
returns_all[c("return_mkt","return_firm")] %>%
    dcc_estimation %>%
    {list(h_mkt = .$h[,1], h_firm = .$h[,2], corr = .$DCC[,2])} %>%
    {bind_cols(returns_all,.)} %>%
    head

Date,RIC,return_firm,Debt,Equity,return_mkt,h_mkt,h_firm,corr
2018-03-01,RY.TO,-0.1053584,0.2661212,-0.879871,0.0,1.246866,1.09618,0.01041874
2018-03-02,RY.TO,-0.4432634,-1.68650993,0.1618463,-1.7539523,1.249435,1.107421,0.01041874
2018-03-03,RY.TO,-0.345755,0.4993765,-0.489369,0.8102618,1.252037,1.118778,0.01041874
2018-03-01,TD.TO,-0.7206326,-0.27909894,-0.3634839,0.0,1.254623,1.130251,0.01041874
2018-03-02,TD.TO,-1.8897634,-0.06550758,0.9047119,-1.7539523,1.257209,1.141843,0.01041874
2018-03-03,TD.TO,-1.6234864,-1.26107744,-2.6898497,0.8102618,1.259827,1.153556,0.01041876


#### Two-dimensional example

In [10]:
random_matrix <- function(N,T){
    return(matrix(rnorm(N * T, 0, 1), N, T))}

In [119]:
dcc_paras_sim = 
    list(nobs = 50,
        a = c(0.003, 0.005),
        A = diag(c(0.2,0.3)),
        B = diag(c(0.75, 0.6)),
        R = matrix(c(1.0, 0.4, 0.4, 1.0),2,2),
        dcc.para = c(0.01,0.98), 
        model="diagonal")

In [120]:
dcc_data <- do.call(dcc.sim,dcc_paras_sim)

In [181]:
# returns_2d = random_matrix(50,2)
dcc_results <- dcc.estimation(inia=a, iniA=A, iniB=B, ini.dcc=dcc_para, 
        dvar= dcc_data$eps , model="extended") 

****************************************************************
*  Estimation has been completed.                              *
*  The outputs are saved in a list with components:            *
*    out    : the estimates and their standard errors          *
*    loglik : the value of the log-likelihood at the estimates *
*    h      : a matrix of estimated conditional variances      *
*    DCC    : a matrix of DCC estimates                        *
*    std.resid : a matrix of the standardised residuals        *
*    first  : the results of the first stage estimation        *
*    second : the results of the second stage estimation       *
****************************************************************


In [173]:
returns = data.frame(
    date = seq(today()-ddays(nobs-1), today(), 1), 
    return_mkt = dcc_data$eps[,1], 
    return_firm = dcc_data$eps[,2] ) 

In [174]:
return_data = data.frame(
    date = seq(today()-ddays(nobs-1), today(), 1), 
    returns = dcc_data$eps ) 

Unpack the `matrix` elements of the `list` & create a `data.frame`

In [180]:
dcc_results %>%
    {data.frame(h_mkt = .$h[1], h_firm = .$h[2], corr = .$DCC[,2])} %>%
     head

h_mkt,h_firm,corr
0.08948357,0.06093694,0.09773793
0.08948357,0.06093694,0.09773793
0.08948357,0.06093694,0.09773795
0.08948357,0.06093694,0.09773793
0.08948357,0.06093694,0.09773791
0.08948357,0.06093694,0.09773776


# Calling DCC-GARCH

### Overview of SRISK calculation

| Step | In   | Out   | Parameters   |  Notes |
|------|------|------|---------------|
|  ccgarch  | returns| h, DCC| NA     |     |
|  MES  | h,DCC | MES,LRMES| Market shock (40%), Confidence level (95%)     |     |
|  SRISK  | LRMES, Assets, Liabilities | SRISK| NA     |     |

### Universe & market index - RICs, names

In [11]:
big5_rics = c("RY.TO","TD.TO","BNS.TO","BMO.TO","CM.TO")
big5_names = c("Royal Bank of Canada","Toronto-Dominion Bank","Bank of Nova Scotia",
              "Bank of Montreal","Canadian Imperial Bank of Commerce")
market_ric = ".GSPTSE"

### Generating fictitious returns data

http://clayford.github.io/dwir/dwr_12_generating_data.html 

#### Parameters

In [12]:
dates3 <- as.Date(c("2018-03-01", "2018-03-02", "2018-03-03"))
dates50 <- seq(today()-ddays(50-1), today(), by='days')
firms5 <- c("A","B","C","D","E")
fields2 <-  paste("Field",1:2,sep="")

#### `generate.data`   

The function `generate.data` generates random data with a pair of row indices: date & firm, & any number of fields in each column.

In [13]:
generate_data = function(firms,dates,fields) {
    nfirms <- length(firms)
    ndates <- length(dates)
    nfields <- length(fields)
    row_ids <- expand.grid(Date=dates,RIC=firms)[,c("RIC","Date")]
    values <- data.frame(matrix(rnorm(nfirms * ndates * nfields),ncol=nfields))
    colnames(values) <- fields
    # row.names(row_ids) <- paste(row_ids$"RIC",row_ids$"Date", sep="/") # See add_rownames_ric_date
    return(cbind(row_ids,values))
}

In [14]:
add_rownames_ric_date = function(df) {
    row.names(df) <- paste(df$RIC,df$Date, sep="/")
    return(df)
}

#### Test it

In [188]:
generate_data(firms5,dates3,fields2) %>% add_rownames_ric_date %>% head(5)

Unnamed: 0,RIC,Date,Field1,Field2
A/2018-03-01,A,2018-03-01,0.1571562,0.9744708
A/2018-03-02,A,2018-03-02,-0.7067523,-0.2198407
A/2018-03-03,A,2018-03-03,-1.9192296,-0.9474985
B/2018-03-01,B,2018-03-01,1.2384404,0.331944
B/2018-03-02,B,2018-03-02,0.1974105,-0.5227953


In [15]:
firm_data = generate_data(big5_rics,dates3,c("Return","Debt","Equity"))
market_data = generate_data(market_ric,dates3,c("Price"))

### Transforming series

#### Overview

* Create new fields / series from old using `mutate`  
* Create new aggregate fields / series using `group_by` and `summarise`  
* Alternative is to use `split` (to create a `list` of `data.frames`) and `map`

##### Creating new fields using `mutate`

In [253]:
f=function(x,y){return(x^2+exp(y))} # Elementwise operations on vectors.

In [255]:
firm_data %>% 
    mutate(
        d2e = Debt / Equity, 
        cumsum_debt = cumsum(Debt), 
        dot_product = Debt %*% Equity,
        fn_result = f(Debt,Equity)) %>%
    add_rownames_ric_date %>%
    head(5) 

Unnamed: 0,RIC,Date,Return,Debt,Equity,d2e,cumsum_debt,dot_product,fn_result
RY.TO/2018-03-01,RY.TO,2018-03-01,-0.9458976,-0.618973,-0.18836422,3.2860433,-0.618973,-4.506824,1.2114405
RY.TO/2018-03-02,RY.TO,2018-03-02,0.6132129,-1.4864735,0.01217025,-122.1399003,-2.105446,-4.506824,3.221848
RY.TO/2018-03-03,RY.TO,2018-03-03,0.673705,-0.1550855,-1.17686898,0.1317781,-2.260532,-4.506824,0.3322939
TD.TO/2018-03-01,TD.TO,2018-03-01,-0.9730841,0.2733288,-0.08121659,-3.3654307,-1.987203,-4.506824,0.9967026
TD.TO/2018-03-02,TD.TO,2018-03-02,-0.2943303,-1.1159619,1.90865224,-0.5846858,-3.103165,-4.506824,7.9893643


#### Creating new aggregate fields using `group_by`, `do` & `summarise`

##### Using `group_by` and `do`

In [97]:
firm_data %>% group_by(RIC) %>% do(head(.,1))

RIC,Date,Return,Debt,Equity
RY.TO,2018-03-01,0.559849,0.994846,0.138256382
TD.TO,2018-03-01,0.1393961,1.3550318,-0.001701984
BNS.TO,2018-03-01,1.082532,1.0663013,-0.799954415
BMO.TO,2018-03-01,-1.8760219,0.0297518,-0.755924374
CM.TO,2018-03-01,0.4919617,-0.9827256,1.548481889


In [99]:
firm_data %>% group_by(RIC) %>% do(new = .$Debt/.$Equity)

RIC,new
RY.TO,"7.1956606, 0.2021058, -0.3793467"
TD.TO,"-796.148268, -1.528699, 6.578500"
BNS.TO,"-1.332953, -1.971871, 2.016039"
BMO.TO,"-0.03935817, 0.14923994, -1.06347798"
CM.TO,"-0.6346381, 1.1187117, -0.3102249"


https://www.r-bloggers.com/dplyr-do-some-tips-for-using-and-programming/

In [102]:
my_fun <- function(x, y){
  res_x = mean(x) + 2
  res_y = mean(y) * 5 
  return(data.frame(res_x, res_y))
}

In [106]:
# Apply my_fun() function to ds by group
firm_data %>% group_by(RIC) %>% do(my_fun(.$Debt, .$Equity))

RIC,res_x,res_y
RY.TO,2.055879,1.5023396
TD.TO,2.685327,1.4187041
BNS.TO,2.370475,-0.8308779
BMO.TO,2.119288,-4.5232735
CM.TO,1.383114,3.3227209


##### Using `group_by` and `summarise`

In [17]:
firm_data %>%
    group_by(RIC) %>%
    summarise(avg_debt = mean(Debt), 
              min_equity = min(Equity),
              dot_product = sum(Debt * Equity) ) %>%
    left_join( firm_data, . , by = c("RIC"="RIC")) %>% # Merge with inputs
    # left_join( firm.data %>% rownames_to_column('rownames'), . ) %>%
    # column_to_rownames('rownames') %>%
    head(5)

RIC,avg_debt,min_equity,dot_product
RY.TO,0.8580499,-0.47987191,3.873163
TD.TO,0.1990649,-2.11770332,-0.2116659
BNS.TO,0.2217307,-0.07531833,-0.5978766
BMO.TO,-0.1442535,0.0319708,-0.5826413
CM.TO,-0.8334507,-2.32975937,1.1116275


##### `group_by` & `summarize`

In [168]:
returns_all[c("RIC","ret_m","ret_x")] %>%
    group_by(RIC) %>%
    summarize(var_x = quantile(.$ret_m,0.1) )

RIC,var_x
RY.TO,-0.5894609
TD.TO,-0.5894609
BNS.TO,-0.5894609
BMO.TO,-0.5894609
CM.TO,-0.5894609


#### Using `split` & `map` (from `purrr`)

This could be an alternative to using `group_by` & `summarize`.  
The idea is to `split` the data-frame into a `list` of data-frames, over which we can `map` a function. Tilde `~` is necessary for non-standard evaluation of functions. 

In [19]:
firm_data %>%
    split(.$RIC)  %>%   # This yields a list of dfs; use ~ for non-standard evaluation
    map(~left_join( ., market_data %>% select(-RIC) , by = c("Date"="Date"))) %>%
    map(~mutate(.,new = Debt + Equity)) %>%
    .[1:2]

RIC,Date,Return,Debt,Equity,Price,new
RY.TO,2018-03-01,1.20034231,-0.9431565,-0.4798719,-0.92193,-1.423028
RY.TO,2018-03-02,0.05597288,1.4167669,0.1731779,0.3036474,1.589945
RY.TO,2018-03-03,-0.74644365,2.1005394,1.5116193,0.6797938,3.612159

RIC,Date,Return,Debt,Equity,Price,new
TD.TO,2018-03-01,0.4934651,0.4360902,-1.6811713,-0.92193,-1.2450811
TD.TO,2018-03-02,0.9282905,-0.150606,-2.1177033,0.3036474,-2.2683093
TD.TO,2018-03-03,-0.3876854,0.3117105,0.6497621,0.6797938,0.9614726


### Tidyverse examples

#### Testing `select` and `rename`

In [50]:
market_data %>% select(!!c("Date","Close price"))

Date,Close price
2018-03-01,-0.9781007
2018-03-02,-1.0658338
2018-03-03,0.3797031


In [55]:
market_data %>% select(-RIC)

Date,Close price
2018-03-01,-0.9781007
2018-03-02,-1.0658338
2018-03-03,0.3797031


In [56]:
market_data %>% rename(Market = "RIC")

Market,Date,Close price
.GSPTSE,2018-03-01,-0.9781007
.GSPTSE,2018-03-02,-1.0658338
.GSPTSE,2018-03-03,0.3797031


### Putting it all together

In [13]:
firm_data %>%
    left_join( (market_data %>% select(-RIC)), . , by = c("Date"="Date")) %>%
    group_by(RIC) %>%
    summarise(avg_debt = mean(Debt), 
              min_equity = min(Equity),
              dot_product = sum(Price * Equity) ) %>%
    left_join( firm_data, . , by = c("RIC"="RIC")) %>%  # Join with inputs
    head(5)

RIC,Date,Return,Debt,Equity,avg_debt,min_equity,dot_product
RY.TO,2018-03-01,1.0499078,1.6019564,0.5659224,-0.1412319,-0.5903469,-2.171131
RY.TO,2018-03-02,1.8156609,-1.3938039,-0.5903469,-0.1412319,-0.5903469,-2.171131
RY.TO,2018-03-03,-0.8194842,-0.6318483,2.1585673,-0.1412319,-0.5903469,-2.171131
TD.TO,2018-03-01,-0.3877224,-0.9503588,-1.9986268,1.1574392,-1.9986268,2.017711
TD.TO,2018-03-02,-1.1840824,2.7635235,-1.3338509,1.1574392,-1.9986268,2.017711


### Process firms with `do_firm`, combine with input df using `bind_cols`

In python pandas it is possible to have a function that accepts series, & returns series, some of which can be (the repeated values of) aggregate functions. The assumption is that the input series are already firm specific, so a call such as `x.sum()` will be summing over dates for a single firm. 

In pandas the series conveniently retain their indices. The closest to an index in R is `row.names`, but 
* multi-index is not supported, so have to use `paste` to build a composite string!
* these get lost under most `dplyr` operations, so row names have to be copied over to a temporary field & then copied back   

Another option is to ignore the RIC & Date indices & just work with bare series, & just rely on them staying present & in the same order.

In [16]:
# Create a dataframe of sample data with two fields
data_xy = generate_data(big5_rics,dates3,c("x","y")) 

Would it better to have `do_firm` return a `data.frame` instead of `list`? Not much difference. The functions to `merge`/`cbind`/`bind_cols` are `list`/`data.frame` agnostic.

In [17]:
do_firm <- function(x,y){
    #Function of multiple numeric vectors. Returns list (~dict) of series. No aggregate functions.
    return(list(u = x, v = y))}

This function creates a pair of completely random vectors of the same length as the inputs. (Just to check that the operations are not done row-wise.)

In [18]:
do_firm_rnd <- function(x,y){
    # Return 2 random vectors: u,v
    return(length(x) %>%
            {matrix(rnorm(. * 2),ncol=2)} %>%
            data_frame %>%
            rename(u=X1,v=X2)
          )}   

Can `merge` `data.frame`s with `list`s??!

In [290]:
data_xy %>% 
    {do_firm(.$x, .$y)} %>% 
    bind_cols( data_xy, . ) %>%
    head

RIC,Date,x,y,u,v
RY.TO,2018-03-01,0.008667789,0.7217674,0.008667789,0.7217674
RY.TO,2018-03-02,-0.385569363,0.2137734,-0.385569363,0.2137734
RY.TO,2018-03-03,2.589840823,-1.4053111,2.589840823,-1.4053111
TD.TO,2018-03-01,0.446101201,-0.6425107,0.446101201,-0.6425107
TD.TO,2018-03-02,-1.199868225,0.5515122,-1.199868225,0.5515122
TD.TO,2018-03-03,0.82069346,-0.8019894,0.82069346,-0.8019894


# MES & SRISK for systemic risk

https://vlab.stern.nyu.edu/analysis/RISK.USFIN-MR.MES  

## Benoit Sylvain - Comparison of measures  - Matlab / Octave 

http://www.runmycode.org/companion/view/175  
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1973950  

### `call_fct`

In [None]:
# % Returns for the market index & the given asset
data = [index asset];
data = data(~isnan(data(:,2).*data(:,1)),:); #% select only elements which are not NaN
data_center = data - ones(size(data,1),1)*mean(data); #%demeaned returns

# %% GJR-GARCH and DCC
[parameters, loglikelihood, Ht, Rt, Qt, stdresid, likelihoods, stderrors, A,B, jointscores, H]
    =dcc_mvgarch(data_center,1,1,1,1);
ht_m=sqrt(H(:,1)); #%market conditional volatility
ht_i=sqrt(H(:,2)); #%asset conditional volatility
rho=squeeze(Rt(1,2,:)); #%conditional correlation
c = quantile(data_center(:,1),alpha); # % HS VaR (nonparametric), it's our systemic event, it's a scalar here
MES = - fct_MES(data_center,c,ht_m,ht_i,rho);
LRMES = (1-exp(-18.*MES)); # %without simulation
SRISK = k.*LTQ - (1-k).*(1-LRMES).*MV;

### `dcc_mvgarch`

#### Purpose:

Estimates a multivariate GARCH model using the DCC estimator of Engle and Sheppard

#### Usage:

In [None]:
[parameters, loglikelihood, Ht, Rt, Qt, likelihoods, stdresid, stderrors, A,B, jointscores] 
    = dcc_mvgarch(data,dccQ,dccP,archQ,garchP)

#### Inputs:

`data          = A zero mean t by k (asset number) vector of residuals from some filtration [t by k]
dccQ          = The lag length of the innovation term in the DCC estimator (a scalar)
dccP          = The lag length of the lagged correlation matrices in the DCC estimator (a scalar)
archQ         = One of two things:     A scalar, q     in which case a p innovation model is estimated for each series
                                       A k by 1 vector in which case the ith series has innovation terms p=archP(i)
garchP        = One of two things:     A scalar, p     in which case a q GARCH lags is used in estimation for each series
                                       A k by 1 vector in which case the ith series has lagged variance terms q=archQ(i)`

#### Outputs:

`parameters    = A vector of parameters estimated from the model of the form    
                    [GarchParams(1) GarchParams(2) ... GarchParams(k) DCCParams]    
                    where the Garch Parameters from each estimation are of the form    
                    [omega(i) alpha(i1) alpha(i2) ... alpha(ip(i)) beta(i1) beta(i2) ... beta(iq(i))]    
loglikelihood = The log likelihood evaluated at the optimum    
Ht            = A k by k by t array of conditional variances    
Rt            = A k by k by t array of Rt elements    
Qt            = A k by k by t array of Qt elements   
stdresid      = A [t by k] matrix of standardized residuals   
likelihoods   = the estimated likelihoods t by 1   
stderrors     = A length(parameters)^2 matrix of estimated correct standard errors   
A             = The estimated A from the robust standard errors    
B             = The estimated B from the standard errors   
jointscores   = The estimated scores of the likelihood t by length(parameters)   
H             = Conditional Volatility univariate   `

### `fct_MES`

In [None]:
function [MES] = fct_MES(data,c,ht_m,ht_i,rho)
    em=(data(:,1))./ht_m; %market first column
    xi=((data(:,2)./ht_i)-rho.*em)./ sqrt(1-rho.^2); %asset second column
    bwd=1*(size(data,1)^(-0.2)); # Scaillet's bwd p21, I put 1 instead of the standard deviation because our shocks are iid with unit variance
    K1=sum(em.*(normcdf(((c./ht_m)-em)./bwd)))./(sum(normcdf(((c./ht_m)-em)./bwd)));
    K2=sum(xi.*(normcdf(((c./ht_m)-em)./bwd)))./(sum(normcdf(((c./ht_m)-em)./bwd)));
    MES = (ht_i.*rho.*K1) + (ht_i.*sqrt(1-rho.^2).*K2);

## `TommasoBelluzzo/SystemicRisk` - Matlab / Octave

This code uses the Glosten-Jagannathan-Runkle (GJR) DCC Model   
https://vlab.stern.nyu.edu/doc/3?topic=mdls  
https://vlab.stern.nyu.edu/doc/14?topic=mdls  

### `main_pro.m`

https://github.com/TommasoBelluzzo/SystemicRisk/blob/master/ScriptsProbabilistic/main_pro.m    

In [None]:
#% Supply 2 series of returns (with means subtracted)
[p,s] = dcc_gjrgarch([ret0_m ret0_x]);
#% p       = An n-by-n-by-t matrix of floats containing the DCC coefficients.
#% s       = A t-by-n matrix of floats containing the conditional variances.
s_m = sqrt(s(:,1));
s_x = sqrt(s(:,2));
p_mx = squeeze(p(1,2,:)); #% Pull out the off-diagonal correlation (between market & firm)

beta_x = p_mx .* (s_x ./ s_m);
var_x = s_x * quantile((ret0_x ./ s_x),data.A); #% Find the VaR of the firm? NOT USED!

[mes,lrmes] = calculate_mes(ret0_m,s_m,ret0_x,s_x,beta_x,p_mx,data.A,data.D); #% Hopefully R package ccgarch can do this step?
srisk = calculate_srisk(lrmes,data.FrmsLia(:,i),data.FrmsCap(:,i),data.L); #% SRISK needs the (LR)MES + balance sheet data + crash level (e.g. 40%)

### `calculate_mes_internal`

https://github.com/TommasoBelluzzo/SystemicRisk/blob/master/ScriptsProbabilistic/calculate_mes.m

In [None]:
function [mes,lrmes] = calculate_mes_internal(ret0_m,s_m,ret0_x,s_x,beta_x,p_mx,a,d)
    c = quantile(ret0_m,a);
    h = 1 * (length(ret0_m) ^ (-0.2));
    u = ret0_m ./ s_m;
    x_den = sqrt(1 - (p_mx .^ 2));
    x_num = (ret0_x ./ s_x) - (p_mx .* u);
    x = x_num ./ x_den;
    f = normcdf(((c ./ s_m) - u) ./ h);
    k1 = sum(u .* f) ./ sum(f);
    k2 = sum(x .* f) ./ sum(f);
    mes = (s_x .* p_mx .* k1) + (s_x .* x_den .* k2);
    lrmes = 1 - exp(log(1 - d) .* beta_x);

### `calculate_srisk_internal` 
https://github.com/TommasoBelluzzo/SystemicRisk/blob/master/ScriptsProbabilistic/calculate_srisk.m  

In [None]:
function srisk = calculate_srisk_internal(lrmes,tl_x,mc_x,l)
    srisk = (l .* tl_x) - ((1 - l) .* (1 - lrmes) .* mc_x);
    srisk(srisk < 0) = 0;

### `dcc_gjrgarch`

#### Inputs

In [None]:
data    = A numeric t-by-n matrix containing the input data.
dcc_q   = An integer representing the lag of the innovation term in the DCC estimator (optional, default=1).  
dcc_p   = An integer representing the lag of the lagged correlation matrices in the DCC estimator (optional, default=1). 
arch_q  = Optional argument (default=1) with two possible types:
           - An integer representing the lag of the innovation terms in the ARCH estimator. 
           - A vector of integers, of length n, containing the lag of each innovation term in the ARCH estimator.   
garch_p = Optional argument (default=1) with two possible types:
           - An integer representing the lag of the innovation terms in the GARCH estimator.
           - A vector of integers, of length n, containing the lag of each innovation term in the GARCH estimator 

#### Outputs

`p       = An n-by-n-by-t matrix of floats containing the DCC coefficients.
s       = A t-by-n matrix of floats containing the conditional variances.`