# SRISK - python

#### Goals

The python modules & parameters for the SRISK Azure Machine Learning Studio (AMLS) experiment to calculate the SRISK systemic risk measure. 

Currently, the core of the SRISK model, the DCC-GARCH model, is implemented in R only (using the `ccgarch` package). The easier parts of the calculation, namely MES & SRISK are implemented below (translated from Matlab). The original intention was to connect the R & python parts using Azure ML Studio, or [`rpy2`](https://rpy2.github.io/doc/latest/html/introduction.html#), `reticulate` etc.

https://github.com/TommasoBelluzzo/SystemicRisk/blob/master/ScriptsProbabilistic/calculate_mes.m 
https://github.com/TommasoBelluzzo/SystemicRisk/blob/master/ScriptsProbabilistic/calculate_srisk.m
https://github.com/TommasoBelluzzo/SystemicRisk/blob/master/ScriptsProbabilistic/main_pro.m  (see `main_pro_internal`)

#### Hosting on Azure

This notebook can be [hosted on Azure](https://notebooks.azure.com/). It is necessary to sign in to Microsoft (perhaps using a (free) Hotmail or Outlook account), & to clone the notebook, in order to run it. Or it can be run locally or on another Jupyter notebook hosting service.

Notebooks using multi-index pandas dataframes do not render properly in preview mode on Azure.

##### Sources for Matlab code

Belluzzo, Tommaso. SystemicRisk: A Framework for Systemic Risk Valuation and Analysis. Matlab, 2018. https://github.com/TommasoBelluzzo/SystemicRisk.  
Bisias, Dimitrios, Mark D. Flood, Andrew W. Lo, and Stavros Valavanis. A Survey of Systemic Risk Analytics. Matlab, 2012. https://financialresearch.gov/working-papers/files/OFRwp0001_BisiasFloodLoValavanis_MatlabCode-v0_3.zip.
———. “A Survey of Systemic Risk Analytics.” SSRN Scholarly Paper. Rochester, NY: Social Science Research Network, January 11, 2012. http://papers.ssrn.com/abstract=2747882.  
Dube, Qobolwakhe. SA-Systemic-Risk: Systemic Risk Ranking of South Africa’s Financial Institutions. Matlab, 2017. https://github.com/qobolwakhe/SA-systemic-risk.  
Perignon, Christophe, Sylvain Benoit, Christophe Hurlin, and Gilbert Colletaz. Run My Code - A Theoretical and Empirical Comparison of Systemic Risk Measures. Accessed July 11, 2016. http://www.runmycode.org/companion/view/175.  
V-Lab Stern NYU. “GARCH-DCC Documentation.” V-Lab. Accessed May 8, 2018. https://vlab.stern.nyu.edu/doc/13?topic=mdls.  

## Libraries

In [1]:
import datetime
import pandas as pd
import numpy as np

## `generate_data`

### Some string lists for sample data generation

In [2]:
firms = ["RY.TO","TD.TO","BNS.TO","BMO.TO","CM.TO"]
dates = ["2018-03-01", "2018-03-02", "2018-03-03"]
fields = ["Field1","Field2","Field3"]
[nfirms,ndates,nfields] = [len(z) for z in [firms,dates,fields]]
dates2 = [d.date() for d in pd.to_datetime(dates)] # Convert
dates3 = pd.date_range('2018-03-01', periods=3)   # Pandas date_range

### `generate_data`

In [3]:
def generate_data(firms,dates,fields):
    [nfirms,ndates,nfields] = [len(z) for z in [firms,dates,fields]]
    dates2 = [d.date() for d in pd.to_datetime(dates)]  # Convert strings to datetimes
    firm_date_index = pd.MultiIndex.from_product([firms, dates2], names=['Firm', 'Date'])
    df = pd.DataFrame( np.random.randn(ndates * nfirms, nfields), index = firm_date_index, columns=fields)
    return(df)

In [None]:
sample_data = generate_data(firms,dates,fields)

### Manipulate dataframe with multi-index using `index`, `.loc`, `unstack`

Experiments extracting MultiIndex info from a dataframe

#### `.index`

In [11]:
sample_data.index

MultiIndex(levels=[['BMO.TO', 'BNS.TO', 'CM.TO', 'RY.TO', 'TD.TO'], [2018-03-01 00:00:00, 2018-03-02 00:00:00, 2018-03-03 00:00:00]],
           labels=[[3, 3, 3, 4, 4, 4, 1, 1, 1, 0, 0, 0, 2, 2, 2], [0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2]],
           names=['Firm', 'Date'])

#### `.index.levels[0].values`

In [13]:
sample_data.index.levels[0].values

array(['BMO.TO', 'BNS.TO', 'CM.TO', 'RY.TO', 'TD.TO'], dtype=object)

#### `.loc[]` for selecting rows, `[]` for selecting columns

In [90]:
sample_data.loc['BMO.TO']['Field1']

Date
2018-03-01    0.972150
2018-03-02   -0.288688
2018-03-03    0.825964
Name: Field1, dtype: float64

#### Splitting dataframe into a list of individual records.

In [14]:
[sample_data.loc[[i]] for i in sample_data.index][:2]

[                    Field1    Field2   Field3
 Firm  Date                                   
 RY.TO 2018-03-01 -0.510618  0.360086 -0.41942,
                     Field1   Field2    Field3
 Firm  Date                                   
 RY.TO 2018-03-02 -0.838906  0.79444  0.304467]

#### `.unstack`

In [15]:
sample_data.unstack(level='Firm')

Unnamed: 0_level_0,Field1,Field1,Field1,Field1,Field1,Field2,Field2,Field2,Field2,Field2,Field3,Field3,Field3,Field3,Field3
Firm,BMO.TO,BNS.TO,CM.TO,RY.TO,TD.TO,BMO.TO,BNS.TO,CM.TO,RY.TO,TD.TO,BMO.TO,BNS.TO,CM.TO,RY.TO,TD.TO
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2
2018-03-01,0.039434,-0.134752,0.447877,-0.510618,1.399231,-0.330408,-0.30318,-0.582555,0.360086,0.736067,0.160092,0.196177,-0.380699,-0.41942,1.543249
2018-03-02,-0.764024,-1.290992,-1.125049,-0.838906,-1.775888,-1.403676,1.970138,0.621628,0.79444,-0.696343,0.390696,-0.330183,-0.407283,0.304467,0.791402
2018-03-03,0.299869,-1.067755,0.577828,0.403372,1.937433,-0.756984,-1.091377,-0.152663,0.774011,-0.427033,0.484912,1.61404,-0.151665,-1.195967,1.234595


## Sample data

Best not to use hyphen `-` in names or else not possible to use `df.field_name` construct. Underscores `_` are OK.

In [None]:
df_firm_returns = generate_data(firms,dates,['Return'])
df_dcc_garch =    generate_data(firms,dates,['ConditionalVariance_h','ConditionalCorrelation_R'])
df_mes =          generate_data(firms,dates,['LRMES','MES','Beta'])

df_market_price = generate_data(['.GSPTSE'],dates,['Close'])

### Create new series from existing ones

https://kaijento.github.io/2017/04/22/pandas-create-new-column-sum/ gives some simple examples.

In [44]:
def add2(x,y): return x + y  
def add3(x,y,z): return x + y + z

Dataframes support vectorized operations, so applying a function to multiple fields to create a new field is trivial. Scalar parameters can be mixed in with series without fuss. The function is applied at the row level.

In [53]:
df = generate_data(firms,dates,['x','y'])
df['new'] = add2(df.x ,df.y)
df['new2'] = add3(df.x ,df.y,-999)    # x,y are series, z is a scalar
df

Unnamed: 0_level_0,Unnamed: 1_level_0,x,y,new,new2
Firm,Date,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
RY.TO,2018-03-01,-0.98051,0.737289,-0.243221,-999.243221
RY.TO,2018-03-02,-1.075726,-0.005682,-1.081408,-1000.081408
RY.TO,2018-03-03,-0.810852,-0.917874,-1.728726,-1000.728726
TD.TO,2018-03-01,1.215862,0.798152,2.014014,-996.985986
TD.TO,2018-03-02,0.41234,-0.288999,0.123341,-998.876659
TD.TO,2018-03-03,-0.817653,-1.371886,-2.189539,-1001.189539
BNS.TO,2018-03-01,1.934915,0.214242,2.149157,-996.850843
BNS.TO,2018-03-02,-1.734872,0.862594,-0.872278,-999.872278
BNS.TO,2018-03-03,-0.694934,1.186807,0.491873,-998.508127
BMO.TO,2018-03-01,1.761474,-0.650017,1.111458,-997.888542


In [5]:
def calc_firm_ccgarch(firm):
    firm_return = df_firm_returns.loc[firm]['Return'] # Firm is row index (1/2); Return is col index.
    market_return = df_market_price.loc['.GSPTSE']['Close'].pct_change(1)
    pair_return = pd.concat([market_return,firm_return],axis=1)
    return pair_return    

In [26]:
calc_firm_ccgarch('BMO.TO')

Unnamed: 0_level_0,Close,Return
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2018-03-01,,-1.264835
2018-03-02,-1.435059,-0.979102
2018-03-03,3.555761,-0.19807


In [18]:
bmo_return = df_firm_returns.loc['BMO.TO']['Return']
market_close = df_market_price.loc['.GSPTSE']['Close']

In [19]:
market_return = df_market_price.loc['.GSPTSE']['Close'].pct_change(1)
market_return

Date
2018-03-01         NaN
2018-03-02   -1.435059
2018-03-03    3.555761
Name: Close, dtype: float64

### Looping over multiindex dataframes

#### Loop over firms using `for` and `df.groupby`

In [6]:
all_firms_df = generate_data(firms,dates,['x','y','z'])

Standard loop with side effects:

In [21]:
for key, sub_df in all_firms_df.groupby(level=0):
     print(key)

BMO.TO
BNS.TO
CM.TO
RY.TO
TD.TO


Dictionary comprehension:

In [22]:
dict_from_multiindex_df = {firm: sub_df for firm, sub_df in all_firms_df.groupby(level=0)}

#### Split multiindex dataframe into list using `groupby` & restore to dataframe using `concat`

In fact, a cleaner approach is to use `.apply` and `.assign`

In [23]:
list_from_multiindex_df = [sub_df for firm, sub_df in all_firms_df.groupby(level=0)]

In [24]:
list_from_multiindex_df[0:2]

[                          x         y         z
 Firm   Date                                    
 BMO.TO 2018-03-01 -0.389149  0.959650 -1.365241
        2018-03-02  0.896308 -0.377102 -0.558298
        2018-03-03  0.963550 -0.186580 -1.765016,
                           x         y         z
 Firm   Date                                    
 BNS.TO 2018-03-01 -1.123153  0.394552 -1.068644
        2018-03-02  0.287074  2.709269 -1.120250
        2018-03-03  1.095091  0.983898 -0.309522]

And, drum roll, restore the sub-frames back into a frame using `concat`

In [25]:
pd.concat(list_from_multiindex_df)[0:6]

Unnamed: 0_level_0,Unnamed: 1_level_0,x,y,z
Firm,Date,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
BMO.TO,2018-03-01,-0.389149,0.95965,-1.365241
BMO.TO,2018-03-02,0.896308,-0.377102,-0.558298
BMO.TO,2018-03-03,0.96355,-0.18658,-1.765016
BNS.TO,2018-03-01,-1.123153,0.394552,-1.068644
BNS.TO,2018-03-02,0.287074,2.709269,-1.12025
BNS.TO,2018-03-03,1.095091,0.983898,-0.309522


#### Apply operation to the dataframe at the firm level 

1. Create a dataframe of sample data with multi-index `(firm, date)`, & two fields `x`, `y` 
2. Define a function `do_firm` with arguments of type `pandas.Series`.  
3. Split the input data into sub-frames for each firm & apply the function to the sub-series. 
4. Assign the result to a new time-series (field) `z`
4. Join the outputs from individual firms back into a single data frame for the full universe.

In [53]:
# Create a dataframe of sample data with two fields
all_firms = generate_data(firms,dates,['x','y'])

def do_firm(x,y): 
    '''Function of multiple series. Returns dict of series.'''
    return {'u':2*x + y.sum(),'v':x.corr(y)}

# Looping over firms logic is a one-liner!
all_firms \
    .groupby(['Firm']) \
    .apply(lambda d: d.assign(**do_firm(d.x,d.y))) # Loop over firm sub dataframes ("d") 

Unnamed: 0_level_0,Unnamed: 1_level_0,x,y,u,v
Firm,Date,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
RY.TO,2018-03-01,-0.900286,-0.299401,-2.20331,-0.343481
RY.TO,2018-03-02,0.714358,1.336465,1.025978,-0.343481
RY.TO,2018-03-03,1.977944,-1.439802,3.553149,-0.343481
TD.TO,2018-03-01,0.531077,-0.172637,1.229492,0.377449
TD.TO,2018-03-02,0.563808,0.406482,1.294953,0.377449
TD.TO,2018-03-03,-0.230629,-0.066507,-0.29392,0.377449
BNS.TO,2018-03-01,0.256004,0.911004,-0.444059,-0.485625
BNS.TO,2018-03-02,1.349226,-1.078178,1.742384,-0.485625
BNS.TO,2018-03-03,0.044629,-0.788894,-0.86681,-0.485625
BMO.TO,2018-03-01,-0.42523,-0.69161,-1.575482,0.951139


Alternative using list comprehension, requires `concat`

In [54]:
pd.concat([df.assign(**do_firm_dict(df.x,df.y)) for k,df in all_firms.groupby(['Firm'])])[0:6]

Unnamed: 0_level_0,Unnamed: 1_level_0,x,y,u,v
Firm,Date,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
BMO.TO,2018-03-01,-0.42523,-0.69161,-1.575482,0.951139
BMO.TO,2018-03-02,0.655729,-0.260999,0.586436,0.951139
BMO.TO,2018-03-03,1.022452,0.227587,1.319881,0.951139
BNS.TO,2018-03-01,0.256004,0.911004,-0.444059,-0.485625
BNS.TO,2018-03-02,1.349226,-1.078178,1.742384,-0.485625
BNS.TO,2018-03-03,0.044629,-0.788894,-0.86681,-0.485625


## Translated code

See  
https://github.com/TommasoBelluzzo/SystemicRisk/blob/master/ScriptsProbabilistic/calculate_mes.m  
https://github.com/TommasoBelluzzo/SystemicRisk/blob/master/ScriptsProbabilistic/calculate_srisk.m  

In [7]:
import math
from scipy.stats import norm

### `mes`

https://github.com/TommasoBelluzzo/SystemicRisk/blob/master/ScriptsProbabilistic/calculate_mes.m

The required series are explicitly listed as arguments (as in the orginal Matlab code).   
An alternative would be to accept a single dataframe as an argument containing all required series & select them as required within the function. 

Is this very inefficient because the calculations are being done row by row, & the `sum`, `norm` etc. operations are being repeated for each row / record??

In [8]:
def mes(ret0_m, s_m, ret0_x, s_x, beta_x, p_mx, a=0.05, d=0.4, debug=False):
    '''
    Calculate marginal expected shortfall (MES) & long range MES (LRMES)
    Author: Tommaso Belluzzo (Matlab)
    In python, vectors are pandas series objects
    The input series must be for a single firm only!
    :param ret0_m: Demeaned market index log returns.
    :param s_m: Volatilities of the market index log returns.
    :param ret0_x: Demeaned firm log returns.
    :param s_x: Volatilities of the firm log returns.
    :param beta_x: Firm CAPM betas.
    :param p_mx: DCC coefficients.
    :param a: A float [0.01,0.10] representing the complement to 1 of the confidence level (optional, default=0.05).
    :param d: A float representing the six-month crisis threshold for the market index decline used to calculate LRMES (optional, default=0.40).
    :type ret0_m: Pandas series of floats
    :type s_m: Pandas series of floats
    :type ret0_x: Pandas series of floats
    :type s_x: Pandas series of floats 
    :type beta_x: Pandas series of floats 
    :type p_mx: Pandas series of floats
    :type a: Scalar float
    :type d: Scalar float
    :type: A dataframe containing MES & LRMES values
    :rtype: Dict of pandas series
    '''
    c = np.percentile(ret0_m,a)
    h = len(ret0_m) ** (-0.2)
    u = ret0_m / s_m              # Standardize
    x_den = np.sqrt(1 - p_mx**2)
    x_num = (ret0_x / s_x) - (p_mx * u)
    x = x_den / x_num
    f = norm.cdf(((c / s_m) - u) / h)
    f_sum = f.sum()
    k1 = (u * f).sum() / f.sum()          
    k2 =  (x * f).sum() / f.sum()
    mes = (s_x * p_mx * k1) + (s_x * x_den * k2)
    lrmes = 1 - np.exp(np.log(1 - d) * beta_x) 
    
    return_series = {'mes':mes,'lrmes':lrmes}
    debug_series = {'c':c,'h':h,'u':u,'x':x,'f':f,'f_sum':f_sum,'k1':k1,'k2':k2} 
    return {**return_series,**debug_series} if debug else return_series

#### Test for single firm

In [70]:
mes_data = generate_data(firms,dates,['ret0_m','s_m','ret0_x','s_x','beta_x','p_mx'])
df = mes_data.loc['RY.TO']  # Important to select a sub dataframe for a single firm!
df.assign(**mes(df.ret0_m, df.s_m, df.ret0_x, df.s_x, df.beta_x, df.p_mx, 0.05, 0.4,debug=True))

Unnamed: 0_level_0,ret0_m,s_m,ret0_x,s_x,beta_x,p_mx,c,f,f_sum,h,k1,k2,lrmes,mes,u,x
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
2018-03-01,-0.15918,1.075699,0.697702,1.494221,-0.096814,0.080335,-2.249517,0.007744,1.51056,0.802742,-0.063153,-0.325169,-0.050699,-0.491885,-0.147978,2.081711
2018-03-02,-0.414205,-0.121921,1.217606,1.410365,0.649527,0.739019,-2.249517,1.0,1.51056,0.802742,-0.063153,-0.325169,0.282366,-0.37478,3.39733,-0.408946
2018-03-03,-2.251354,0.324213,-1.077036,-0.461747,-0.802593,-0.788582,-2.249517,0.502816,1.51056,0.802742,-0.063153,-0.325169,-0.506795,0.069333,-6.944053,-0.195624


#### Loop over all firms using `.groupby`, `.apply` and `.assign`

The `lambda` function is `apply`ied the sub-frame `d` for each firm. The `**` operator unpacks the dictionary of the form `{series_name:series,...}` that is the result of the `mes` function to create new fields / columns / series in the dataframe.  
Optional final call to `.loc` selects outputs from union of inputs & outputs.

In [139]:
mes_data = generate_data(firms,dates,['ret0_m','s_m','ret0_x','s_x','beta_x','p_mx'])

mes_data \
    .groupby(['Firm']) \
    .apply(lambda d: d.assign(**mes(d.ret0_m, d.s_m, d.ret0_x, d.s_x, d.beta_x, d.p_mx, 0.05, 0.4,debug=False))) \
    .loc[:,['mes','lrmes']] # Select outputs to exclude the inputs



Unnamed: 0_level_0,Unnamed: 1_level_0,mes,lrmes
Firm,Date,Unnamed: 2_level_1,Unnamed: 3_level_1
RY.TO,2018-03-01,0.850347,0.611251
RY.TO,2018-03-02,,0.411874
RY.TO,2018-03-03,0.085553,-0.041825
TD.TO,2018-03-01,0.132959,0.499968
TD.TO,2018-03-02,0.273335,0.347665
TD.TO,2018-03-03,-0.895704,0.265617
BNS.TO,2018-03-01,,0.280327
BNS.TO,2018-03-02,-0.964392,-0.95252
BNS.TO,2018-03-03,-0.127518,0.352081
BMO.TO,2018-03-01,,-0.130009


### `srisk`

https://github.com/TommasoBelluzzo/SystemicRisk/blob/master/ScriptsProbabilistic/calculate_srisk.m

In [9]:
def srisk(lrmes,tl_x,mc_x,l=0.08):
    '''
    Calculate the SRISK measure of systemic risk
    Author: Tommaso Belluzzo (Matlab)
    In python, input vectors are pandas series objects; output type is pandas dataframe
    :param lrmes:   A vector of floats containing the LRMES values.
    :param tl_x:    A numeric vector containing the firm total liabilities.
    :param mc_x:    A numeric vector containing the firm market capitalization.
    :param l:       A float [0.05,0.20] representing the capital adequacy ratio (optional, default=0.08).
    :return srisk:  A dict of series including SRISK.
    '''
    srisk = ((l * tl_x) - ((1 - l) * (1 - lrmes) * mc_x)).clip(lower=0)
    return {'srisk':srisk}

#### Using `.groupby` & `.apply`

In [121]:
srisk_data = generate_data(firms,dates,['lrmes','tl_x','mc_x'])

srisk_data \
    .groupby(['Firm']) \
    .apply(lambda df: df.assign(**srisk(df.lrmes,df.tl_x,df.mc_x,0.08))) \
    .loc[:,['srisk']]

Unnamed: 0_level_0,Unnamed: 1_level_0,srisk
Firm,Date,Unnamed: 2_level_1
RY.TO,2018-03-01,0.247052
RY.TO,2018-03-02,2.207659
RY.TO,2018-03-03,0.0
TD.TO,2018-03-01,0.142899
TD.TO,2018-03-02,0.0
TD.TO,2018-03-03,0.0
BNS.TO,2018-03-01,0.551315
BNS.TO,2018-03-02,0.222487
BNS.TO,2018-03-03,0.903932
BMO.TO,2018-03-01,4.146336


#### Simple, no split, approach (no aggregation over firm series)

In fact, because `srisk` does not require any aggregate functions, it is not strictly necessary to split the full dataframe into chunks for for each firm.  

First approach simply builds a dataframe from the dict.

In [118]:
srisk_data = generate_data(firms,dates,['lrmes','tl_x','mc_x'])
pd.DataFrame(srisk(df.lrmes,df.tl_x,df.mc_x,0.08))[0:6]

Unnamed: 0_level_0,Unnamed: 1_level_0,srisk
Firm,Date,Unnamed: 2_level_1
RY.TO,2018-03-01,0.0
RY.TO,2018-03-02,0.066581
RY.TO,2018-03-03,0.1394
TD.TO,2018-03-01,0.986115
TD.TO,2018-03-02,0.0
TD.TO,2018-03-03,0.0


Alternative approach uses `.assign` (but not `.groupby`) & optionally `.loc` if we only want the output fields:

In [117]:
srisk_data = generate_data(firms,dates,['lrmes','tl_x','mc_x'])
srisk_data \
    .assign(**srisk(df.lrmes,df.tl_x,df.mc_x,0.08)) \
    .loc[:,['srisk']] # Select output to exclude inputs

Unnamed: 0_level_0,Unnamed: 1_level_0,srisk
Firm,Date,Unnamed: 2_level_1
RY.TO,2018-03-01,0.0
RY.TO,2018-03-02,0.066581
RY.TO,2018-03-03,0.1394
TD.TO,2018-03-01,0.986115
TD.TO,2018-03-02,0.0
TD.TO,2018-03-03,0.0
BNS.TO,2018-03-01,0.0
BNS.TO,2018-03-02,0.0
BNS.TO,2018-03-03,0.0
BMO.TO,2018-03-01,0.0


## Azure Machine Learning Studio - python module

### `azureml_main`

In [21]:
import pandas as pd
import numpy as np
from scipy.stats import norm
def azureml_main(df1 = None, df2 = None):
    return df3

#### Parameters - scalar floats into a single series

In [184]:
parameters_mes_df = pd.Series({'a':0.05, 'd':0.4}).to_frame(name='Parameters')

parameters_mes_df.loc[['a','d'],'Parameters']  # Can index the values

a    0.05
d    0.40
Name: Parameters, dtype: float64

In [179]:
parameters_mes_dict = parameters_mes_df.to_dict()['Parameters']

In [183]:
[a,d]=parameters_mes_dict.values()

[0.050000000000000003, 0.40000000000000002]

#### Parameters - heterogeneous types

In [188]:
pd.Series({'p1':True, 'p2':2, 'p3':2.2, 'p4':"Bla", 'p5':[1.,2.], 'p6':{'p6a':"Bla"}}).to_frame(name='Parameters')

Unnamed: 0,Parameters
p1,True
p2,2
p3,2.2
p4,Bla
p5,"[1.0, 2.0]"
p6,{'p6a': 'Bla'}


#### `azureml_main_ccgarch`

Note that this module will be written in R in the final AMLS solution. This python function is just for design & test purposes in the Azure notebook.

In [10]:
import pandas as pd
# import numpy as np
def azureml_main_ccgarch(df_firm_returns, df_market_price):
    '''Estimate parameters for a DCC-GARCH model
    :param df_firm_returns: 
    :param df_market_price: 
    :returns: Conditional variances h, DCC conditional correlations R'''
    
    return df_dcc_garch

#### `azureml_main_mes`

In [11]:
import pandas as pd
import numpy as np
from scipy.stats import norm
def azureml_main_mes(df_ccgarch , df2 = df_mes_paras):
    '''Calculate Marginal expected shortfall (MES) & beta for firms.
    :param df_dcc_garch: 
    :returns: Marginal expected shortfall (MES), long-range MES, beta''' 
    [a,d] = df_mes_paras.to_dict()['Parameters'].values()
    # Complement to 1 of the confidence level (optional, default=0.05) a in [0.01,0.10]
    # Six-month crisis threshold for the market index decline used to calculate LRMES (optional, default=0.40).
    output_fields = ['mes','lrmes','beta_x'] # + ['ret0_m','s_m','ret0_x','s_x','p_mx']
    return df_ccgarch \
                .groupby(['Firm']) \
                .apply(lambda df: d.assign(**mes(df.ret0_m, df.s_m, df.ret0_x, df.s_x, df.beta_x, df.p_mx, a, d, debug=False))) \
                .loc[:,output_fields] 

NameError: name 'df_mes_paras' is not defined

In [150]:
df_mes_paras = pd.Series({'a':0.05, 'd':0.4}).to_frame(name='Parameters')
df_ccgarch = generate_data(firms,dates,['ret0_m','s_m','ret0_x','s_x','beta_x','p_mx'])

azureml_main_mes(df_ccgarch, df_mes_paras) 

Unnamed: 0_level_0,Unnamed: 1_level_0,ret0_m,s_m,ret0_x,s_x,beta_x,p_mx
Firm,Date,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
RY.TO,2018-03-01,0.272933,0.913575,0.603346,-1.028871,1.418573,-0.386753
RY.TO,2018-03-02,0.455251,0.056976,0.151604,0.816126,-0.427207,1.795311
RY.TO,2018-03-03,2.011012,0.692939,-2.134896,0.600491,-0.25431,-0.048077
TD.TO,2018-03-01,0.90104,1.632888,0.46899,-2.744852,0.896312,-1.185023
TD.TO,2018-03-02,-0.332145,1.100406,1.935782,-0.394271,-0.219508,-2.128438
TD.TO,2018-03-03,1.246148,0.007561,1.055629,0.464768,0.75541,-2.679953
BNS.TO,2018-03-01,0.16973,1.338404,-0.090046,-0.313123,1.094004,-1.020318
BNS.TO,2018-03-02,-2.602466,3.02251,0.276737,0.193696,0.265474,1.115367
BNS.TO,2018-03-03,-2.055217,0.238081,0.144116,0.284929,-0.130618,0.667574
BMO.TO,2018-03-01,-1.200605,-0.544792,0.88666,0.016869,-1.304055,2.020138


#### `azureml_main_srisk`

In [None]:
import pandas as pd
# import numpy as np
def azureml_main_srisk(df_mes, df_balance_sheet):
    '''Calculate SRISK for a set of firms from MES values and balance sheet data: assets & liabilities.
    :param df_mes: Marginal expected shortfall (MES), long-range MES, beta
    :param df_balance_sheet: firm balance sheet data: assets & liabilities (debt)
    :returns: SRISK systemic risk measure'''
    return df_srisk