## Risk Management & Applications
As long as there have been economies, there has been financial risk.  Risk generally comes from a variety of sources including but not limited to:
    1. Market risk - risk from price changes and losses on owned securities
    2. Credit risk - risk of clients defaulting or being downgraded
    3. Operational risk - risk associated with processing and systems failures
    4. Legal risk - contract violation and any regulatory exposure through non-compliance
Market risk can be broken down further into a handful of main groups:
    1. Equity risk
    2. Interest rate risk
    3. Currency (Forex) risk
    4. Commodity risk
In this lecture we will attempt to model aggregate risk, and have a general discussion on what insights modeling can provide us.  Can you think of any other risks not outlined above?




## Value at Risk
We briefly touched on this concept in our TD lecture, but here we will give it a more thorough analysis on the theory behind why it should work.  THere are two main ways to communicate VAR:
    1. Maximum Loss - Here we state that the VAR is the max loss that can be incurred for a security or portfolio over some holding period given a specific confidence level.  For example, we will state that we are 95% confident that the loss, as measured by VAR, won't exceed $2MM over the next month.

    2. Minimum Loss - Here we state that the VAR is the min loss that can be incurred for a security or portfolio over some period given a significance level.  For example, we will state that there is a 5% chance that the loss, as measured by VAR, will exceed $2MM over the next month.  
    
Measured needed:
    1. Current Asset(s) price(s)
    2. Expected Returns (sometimes assumed 0 for conservatism)
    3. Volatility of returns (standard deviation)
    4. Covariance of all assets
    
Key Assumptions:
    1.  It is assumed, for this model, that standard conditions will continue into the projection period.  This is manifested by using historical measures to measure volatility going forward.  Generally, when black swan events occur, this model will not be an accurate predictor of near term loss expectations.
    2. The underlying distribution is normal.
    
One important issue in computing VAR is a one tailed test.  Also, generally in practice, it's assumed that the standard deviation comes from a large sample, so the t factor (for large samples) approaches the standard normal distribution, such that, for 5% VAR it is 1.65, and for 1% VAR it is 2.33. 


    

## Testing for normality
We will begin by testing for normality of an underlying data set.  In order to test this, we will implement the Kolmogorov-Smirnov test (also known as the KS test).  While this test can theoretically be used to test goodness of fit to any distribution, we will apply it to the normal distribution.

The following is how to implement the test:
    1. Specify the cumulitive density function (for us it will be normal)
    2. To set up the theoretical normal distribution, we need to standardize the observations to a standard normal distribution and then look up the associated cumulative probability iin the tables
    3. Then take the absolute difference between the actual and the theoretical cumulative probability distribution 
    4. Find the maximum difference (D) from the above step
    5. Use this D value to compare to the critical KS table value.  If D is greater than the critical table value, then we reject the null hypothesis that the actual data is normally distributed.  

In [27]:
import pandas as pd
import statsmodels.api as sm
import math

In [19]:
# For simplicity, we will assume a portfolio of $10,000, and only holding 1 stock: T

## Read in Data
spy_data = pd.read_csv('SPY_Data.csv')

##Run KS test
x = spy_data['Return']
ks = sm.stats.diagnostic.kstest_normal(x, dist='norm', pvalmethod='table')[1]  ##test
print("p-value: " + str(round(ks,2)))

p-value: 0.06


Since our p-value for the test is greater than 0.05, we fail to reject our null hypothesis that the data is normally distributed.  While this doesn't mean that our data is definitely normal, we can feel more confident. 

## Testing for Stationary Data
The next key assumption we are implicitely making is that the historical data is a meaningful predictor of the future.  To test this, we will use two tests:
    1. Autocorrelation function test (Durban-Watson test)
    2. Unit Root test
Dealing with non-stationary data (i.e. the mean and variance of the underlying distribution is not static) is an especially common issue with time series modeling so being confident in the underlying data is critical.    

## Autocorrelation
Our first test will be the autocorrelation  test.  In order to conduct this test, we will see if there is a pattern in the error terms of a time series regression of SPY return vs. SPY return of the prior day.  If there is a relationship in the error terms, then we can conclude our data is non-stationary.  

In [20]:
## Autocorrelation test
y = spy_data['Return']
x = spy_data['1_Day_Lag_Return']
model = sm.OLS(y, x)
results = model.fit()
print(results.summary())

                                 OLS Regression Results                                
Dep. Variable:                 Return   R-squared (uncentered):                   0.004
Model:                            OLS   Adj. R-squared (uncentered):             -0.000
Method:                 Least Squares   F-statistic:                             0.9172
Date:                Tue, 05 Apr 2022   Prob (F-statistic):                       0.339
Time:                        16:56:02   Log-Likelihood:                          819.71
No. Observations:                 252   AIC:                                     -1637.
Df Residuals:                     251   BIC:                                     -1634.
Df Model:                           1                                                  
Covariance Type:            nonrobust                                                  
                       coef    std err          t      P>|t|      [0.025      0.975]
-----------------------------------

Because our Durbin-Watson statistic is greater than 1.96, we can reject the null hypothesis that there is autocorrellation in our return data.  As above, this does not mean the data is not stationary, but rather, our first test failed to show that it i s. Our next test will be more rigorous.  

## Unit Root Test
The most effective and formal way of testing for stationary data is to use the Dicky-Fuller ters for the existance of unit roots.  Suppose you have the following:
![panda-wave](https://wikimedia.org/api/rest_v1/media/math/render/svg/223b7673f7bf871d08b19fb9a3c6ae1f0f619a0b)


where ${\displaystyle \alpha }$  is a constant, ${\displaystyle \beta }$  the coefficient on a time trend and ${\displaystyle p}$ the lag order of the autoregressive process. Imposing the constraints ${\displaystyle \alpha =0}$ and ${\displaystyle \beta =0}$ corresponds to modelling a random walk and using the constraint ${\displaystyle \beta =0}$ corresponds to modeling a random walk with a drift. Consequently, there are three main versions of the test, analogous to the ones discussed on Dickey–Fuller test (see that page for a discussion on dealing with uncertainty about including the intercept and deterministic time trend terms in the test equation.)

By including lags of the order p the ADF formulation allows for higher-order autoregressive processes. This means that the lag length p has to be determined when applying the test. One possible approach is to test down from high orders and examine the t-values on coefficients. An alternative approach is to examine information criteria such as the Akaike information criterion, Bayesian information criterion or the Hannan–Quinn information criterion.

The unit root test is then carried out under the null hypothesis ${\displaystyle \gamma =0}$ against the alternative hypothesis of ${\displaystyle \gamma <0.}$. Once a value for the test statistic

${\displaystyle \mathrm {DF} _{\tau }={\frac {\hat {\gamma }}{\operatorname {SE} ({\hat {\gamma }})}}}$
is computed it can be compared to the relevant critical value for the Dickey–Fuller test. As this test is asymmetrical, we are only concerned with negative values of our test statistic ${\displaystyle \mathrm {DF} _{\tau }}$. If the calculated test statistic is less (more negative) than the critical value, then the null hypothesis of ${\displaystyle \gamma =0}$ is rejected and no unit root is present.

In [24]:
## Execute DF test
df = sm.tsa.stattools.adfuller(x, maxlag=None, regression='c', autolag='AIC', store=False, regresults=False)[1]
print("p-value: "+str(round(df,5)))

p-value: 0.0


Since our p-value for the unit root test is below .05, we can be reasonably confident that the data we are working with is stationary, and more importantly, that the historicals will be meaningful to estimate the underlying mean and variance of the distribution, though not actually predict what the distribution will do at any specific time.  

## Test Conclusions
At this point, we can be relatively comfortable that our data is stationary, and reasonably resembles a normal distribution.  With this in mind we can proceed to digging into our Portfolio VAR

## VAR  Single Asset Portfolio

In [55]:
#Define our inputs 
start_value = 10000
returns = spy_data["Return"]
days_into_future = 252 ##we will assume that we are calculating 1-Year VAR

In [56]:
vol = returns.T.std()*math.sqrt(days_into_future)  ## annual volatility 
avg = returns.T.mean()*(days_into_future)  ##annual expected return
VAR = start_value *(avg - 1.645*vol)
print("There is a 5% chance that the loss, as measured by VAR, will exceed $"+str(-round(VAR,2))+" over the next year for a \n $10,000 portfolio of SPY.")

There is a 5% chance that the loss, as measured by VAR, will exceed $1030.63 over the next year for a 
 $10,000 portfolio of SPY.


## Concluding Remarks
While the above can make an investor or analyst more comfortable with an investment opportunity, it is incredibly important to note that this ignores black swan events in the medium term. In the long and short term, markets tend to act normally, but in the medium term, recessions happen.  In any instance of capital planning, be it for an institutional portfolio of commodities, or a personal brokerage account of SPY shares, knowing when liquidity will be needed is just as important as knowing where to invest capital.  

## A Note on Multi-Asset Portfolios
While our example focused on a simple single asset portfolio, was it really so simple?  SPY is an aggregation of 505 stocks, and we were able to compute the VAR and run our statistical analysis on a portfolio of all 505 stocks.  As you will be asked to do so on the homework, computing a multi-asset portfio's VAR is just as complicated as what was done above, simply prep the data by finding the portfolio's historical returns to use in the modeling instead of the individual assets.
