# Introduction

- We typically assume that the volatility of returns for a security is a good measurement of risk
    - The standard deviation or variance are both good measures of volatility
        - The term "volatility" often refers to standard deviation

- If we model returns using a normal distribution, it will treat a positive deviation similarly to a negative deviation because the distribution is symmetrical
    - This goes against our conventional wisdom; a negative deviation has a bigger impact than a positive deviation of the same amount
        - To account for this asymmetry, the **Sortino measure** suggests a lower partial deviation

- Another assumption for the basic measurement of volatility is that it is constant over time
    - This is not true; periods of high/low volatility are sticky but not constant
        - To account for the inconsistency, we can use the **Auto Regressive Conditional Heteroskedasticity (ARCH)** process
            - This process is expanded further using the **Generalized Auto Regressive Conditional Heteroskedasticity (GARCH)** process

_____

# Conventional volatility measure - standard deviation

- The term *volatility* usually refers to the standard deviation of returns for a security
    - E.g. "The volatility of IBM is 20%" means the annualized standard deviation for the returns of IBM's stock was 0.2

- Let's calculate the volatility for acutal IBM data

In [1]:
import pandas as pd
import numpy as np
from pandas_datareader import data as pdr
import fix_yahoo_finance as yf
yf.pdr_override()
%matplotlib inline

In [2]:
df_IBM = pdr.get_data_yahoo('IBM', start = '2009-01-01', end='2014-01-01')

[*********************100%***********************]  1 of 1 downloaded


In [3]:
df_IBM.head()

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2009-01-02,83.889999,87.589996,83.889999,87.370003,66.577576,7558200
2009-01-05,86.419998,87.669998,86.18,86.82,66.158447,8315700
2009-01-06,87.110001,90.410004,86.370003,89.230003,67.994942,9649500
2009-01-07,87.830002,88.800003,87.120003,87.790001,66.897629,8455100
2009-01-08,87.809998,88.139999,85.980003,87.18,66.432793,7231800


- Calculating returns

In [4]:
df_IBM['return'] = df_IBM['Adj Close'].shift(1)/df_IBM['Adj Close']-1

### Aside

- Recall that if we have $n$ observations of daily returns, then the mean daily return is $\frac{\sum{r_{i}}}{n}$
    - Similarly, the standard error of the mean is equal to $\frac{\sigma_{r}}{\sqrt{n}}$

- In this scenario, we can calculate the standard error of the mean by simply calculating the $\sigma$ of our return column
    - Therefore, to calculate the volatility of returns, we need to multiply the standard error by $\sqrt{n}$
    
- We'll assume that there are 252 trading days in a year

In [5]:
standard_error = np.std(df_IBM['return'])
volatility = standard_error*np.sqrt(252)

In [6]:
volatility

0.20868237077512763

- So, based on this data, we could say that the volatility of IBM over the 5-year period was about 21%

____

# Tests of normality

## 1. Shapiro-Wilk test

- We won't get into the details, but the Shapiro-Wilk test checks whether data is normally distributed
    - Here, the null hypothesis is that the data IS normally distributed, so if the test rejects the null, the data is NOT normal

In [7]:
from scipy.stats import shapiro

In [8]:
shapiro(df_IBM['return'].dropna().values)

(0.9316559433937073, 1.5859177597029078e-23)

- The first value (0.9316...) is the test statistic, and the second (1.58...) is the p-value
    - Since this p-value is well below the usual threshold of 0.05, we can **reject the null**
        - Therefore, according to this test, the returns are NOT normally distributed

## 2. Anderson-Darling test

- The scipy function for this test is similar to that of the Shapiro-Wilk test, except that we can alos use is for the exponential, logistic, and Gumbel distributions
    - **Note**: the function's default test is for the normal distribution

In [9]:
from scipy.stats import anderson

In [10]:
anderson(df_IBM['return'].dropna().values)

AndersonResult(statistic=14.72247062396923, critical_values=array([0.574, 0.654, 0.785, 0.915, 1.089]), significance_level=array([15. , 10. ,  5. ,  2.5,  1. ]))

- This first value (14.72...) represents the test statistic

- The next two represent the critical values and their corresponding levels of significance
    - For example, if we want a 1% confidence interval, the test statistic must have a value of at least 1.089
        - The value of 14.72 way exceeds this threshold, therefore we reject the null

____

# Estimating fat tails

- Recall: the first 4 moments of a data set are:
    1. Mean
    2. Variance
    3. Skew
    4. Kurtosis

$$
\text{Mean: }\mu = \bar{r} = \frac{\sum{r_{i}}}{n} \\
\text{Variance: }\sigma^{2} = \frac{\sum \left (r_{i} - \bar{r}\right )^{2}}{n-1} \\
\text{Skew: }skew = \frac{\sum \left (r_{i} - \bar{r} \right )^{3}}{(n-1)\sigma^{3}} \\
\text{Kurtosis: } kurtosis = \frac{\sum\left (r_{i} - \bar{r} \right )^{4}}{(n-1)\sigma^{4}}\\
$$

- Let's estimate the first four moments of daily returns for the S&P 500

In [11]:
df_SP = pdr.get_data_yahoo('SPY', start = '1970-01-01', end = '2013-12-31')

[*********************100%***********************]  1 of 1 downloaded


In [12]:
df_SP['daily return'] = df_SP['Adj Close'].shift(1)/df_SP['Adj Close']-1

In [13]:
from scipy.stats import skew, kurtosis

In [14]:
ret_data = df_SP['daily return'].dropna().values
mean = np.mean(ret_data)
std = np.std(ret_data)
skewness = skew(ret_data)
kurtosis_ = kurtosis(ret_data)
mean, std, skewness, kurtosis_

(-0.00027222521985948664,
 0.012187253117670008,
 0.31700563261877923,
 9.552432462334776)

____

# Lower partial standard deviation and the Sortino ratio

- One problem with using the simple standard deviation of daily returns is that a positive change is considered a bad thing
    - Another problem is that we're comparing our performance to the mean, instead of a fixed benchmark (e.g. an assumed RFR of 5%)
    
- To account for these issues, Sortino suggested calculating the **lower partial standard deviation**:

$$
\left (LPSD \right ) = \frac{\sum \left (r^{*}_{i} - RFR \right )^{2}}{m-1}; \text{ where }r^{*}_{i} \leq RFR
$$

- Importing the Fama-French RFR data

In [15]:
import pickle

In [16]:
df_RFR = pd.read_pickle('ffDaily.pkl')[['Rf']]

- Merging RFR onto the returns data for IBM

In [17]:
df_IBM = df_IBM.join(df_RFR, how = 'left')

In [18]:
df_IBM['difference'] = df_IBM['return'] - df_IBM['Rf']

In [19]:
df_IBM.head()

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume,return,Rf,difference
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2009-01-02,83.889999,87.589996,83.889999,87.370003,66.577576,7558200,,0.0,
2009-01-05,86.419998,87.669998,86.18,86.82,66.158447,8315700,0.006335,0.0,0.006335
2009-01-06,87.110001,90.410004,86.370003,89.230003,67.994942,9649500,-0.027009,0.0,-0.027009
2009-01-07,87.830002,88.800003,87.120003,87.790001,66.897629,8455100,0.016403,0.0,0.016403
2009-01-08,87.809998,88.139999,85.980003,87.18,66.432793,7231800,0.006997,0.0,0.006997


In [20]:
diff_data = df_IBM[df_IBM['difference']<0]['difference'].dropna().values

- Calculating the annualized LPSD

In [21]:
np.std(diff_data)*np.sqrt(252)

0.1475022038478313

____

# Testing equivalency of volatility over two periods

- The stock market fell dramatically in Oct 1987
    - We can compare market volatility before and after the drop using **Bartlett's test**

- We can compare the performance of Ford Motor Corp before and after the crash

In [22]:
df_Ford_before = pdr.get_data_yahoo('F', start = '1982-09-01', end = '1987-09-01')
df_Ford_after = pdr.get_data_yahoo('F', start = '1987-12-01', end = '1992-12-01')

[*********************100%***********************]  1 of 1 downloaded
[*********************100%***********************]  1 of 1 downloaded


In [32]:
df_Ford_before['return'] = df_Ford_before['Adj Close'].shift(1)/df_Ford_before['Adj Close']
df_Ford_after['return'] = df_Ford_after['Adj Close'].shift(1)/df_Ford_after['Adj Close']

ret_before = df_Ford_before['return'].dropna().values
ret_after = df_Ford_after['return'].dropna().values

In [33]:
np.std(ret_before), np.std(ret_after)

(0.022901697327302302, 0.022991981703357433)

In [37]:
from scipy.stats import bartlett

In [38]:
bartlett(ret_before,ret_after)

BartlettResult(statistic=0.019548491736464998, pvalue=0.8888054338011235)

- For some reason, these results don't agree with those in the book. But whatever

____

# Testing heteroskedasticity

- If volatility is constant, then residuals (i.e. noise) should be more or less random
    - Otherwise, that means the errors increase/decrease with time (i.e. residuals are correlated to time)
        - This is a problem in regression modelling

- Skipping the rest