# Ben's Answers
* What are the differences between historical volatility and implied volatility?

Historical Volatility is measured over a sample of time. Implied volatility is calculated from the trading price of current options.
  * Can instantanious historical volatility be calculated?
  
  No historical volatility needs to be calculated overtime. Implied volatility is instantanious because it is observed in option prices
* How can we evaluate our historical volatility estimators? 

Bias and efficiency
* Write down the historical volatility calculations from ch. 2 

See functions below.
   * What estimators can we evaluate with Quandl/yahoo data sources?

All estimators except First Exit Time Estimator which requires higher frequency. Calculations only require stock price's high, low, open, and close. However all can be improved by high frequency sampling (15 minutes, half hours)
* What are some ways sampling effects historical volatility?

Long sampling periods make initial data less relavent because volatility is not constant however short volatility sampling periods have significant bias.
  * Consider the overnight variance handling method pg. 30-31

This lets us consider non trading time periods (jumps) as a weighted effect based on historical observations of stock behavior
  
* Why is the first exit time estimator fundamentally different from the other estimators?

Measuring how fast prices change not how far prices move
  * Consider eq. (2.18)'s significance

This eqaution relates gives us a set period of time to measure how far based on the average velocity of stock movements
  
* What are some examples of fundimental analysis which can suppliment observed changes in volatility?

Earnings calls, dividends announcements, upcoming litigation, clinical trials (bio-tech)

In [1]:
import numpy as np
import pandas as pd
import quandl

In [2]:
# Close to Close Estimator
def c2c_wrong(df):
    """Close Close estimator from Euan Sinclair's Volatility Trading"""
    close=df['Adj_Close']
    n=len(close)
    return np.sqrt(1/(len(close)-1)*np.sum(np.power(close-np.mean(close),2)))

In [3]:
def c2c(df):
    """Equation 5.3 in Espen Gaarder Haug's book The Complete Guide to Option Pricing Formulas"""
    close=df['Adj_Close']
    n=len(close)
    return np.sqrt(1/(n-1)*np.sum(np.power(np.log(np.divide(close[1:],close[:-1])),2))-1/(n*(n-1))*np.power(np.sum(np.log(np.divide(close[1:],close[:-1]))),2))

In [4]:
#Parkinson Estimator
def park(df):
    high=df['Adj_High']
    low=df['Adj_Low']
    return np.sqrt(1/(4*len(high)*np.log(2))*np.sum(np.power(np.log(np.divide(high,low)),2)))

In [5]:
#Garman-Klass Estimator
def gk(df):
    high=df['Adj_High']
    low=df['Adj_Low']
    close=df['Adj_Close']
    n=len(close)
    return np.sqrt(1/n*np.sum(1/2*np.power(np.log(np.divide(high,low)),2))-1/n*np.sum((2*np.log(2)-1)*(np.power(np.log(np.divide(close,close-1)),2))))

In [6]:
#Rodgers-Satchell-Yoon Estimator
def rsy(df):
    high=df['Adj_High']
    low=df['Adj_Low']
    close=df['Adj_Close']
    open_=df['Adj_Open']
    n=len(close)
    return np.sqrt(1/n*np.sum(np.log(np.divide(high,close))*np.log(np.divide(high,open_))+np.log(np.divide(low,close))*np.log(np.divide(low,open_))))

In [7]:
#Yang-Zhang Estimator
def yz(df):
    high=df['Adj_High']
    low=df['Adj_Low']
    close=df['Adj_Close']
    open_=df['Adj_Open']
    n=len(close)
    k=0.34/(1+(n+1)/(n-1))
    sig_o2=1/(n-1)*np.sum(np.power(np.log(np.divide(open_[1:],close[:-1])),2))
    sig_c2=1/(n-1)*np.sum(np.power(np.log(np.divide(close[1:],open_[:-1])),2))
    sig_rs2=(1/n*np.sum(np.log(np.divide(high,close))*np.log(np.divide(high,open_))+np.log(np.divide(low,close))*np.log(np.divide(low,open_))))
    return np.sqrt(sig_o2+sig_c2+(1-k)*sig_rs2)

In [8]:
quandl.ApiConfig.api_key = ##################

In [9]:
df = quandl.get('EOD/HD', start_date='2016-12-28', end_date='2017-12-28')

In [10]:
park(df)

0.007595636044045016

In [11]:
c2c(df)

0.008253980027371693

In [12]:
gk(df)

0.007829593329408868

In [13]:
rsy(df)

0.007790844720663628

In [14]:
yz(df)

0.013910650522288003

What is my error in the close to close method from Sinclair? 

In [15]:
c2c_wrong(df)

12.790607221717753