Tobias Kuhlmann, Karlsruhe Institute of Technology (KIT), tobias.kuhlmann@student.kit.edu

In [1]:
%matplotlib inline

# Pretty Display of Variables
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

# Double resolution plotting for retina display
%config InlineBackend.figure_format ='retina'

import numpy as np
import pylab as pl
import pandas as pd
import matplotlib.pyplot as plt

import glob
import os

import datetime as dt




## Option implied betas following Buss and Vilkov (2012)
Implements calculation of option implied correlations following Buss, A. and Vilkov, G., 2012. Measuring equity risk with option-implied correlations. The Review of Financial Studies, 25(10), pp.3113-3140.


## Idea

Buss and Vilkov use forward-looking information from option prices to estimate option-implied correlations and to construct an option-implied predictor of factor betas. All of that under risk neutral probability measure.

##### Intuition: 
- Only using historical information implies that the future is sufficiently similar to the past, which is questionable in financial markets -> Need for option implied information as they are forward-looking and traded, which means they include current market expectations. Use of option implied information may improve predictive quality
- Variance risk premium: Overinsurance of volatility risk -> Implied volatility is higher in magnitude than realized volatility
 - Intuition: Investors are also willing to pay a risk premium to hedge against changes in variance (rising variance)
- Correlation risk premium: Premium for uncertainty of future correlation. Need to insure future correlation, as a rise in correlation reduces diversification benefits. Underlying uncertainty: Correlations go crazy in crises -> Not much diversification in large market downturns and systemic crises, except with short positions. Result is that option-implied correlations are higher than realized.
 - Intuition: Investors are also willing to pay a risk premium to hedge against changes in correlations
 
##### Rep: Risk neutral vs realized
- Option implied measures are under risk neutral probability measure, which differs from realized / objective probability measure because of variance and correlation risk premium. Risk premia makes risk neutral moments biased predictors of realized moments. 
- Option implied correlations not directly observable, needs modeling choice -> parametric with assumptions

##### Technical conditions of correlation matrix
1. Propose a parametric way to estimate implied correlations not exceeding one
2. correlation risk premium is negative (so consistent with the literature)
3. correlation risk premium is higher in magnitude for low or negatively correlated stocks that are exposed to a higher risk of losing diversification benefits (so consistent with literature)
4. Correlation matrix should be consistent with empirical observations
    - Implied correlation is higher than realized
    - Correlation risk premium is larger in magnitude for pairs of stocks that provide higher diversification benefits (negative and low correlation)

## Theoretical implementation steps

##### Goal
- Implied correlation matrix $\Gamma^{Q}_{t}$ with elements $\rho^{Q}_{ij,t}$
- Implied betas $\beta^{Q}_{iM,t}$ calculated from implied correlations

##### Implied variance of market index
Observed implied variance of market index 
$$(\sigma^{Q}_{M,t})^2 = \sum_{i=1}^{N}\sum_{j=1}^{N}{w_iw_j\sigma^{Q}_{i,t}\sigma^{Q}_{j,t}\rho^{Q}_{ij,t}}$$
 
where $\sigma^{Q}_{i,t}$ is implied volatility of stock i in the index and $w_i$ are index weights
 
##### Parametric form of implied correlations $\rho^{Q}_{ij,t}$
$$\rho^{Q}_{ij,t} = \rho^{P}_{ij,t} - \alpha_t(1-\rho^{P}_{ij,t})$$
 
where $\rho^{P}_{ij,t}$ is expected correlation under objective measure (realized), $\alpha_t$ needs to be identified

##### Identify $\alpha_t$ in closed form
$$\alpha_t = \frac{(\sigma^{Q}_{M,t})^2-\sum_{i=1}^{N}\sum_{j=1}^{N}{w_iw_j\sigma^{Q}_{i,t}\sigma^{Q}_{j,t}\rho^{P}_{ij,t}}}{\sum_{i=1}^{N}\sum_{j=1}^{N}{w_iw_j\sigma^{Q}_{i,t}\sigma^{Q}_{j,t}(1-\rho^{P}_{ij,t})}}$$

To satisfy above conditions: $-1\leq\alpha_t\leq0$

##### Calculate market betas $\beta_{ik,t}$ with option implied correlation
$$\beta^{Q}_{iM,t} = \frac{\sigma^{Q}_{i,t}\sum_{j=1}^{N}{w_j\sigma^{Q}_{j,t}\rho^{Q}_{ij,t}}}{(\sigma^{Q}_{M,t})^2}$$

For a multifactor model with multiple non-correlated factors, the beta $\beta_{ik,t}$, where $i$ is the stock and $k$ the factor, can be calulated as the ratio of stock-to-factor covariance $\sigma_{ik,t}$ to the factor variance $\sigma^{2}_{k,t}$. $\rho_{ij,t}$ is the pairwise stock correlation, $\sigma_{j,t}$ stock volatilities, and $w_j$ the factor-mimicking portfolio weights.
 
 
 

## Data

##### Data needed
- Return time series to calculate rolling window stock-stock correlations $\rho^{P}_{ij,t}$
- Stock weights in index over time 
- Implied volatilities of every single stock i $\sigma^{Q}_{i,t}$
- Implied volatilities of Index $\sigma^{Q}_{M,t}$
- Use stock-stock correlations, implied volatilities of single stocks and index implied volatility to calculate $\alpha_t$


In [2]:
# Validate relative file path and list files
os.listdir("../Option_Implied_Beta_Tobias/")

['Single_Stock_Skewness',
 'Fundamentals_SP500_Full.xlsx',
 'usdOIScurve.csv',
 '0_Paper',
 'Calculation_Process2.ipynb',
 'SP500 Prices',
 'instrumentid_and_symbol.csv',
 'CRAMnoarbEOD_USOPT0007588D1_measuresByMaturity.csv',
 '.ipynb_checkpoints']

###### Import instrument id on ticker mapping

In [3]:
id_ticker_map = pd.read_csv("../Option_Implied_Beta_Tobias/instrumentid_and_symbol.csv")
id_ticker_map.shape
id_ticker_map.head(1)


(7590, 3)

Unnamed: 0,instrumentid,symbol,name
0,USOPT0000001D1,1R,NFX (OPIS) Mont Belvieu Non-LST Propane Future


##### Import USD risk free rate

In [4]:
risk_free_rate = pd.read_csv("../Option_Implied_Beta_Tobias/usdOIScurve.csv", sep=";")
risk_free_rate.head(1)

Unnamed: 0,loctimestamp,maturity,yld
0,2002-01-01,7,1.785


##### Stock weights in index over time 

- Problem: In SP500 fundamentals excel sheet only current market cap, need free float market cap

S&P 500 index is free-float market capitalization weighted. 
Free-float weighted means that instead of full market cap, only the public float of the company is considered when calculating its weight. Not all the shares of a company can be traded freely but some stocks might be under restrictions from SEC. S&P Dow Jones is assigning IWF (Investable Weight Factor) for all components part of its US indexes based on the component’s float

References: 
- https://us.spindices.com/documents/index-policies/methodology-sp-float-adjustment.pdf
- http://siblisresearch.com/data/weights-sp-500-companies/


##### ToDo
- Get free-float market caps from all SP500 stocks (Bloomberg?)
- Calculate free-float market cap = Share price  ×  Free Float for all stocks



##### Implied volatilities of Index $\sigma^{Q}_{M,t}$
- Question: Daiana file: Bereits bereinigt? Maturities 30 days out, interpolated out
- Ein Tag: 14 Tage, 41 Tage -> linear interpolieren auf 30 Tage
- Interpolation aus Simon's vol surface paper


In [5]:
sp500_options = pd.read_csv("../Option_Implied_Beta_Tobias/CRAMnoarbEOD_USOPT0007588D1_measuresByMaturity.csv", sep=";")
sp500_options.shape
sp500_options.head(1)



(58244, 14)

Unnamed: 0,instrumentID,loctimestamp,daystomaturity,underlyingprice,underlyingforwardprice,bakshiVariance,bakshiCubic,bakshiQuartic,bakshiSkew,bakshiKurt,hellingerVar,hellingerSkew,hellingerQuart,SVIX
0,USOPT0007588D1,2004-01-02 00:00:00,15,1108.03,1110.05,0.00136,-0.000114,4.6e-05,-2.26189,24.9558,1611.08,-41.2902,8.06682,0.030755


##### Implied volatilities of every single stock i $\sigma^{Q}_{i,t}$
Read in risk neutral measures for single stocks and combine in dataframe

In [None]:
# the path to your csv file directory
mycsvdir = '../Option_Implied_Beta_Tobias/Single_Stock_Skewness/'

# get all the csv files in that directory
csvfiles_w_path = glob.glob(os.path.join(mycsvdir, '*.csv'))

# loop through the files and read them in with pandas
stock_risk_neutral_measures = pd.DataFrame(columns=['id', 
                                     'loctimestamp',
                                     'daystomaturity', 
                                     'bakshiVariance',
                                     'hellingerVar',
                                     'SVIX' , 
                                     'interpolmaturity']) 
for csvfile in csvfiles_w_path:
    df = pd.read_csv(csvfile, usecols=['loctimestamp',
                                     'daystomaturity', 
                                     'bakshiVariance',
                                     'hellingerVar',
                                     'SVIX' , 
                                     'interpolmaturity'])
    # add column with instrument id
    filename_wo_ext=os.path.basename(csvfile)
    df['id'] = os.path.splitext(filename_wo_ext)[0].partition("_")[0]
    # append
    stock_risk_neutral_measures = stock_risk_neutral_measures.append(df[['id',
                                     'loctimestamp',
                                     'daystomaturity', 
                                     'bakshiVariance',
                                     'hellingerVar',
                                     'SVIX' , 
                                     'interpolmaturity']])

stock_risk_neutral_measures.shape
print(f"Unique instrument id: {stock_risk_neutral_measures.id.unique().shape}")

##### Stock prices
Why are there 871 files / ticker symbols, SP500 supposed to have 505?

In [None]:
# the path to your csv file directory
mycsvdir = '../Option_Implied_Beta_Tobias/SP500 Prices/'

# get all the csv files in that directory
csvfiles_w_path = glob.glob(os.path.join(mycsvdir, '*.csv'))

In [None]:
# loop through the files and read them in with pandas
stock_prices = pd.DataFrame(columns=['id', 'Date', 'CLOSE']) 
for csvfile in csvfiles_w_path:
    df = pd.read_csv(csvfile, usecols=['Date', 'CLOSE'])
    # add column with ticker symbol
    filename_wo_ext=os.path.basename(csvfile)
    df['id'] = os.path.splitext(filename_wo_ext)[0]
    # append
    stock_prices = stock_prices.append(df[['id', 'Date', 'CLOSE']])
    
stock_prices.shape
print(f"Unique ticker labels: {stock_prices.id.unique().shape}")

##### Rolling window stock-stock correlations $\rho^{P}_{ij,t}$ under empirical measure P
- rolling window correlations of log returns: rolling window = 252

## Implementation

##### Steps
- Return time series to calculate stock-stock correlations $\rho^{P}_{ij,t}$
- Stock weights $w_i$ in index over time (time-varying or time-invariant?)
- Implied volatilities of every single stock i $\sigma^{Q}_{i,t}$
- Implied volatilities of Index $\sigma^{Q}_{M,t}$
- Use stock-stock correlations, implied volatilities of single stocks and index implied volatility to calculate $\alpha_t$