# Option Pricing based on Heston and Black-Scholes Model

We use Monte Carlo simulation to implement the Heston and Black-Scholes model

In [1]:
import os
import time
import pathlib
import numpy as np
import pandas as pd
from tqdm import tqdm
from scipy.stats import norm

from heston import *
from blackscholes import *

## 0 Heston model for scalar values

We first implement a scalar version of the model according to the following formulae.  

$$d S_t  = \mu S_t dt + \sqrt{\nu_t} S_t dW^S_t \\ d \nu_t = \kappa (\theta - \nu_t) dt + \xi \sqrt{\nu_t} dW^\nu_t$$  

which is discretized as  

$$
S_{i+1} = S_i \exp [\left(\mu - \frac{1}{2} v_i^{+}\right) \Delta t + \sqrt{v_i^{+}} \sqrt{\Delta t} \Delta W^S_{i+1}] \\
\nu_{i+1} = \nu_i + \kappa (\theta - \nu_i^{+}) \Delta t + \xi \sqrt{\nu_i^{+}} \Delta W^\nu_{i+1}
$$

In [2]:
# set some parameters
num_sims = 100000;   # Number of simulated asset paths
num_intervals = 1000;  # Number of intervals for the asset path to be sampled 

S_0 = 100.0;    # Initial spot price
K = 100.0;      # Strike price
r = 0.0319;     # Risk-free rate
v_0 = 0.010201; # Initial volatility 
T = 1.00;       # One year until expiry

rho = -0.7;     # Correlation of asset and volatility
kappa = 6.21;   # Mean-reversion rate
theta = 0.019;  # Long run average volatility
xi = 0.61;      # "Vol of vol"

In [3]:
result = generate_heston_paths(S_0, T, K, r, kappa, theta, v_0,
                      rho, xi, num_intervals, num_sims)

100%|[32m██████████[0m| 1000/1000 [00:04<00:00, 237.62it/s]


In [4]:
for x in result:
  print(x)

6.857745620433957
3.7042566528167202
0.019215234072840794


## 1 Heston model for vector values 

To produce a large number of results, we need to utilize the built-in vectorization in `numpy`.

We first load data for experiment and preprocess (roughly)

In [12]:
parent_path = str(pathlib.Path(os.getcwd()).parent)
compressed = False
if compressed:
    import zipfile
    parent_path = str(pathlib.Path(os.getcwd()).parent)
    with zipfile.ZipFile(parent_path + "/data/combined.zip","r") as zip_ref:
        zip_ref.extractall(parent_path+"/data/")

In [16]:
df = pd.read_csv(os.path.join(parent_path, 'data/data.csv'))

print(df.shape)

df.sample(5)

(107499, 14)


Unnamed: 0.1,Unnamed: 0,optionid,securityid,strike,callput,date_traded,contract_price,market_price,underlyings_price,contract_volume,days_to_maturity,moneyness,rate,volatility
71090,71616,158035361.0,702263.0,14.75,C,2014-12-05,0.0725,0.07625,14.7175,175.0,7.0,0.997797,0.001361,0.106912
79565,80091,161441744.0,702263.0,17.2,C,2019-07-19,0.145,0.14875,16.03175,40.0,154.0,0.932079,0.021512,0.140598
12914,13302,150256638.0,506528.0,62.25,C,2006-09-14,0.205,0.2,58.772202,111.0,64.0,0.944132,0.054346,0.259342
88120,88646,163738109.0,702263.0,15.7,C,2019-03-04,0.2775,0.2725,15.892,43.0,11.0,1.012229,0.024492,0.099781
26057,26570,156827924.0,506534.0,5.25,C,2007-09-07,0.06,0.0605,5.1743,1634.0,14.0,0.985581,0.05829,0.244493


Here we use (almost) the same approach to model average long run volatility $\theta$, mean reversion rate of volatility $\kappa$, and the variance of volatility $\xi$.  

**Question:** Is this the best approach? Are there better approaches? Perhaps options with the same underlying asset should take correlated values?

In [17]:
# drop contract volume
df.drop(['contract_volume'], axis = 1)

# drop small strike prices
df = df.drop(df[df.strike<0.1].index)

# We may  add the following to the function for heston simulation

# add average long run volatility (theta)
df['mean_volatility'] = 0.001 + 0.05 * np.random.rand(len(df))

# add mean reversion rate of volatility (kappa)
df['reversion'] = 0.01 + 5 * np.random.rand(len(df))

# add variance of volatility
df['var_of_vol'] = 0.01 + 0.7 * np.random.rand(len(df))

# add correlation between random processes  
df['rho'] = -0.05 - 0.7 * np.random.rand(len(df))

df.head()

Unnamed: 0.1,Unnamed: 0,optionid,securityid,strike,callput,date_traded,contract_price,market_price,underlyings_price,contract_volume,days_to_maturity,moneyness,rate,volatility,mean_volatility,reversion,var_of_vol,rho
0,0,150034236.0,504569.0,0.42,C,2006-10-18,0.0715,0.07025,0.4885,5.0,2.0,1.163095,0.053646,0.022956,0.034938,3.024025,0.171832,-0.269833
1,1,150247468.0,504880.0,40.0,C,2006-10-18,0.124,0.1225,39.913799,56137.0,2.0,0.997845,0.053646,0.114784,0.009892,1.36782,0.695079,-0.062408
2,2,150255000.0,506496.0,62.0,C,2006-10-18,0.172,0.174,61.827798,27369.0,2.0,0.997223,0.053646,0.106823,0.017232,4.551421,0.366965,-0.609384
3,3,150255496.0,506497.0,53.5,C,2006-10-18,0.296,0.2655,53.6129,1224.0,2.0,1.00211,0.053646,0.110336,0.00448,0.939388,0.357749,-0.68554
4,4,150255498.0,506497.0,54.0,C,2006-10-18,0.075,0.0645,53.6129,963.0,2.0,0.992831,0.053646,0.110336,0.013351,1.627237,0.627991,-0.066153


For the ease of computation, we halve/reduce the size of `df`

In [18]:
df = df.sample(int(0.005*len(df)))
print(f"The dataframe now has a length {len(df)}")

The dataframe now has a length 537


## 2 Black-Scholes Model for vector values  

We implement the call/put option price of a Black-Scholes Model analystically according the following formulae.  

$$
\begin{aligned}
C(S,t) &= SN(d_1) - Ke^{-rT} N(d_2) \\
P(S,t) &= Ke^{-rT} - S + (SN(d_1) - Ke^{-rT} N(d_2))
\end{aligned}
$$  

where $N(x)$ is the cdf. of a standard normal distribution and $d_1,d_2$ are defined as  

$$
\begin{aligned}
d_1 &= \frac{\log(S/K) + (r+\frac{\sigma^2}{2})T}{\sigma \sqrt{T} }\\
d_2 &= d_1 - \sigma \sqrt{T}
\end{aligned}
$$  

The implementation of the Black-Scholes model can be found in `blackscholes.py` and for the Heston model it can be found in `heston.py`

## 1.1 Testing implementation  

We test the implementation and observe the MSE computed with the historical contract prices

In [20]:
# test_output = generate_heston_vec(dummy, 1000, 100)
output_heston = generate_heston_vec(df, 1000, 100)
output_bs = generate_bs_vec(df)

test_output_heston = np.sum((output_heston - df['contract_price'].values)**2) / len(df)
test_output_bs = np.sum((output_bs - df['contract_price'].values)**2) / len(df)

print('MSE for Heston: ', test_output_heston)
print('MSE for B-S: ', test_output_bs)

100%|[32m██████████[0m| 1000/1000 [00:02<00:00, 424.70it/s]

MSE for Heston:  0.05061445005096539
MSE for B-S:  0.019314992276848564



  d1 = (np.log(m) + (r + 0.5 * vol ** 2) * T) / (vol * np.sqrt(T))
