# Part 4 - Code

## I. Libraries and data

In [1]:
import numpy as np
import pandas as pd
import cmath
import scipy.integrate as integrate
import scipy.special as special
from scipy.optimize import least_squares

In [2]:
#data = pd.read_excel('/content/drive/MyDrive/Colab Notebooks/Numerical Analysis/dataD1.xlsx')
data = pd.read_excel('dataD3.xlsx')

---

## II. The Characteristic Functions

__Start with the general idea of valuation__, which states that _the value of the option at time t is the expected value of the option payoff discounted at risk-free rate_

$V_t = e^{-r(T-t)}𝔼[H(S_t)]$ 

where 

$H(S_t)$ is the payoff of the option as a function of $S_t$

__We use characteristic function__ instead of density function. Characteristic function performs one-to-one mapping with density functions. The characteristic function is used to obtain the _probability-related quantities_ $Π_1$ (the delta of the option) and $\Pi_2$ (the risk-neutral probability of option being exercised).

The characteristic function of Heston model is defined as below

$\large Ψ^{Heston}_{ln(S_t)}(w) = e^{C_{(t, w)}\bar{V} + D_{(t, v)} V_0 + iwln(S_0e^{rt})} $

where
- $C_{(t, w)} = a \large[ r^{-}.t - \frac{2}{\eta^2}ln(\frac{1-ge^{-ht}}{1-g})]$
- $D_{(t, w)} = r^{-}.\frac{1-e^{-ht}}{1-ge^{-ht}}$
- $r^{±} = \frac{\beta ± h}{\eta^2}$
- $h = \sqrt{\beta^2 - 4\alpha\gamma}$
- $ g = \frac{r^{-}}{r^{+}} $
- $\alpha = -\frac{w^2}{2} -\frac{iw}{2}$
- $\beta = \alpha - \rho\eta i w$
- $\gamma = \frac{\eta^2}{2}$

Note that the characteristic function is applied on the process $ln(S_t)$

In [3]:
def chfun_heston(s0, v0, vbar, a, vvol, r, rho, t, w):
    """  
    Arguments:
    - s0 [float] the current price of underlying asset
    - v0 [float] the current volatility (variance)
    - vbar [float] the long-term variance
    - a [float] the speed of mean-reversion of the variance
    - r [float] risk-free rate
    - t [float] 
    - w [float] 
    Returns:
    - Characteristic function of ln(St) evaluated at given parameters
    """
    alpha = -w*w/2 - complex(0,1)*w/2
    beta = a - rho*vvol*complex(0,1)*w
    gamma = vvol*vvol/2
    h = np.sqrt(beta*beta - 4*alpha*gamma)
    rplus = (beta + h)/vvol/vvol
    rminus = (beta - h)/vvol/vvol
    g = rminus/rplus
    C = a*(rminus*t - (2/vvol**2)*np.log((1 - g*np.exp(-h*t))/(1-g)))
    D = rminus*(1 - np.exp(-h * t))/(1-g*np.exp(-h*t))
    
    return np.exp(C*vbar + D*v0 + complex(0,1)*w*np.log(s0*np.exp(r*t)))

---

## III. Heston Model for Pricing

The mathematical construction of the Heston pricing method is as follow:

By general valuation framework, 

$V_t = e^{-r(T-t)}𝔼[H(S_t)]$

As for European Call, the option payoff at maturity is defined as $H(S_T) = (S_T - K)^+$. The value of option at time $t = 0$ is thus

$C_0 = e^{-rT}∫^{∞}_{0} (S_T - K)^+ f(S_T)dS_T$ where $f(S_T)$ is the density function of $S_T$

Rewrite the density function in terms of characteristic function $Ψ(w)$, we have

$f(S_T) = \frac{1}{2\pi} ∫^{∞}_{-∞} e^{-iwS_T}Ψ(w)dw$

Suppose that $\Pi_1$ and $\Pi_2$ are the the option delta and risk-neutral probability of exercise $P[S_T > K]$, the option price is then

$C_0 = S_0.\Pi_1 - e^{-rT}.K.\Pi_2$

with

- $\Pi_1 = \frac{1}{2} + \frac{1}{\pi}∫^{∞}_{0}Re[\frac{e^{-iwln(K)}.\Psi_{ln(S_T)}(w-i)}{iwΨ_{ln(S_T)}(-i)}dw] $
- $\Pi_2 = \frac{1}{2} + \frac{1}{\pi} ∫^{∞}{0} Re[\frac{e^{-iwln(K)}Ψ_{ln(S_T)}(w)}{iw}dw]$

The following code calculate the option price based on the above approach

In [4]:
def call_heston_cf(s0, v0, vbar, a, vvol, r, rho, t, k):

      """
      Arguments:
      - s0 [float] the current price of underlying asset
      - v0 [float] the current volatility (variance)
      - vbar [float] the long-term variance
      - a [float] the speed of mean-reversion of the variance
      - r [float] risk-free rate
      - rho [float] correlation coefficient of Brownian processes
      - t [float] time to maturity (in years)
      - k [float] strike price of the option
  
      Return [float] option price
      """
  
      # (1) pi1 - option delta
      int1 = lambda w, s0, v0, vbar, a, vvol, r, rho, t, k : \
            (np.exp(-complex(0,1)*w*np.log(k)))*chfun_heston(s0, v0, vbar, a, vvol, r, rho, t, w-complex(0,1))/ \
            (complex(0,1)*w*chfun_heston(s0, v0, vbar, a, vvol, r, rho, t, -complex(0,1)))
  
      int1 = integrate.quad(lambda w: int1(w, s0, v0, vbar, a, vvol, r, rho, t, k).real,0, 100)
  
      pi1 = 1/2 + int1[0]/np.pi
  
      # (2) pi2 - risk-neutral probability
      int2 = lambda w, s0, v0, vbar, a, vvol, r, rho, t, k : \
            (np.exp(-complex(0,1)*w*np.log(k))*chfun_heston(s0, v0, vbar, a, vvol, r, rho, t, w)/(complex(0,1)*w))
  
      int2 = integrate.quad(lambda w: int2(w, s0, v0, vbar, a, vvol, r, rho, t, k).real,0,100)
  
      pi2 = 1/2 + int2[0]/np.pi
  
      # (3)  return option price 
      return s0*pi1 - np.exp(-r*t)*k*pi2

---

## IV. Optimization

In order for the model to produce the prices that are closest to the actual price traded in the market, we try to optimize the hyperparameters. The hyperparameters to be optimized are $Ω = \{V_0, \bar{V}, a, \eta, \rho\}$

To optimize the hyperparameters, we execute the following steps:

### Step 1: Define measure of error

In our code, the measure of error is the sum of squared error



In [5]:
def costf(x):
    """
    Arguments:
     - x [array] parameters in order {V0, Vbar, a, eta, rho}

    Return [float] sum of squared error
    """
    
    cost = np.zeros([len(data)])
    for i in range(len(data)):
        cost[i] = data.loc[i, 'Mid'] - call_heston_cf(data.loc[i,'Spot'], x[0], x[1], (x[4]+x[2]**2)/(2*x[1]), x[2], data.loc[i, 'Interest rate'], x[3], data.loc[i, 'Maturity'], data.loc[i, 'Strike'])
        
    return sum(cost**2)

__Dataset preprocessing__

Our dataset contains 30 observations, which are the European calls written on the same underlying asset but at different strike price and maturity.

An overview of the dataset is provided as follow

In [6]:
data

Unnamed: 0,Spot,Maturity,Strike,Interest rate,Mid,Bid,Ask
0,39.63,0.049315,36,0.000632,3.75,3.7,3.8
1,39.63,0.049315,38,0.000632,2.145,2.13,2.16
2,39.63,0.049315,40,0.000632,1.035,1.02,1.05
3,39.63,0.049315,42,0.000632,0.435,0.42,0.45
4,39.63,0.049315,44,0.000632,0.17,0.16,0.18
5,39.63,0.126027,36,0.000707,4.3,4.25,4.35
6,39.63,0.126027,38,0.000707,2.91,2.89,2.93
7,39.63,0.126027,40,0.000707,1.85,1.84,1.86
8,39.63,0.126027,42,0.000707,1.095,1.08,1.11
9,39.63,0.126027,44,0.000707,0.615,0.61,0.62


For the purpose of calibrating the optimal parameters and to test the performance of our model, we divide the dataset into 2 subsets:
- Set 1 containing the first 25 observations, used for calibration
- Set 2 containing the last 5 observations, used for testing the pricing accuracy

In [7]:
data_test = data.loc[25:]
data_test

Unnamed: 0,Spot,Maturity,Strike,Interest rate,Mid,Bid,Ask
25,39.63,1.868493,35,0.00228,10.125,9.95,10.3
26,39.63,1.868493,37,0.00228,9.2,9.05,9.35
27,39.63,1.868493,40,0.00228,7.85,7.75,7.95
28,39.63,1.868493,42,0.00228,7.1,7.0,7.2
29,39.63,1.868493,45,0.00228,6.1,5.95,6.25


In [8]:
data = data[:25]
data

Unnamed: 0,Spot,Maturity,Strike,Interest rate,Mid,Bid,Ask
0,39.63,0.049315,36,0.000632,3.75,3.7,3.8
1,39.63,0.049315,38,0.000632,2.145,2.13,2.16
2,39.63,0.049315,40,0.000632,1.035,1.02,1.05
3,39.63,0.049315,42,0.000632,0.435,0.42,0.45
4,39.63,0.049315,44,0.000632,0.17,0.16,0.18
5,39.63,0.126027,36,0.000707,4.3,4.25,4.35
6,39.63,0.126027,38,0.000707,2.91,2.89,2.93
7,39.63,0.126027,40,0.000707,1.85,1.84,1.86
8,39.63,0.126027,42,0.000707,1.095,1.08,1.11
9,39.63,0.126027,44,0.000707,0.615,0.61,0.62


### Step 2: Run optimization scheme to locally minimize the error function

We choose to use `least_squares` for the optimization problem. In addition, we choose to do local optimization by specifying the bounds of each hyperparameters.

We denote
- `x0` as the array of initial hyperparameters
- `bounds` as the array of lower and upper bounds of the hyper parameters

The bounds are chosen on the following basis:
1. $\bar{V}, V_0 \in [0,1] $ 
  
  Long-term variance and initial variance are expected to be within 0 and 100%
2. $\rho \in [-1, 1] $ 

  Correlation is expected to fall within [-1, 1] as its property

3. $η \in [0, 5]$

  Volatility of variance is expected to be positive and dramatic

4. $a \in [-1, 1]$

5. Non-negativity constraint so that the variance process does not reach zero or negative values

  $2a\bar{V} - \eta^2 > 0$

In [9]:
x0 = [0.1, 0.3, 0.7, -0.5, 0]
D1_local_calibration = least_squares(costf, x0, bounds=([0,0,0,-1,0],[1,1,5,1,20]))
params = D1_local_calibration.x

In [10]:
params[-1] = (params[-1] + params[2]**2)/(2*params[1])

In [11]:
params

array([ 0.13098779,  0.17617938,  0.54243515, -0.17221106,  1.09542069])

In [12]:
params_df = pd.DataFrame(params).T
params_df.rename(columns = {0: 'V0',
                            1: 'Vbar',
                            2: 'eta',
                            3: 'rho',
                            4: 'a'}, inplace = True)

---

## V. Results

### 1. Parameters

In [13]:
params_df

Unnamed: 0,V0,Vbar,eta,rho,a
0,0.130988,0.176179,0.542435,-0.172211,1.095421


In [14]:
results = pd.DataFrame(data.loc[:,'Mid'])
results[['Model Price','Difference']] = np.zeros([])

for i in range(len(results)):
    results.loc[i, 'Model Price'] = call_heston_cf(data.loc[i,'Spot'], params[0], params[1], params[4], params[2], data.loc[i,'Interest rate'], params[3], data.loc[i,'Maturity'], data.loc[i,'Strike'])
    results.loc[i, 'Difference'] = np.abs(results.loc[i,'Mid'] - results.loc[i,'Model Price'])

In [15]:
results['Bid-Ask Spread'] = np.abs(data['Bid'] - data['Ask'])
results['Within BA Spread'] = (results['Difference'] < results['Bid-Ask Spread'])

### 2. Option Price

#### Result of option pricing on the dataset for calibrating

In [16]:
results

Unnamed: 0,Mid,Model Price,Difference,Bid-Ask Spread,Within BA Spread
0,3.75,3.816005,0.066005,0.1,True
1,2.145,2.234845,0.089845,0.03,False
2,1.035,1.097874,0.062874,0.03,False
3,0.435,0.442494,0.007494,0.03,True
4,0.17,0.145953,0.024047,0.02,False
5,4.3,4.30176,0.00176,0.1,True
6,2.91,2.925026,0.015026,0.04,True
7,1.85,1.857258,0.007258,0.02,True
8,1.095,1.100879,0.005879,0.03,True
9,0.615,0.61175,0.00325,0.01,True


In [17]:
mae_cal = np.mean(results['Difference'])
print(f"The mean absolute error on the calibrating data is {mae_cal:.6f}")

The mean absolute error on the calibrating data is 0.023471


#### Results of option price for the testing dataset

In [18]:
n_test = len(data_test)

In [19]:
price_model_test = np.zeros((n_test))
for j in range(n_test):
    i = j + 25
    price_model_test[j] = call_heston_cf(s0 = data_test.loc[i,'Spot'], 
                                         v0 = params[0], 
                                         vbar = params[1], 
                                         a = params[4], 
                                         vvol = params[2], 
                                         r = data_test.loc[i,'Interest rate'], 
                                         rho = params[3], 
                                         t = data_test.loc[i,'Maturity'], 
                                         k = data_test.loc[i,'Strike'])

In [20]:
price_test_df = pd.DataFrame({'Actual Price': data_test.Mid,
                              'Model Price': price_model_test,
                              'Bid-Ask Spread': np.abs(data_test['Bid'] - data_test['Ask'])})
price_test_df['Difference'] = np.abs(price_test_df['Actual Price'] - price_test_df['Model Price'])
price_test_df['Within Bid-Ask Spread'] = (price_test_df['Difference'] < price_test_df['Bid-Ask Spread'])

In [21]:
price_test_df

Unnamed: 0,Actual Price,Model Price,Bid-Ask Spread,Difference,Within Bid-Ask Spread
25,10.125,10.278271,0.35,0.153271,True
26,9.2,9.292275,0.3,0.092275,True
27,7.85,7.973847,0.2,0.123847,True
28,7.1,7.19545,0.2,0.09545,True
29,6.1,6.165908,0.3,0.065908,True


In [22]:
mae_test = np.mean(price_test_df['Difference'])
print(f"The mean absolute error on the testing data is {mae_test:.6f}")

The mean absolute error on the testing data is 0.106150
