# Option Pricing 
The model assumes that the price of the stock follows the random walk $dS_t = S_t \mu dt + σ S_t dW_t$ where $W$ is a geometric Brownian motion (stochastic variable). Under risk-neutral pricing, stock trajectories can be simulated with $S_t = S_{t-1} e^{(r - \frac{1}{2}\sigma^2)dt} e^{\sigma Z_t \sqrt{dt}}$ where $r$ is the risk-free discount rate and $Z$ is standard normal random variable.

In [2]:
import numpy as np
from numpy import log, sqrt, exp
from scipy.stats import norm

In [8]:
def blackscholes(S, K, T, r, σ):
  d1 = (log(S/K) + (r + σ**2 / 2) * T) / (σ*sqrt(T))
  d2 = d1 - σ * sqrt(T)
  call = S * norm.cdf(d1) - K * exp(-r*T) * norm.cdf(d2)
  put = K * exp(-r*T) * norm.cdf(-d2) - S * norm.cdf(-d1)
  return call, put

In [11]:
S = 100
K = 100
T = 1
r = .05
σ = .12

print(blackscholes(S, K, T, r, σ))

(7.50513887446543, 2.628081324536833)


In [75]:
def paths(S, K, T, r, σ, N, steps):
  rng = np.random.default_rng()
  Z = rng.standard_normal((N, steps))
  dt = T / steps
  prices = np.zeros((N, steps))
  prices[:,0] = S
  for i in range(1, steps):
    prices[:,i] = prices[:,i-1] * exp((r-σ**2/2)*dt+σ*Z[:,i]*sqrt(dt))
  return prices

def european(ST, K, T, r):
  payoff = np.maximum((ST - K), 0)
  call = np.average(payoff * exp(-r * T))
  payoff = np.maximum((K - ST), 0)
  put = np.average(payoff * exp(-r * T))
  return call, put

def asian(SA, K, T, r):
  """
  arithmetic average, fixed strike
  """
  payoff = np.maximum(SA - K, 0)
  call = np.average(payoff * exp(-r * T))
  payoff = np.maximum(K - SA, 0)
  put = np.average(payoff * exp(-r * T))
  return call, put

In [76]:
S = 100
K = 100
T = 1
r = .05
σ = .12
N = 10000
steps = 20000

print(blackscholes(S, K, T, r, σ))
prices = paths(S, K, T, r, σ, N, steps)
print(asian(np.average(prices, axis=1), K, T, r))

(7.50513887446543, 2.628081324536833)
(4.083505876644273, 1.601654276038386)


Machine Learning (ML) approach using Sklearn.

Regression model to price European calls around S=100, K=100 and T=1 using ML.

Using the Black-Scholes analytical solution to generate the synthetic data needed for the training.

In [46]:
# generate synthetic data in the vicinity of S=100, K=100, T=1, etc. and save it to disk
SK = np.arange(99, 101, 0.05)
T = np.arange(0.9, 1.1, 0.05)
r = np.arange(0.02, 0.05, 0.001)
σ = np.arange(0.1, 0.2, 0.01)
SS, KK, TT, rr, σσ = np.meshgrid(SK, SK, T, r, σ)
YY = blackscholes(SS, KK, TT, rr, σσ)
YY = YY[0]
rows = np.stack([z.ravel() for z in (YY, SS, KK, TT, rr, σσ)], axis=1)
np.savetxt('data.csv', rows, fmt='%.5f', delimiter=',', newline='\n', comments='', header='Call,S,K,T,r,sigma')

In [66]:
SK = np.arange(99, 101, 0.05)
T = np.arange(0.9, 1.1, 0.05)
r = np.arange(0.02, 0.05, 0.001)
σ = np.arange(0.1, 0.2, 0.01)
SS, KK, TT, rr, σσ = np.meshgrid(SK, SK, T, r, σ)
YY = blackscholes(SS, KK, TT, rr, σσ)

Y_flat    = YY[0].ravel()
S_flat    = SS.ravel()
KK_flat    = KK.ravel()
TT_flat    = TT.ravel()
rr_flat  = rr.ravel()
σσ_flat = σσ.ravel()

rows = np.column_stack((Y_flat, S_flat, KK_flat, TT_flat , rr_flat, σσ_flat))
np.savetxt('datacall.csv', rows, fmt='%.5f', delimiter=',', newline='\n', comments='', header='Call,S,K,T,r,sigma')
print("Fichier datacall.csv sauvegardé avec succès !")

Fichier datacall.csv sauvegardé avec succès !


In [67]:
# train a MLP on the data and save trained model to disk
import numpy as np
import pandas as pd
from sklearn.neural_network import MLPRegressor
from sklearn.model_selection import train_test_split
from joblib import dump, load

df = pd.read_csv("datacall.csv", dtype=np.float32)
y = df.pop('Call').values
# scaling S and K values
df.S = df.S/100
df.K = df.K/100
X = df.values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
regr = MLPRegressor(random_state=42, max_iter=500).fit(X_train, y_train)
regr.predict(X_test[:5])
regr.score(X_test, y_test)
dump(regr, 'mlpregressor.joblib')

['mlpregressor.joblib']

In [70]:
# create input data for inference
!echo "S,K,T,r,sigma" > input1.csv
!echo "100,100,1,0.05,0.1" >> input1.csv
!echo "100,100,1,0.05,0.11" >> input1.csv
!echo "100,100,1,0.05,0.12" >> input1.csv
!echo "99.8,100,1,0.05,0.11" >> input1.csv
!echo "99.8,105,1,0.02,0.2" >> input1.csv
!echo "95,125,0.5,0.02,0.2" >> input1.csv

In [71]:
# reload the model and run inference on new data
# ground truth is 6.8020, 7.1543, 7.5100, 7.0121, 6.2715, 0.1829
regr = load('mlpregressor.joblib')
df = pd.read_csv('input1.csv')
print(df)
df.S = df.S/100
df.K = df.K/100
regr.predict(df.values)

       S    K    T     r  sigma
0  100.0  100  1.0  0.05   0.10
1  100.0  100  1.0  0.05   0.11
2  100.0  100  1.0  0.05   0.12
3   99.8  100  1.0  0.05   0.11
4   99.8  105  1.0  0.02   0.20
5   95.0  125  0.5  0.02   0.20


array([ 6.73915706,  7.09936597,  7.45957488,  6.96970263,  6.11334984,
       -9.57393822])

# Conclusion
Pretty good, except last one !<br>
Interpolation : Maillage fin de notre data entrainement<br> 
Extrapolation : S=95 K=125, On n'a pas de tels points dans data d'entraînement