# Theoretical Option Pricing

# 1. Introduction


In the realm of options pricing, traditional models often simplify the intricate dynamics of underlying stock movements or are limited to certain constraints.



### Dynamic Programming

In this project we approach theoretical option pricing as a dynamic programming problem in which at each time step we have information $s_t \in \mathcal{S}$ such as the underlying price, time till expiry, interest rates, etc. We also have a set of actions we can take at each time step $a \in \mathcal{A}$ such as early exercise, hedging adjustment, taking no action, etc. We also have functions defining the cost/reward of taking an action $C(s_t, a_t)$ and probabilty of transitioning from one state to another $P(s_{t + dt} ~|~ s_t, a_t)$ typically independent of $a_t$ and corresponds to price movements of the underlying. Our goal therefore is to find $V^*(s_t)$ which denotes the theoretical value of an option under the assumtion that the holder of the contract takes the most optional actions during the lifetime of the contract. By the Bellman Optimality Principle, $V^*(s_t)$ can be recursively defined as:

$$
V^*(s_t) = 
    \max_{a_t \in \mathcal{A}} 
    \left\{
        C(s_t, a_t) + \int_{s_{t+dt} \in \mathcal{S}} P(s_{t + dt} ~|~ s_t, a_t) V^*(s_{t + dt})  
    \right\} 
$$




### Approximate Dynamic Programming
To fight the curse of dimensionality inherent with financial problems we will be using approximate dynamic programming which leverages machine learning techniques to approximate $V(s_t)$. The goal in Approximate Dynamic Programming is to parameterize $V(s_t)$ with parameters $\theta$ such that we minimize the Temporal Difference Value Error which is defined as the absolute difference between the predicted value of a state and the optimal value of the state:

$$
\overline{VE} = 
    V_\theta(s_t) - 
    \max_{a_t \in \mathcal{A}} 
    \left\{
        C(s_t, a_t) + \int_{s_{t+dt} \in \mathcal{S}} P(s_{t + dt} ~|~ s_t, a_t) V_\theta(s_{t + dt})  
    \right\} 
$$

We can then use this iteratively perform gradient descent or interpolation to converge towards the true value function.



### Organization
The rest of this notebook is organized as follows:

- In Section 2 we will model our underlying assets price dynamics using the Bates Model.
- In Section 3 we will describe the algorithm we will use to approximate $V^*(s_t)$.
- In Section 4 we will provide dynamic programming formualtions for the different contracts we will be considering.
- In Section 5 we will fit the models and visualize the results.

# 2. Modeling The Underlying Asset

### Heston Model
In this project we assume the S&P 500 follows the Heston model which is a Stochastic Volatility model defined by the following stochastic differential equations:

$$ 
\begin{align}
dS_t &= \mu S_t ~ dt + \sqrt{V_t} S_t ~ dW_{S,t}
\\
dV_t &= \kappa (\theta - V_t) ~ dt + \sigma \sqrt{V_t} ~ dW_{V,t}
\end{align}
$$ 

where:
- $S_t$ represents the stock price at time $t$,
- $V_t$ denotes the stochastic volatility process,
- $\mu$ is the drift rate,
- $\kappa$ is the mean-reversion rate of volatility,
- $\theta$ is the long-term average volatility,
- $\sigma$ is the volatility of volatility,
- $dW_{S,t}$ and $dW_{V,t}$ are Wiener processes for the stock price and volatility, respectively,


### Calculating Price Transition Probabilities
<!-- The probability of observing the price and volatility sequences $\mathbf{S} = \{S_1, S_2, \dots, S_T\}$ and $\mathbf{V} = \{V_1, V_2 \dots, V_T\}$ is:

$$
\begin{align*}
\mathbb{P}(\mathbf{S}, \mathbf{V}) &= \prod_{t = 1}^{T - 1} \mathbb{P}(S_{t + 1}, V_{t + 1} ~|~ S_t, V_t)
% \\\\
% &= \sum_{t = 1}^{T - 1} \log \bigg(\mathbb{P}(S_{t + 1}, V_{t + 1} ~|~ S_t, V_t)\bigg)
\end{align*}
$$ -->

From the definition of the heston model we can see that $S_{t + 1}$ and $V_{t + 1}$ are sampled from the following normal distributions:
$$
S_{t + 1} \sim \mathcal{N}(\mu S_t, \sqrt{V_t} S_t) ~~~\text{and}~~~ V_{t + 1} \sim \mathcal{N}(\kappa(\theta - V_t), \sigma\sqrt{V_t})
$$

Therefore, we can express $\mathbb{P}(S_{t + 1}, V_{t + 1} ~|~ S_t, V_t)$ in terms of $\mu, \kappa, \theta, \sigma, S_t, V_t$ as follows:

$$
\mathbb{P}(S_{t + 1}, V_{t + 1} ~|~ S_t, V_t) = 
\left( 
    \frac{
        \exp\left(
            \frac{
                S_{t + 1} - (S_t + \mu S_t)^2
            }{
                2(\sqrt{V_t}S_t)^2
            }
        \right)
    }{
        \sqrt{2 \pi (\sqrt{V_t}S_t)^2}
    } 
\right)
\left( 
    \frac{
        \exp\left(
            \frac{
                V_{t + 1} - \kappa(\theta + V_t)^2
            }{
                2(\sigma\sqrt{V_t})^2
            }
        \right)
    }{
        \sqrt{2 \pi (\sigma\sqrt{V_t})^2}
    } 
\right)

$$


<!-- Now we can simply use gradient descent to find $\mu$, $\kappa$, $\theta$, and $\sigma$ that maximize $\mathbb{P}(\mathbf{S}, \mathbf{V})$. -->

# 4. Option Contracts

## European Call Option

#### Dynamic Programming Formulation

Given a strike price $K$ our option's pay off after exercising at time $t$ is $\max\{0, S - K\}$. Further more since this is an American call option, we can exercise at any time $t \ge T$. Therefore at each time step we have the choice of exercising and recieving a payoff of $\max\{0, S - K\}$ or not exercising. If we dont exercise, our underlying stock moves to a new price $S_{t + 1}$ with probability $P(S_{t + 1} ~|~ S_t)$ and we have a new problem of whether or not we should exercise. This recursive relationship can be modeled with the following dynamic programm

$$
V(S, T) = \max\left\{ \max\{0, S - K\}, \int_{S_\text{min}}^{S_\text{max}}V(S_{t + dt}, T - dt) P(S_{t + dt} ~|~ S_t) \right\}
$$
