In [2]:
# --- Imports ---
import numpy as np
import pandas as pd
import altair as alt

# Basics of Options Pricing
For our final project, we will be visualizing how Monte Carlo simulations are used to price options.

We'll begin with some explanations. Options are a financial contract that give the buyer the *right* (but not the obligation! hence the name "option") to buy or sell an asset at a specific price, the **strike price**, before the option's **expiration date**. There are two main kinds of options: **call options**, which let you buy an asset at the strike price before the expiration date, and **put options**, which let you sell an asset at the strike price before the expiration date.

For example, let's say you want to buy a call option with a strike price of $110 that expires in 1 month on Pepsi stock, which is currently worth $100. How would you price this option? If you pay $10 for the option, you are hoping that Pepsi stock will be worth more than $110 in 1 month, so that you can buy it at $110 and sell it for more than you paid for the option.

Say we buy this option for $10. If Pepsi stock goes up to $135 in 1 month, you can use the option to buy it at $110 and then sell it for $135, netting you a profit of $135 (what you sell it for) - $110 (the strike price you buy at) - $10 (the price of the option) = $15. You can see that the option is now worth $15, so it is a good deal!

However, assume that Pepsi stock goes down to $90 in 1 month. If you bought the option, you would not use it, because you could just buy Pepsi stock for $90 - there's no reason to buy it at the option's strike price of $110. In this case, the option is worthless, and you would have lost the $10 you paid for it.

Therefore, when buying an option, we seek to only buy it if we think that *on average* the option will return a profit above the price of buying it.

<!-- TODO: Should this be a visualization of how this can go? -->

# Monte Carlo Simulation with Metropolis-Hastings

TODO: Make it less jargon-y

In our Pepsi example, it's easy to see that the option is worth $15, so we should buy it. However, in the real world, we don't know the exact value of the option. Instead, we can use a Monte Carlo simulation to estimate the value of the option.

Let's start with a simple coin tossing example to demonstrate how a Monte Carlo simulation works.

Say we have a coin that has an unknown probability $p$ of landing heads. An intuitive way to estimate $p$ is to flip the coin many times and take the average of the number of heads. At its core, this is what a Monte Carlo simulation does: it repeatedly draws random samples from a probability distribution and then takes the average of the function evaluated at those samples.

Let's demonstrate this now using code. For demonstration purposes, we'll begin by randomly generating a probability $p$ of landing heads, between 0 and 1 - in the real world, we wouldn't know what $p$ is!

In [3]:
# For reproducibility.
# np.random.seed(42)

# Randomly generate the true probability of landing heads.
true_p = np.random.uniform(0, 1)

print(f"True probability p: {true_p:.4f}")

True probability p: 0.0809


From here, we take this coin and toss it $N$ times, and count the number of heads and tails. This can be done using a Binomial distribution, which we simulate below:

In [5]:
# Number of times to toss our coin.
N = 1000

# Define random variable and compute heads and tails
coin_flips = np.random.binomial(1, true_p, size=N)
num_heads = np.sum(coin_flips)
num_tails = N - num_heads
print("Heads:", num_heads, "Tails:", num_tails)
print(f"Empirical estimate of p: {num_heads / N:.4f}")
print(f"True probability p: {true_p:.4f}")

# Define target distribution:
# Posterior alpha p^(num_heads) * (1 - p)^(num_tails), uniform prior
def target(p):
    if p <= 0 or p >= 1:
        return 0
    return (p ** num_heads) * ((1 - p) ** num_tails)


Heads: 78 Tails: 922
Empirical estimate of p: 0.0780
True probability p: 0.0809


However, in this case we know the exact distribution of the coin – we can just flip it ourselves – so it's easy for us to compute the probability of heads. What if we didn't have the coin, or were dealing with a more complex set of variables? In this case, we can use something called the Metropolis-Hastings algorithm to estimate the probability of an event happening - in case of options, the probability of the stock price at expiration being above the strike price.

To do this, we have to use something called rejection sampling. What this does is draw samples from a normal distribution, and then accept or reject those samples based on whether they are above or below the *target* distribution - what we expect the upper and lower bounds of the true distribution to be. For example, if we were to randomly draw a number greater than 1 from the normal distribution when trying to estimate the probability of heads, we would reject it, because we know the probability of heads being greater than 1 is 0.

A visualization of this is shown below:

In [6]:

# Apply Metropolis-Hastings algorithm.
# TODO: Turn into function.
# TODO: Add sliders for the visualization to make it interactive.
num_iterations = 100
proposal_std = 0.5  # Increase to allow broader exploration
chain = []

current_p = 0.5  # Starting guess
for i in range(num_iterations):
    proposed_p = current_p + np.random.normal(0, proposal_std)
    # Ensure candidate remains in [0, 1]
    proposed_p = np.clip(proposed_p, 0, 1)

    current_target = target(current_p)
    proposed_target = target(proposed_p)

    # If current_target == 0, accept if proposed_target > 0, else 0
    if current_target == 0:
        acceptance_prob = 1 if proposed_target > 0 else 0
    else:
        acceptance_prob = min(1, proposed_target / current_target)

    if np.random.rand() < acceptance_prob:
        current_p = proposed_p

    chain.append(current_p)

chain = np.array(chain)

# Compute running mean at each step
# TODO: Visualize this as a stochastic process, i.e. an animation of some kind.
running_mean = np.cumsum(chain) / (np.arange(num_iterations) + 1)

# Prepare DataFrame
df = pd.DataFrame({
    'Iteration': np.arange(num_iterations),
    'Chain Value': chain,
    'Running Mean': running_mean
})

# TODO: Turn into function as well
# Altair visualization of the data.
# Gray line for raw chain, including rejections
base = alt.Chart(df).encode(
    x=alt.X('Iteration:Q', title='Iteration')
)

chain_line = base.mark_line(
    color='lightgray',
    strokeWidth=1
).encode(
    y=alt.Y('Chain Value:Q', 
            title='Parameter Value',
            scale=alt.Scale(domain=[0, 1]))  # Force y-axis [0,1]
)

# Blue line to track running mean
running_mean_line = base.mark_line(
    color='blue',
    strokeWidth=2
).encode(
    y='Running Mean:Q'
)

# Red horizontal rule for the true probability of the coin
true_rule = alt.Chart(pd.DataFrame({'y': [true_p]})).mark_rule(
    color='red',
    strokeWidth=2
).encode(
    y='y:Q'
)

# Combine all the layers together
final_chart = alt.layer(
    chain_line,
    running_mean_line,
    true_rule
).properties(
    width=600,
    height=400,
    title=f'Convergence of Estimated p (True p = {true_p:.4f})'
)

final_chart.display()


TODO: Introduce notion of "burnout".

TODO: Introduce Monte Carlo simulation of *expected* value of the option, not just the probability it will return a profit or not.

TODO: Gradually introduce different variables for the user to interact with: number of path simulations, number of iterations, volatility, strike price, drift, etc.

Number of Path Simulations: The number of path simulations for a Monte Carlo Simulation is the number of possible different outcomes trees we are exploring. For example, if we have one path simulation, then we are only mapping one possible outcome. If we have two path simulations, then we are looking at two possible futures, etc etc. The more simulation paths that we observe, the more data we will have to analyze, and the more complete of a picture we will have prediting the future. A simple way we can see this is with the law of large numbers, which states that the more data we have, that dataset's mean will be closer to the true mean of whatever we are sampling. So, in our example, the more paths we have, the better an estimate of the stock's future value we will have.

Number of Iterations: The number of iterations in the algorithm shows how many steps we take to refine our estimate. This is like time to expiration in options pricing — more time allows for greater price fluctuations, just like how more iterations allows for a better estimate. With few iterations, our estimate is rough, like how an option close to expiration has limited opportunities for price movement. With more iterations, we get a more accurate distribution, just as a longer time to expiration gives the stock price more chances to impact an option’s value.

In [15]:
import numpy as np
import pandas as pd
import altair as alt

# Ensure Altair renders properly in Jupyter
alt.renderers.enable('default')

# Function to compute target probability (posterior distribution)
def target(p, num_heads, num_tails):
    if p <= 0 or p >= 1:
        return 0
    return (p ** num_heads) * ((1 - p) ** num_tails)

# Function to run Metropolis-Hastings algorithm for multiple chains
def metropolis_hastings(num_iterations, proposal_std, num_chains):
    chains = []

    for _ in range(num_chains):
        current_p = 0.5  # Start at 0.5
        chain = []

        for _ in range(num_iterations):
            proposed_p = current_p + np.random.normal(0, proposal_std)
            proposed_p = np.clip(proposed_p, 0, 1)  # Ensure it's in [0,1]

            current_target = target(current_p, num_heads, num_tails)
            proposed_target = target(proposed_p, num_heads, num_tails)

            if current_target == 0:
                acceptance_prob = 1 if proposed_target > 0 else 0
            else:
                acceptance_prob = min(1, proposed_target / current_target)

            if np.random.rand() < acceptance_prob:
                current_p = proposed_p

            chain.append(current_p)

        chains.append(chain)

    return np.array(chains)

# Function to plot results
def plot_chains(num_iterations, proposal_std, num_chains):
    chains = metropolis_hastings(num_iterations, proposal_std, num_chains)

    # Convert to DataFrame for visualization
    df = pd.DataFrame({
        'Iteration': np.tile(np.arange(num_iterations), num_chains),
        'Chain Value': chains.flatten(),
        'Chain': np.repeat(np.arange(1, num_chains + 1), num_iterations)
    })

    # Altair visualization
    chart = alt.Chart(df).mark_line().encode(
        x='Iteration:Q',
        y=alt.Y('Chain Value:Q', title='Parameter Value', scale=alt.Scale(domain=[0, 1])),
        color='Chain:N'
    ).properties(
        width=600,
        height=400,
        title=f'Metropolis-Hastings Chains (Iterations={num_iterations}, Chains={num_chains})'
    )

    return chart

# Generate coin flip data
np.random.seed(42)
true_p = np.random.uniform(0, 1)  # Random true probability
N = 1000
coin_flips = np.random.binomial(1, true_p, size=N)
num_heads = np.sum(coin_flips)
num_tails = N - num_heads

# Set parameters
# Todo: Make this interactable in the interface for the User.
num_iterations = 50 # Customize number of iterations
proposal_std = 0.5
num_chains = 25  # Customize number of path simulations

# Display the result
plot_chains(num_iterations, proposal_std, num_chains).display()
