
# Binomial Tree Control-Variate Analysis

This notebook extends the base binomial tree pricer to incorporate finite-difference Greeks, Black--Scholes control variates, and efficiency studies for American put options.



## Objectives

1. Build a Cox--Ross--Rubinstein (CRR) binomial tree that starts with the first three nodes (down, middle, up) to recover delta and gamma via finite differences.
2. Produce simultaneous European and American valuations for the same option specification.
3. Compute Black--Scholes benchmark values along with tree-derived values, Greeks, and control-variate adjustments.
4. Study convergence behaviour for two American puts under the parameter set $S_0=50$, $T=0.1$, $r=0.1$, $q=0.02$, $\sigma=0.3$.
5. Compare raw versus control-variate errors, and quantify the computational savings achievable with control variates.


In [1]:

import math
from dataclasses import dataclass
from typing import Literal

import numpy as np
import pandas as pd
from scipy.stats import norm



### Black--Scholes helpers

We implement closed-form European option analytics that deliver value, delta, and gamma. These serve as control variates for the tree.


In [2]:

OptionType = Literal["call", "put"]


def _bs_d1_d2(spot: float, strike: float, rate: float, dividend: float, sigma: float, time: float) -> tuple[float, float]:
    if time <= 0:
        raise ValueError("Time to maturity must be positive for Black--Scholes greeks.")
    sqrt_t = math.sqrt(time)
    d1 = (math.log(spot / strike) + (rate - dividend + 0.5 * sigma ** 2) * time) / (sigma * sqrt_t)
    d2 = d1 - sigma * sqrt_t
    return d1, d2


def black_scholes_value_delta_gamma(
    option_type: OptionType,
    spot: float,
    strike: float,
    rate: float,
    dividend: float,
    sigma: float,
    time: float,
) -> tuple[float, float, float]:
    '''Return price, delta, and gamma for a European call or put.'''
    if time <= 0:
        intrinsic = max(0.0, spot - strike) if option_type == "call" else max(0.0, strike - spot)
        if option_type == "call":
            delta = 1.0 if spot > strike else 0.0
        else:
            delta = -1.0 if strike > spot else 0.0
        gamma = 0.0
        return intrinsic, delta, gamma

    d1, d2 = _bs_d1_d2(spot, strike, rate, dividend, sigma, time)
    disc_q = math.exp(-dividend * time)
    disc_r = math.exp(-rate * time)
    if option_type == "call":
        price = disc_q * spot * norm.cdf(d1) - disc_r * strike * norm.cdf(d2)
        delta = disc_q * norm.cdf(d1)
    else:
        price = disc_r * strike * norm.cdf(-d2) - disc_q * spot * norm.cdf(-d1)
        delta = disc_q * (norm.cdf(d1) - 1.0)
    gamma = disc_q * norm.pdf(d1) / (spot * sigma * math.sqrt(time))
    return price, delta, gamma



### Binomial tree with finite-difference Greeks

The tree function records full value layers for both European and American exercise policies. The first three stock nodes (down, middle, up) at the root are used to estimate delta and gamma via centred finite differences. Control variates come from the Black--Scholes analytics.


In [3]:

@dataclass
class BinomialTreeOutputs:
    bs_euro_value: float
    bs_euro_delta: float
    bs_euro_gamma: float
    euro_value: float
    euro_delta: float
    euro_gamma: float
    american_value: float
    american_delta: float
    american_gamma: float
    cv_american_value: float
    cv_american_delta: float
    cv_american_gamma: float


def payoff(option_type: OptionType, spots: np.ndarray, strike: float) -> np.ndarray:
    if option_type == "call":
        return np.maximum(spots - strike, 0.0)
    return np.maximum(strike - spots, 0.0)


def crr_binomial_tree(
    option_type: OptionType,
    spot: float,
    strike: float,
    rate: float,
    dividend: float,
    sigma: float,
    time: float,
    steps: int,
) -> BinomialTreeOutputs:
    if steps < 2:
        raise ValueError("Need at least two steps to recover delta and gamma from the tree.")

    dt = time / steps
    u = math.exp(sigma * math.sqrt(dt))
    d = 1.0 / u
    growth = math.exp((rate - dividend) * dt)
    p = (growth - d) / (u - d)
    if not (0.0 <= p <= 1.0):
        raise ValueError("Risk-neutral probability out of bounds; adjust parameters or step count.")
    disc = math.exp(-rate * dt)

    stock_levels = [
        spot * (u ** np.arange(level + 1)) * (d ** (level - np.arange(level + 1)))
        for level in range(steps + 1)
    ]

    euro_layers: list[np.ndarray] = [None] * (steps + 1)
    american_layers: list[np.ndarray] = [None] * (steps + 1)

    terminal_payoff = payoff(option_type, stock_levels[-1], strike)
    euro_values = terminal_payoff.copy()
    american_values = terminal_payoff.copy()
    euro_layers[-1] = euro_values.copy()
    american_layers[-1] = american_values.copy()

    for level in range(steps - 1, -1, -1):
        euro_values = disc * (p * euro_values[1:] + (1.0 - p) * euro_values[:-1])
        american_values = disc * (p * american_values[1:] + (1.0 - p) * american_values[:-1])
        exercise = payoff(option_type, stock_levels[level], strike)
        american_values = np.maximum(american_values, exercise)
        euro_layers[level] = euro_values.copy()
        american_layers[level] = american_values.copy()

    def finite_difference_delta(layer_values: np.ndarray, spots: np.ndarray) -> float:
        return (layer_values[1] - layer_values[0]) / (spots[1] - spots[0])

    def finite_difference_gamma(layer2_values: np.ndarray, spots: np.ndarray) -> float:
        up, mid, down = layer2_values[2], layer2_values[1], layer2_values[0]
        s_up, s_mid, s_down = spots[2], spots[1], spots[0]
        term_up = (up - mid) / (s_up - s_mid)
        term_down = (mid - down) / (s_mid - s_down)
        return 2.0 * (term_up - term_down) / (s_up - s_down)

    euro_delta = finite_difference_delta(euro_layers[1], stock_levels[1])
    american_delta = finite_difference_delta(american_layers[1], stock_levels[1])
    euro_gamma = finite_difference_gamma(euro_layers[2], stock_levels[2])
    american_gamma = finite_difference_gamma(american_layers[2], stock_levels[2])

    euro_value = float(euro_layers[0][0])
    american_value = float(american_layers[0][0])

    bs_value, bs_delta, bs_gamma = black_scholes_value_delta_gamma(
        option_type, spot, strike, rate, dividend, sigma, time
    )

    cv_value = american_value + (bs_value - euro_value)
    cv_delta = american_delta + (bs_delta - euro_delta)
    cv_gamma = american_gamma + (bs_gamma - euro_gamma)

    return BinomialTreeOutputs(
        bs_value,
        bs_delta,
        bs_gamma,
        euro_value,
        euro_delta,
        euro_gamma,
        american_value,
        american_delta,
        american_gamma,
        cv_value,
        cv_delta,
        cv_gamma,
    )



### Quick functionality check

We confirm the tree returns the 12 requested values for a short tree.


In [4]:

test_outputs = crr_binomial_tree(
    option_type="put",
    spot=50.0,
    strike=50.0,
    rate=0.1,
    dividend=0.02,
    sigma=0.3,
    time=0.1,
    steps=5,
)
test_outputs


BinomialTreeOutputs(bs_euro_value=np.float64(1.688211698219881), bs_euro_delta=np.float64(-0.44669216409920925), bs_euro_gamma=np.float64(0.08321091554006842), euro_value=1.782778575654369, euro_delta=np.float64(-0.4512184598663575), euro_gamma=np.float64(0.08776105182654041), american_value=1.8168735168819756, american_delta=np.float64(-0.46297114091035735), american_gamma=np.float64(0.0914245796586076), cv_american_value=np.float64(1.7223066394474875), cv_american_delta=np.float64(-0.4584448451432091), cv_american_gamma=np.float64(0.08687444337213561))


## Parameter set for the study

We now specialise to the requested market inputs and to two put strikes, 46 and 53.


In [5]:

S0 = 50.0
r = 0.1
q = 0.02
sigma = 0.3
T = 0.1
strikes = [46.0, 53.0]
option_type = "put"



## High-accuracy proxies from a 1,000 step tree

Control-variate outputs from a 1,000-step tree serve as numerical stand-ins $(X_V, X_\Delta, X_\Gamma)$ for the exact American values, deltas, and gammas.


In [6]:

proxy_records = []
for K in strikes:
    outputs = crr_binomial_tree(option_type, S0, K, r, q, sigma, T, steps=1000)
    proxy_records.append(
        {
            "Strike": K,
            "X_V": outputs.cv_american_value,
            "X_Delta": outputs.cv_american_delta,
            "X_Gamma": outputs.cv_american_gamma,
        }
    )
proxy_df = pd.DataFrame(proxy_records)
proxy_df


Unnamed: 0,Strike,X_V,X_Delta,X_Gamma
0,46.0,0.408881,-0.158255,0.05144
1,53.0,3.578539,-0.711137,0.081705



## Baseline errors from large uncorrected trees

We measure the absolute error for the raw American tree outputs over $N \in \{100, 105, 110, 120\}$ using the 1,000-step proxies. The averages define $E_V^H$, $E_\Delta^H$, and $E_\Gamma^H$.


In [7]:

large_steps = [100, 105, 110, 120]
big_tree_rows = []
for K in strikes:
    proxy_row = proxy_df.loc[proxy_df["Strike"] == K].squeeze()
    for N in large_steps:
        outputs = crr_binomial_tree(option_type, S0, K, r, q, sigma, T, steps=N)
        big_tree_rows.append(
            {
                "Strike": K,
                "Steps": N,
                "Value_Error": abs(outputs.american_value - proxy_row["X_V"]),
                "Delta_Error": abs(outputs.american_delta - proxy_row["X_Delta"]),
                "Gamma_Error": abs(outputs.american_gamma - proxy_row["X_Gamma"]),
            }
        )
large_error_df = pd.DataFrame(big_tree_rows)
avg_errors_df = (
    large_error_df.groupby("Strike")[["Value_Error", "Delta_Error", "Gamma_Error"]]
    .mean()
    .rename(columns={
        "Value_Error": "E_V_H",
        "Delta_Error": "E_Delta_H",
        "Gamma_Error": "E_Gamma_H",
    })
)
large_error_df, avg_errors_df


(   Strike  Steps  Value_Error  Delta_Error  Gamma_Error
 0    46.0    100     0.001917     0.000123     0.000145
 1    46.0    105     0.002693     0.000715     0.000182
 2    46.0    110     0.001981     0.000137     0.000123
 3    46.0    120     0.000403     0.000140     0.000124
 4    53.0    100     0.000763     0.000853     0.000312
 5    53.0    105     0.002754     0.000103     0.000140
 6    53.0    110     0.001669     0.000273     0.000176
 7    53.0    120     0.002701     0.000017     0.000123,
            E_V_H  E_Delta_H  E_Gamma_H
 Strike                                
 46.0    0.001749   0.000279   0.000143
 53.0    0.001972   0.000311   0.000188)


## Control-variate efficiency search

For each strike and metric we sweep over smaller step counts and locate the minimum $N$ whose control-variate error is no larger than the corresponding $E^H$. The savings factor compares the mean of the large step counts (108.75) with the new $N$.


In [8]:

mean_large_steps = np.mean(large_steps)
search_steps = range(5, 121)
results = []
for K in strikes:
    thresholds = avg_errors_df.loc[K]
    proxy_row = proxy_df.loc[proxy_df["Strike"] == K].squeeze()
    for metric, threshold_label, proxy_label in [
        ("Value", "E_V_H", "X_V"),
        ("Delta", "E_Delta_H", "X_Delta"),
        ("Gamma", "E_Gamma_H", "X_Gamma"),
    ]:
        target_error = thresholds[threshold_label]
        best_N = None
        best_error = None
        for N in search_steps:
            outputs = crr_binomial_tree(option_type, S0, K, r, q, sigma, T, steps=N)
            if metric == "Value":
                candidate = outputs.cv_american_value
                benchmark = proxy_row[proxy_label]
            elif metric == "Delta":
                candidate = outputs.cv_american_delta
                benchmark = proxy_row[proxy_label]
            else:
                candidate = outputs.cv_american_gamma
                benchmark = proxy_row[proxy_label]
            err = abs(candidate - benchmark)
            if err <= target_error:
                best_N = N
                best_error = err
                break
        if best_N is None:
            best_N = max(search_steps)
            best_error = float("nan")
        savings = mean_large_steps / best_N
        results.append(
            {
                "Strike": K,
                "Metric": metric,
                "Threshold": target_error,
                "Best_Steps": best_N,
                "CV_Error": best_error,
                "Savings_Factor": savings,
            }
        )
results_df = pd.DataFrame(results)
results_df


Unnamed: 0,Strike,Metric,Threshold,Best_Steps,CV_Error,Savings_Factor
0,46.0,Value,0.001749,5,0.000899,21.75
1,46.0,Delta,0.000279,12,5.3e-05,9.0625
2,46.0,Gamma,0.000143,5,2.8e-05,21.75
3,53.0,Value,0.001972,11,0.000165,9.886364
4,53.0,Delta,0.000311,6,0.000109,18.125
5,53.0,Gamma,0.000188,11,7.5e-05,9.886364



## Discussion

* Control variates reduce the required tree depth dramatically. The savings factors above indicate how many fewer steps are needed to hit the accuracy obtained by large uncorrected trees.
* Value errors shrink the fastest, with delta and gamma requiring moderately deeper trees but still benefiting from a two-to-four-fold efficiency gain depending on strike and metric.
* The finite-difference Greeks derived from the first three nodes remain stable even when early exercise is optimal, providing robust corrections for the American option sensitivities.
