# Delta Method Visualization

An interactive applet showing why $g(\bar{X}_n)$ is approximately normal when $\bar{X}_n$ is approximately normal and $g$ is smooth.

**Data context:** California earthquakes of magnitude ≥ 4.0 (295 mainshocks over 15 years, mean interarrival ≈ 18.6 days).

**Key insights to convey:**
1. As $n$ increases, $\bar{X}_n$ becomes more concentrated and more normal (CLT)
2. As $\bar{X}_n$ concentrates, the function $g$ looks increasingly linear over the relevant range
3. A linear transformation of a normal is still normal

**Color coding:**
- **Blue (solid)**: Data and true distributions (Gamma for $\bar{X}_n$, transformed Gamma for $g(\bar{X}_n)$)
- **Red (dashed)**: Approximations (CLT normal, tangent line, delta method normal)

**Hover over curves** to see full descriptions.

**Functions available:**
- **Rate estimator**: $g(x) = 1/x$ — estimates $\lambda$ from $\bar{X} = 1/\lambda$ (nonlinear, concave)
- **Quantile estimator**: $g(x) = -\log(1-p) \cdot x$ — estimates the $p$-th quantile (linear!)
- **Probability estimator**: $g(x) = 1 - e^{-t/x}$ — estimates $P(X \leq t)$, e.g., chance of earthquake within $t$ days (nonlinear, sigmoid-like)

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.gridspec import GridSpec
from scipy import stats
import ipywidgets as widgets
from IPython.display import display
import mplcursors

%matplotlib widget

In [None]:
# True parameters from California earthquake data
# 295 mainshocks of magnitude >= 4.0 over 15 years (Jan 16, 2010 - Jan 16, 2025)
N_EARTHQUAKES = 295
N_YEARS = 15
N_DAYS = N_YEARS * 365  # 5475 days

TRUE_MEAN = N_DAYS / N_EARTHQUAKES  # Mean interarrival time ≈ 18.56 days
TRUE_LAMBDA = 1 / TRUE_MEAN          # Rate ≈ 0.054 per day

# Number of bootstrap replicates for histograms
N_REPLICATES = 1000

# Number of individual points to trace through the function
N_TRACE_POINTS = 4

# Colors
COLOR_DATA = 'steelblue'       # Data, true function, true distributions
COLOR_APPROX = 'firebrick'     # Approximations (CLT, tangent, delta method)

def g_rate(x):
    """Rate estimator: g(x) = 1/x"""
    return 1 / x

def g_rate_deriv(x):
    """Derivative of rate estimator: g'(x) = -1/x^2"""
    return -1 / x**2

def g_quantile(x, p=0.5):
    """Quantile estimator: g(x) = -log(1-p) * x

    Since lambda_hat = 1/x_bar, the estimated p-th quantile is:
    Q_hat = -log(1-p) / lambda_hat = -log(1-p) * x_bar
    """
    return -np.log(1 - p) * x

def g_quantile_deriv(x, p=0.5):
    """Derivative of quantile estimator: g'(x) = -log(1-p) (constant)"""
    return -np.log(1 - p)

def g_prob(x, t=7):
    """Probability estimator: g(x) = 1 - exp(-t/x)
    
    Estimates P(X <= t) where X ~ Exp(lambda) and lambda_hat = 1/x_bar.
    P_hat = 1 - exp(-lambda_hat * t) = 1 - exp(-t / x_bar)
    """
    return 1 - np.exp(-t / x)

def g_prob_deriv(x, t=7):
    """Derivative of probability estimator: g'(x) = -(t/x^2) * exp(-t/x)"""
    return -(t / x**2) * np.exp(-t / x)

In [None]:
def generate_sample_means(n, n_replicates=N_REPLICATES):
    """
    Generate sample means from Exponential(TRUE_LAMBDA).
    Each replicate: draw n observations, compute mean.
    """
    samples = np.random.exponential(scale=1/TRUE_LAMBDA, size=(n_replicates, n))
    return samples.mean(axis=1)


def stratified_sample(xbar_samples, n_points=4):
    """
    Take a stratified random sample of points from the distribution.
    One point from each of: 10-30%, 30-50%, 50-70%, 70-90% percentile ranges.
    """
    strata = [(10, 30), (30, 50), (50, 70), (70, 90)]
    selected = []
    
    for lo_pct, hi_pct in strata:
        lo_val = np.percentile(xbar_samples, lo_pct)
        hi_val = np.percentile(xbar_samples, hi_pct)
        
        # Find points in this range
        in_range = xbar_samples[(xbar_samples >= lo_val) & (xbar_samples <= hi_val)]
        
        if len(in_range) > 0:
            selected.append(np.random.choice(in_range))
        else:
            # Fallback: use the midpoint percentile
            selected.append(np.percentile(xbar_samples, (lo_pct + hi_pct) / 2))
    
    return np.array(selected)

In [None]:
def create_delta_method_plot(n, g_func, g_deriv, g_name, g_formula, true_input, true_output, fig, 
                              estimator_type='rate', prob_t=None, seed=None):
    """
    Create the delta method visualization with:
    - Main panel: function g(x) with tangent line
    - Bottom: histogram of X-bar with true Gamma and CLT normal
    - Right: histogram of g(X-bar) with true transformed density and delta method normal
    - Trace lines showing how individual points map through g
    
    Color coding:
    - Blue solid: data and true distributions
    - Red dashed: approximations (CLT, tangent, delta method)
    
    estimator_type: 'rate', 'quantile', or 'prob' - determines how to compute true distribution
    prob_t: time parameter for probability estimator (required if estimator_type='prob')
    """
    if seed is not None:
        np.random.seed(seed)
    
    fig.clear()
    
    # Create grid: main plot, bottom histogram, right histogram
    gs = GridSpec(2, 2, width_ratios=[4, 1], height_ratios=[4, 1],
                  hspace=0.05, wspace=0.05, figure=fig)
    
    ax_main = fig.add_subplot(gs[0, 0])   # Main function plot
    ax_bottom = fig.add_subplot(gs[1, 0], sharex=ax_main)  # X-bar histogram
    ax_right = fig.add_subplot(gs[0, 1], sharey=ax_main)   # g(X-bar) histogram
    
    # Generate all sample means
    xbar_samples = generate_sample_means(n)
    g_samples = g_func(xbar_samples)
    
    # Pick stratified random sample of points to trace
    trace_xbars = stratified_sample(xbar_samples, N_TRACE_POINTS)
    trace_g = g_func(trace_xbars)
    
    # Theoretical values for CLT approximation
    xbar_std = TRUE_MEAN / np.sqrt(n)  # SD of Xbar
    
    # True distribution parameters for X-bar
    # X_i ~ Exp(rate=lambda), so X_i ~ Exp(scale=1/lambda)
    # Sum of n X_i ~ Gamma(shape=n, scale=1/lambda)
    # X_bar = Sum/n ~ Gamma(shape=n, scale=1/(n*lambda))
    gamma_shape = n
    gamma_scale = 1 / (n * TRUE_LAMBDA)
    
    # Determine plot ranges
    spread = max(3.5 * xbar_std, 0.3 * TRUE_MEAN)
    x_min = max(0.2, true_input - spread)
    x_max = true_input + spread
    
    # Handle both increasing and decreasing g
    y_at_xmin = g_func(x_min)
    y_at_xmax = g_func(x_max)
    y_min = min(y_at_xmin, y_at_xmax)
    y_max = max(y_at_xmin, y_at_xmax)
    y_pad = 0.1 * (y_max - y_min)
    y_min -= y_pad
    y_max += y_pad
    
    # Check if g is increasing or decreasing
    g_is_increasing = g_deriv(true_input) > 0
    
    # Store line objects for tooltips
    tooltip_lines = []
    tooltip_labels = []
    
    # --- Main panel: the function g(x) ---
    x_grid = np.linspace(x_min, x_max, 300)
    y_grid = g_func(x_grid)
    
    # Plot the curve (BLUE SOLID - true function)
    line_true, = ax_main.plot(x_grid, y_grid, '-', color=COLOR_DATA, linewidth=2.5, zorder=2)
    tooltip_lines.append(line_true)
    tooltip_labels.append(f'True function: {g_formula}')
    
    # Plot tangent line at true mean (RED DASHED - approximation)
    slope = g_deriv(true_input)
    tangent_y = true_output + slope * (x_grid - true_input)
    line_tangent, = ax_main.plot(x_grid, tangent_y, '--', color=COLOR_APPROX, linewidth=2, 
                                  alpha=0.9, zorder=1)
    tooltip_lines.append(line_tangent)
    tooltip_labels.append('Linear approximation (tangent at true mean)')
    
    # Mark the true point
    ax_main.plot(true_input, true_output, 'ko', markersize=8, zorder=5)
    
    # Trace individual points through the function
    trace_colors = plt.cm.Blues(np.linspace(0.4, 0.8, N_TRACE_POINTS))
    
    for i, (x_pt, y_pt, color) in enumerate(zip(trace_xbars, trace_g, trace_colors)):
        # Point on bottom axis
        ax_main.plot(x_pt, y_min, 'o', color=color, markersize=10, 
                     clip_on=False, zorder=10)
        # Vertical line up to the curve
        ax_main.plot([x_pt, x_pt], [y_min, y_pt], '-', color=color, 
                     linewidth=1.5, alpha=0.7, zorder=3)
        # Horizontal line to right axis
        ax_main.plot([x_pt, x_max], [y_pt, y_pt], '-', color=color, 
                     linewidth=1.5, alpha=0.7, zorder=3)
        # Point on right axis
        ax_main.plot(x_max, y_pt, 'o', color=color, markersize=10, 
                     clip_on=False, zorder=10)
    
    ax_main.set_xlim(x_min, x_max)
    ax_main.set_ylim(y_min, y_max)
    ax_main.set_ylabel(g_name, fontsize=12)
    ax_main.tick_params(labelbottom=False)
    
    # --- Bottom panel: histogram of X-bar ---
    bins_x = np.linspace(x_min, x_max, 50)
    
    # Histogram (BLUE - data)
    ax_bottom.hist(xbar_samples, bins=bins_x, density=True, 
                   color=COLOR_DATA, alpha=0.5, edgecolor='white', linewidth=0.5)
    
    # True Gamma density (BLUE SOLID)
    gamma_pdf = stats.gamma.pdf(x_grid, a=gamma_shape, scale=gamma_scale)
    line_gamma, = ax_bottom.plot(x_grid, gamma_pdf, '-', color=COLOR_DATA, linewidth=2)
    tooltip_lines.append(line_gamma)
    tooltip_labels.append(f'True distribution: Gamma(shape={n}, scale=1/(n·λ))')
    
    # CLT Normal approximation (RED DASHED)
    normal_pdf = stats.norm.pdf(x_grid, true_input, xbar_std)
    line_clt, = ax_bottom.plot(x_grid, normal_pdf, '--', color=COLOR_APPROX, linewidth=2)
    tooltip_lines.append(line_clt)
    tooltip_labels.append('CLT approximation: Normal(μ, σ²/n)')
    
    # Mark true mean
    ax_bottom.axvline(true_input, color='black', linestyle='--', linewidth=1.5, alpha=0.5)
    
    # Mark the trace points
    for x_pt, color in zip(trace_xbars, trace_colors):
        ax_bottom.axvline(x_pt, color=color, linewidth=2, alpha=0.7)
    
    ax_bottom.set_xlabel(r'$\bar{X}_n$', fontsize=12)
    ax_bottom.set_ylabel('Density', fontsize=10)
    ax_bottom.set_xlim(x_min, x_max)
    ax_bottom.invert_yaxis()
    
    # --- Right panel: histogram of g(X-bar) ---
    bins_y = np.linspace(y_min, y_max, 50)
    
    # Histogram (BLUE - data)
    ax_right.hist(g_samples, bins=bins_y, density=True, orientation='horizontal',
                  color=COLOR_DATA, alpha=0.5, edgecolor='white', linewidth=0.5)
    
    # True distribution of g(X-bar) via change of variables
    # Use relative padding (1% of range) instead of absolute padding
    y_range = y_max - y_min
    y_grid_hist = np.linspace(y_min + 0.01 * y_range, y_max - 0.01 * y_range, 200)
    
    if estimator_type == 'quantile':
        # For g(x) = c*x (increasing linear), if X ~ Gamma(a, s), then g(X) = c*X ~ Gamma(a, c*s)
        # If g(x) = c*x, then x = y/c and |dx/dy| = 1/c
        c = g_deriv(true_input)  # constant derivative = -log(1-p)
        xbar_from_y = y_grid_hist / c
        valid = xbar_from_y > 0
        true_g_pdf = np.zeros_like(y_grid_hist)
        true_g_pdf[valid] = stats.gamma.pdf(xbar_from_y[valid], a=gamma_shape, scale=gamma_scale) / c
    elif estimator_type == 'prob':
        # For g(x) = 1 - exp(-t/x), the inverse is x = -t / log(1-y)
        # Jacobian: |dx/dy| = t / [(1-y) * log(1-y)^2]
        t = prob_t
        # Restrict to valid range (0, 1) for probability
        y_grid_prob = np.linspace(max(y_min, 0.001), min(y_max, 0.999), 200)
        log_1_minus_y = np.log(1 - y_grid_prob)
        xbar_from_y = -t / log_1_minus_y
        jacobian = t / ((1 - y_grid_prob) * log_1_minus_y**2)
        true_g_pdf_prob = stats.gamma.pdf(xbar_from_y, a=gamma_shape, scale=gamma_scale) * jacobian
        # Plot on the prob-specific grid
        line_true_g, = ax_right.plot(true_g_pdf_prob, y_grid_prob, '-', color=COLOR_DATA, linewidth=2)
        tooltip_lines.append(line_true_g)
        tooltip_labels.append(f'True distribution of {g_name} (transformed Gamma)')
    else:  # rate estimator
        # For g(x) = c/x (decreasing), use change of variables
        c = g_func(1.0)  # g(1) = c
        xbar_from_y = c / y_grid_hist
        valid = (xbar_from_y > 0) & (y_grid_hist > 0)
        true_g_pdf = np.zeros_like(y_grid_hist)
        true_g_pdf[valid] = stats.gamma.pdf(xbar_from_y[valid], a=gamma_shape, scale=gamma_scale) * c / (y_grid_hist[valid]**2)
    
    # Plot the true distribution (for non-prob cases; prob case plotted above)
    if estimator_type != 'prob':
        line_true_g, = ax_right.plot(true_g_pdf, y_grid_hist, '-', color=COLOR_DATA, linewidth=2)
        tooltip_lines.append(line_true_g)
        tooltip_labels.append(f'True distribution of {g_name} (transformed Gamma)')
    
    # Delta method Normal approximation (RED DASHED)
    g_std = abs(g_deriv(true_input)) * xbar_std
    normal_pdf_g = stats.norm.pdf(y_grid_hist, true_output, g_std)
    line_delta, = ax_right.plot(normal_pdf_g, y_grid_hist, '--', color=COLOR_APPROX, linewidth=2)
    tooltip_lines.append(line_delta)
    tooltip_labels.append('Delta method: CLT + linear approximation')
    
    # Mark true output
    ax_right.axhline(true_output, color='black', linestyle='--', linewidth=1.5, alpha=0.5)
    
    # Mark the trace points
    for y_pt, color in zip(trace_g, trace_colors):
        ax_right.axhline(y_pt, color=color, linewidth=2, alpha=0.7)
    
    ax_right.set_xlabel('Density', fontsize=10)
    ax_right.set_ylim(y_min, y_max)
    ax_right.tick_params(labelleft=False)
    
    # Title
    fig.suptitle(f'Delta Method Visualization (n = {n})', fontsize=14, fontweight='bold')
    
    # Add tooltips using mplcursors
    cursor = mplcursors.cursor(tooltip_lines, hover=True)
    
    @cursor.connect("add")
    def on_add(sel):
        idx = tooltip_lines.index(sel.artist)
        sel.annotation.set_text(tooltip_labels[idx])
        sel.annotation.get_bbox_patch().set(fc="white", alpha=0.9)
    
    fig.canvas.draw_idle()
    
    return cursor  # Return cursor to keep it alive

In [None]:
# Create the interactive widget

fig = plt.figure(figsize=(10, 8))

# Seed counter for regeneration
seed_counter = [42]
cursor_holder = [None]  # Keep cursor reference alive

def update_plot(n, estimator, quantile_p, prob_t):
    """Update the plot based on widget values."""
    if estimator == 'Rate (λ = 1/x̄)':
        g_func = g_rate
        g_deriv = g_rate_deriv
        g_name = r'$\hat{\lambda}$'
        g_formula = r'$g(x) = 1/x$'
        true_output = TRUE_LAMBDA
        estimator_type = 'rate'
        t_param = None
    elif estimator == 'Quantile':
        p = quantile_p
        g_func = lambda x: g_quantile(x, p)
        g_deriv = lambda x: g_quantile_deriv(x, p)
        g_name = fr'$\hat{{Q}}_{{{p:.1f}}}$'
        g_formula = fr'$g(x) = -\log({1-p:.1f}) \cdot x$'
        true_output = -np.log(1 - p) / TRUE_LAMBDA  # True quantile of Exp(λ)
        estimator_type = 'quantile'
        t_param = None
    else:  # Probability
        t = prob_t
        g_func = lambda x: g_prob(x, t)
        g_deriv = lambda x: g_prob_deriv(x, t)
        g_name = fr'$\hat{{P}}(X \leq {t:.0f})$'
        g_formula = fr'$g(x) = 1 - e^{{-{t:.0f}/x}}$'
        true_output = 1 - np.exp(-TRUE_LAMBDA * t)  # True probability
        estimator_type = 'prob'
        t_param = t
    
    cursor_holder[0] = create_delta_method_plot(
        n=n,
        g_func=g_func,
        g_deriv=g_deriv,
        g_name=g_name,
        g_formula=g_formula,
        true_input=TRUE_MEAN,
        true_output=true_output,
        fig=fig,
        estimator_type=estimator_type,
        prob_t=t_param,
        seed=seed_counter[0]
    )

def on_regenerate_click(b):
    seed_counter[0] = np.random.randint(0, 100000)
    update_plot(n_slider.value, estimator_dropdown.value, quantile_slider.value, prob_t_slider.value)

# Create widgets
n_slider = widgets.IntSlider(
    value=295, min=10, max=1000, step=10,
    description='Sample size n:',
    style={'description_width': '100px'},
    layout=widgets.Layout(width='400px'),
    continuous_update=False
)

estimator_dropdown = widgets.Dropdown(
    options=['Rate (λ = 1/x̄)', 'Quantile', 'Probability P(X ≤ t)'],
    value='Rate (λ = 1/x̄)',
    description='Estimator:',
    style={'description_width': '100px'}
)

quantile_slider = widgets.FloatSlider(
    value=0.5, min=0.1, max=0.9, step=0.1,
    description='Quantile p:',
    style={'description_width': '100px'},
    layout=widgets.Layout(width='300px'),
    continuous_update=False
)

prob_t_slider = widgets.FloatSlider(
    value=7, min=1, max=60, step=1,
    description='Time t (days):',
    style={'description_width': '100px'},
    layout=widgets.Layout(width='300px'),
    continuous_update=False
)

regenerate_button = widgets.Button(
    description='Regenerate Sample',
    button_style='primary',
    layout=widgets.Layout(width='150px')
)
regenerate_button.on_click(on_regenerate_click)

# Link widgets
widgets.interactive_output(
    update_plot,
    {'n': n_slider, 'estimator': estimator_dropdown, 'quantile_p': quantile_slider, 'prob_t': prob_t_slider}
)

# Layout
controls = widgets.VBox([
    widgets.HBox([n_slider, regenerate_button]),
    widgets.HBox([estimator_dropdown, quantile_slider, prob_t_slider])
])

display(controls)
plt.show()

## What to notice

**Color coding:**
- **Blue (solid)**: Data histograms and true distributions
  - Bottom: True Gamma distribution of $\bar{X}_n$
  - Right: True distribution of $g(\bar{X}_n)$ (transformed Gamma)
- **Red (dashed)**: All approximations
  - Bottom: CLT normal approximation
  - Main: Linear (tangent) approximation to $g$
  - Right: Delta method normal approximation

**Hover over any curve** to see its full description.

**Comparing the three estimators:**
- **Rate** ($g(x) = 1/x$): Nonlinear and concave — the tangent line undershoots the curve, creating visible mismatch at small $n$
- **Quantile** ($g(x) = cx$): Linear! — the tangent IS the curve, so delta method is exact
- **Probability** ($g(x) = 1 - e^{-t/x}$): Nonlinear with an S-shape — interesting to see how the delta method handles this

**Small n (e.g., n=30)**: 
- The true Gamma (blue solid) is visibly skewed
- The CLT normal (red dashed) doesn't match well
- The curve has visible curvature — red tangent doesn't match blue curve
- The delta method normal (red dashed) doesn't match the true distribution (blue solid)

**Large n (e.g., n=295 or more)**:
- The true Gamma is nearly symmetric, close to normal
- The CLT approximation matches the true Gamma well
- The blue curve and red tangent nearly coincide over the data range
- The delta method normal matches the true distribution well

**The delta method works because** it combines two approximations that both become accurate for large $n$:
1. CLT: $\bar{X}_n$ is approximately normal
2. Linearization: $g$ is approximately linear where the data lives