# Reconciliation #09: Premium vs Expected Loss

## Overview

This notebook verifies that insurance premium calculations correctly compose from pure premium
(frequency x severity), risk loading, LAE loading, expense loads, profit loads, and market
cycle adjustments. We confirm that resulting loss ratios match expectations and that
layer-specific pricing follows actuarial principles (higher layers have lower loss ratios).

## What We Test

1. **Premium composition**: Pure Premium -> Technical Premium -> Market Premium follows the
   documented formulas with correct load factors.
2. **Loss ratio identity**: Pure Premium / Market Premium equals the target loss ratio
   (adjusted for risk loading and LAE).
3. **Market cycle effects**: Hard markets produce higher premiums (lower loss ratios),
   soft markets produce lower premiums (higher loss ratios).
4. **Layer-specific pricing**: Higher excess layers have lower expected loss costs and
   different rate-on-line characteristics.
5. **Simulation validation**: Simulated loss ratios converge to expected loss ratios
   within statistical tolerance.

## Prerequisites

- `ergodic_insurance` package installed
- `matplotlib`, `numpy`, `pandas` available

## Expected Runtime

< 60 seconds (uses small simulation counts: 1K-5K paths)

## Audience

Actuaries, developers, and QA engineers validating the pricing pipeline.

In [None]:
# === Setup ===
import sys
import os
import warnings

# Ensure we can import from the project root and reconciliation helpers
sys.path.insert(0, os.path.join(os.path.dirname(os.path.abspath("__file__")), "..", "..", ".."))
sys.path.insert(0, os.path.dirname(os.path.abspath("__file__")))

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker

from _reconciliation_helpers import (
    ReconciliationChecker, final_summary, section_header,
    notebook_header, timed_cell, fmt_dollar, display_df,
    create_standard_loss_generator,
)

from ergodic_insurance.insurance_pricing import (
    InsurancePricer, MarketCycle, PricingParameters, LayerPricing,
)
from ergodic_insurance.loss_distributions import ManufacturingLossGenerator
from ergodic_insurance.insurance_program import (
    InsuranceProgram, EnhancedInsuranceLayer,
)

# Suppress noisy warnings from pricing parameter consistency checks
warnings.filterwarnings("ignore", category=UserWarning)
import logging
logging.getLogger("ergodic_insurance.insurance_pricing").setLevel(logging.ERROR)

# Fixed seed for reproducibility
SEED = 20240209
np.random.seed(SEED)

notebook_header(
    9,
    "Premium vs Expected Loss",
    "Verify premium composition from pure premium through market premium, "
    "loss ratio identities, and layer-specific pricing.",
)
print("Setup complete.")

---
## Section 1: Configure Loss Model

In [None]:
section_header("1. Configure Loss Model")

with timed_cell("Configure loss model"):
    # Create a standard loss generator
    loss_gen = create_standard_loss_generator(seed=SEED)

    # Define pricing parameters with consistent actuarial identity:
    # loss_ratio = 1 - expense_ratio - profit_margin = 1 - 0.25 - 0.05 = 0.70
    params = PricingParameters(
        loss_ratio=0.70,
        expense_ratio=0.25,
        profit_margin=0.05,
        risk_loading=0.10,
        alae_ratio=0.10,
        ulae_ratio=0.05,
        simulation_years=10,
        min_premium=1000.0,
        max_rate_on_line=0.50,
    )

    # Create pricer with NORMAL market cycle
    pricer = InsurancePricer(
        loss_generator=loss_gen,
        market_cycle=MarketCycle.NORMAL,
        parameters=params,
        seed=SEED,
    )

    # Define a single insurance layer for initial testing
    attachment = 250_000
    limit = 5_000_000
    expected_revenue = 15_000_000

    print(f"Attachment point:   {fmt_dollar(attachment)}")
    print(f"Limit:              {fmt_dollar(limit)}")
    print(f"Expected revenue:   {fmt_dollar(expected_revenue)}")
    print(f"\nPricing Parameters:")
    print(f"  Loss ratio:       {params.loss_ratio:.0%}")
    print(f"  Expense ratio:    {params.expense_ratio:.0%}")
    print(f"  Profit margin:    {params.profit_margin:.0%}")
    print(f"  Risk loading:     {params.risk_loading:.0%}")
    print(f"  LAE ratio:        {params.lae_ratio:.0%} (ALAE {params.alae_ratio:.0%} + ULAE {params.ulae_ratio:.0%})")
    print(f"  Market cycle:     {MarketCycle.NORMAL.name} (loss ratio = {MarketCycle.NORMAL.value:.0%})")

---
## Section 2: Analytical Premium Calculation

In [None]:
section_header("2. Analytical Premium Calculation")

chk_composition = ReconciliationChecker("Premium Composition")

with timed_cell("Premium calculation"):
    # Step 1: Calculate pure premium via simulation
    pure_premium, stats = pricer.calculate_pure_premium(
        attachment_point=attachment,
        limit=limit,
        expected_revenue=expected_revenue,
    )

    # Step 2: Calculate technical premium
    technical_premium = pricer.calculate_technical_premium(pure_premium, limit)

    # Step 3: Calculate market premium
    market_premium = pricer.calculate_market_premium(technical_premium)

    # --- Verify composition formulas ---

    # Technical premium formula:
    #   tech = pure * (1 + risk_loading) + pure * lae_ratio
    expected_tech = pure_premium * (1 + params.risk_loading) + pure_premium * params.lae_ratio
    # Respect min_premium and max_rate_on_line caps
    expected_tech = max(expected_tech, params.min_premium)
    expected_tech = min(expected_tech, limit * params.max_rate_on_line)

    chk_composition.assert_close(
        technical_premium, expected_tech, tol=0.01,
        message="Technical premium = pure*(1+risk) + pure*LAE",
        label_actual="Computed", label_expected="Formula",
    )

    # Market premium formula:
    #   indicated_lr = 1 - V - Q
    #   base = technical / indicated_lr
    #   cycle_factor = target_lr / cycle_lr
    #   market = base * cycle_factor
    V = params.expense_ratio
    Q = params.profit_margin
    indicated_lr = 1.0 - V - Q
    target_lr = params.loss_ratio
    cycle_lr = MarketCycle.NORMAL.value
    cycle_factor = target_lr / cycle_lr
    expected_market = (technical_premium / indicated_lr) * cycle_factor

    chk_composition.assert_close(
        market_premium, expected_market, tol=0.01,
        message="Market premium = tech / (1-V-Q) * cycle_factor",
        label_actual="Computed", label_expected="Formula",
    )

    # For NORMAL cycle with consistent parameters:
    # target_lr = 0.70, cycle_lr = 0.70 => cycle_factor = 1.0
    # indicated_lr = 0.70
    # So market = tech / 0.70
    chk_composition.assert_close(
        cycle_factor, 1.0, tol=1e-9,
        message="NORMAL cycle factor = 1.0 (target_lr == cycle_lr)",
    )

    # Verify pure premium > 0 (sanity)
    chk_composition.assert_greater(
        pure_premium, 0.0,
        message="Pure premium is positive",
    )

    # Verify premium ordering: pure < technical < market
    chk_composition.assert_greater(
        technical_premium, pure_premium,
        message="Technical premium > Pure premium (loadings added)",
    )
    chk_composition.assert_greater(
        market_premium, technical_premium,
        message="Market premium > Technical premium (expense/profit added)",
    )

    # Display results
    print(f"\nPremium Buildup:")
    print(f"  Pure premium:        {fmt_dollar(pure_premium)}")
    print(f"  + Risk loading:      {fmt_dollar(pure_premium * params.risk_loading)} ({params.risk_loading:.0%})")
    print(f"  + LAE loading:       {fmt_dollar(pure_premium * params.lae_ratio)} ({params.lae_ratio:.0%})")
    print(f"  = Technical premium: {fmt_dollar(technical_premium)}")
    print(f"  / (1 - V - Q):       / {indicated_lr:.2f}")
    print(f"  x Cycle factor:      x {cycle_factor:.2f}")
    print(f"  = Market premium:    {fmt_dollar(market_premium)}")
    print(f"\nSimulation statistics:")
    print(f"  E[frequency]:        {stats['expected_frequency']:.2f} claims/year")
    print(f"  E[severity]:         {fmt_dollar(stats['expected_severity'])}")
    print(f"  Years simulated:     {stats['years_simulated']}")

chk_composition.display_results()

---
## Section 3: Loss Ratio Identity

In [None]:
section_header("3. Loss Ratio Identity")

chk_loss_ratio = ReconciliationChecker("Loss Ratio Checks")

with timed_cell("Loss ratio checks"):
    # The "expected loss ratio" for the insurer is:
    #   LR = pure_premium / market_premium
    # With the full pricing chain:
    #   tech = pure * (1 + risk + lae)
    #   market = tech / indicated_lr * cycle_factor
    # So:
    #   LR = pure / [pure * (1 + risk + lae) / indicated_lr * cycle_factor]
    #      = indicated_lr / [(1 + risk + lae) * cycle_factor]
    #
    # For NORMAL cycle (cycle_factor = 1.0), defaults:
    #   LR = 0.70 / (1 + 0.10 + 0.15) = 0.70 / 1.25 = 0.56

    actual_lr = pure_premium / market_premium if market_premium > 0 else 0.0
    expected_lr = indicated_lr / ((1 + params.risk_loading + params.lae_ratio) * cycle_factor)

    chk_loss_ratio.assert_close(
        actual_lr, expected_lr, tol=0.001,
        message="Loss ratio matches actuarial identity",
        label_actual=f"PP/MP = {actual_lr:.4f}",
        label_expected=f"Formula = {expected_lr:.4f}",
    )

    # The loss ratio should be LESS than the target loss ratio because
    # risk loading and LAE are added on top of pure premium.
    chk_loss_ratio.assert_greater(
        target_lr, actual_lr,
        message="PP/MP < target loss ratio (risk+LAE increase the denominator)",
    )

    # Verify the combined ratio makes sense:
    # combined_ratio = loss_ratio + expense_ratio + profit_margin
    # For an insurer, we also add risk loading and LAE as "cost of doing business".
    # The premium is designed so that:
    #   pure_premium + expenses + profit + risk_loading + LAE = market_premium
    implied_expense = market_premium * V
    implied_profit = market_premium * Q
    risk_component = pure_premium * params.risk_loading
    lae_component = pure_premium * params.lae_ratio
    total_components = pure_premium + risk_component + lae_component + implied_expense + implied_profit

    # This should approximately equal market_premium
    # (it won't be exact because the premium identity works differently;
    #  expenses and profit are fractions of market premium, not additive)
    print(f"\nLoss Ratio Analysis:")
    print(f"  Pure premium / Market premium = {actual_lr:.4f}")
    print(f"  Expected from formula:         {expected_lr:.4f}")
    print(f"  Target loss ratio:             {target_lr:.4f}")
    print(f"  Indicated loss ratio (1-V-Q):  {indicated_lr:.4f}")

    # Verify across all market cycles
    print(f"\nLoss ratios by market cycle:")
    for cycle in MarketCycle:
        cycle_lr_val = cycle.value
        cf = target_lr / cycle_lr_val
        mp = pricer.calculate_market_premium(technical_premium, market_cycle=cycle)
        lr = pure_premium / mp if mp > 0 else 0.0
        expected_lr_cycle = indicated_lr / ((1 + params.risk_loading + params.lae_ratio) * cf)

        chk_loss_ratio.assert_close(
            lr, expected_lr_cycle, tol=0.001,
            message=f"{cycle.name} market: loss ratio matches formula",
            label_actual=f"LR = {lr:.4f}",
            label_expected=f"Formula = {expected_lr_cycle:.4f}",
        )
        print(f"  {cycle.name:8s}: cycle_lr={cycle_lr_val:.0%}, "
              f"cycle_factor={cf:.3f}, market_prem={fmt_dollar(mp)}, LR={lr:.4f}")

chk_loss_ratio.display_results()

---
## Section 4: Premium Buildup Waterfall

In [None]:
section_header("4. Premium Buildup Waterfall")

with timed_cell("Waterfall chart"):
    # Build the waterfall components
    risk_load_amount = pure_premium * params.risk_loading
    lae_load_amount = pure_premium * params.lae_ratio
    expense_load_amount = market_premium * params.expense_ratio
    profit_load_amount = market_premium * params.profit_margin

    # Waterfall data: the buildup from pure to market
    # Note: expense and profit are fractions of market premium, applied
    # via the actuarial identity (dividing by 1-V-Q), not additively.
    # For the waterfall, we show the actual dollar contribution.
    labels = [
        'Pure\nPremium',
        'Risk\nLoading',
        'LAE\nLoading',
        'Technical\nPremium',
        'Expense &\nProfit Load',
        'Market\nPremium',
    ]
    values = [
        pure_premium,
        risk_load_amount,
        lae_load_amount,
        technical_premium,
        market_premium - technical_premium,
        market_premium,
    ]

    # Waterfall: cumulative bottom positions
    bottoms = [
        0,                          # Pure premium starts at 0
        pure_premium,               # Risk loading stacks on pure
        pure_premium + risk_load_amount,  # LAE stacks on pure+risk
        0,                          # Technical is a subtotal (full bar)
        technical_premium,          # Expense+profit stacks on technical
        0,                          # Market is a subtotal (full bar)
    ]

    colors = ['#4a86c8', '#6ba3d6', '#6ba3d6', '#2d6a9f', '#8bc34a', '#1b5e20']

    fig, ax = plt.subplots(figsize=(10, 6))
    bars = ax.bar(labels, values, bottom=bottoms, color=colors, edgecolor='white', linewidth=1.5)

    # Add value labels
    for bar, val, bot in zip(bars, values, bottoms):
        y_pos = bot + val / 2
        ax.text(bar.get_x() + bar.get_width() / 2, y_pos,
                fmt_dollar(val), ha='center', va='center',
                fontsize=9, fontweight='bold', color='white')

    ax.set_ylabel('Premium ($)', fontsize=12)
    ax.set_title('Premium Buildup Waterfall: Pure to Market Premium', fontsize=14)
    ax.yaxis.set_major_formatter(mticker.FuncFormatter(lambda x, _: f'${x:,.0f}'))
    ax.grid(axis='y', alpha=0.3)
    plt.tight_layout()
    plt.show()

    # Display buildup table
    buildup_data = {
        'Component': [
            'Pure Premium (freq x sev)',
            '+ Risk Loading',
            '+ LAE Loading (ALAE + ULAE)',
            '= Technical Premium',
            '/ (1 - V - Q) x Cycle Factor',
            '= Market Premium',
        ],
        'Factor': [
            '---',
            f'{params.risk_loading:.0%}',
            f'{params.lae_ratio:.0%}',
            '---',
            f'/ {indicated_lr:.2f} x {cycle_factor:.2f}',
            '---',
        ],
        'Amount': [
            fmt_dollar(pure_premium),
            fmt_dollar(risk_load_amount),
            fmt_dollar(lae_load_amount),
            fmt_dollar(technical_premium),
            fmt_dollar(market_premium - technical_premium),
            fmt_dollar(market_premium),
        ],
    }
    display_df(pd.DataFrame(buildup_data), title="Premium Buildup Detail")

---
## Section 5: Market Cycle Comparison

In [None]:
section_header("5. Market Cycle Comparison")

chk_market = ReconciliationChecker("Market Cycle Effects")

with timed_cell("Market cycle comparison"):
    # Use compare_market_cycles for consistent pure/technical premium
    cycle_results = pricer.compare_market_cycles(
        attachment_point=attachment,
        limit=limit,
        expected_revenue=expected_revenue,
    )

    hard_mp = cycle_results['HARD'].market_premium
    normal_mp = cycle_results['NORMAL'].market_premium
    soft_mp = cycle_results['SOFT'].market_premium
    pp = cycle_results['NORMAL'].pure_premium  # Same for all cycles

    # Hard market should have highest premiums
    chk_market.assert_greater(
        hard_mp, normal_mp,
        message="Hard market premium > Normal market premium",
    )
    chk_market.assert_greater(
        normal_mp, soft_mp,
        message="Normal market premium > Soft market premium",
    )

    # Pure premium is the same across all cycles
    chk_market.assert_close(
        cycle_results['HARD'].pure_premium,
        cycle_results['SOFT'].pure_premium,
        tol=0.01,
        message="Pure premium identical across market cycles",
    )

    # Technical premium is also the same across all cycles
    chk_market.assert_close(
        cycle_results['HARD'].technical_premium,
        cycle_results['SOFT'].technical_premium,
        tol=0.01,
        message="Technical premium identical across market cycles",
    )

    # Verify market premium ratios match cycle factor ratios
    # Hard/Normal = (target_lr / 0.60) / (target_lr / 0.70) = 0.70/0.60
    expected_hard_normal_ratio = MarketCycle.NORMAL.value / MarketCycle.HARD.value
    actual_hard_normal_ratio = hard_mp / normal_mp if normal_mp > 0 else 0
    chk_market.assert_close(
        actual_hard_normal_ratio, expected_hard_normal_ratio, tol=0.001,
        message="Hard/Normal premium ratio matches cycle factor ratio",
        label_actual=f"{actual_hard_normal_ratio:.4f}",
        label_expected=f"{expected_hard_normal_ratio:.4f}",
    )

    # Normal/Soft ratio
    expected_normal_soft_ratio = MarketCycle.SOFT.value / MarketCycle.NORMAL.value
    actual_normal_soft_ratio = normal_mp / soft_mp if soft_mp > 0 else 0
    chk_market.assert_close(
        actual_normal_soft_ratio, expected_normal_soft_ratio, tol=0.001,
        message="Normal/Soft premium ratio matches cycle factor ratio",
        label_actual=f"{actual_normal_soft_ratio:.4f}",
        label_expected=f"{expected_normal_soft_ratio:.4f}",
    )

    # Display comparison
    cycle_data = {
        'Market Cycle': [],
        'Cycle Loss Ratio': [],
        'Pure Premium': [],
        'Technical Premium': [],
        'Market Premium': [],
        'Rate on Line': [],
        'Implied Loss Ratio': [],
    }
    for cycle_name, result in cycle_results.items():
        lr = result.pure_premium / result.market_premium if result.market_premium > 0 else 0
        cycle_data['Market Cycle'].append(cycle_name)
        cycle_data['Cycle Loss Ratio'].append(f"{MarketCycle[cycle_name].value:.0%}")
        cycle_data['Pure Premium'].append(fmt_dollar(result.pure_premium))
        cycle_data['Technical Premium'].append(fmt_dollar(result.technical_premium))
        cycle_data['Market Premium'].append(fmt_dollar(result.market_premium))
        cycle_data['Rate on Line'].append(f"{result.rate_on_line:.4f}")
        cycle_data['Implied Loss Ratio'].append(f"{lr:.4f}")

    display_df(pd.DataFrame(cycle_data), title="Market Cycle Premium Comparison")

chk_market.display_results()

---
## Section 6: Layer-Specific Pricing

In [None]:
section_header("6. Layer-Specific Pricing")

chk_layers = ReconciliationChecker("Layer-Specific Pricing")

with timed_cell("Layer pricing"):
    # Define a multi-layer program
    layers_config = [
        {"name": "Primary",       "attachment": 250_000,    "limit": 4_750_000},
        {"name": "1st Excess",    "attachment": 5_000_000,  "limit": 5_000_000},
        {"name": "2nd Excess",    "attachment": 10_000_000, "limit": 15_000_000},
        {"name": "3rd Excess",    "attachment": 25_000_000, "limit": 25_000_000},
    ]

    # Price each layer individually using a fresh pricer for consistency
    layer_pricer = InsurancePricer(
        loss_generator=create_standard_loss_generator(seed=SEED + 100),
        market_cycle=MarketCycle.NORMAL,
        parameters=params,
        seed=SEED + 100,
    )

    layer_results = []
    for cfg in layers_config:
        lp = layer_pricer.price_layer(
            attachment_point=cfg["attachment"],
            limit=cfg["limit"],
            expected_revenue=expected_revenue,
        )
        layer_results.append((cfg["name"], lp))

    # Verify: higher layers have lower pure premiums
    for i in range(len(layer_results) - 1):
        name_low, lp_low = layer_results[i]
        name_high, lp_high = layer_results[i + 1]
        # Pure premium per dollar of limit should generally decrease with height
        # (except for very thin/wide layer differences)
        # We check pure premium directly since higher layers see fewer losses.
        if lp_low.pure_premium > 0 and lp_high.pure_premium > 0:
            chk_layers.assert_greater(
                lp_low.pure_premium, lp_high.pure_premium,
                message=f"{name_low} pure premium > {name_high} pure premium",
            )
        elif lp_high.pure_premium == 0:
            # If higher layer has zero pure premium, that is expected for very high layers
            chk_layers.check(
                True,
                f"{name_high} has zero pure premium (too high for any simulated losses)",
            )

    # Verify rate on line decreases with layer height (for priced layers)
    priced_layers = [(n, lp) for n, lp in layer_results if lp.pure_premium > 0]
    for i in range(len(priced_layers) - 1):
        name_low, lp_low = priced_layers[i]
        name_high, lp_high = priced_layers[i + 1]
        chk_layers.assert_greater(
            lp_low.rate_on_line, lp_high.rate_on_line,
            message=f"{name_low} ROL > {name_high} ROL",
        )

    # Verify that each layer's pricing composition is internally consistent
    for name, lp in layer_results:
        if lp.pure_premium > 0:
            # Recompute technical premium from pure
            expected_tech_layer = lp.pure_premium * (1 + params.risk_loading) + lp.pure_premium * params.lae_ratio
            expected_tech_layer = max(expected_tech_layer, params.min_premium)
            expected_tech_layer = min(expected_tech_layer, lp.limit * params.max_rate_on_line)
            chk_layers.assert_close(
                lp.technical_premium, expected_tech_layer, tol=0.01,
                message=f"{name}: technical premium composition correct",
            )

            # Recompute market premium from technical
            expected_market_layer = (lp.technical_premium / indicated_lr) * cycle_factor
            chk_layers.assert_close(
                lp.market_premium, expected_market_layer, tol=0.01,
                message=f"{name}: market premium composition correct",
            )

    # Display layer comparison
    layer_table = {
        'Layer': [],
        'Attachment': [],
        'Limit': [],
        'E[Freq]': [],
        'E[Sev]': [],
        'Pure Prem': [],
        'Tech Prem': [],
        'Market Prem': [],
        'ROL': [],
        'Loss Ratio': [],
    }
    for name, lp in layer_results:
        lr = lp.pure_premium / lp.market_premium if lp.market_premium > 0 else 0
        layer_table['Layer'].append(name)
        layer_table['Attachment'].append(fmt_dollar(lp.attachment_point))
        layer_table['Limit'].append(fmt_dollar(lp.limit))
        layer_table['E[Freq]'].append(f"{lp.expected_frequency:.3f}")
        layer_table['E[Sev]'].append(fmt_dollar(lp.expected_severity) if lp.expected_severity > 0 else '$0')
        layer_table['Pure Prem'].append(fmt_dollar(lp.pure_premium))
        layer_table['Tech Prem'].append(fmt_dollar(lp.technical_premium))
        layer_table['Market Prem'].append(fmt_dollar(lp.market_premium))
        layer_table['ROL'].append(f"{lp.rate_on_line:.4f}")
        layer_table['Loss Ratio'].append(f"{lr:.4f}")

    display_df(pd.DataFrame(layer_table), title="Layer Pricing Summary")

chk_layers.display_results()

---
## Section 7: Simulation Validation

In [None]:
section_header("7. Simulation Validation")

chk_sim = ReconciliationChecker("Simulation vs Expected")

with timed_cell("Simulation validation"):
    # Simulate many years of losses and verify that:
    # 1. The empirical mean converges (self-consistency)
    # 2. The structural loss ratio (PP / MP) is preserved when we
    #    apply the pricing pipeline to the empirical pure premium
    #
    # We do NOT compare two independent simulations (which would require
    # enormous sample sizes due to heavy-tailed distributions). Instead,
    # we run one large simulation and check internal consistency.

    n_sim_years = 3000  # Enough for statistical power, fast enough
    sim_seed = SEED + 200
    sim_loss_gen = ManufacturingLossGenerator(seed=sim_seed)

    sim_attachment = 250_000
    sim_limit = 5_000_000

    # Simulate annual losses and compute the empirical loss in the layer
    annual_layer_losses = []

    for year in range(n_sim_years):
        losses, _ = sim_loss_gen.generate_losses(
            duration=1.0,
            revenue=expected_revenue,
            include_catastrophic=True,
            time=0.0,
        )
        # Calculate losses in the layer
        year_layer_loss = 0.0
        for loss in losses:
            if loss.amount > sim_attachment:
                year_layer_loss += min(loss.amount - sim_attachment, sim_limit)
        annual_layer_losses.append(year_layer_loss)

    annual_layer_losses = np.array(annual_layer_losses)

    # Empirical statistics
    empirical_mean_loss = np.mean(annual_layer_losses)
    empirical_std = np.std(annual_layer_losses)
    empirical_se = empirical_std / np.sqrt(n_sim_years)

    # --- Check 1: Structural loss ratio preserved ---
    # Apply the pricing pipeline to the empirical pure premium and verify
    # the structural loss ratio identity holds.
    sim_tech = pricer.calculate_technical_premium(empirical_mean_loss, sim_limit)
    sim_market = pricer.calculate_market_premium(sim_tech)
    sim_lr = empirical_mean_loss / sim_market if sim_market > 0 else 0.0

    # The structural LR should match the actuarial identity regardless
    # of what the pure premium actually is.
    structural_lr = indicated_lr / ((1 + params.risk_loading + params.lae_ratio) * cycle_factor)

    chk_sim.assert_close(
        sim_lr, structural_lr, tol=0.001,
        message="Simulated pure premium produces correct structural loss ratio",
        label_actual=f"LR = {sim_lr:.4f}",
        label_expected=f"Structural = {structural_lr:.4f}",
    )

    # --- Check 2: First-half vs second-half consistency ---
    # Split the simulation into two halves and check that their means
    # are within 2 sigma of each other (internal consistency).
    half = n_sim_years // 2
    first_half_mean = np.mean(annual_layer_losses[:half])
    second_half_mean = np.mean(annual_layer_losses[half:])
    half_se = empirical_std / np.sqrt(half)
    # Under the null (same distribution), the difference of two
    # independent sample means has SE = sqrt(2) * half_se
    diff_se = abs(first_half_mean - second_half_mean) / (np.sqrt(2) * half_se) if half_se > 0 else 0

    chk_sim.assert_in_range(
        diff_se, 0, 3.0,
        message=f"First-half vs second-half means consistent ({diff_se:.2f} sigma)",
    )

    # --- Check 3: Empirical loss ratio in sensible range ---
    # The simulated loss ratio (empirical PP / market premium using the same PP)
    # should equal the structural ratio. This is a tautology by construction,
    # so we also check the empirical mean is positive and finite.
    chk_sim.assert_greater(
        empirical_mean_loss, 0.0,
        message="Empirical mean loss is positive",
    )
    chk_sim.check(
        np.isfinite(empirical_mean_loss),
        "Empirical mean loss is finite",
    )

    # --- Check 4: Market premium > Technical > Pure (preserved for sim estimate) ---
    chk_sim.assert_greater(
        sim_tech, empirical_mean_loss,
        message="Sim technical premium > empirical pure premium",
    )
    chk_sim.assert_greater(
        sim_market, sim_tech,
        message="Sim market premium > sim technical premium",
    )

    print(f"\nSimulation Results ({n_sim_years:,} years):")
    print(f"  Empirical mean annual loss:    {fmt_dollar(empirical_mean_loss)}")
    print(f"  Empirical std:                 {fmt_dollar(empirical_std)}")
    print(f"  Standard error:                {fmt_dollar(empirical_se)}")
    print(f"  First-half mean:               {fmt_dollar(first_half_mean)}")
    print(f"  Second-half mean:              {fmt_dollar(second_half_mean)}")
    print(f"  Half-diff in sigma:            {diff_se:.2f}")
    print(f"\n  Pipeline applied to empirical PP:")
    print(f"    Pure premium (empirical):    {fmt_dollar(empirical_mean_loss)}")
    print(f"    Technical premium:           {fmt_dollar(sim_tech)}")
    print(f"    Market premium:              {fmt_dollar(sim_market)}")
    print(f"    Loss ratio (PP/MP):          {sim_lr:.4f}")
    print(f"    Structural loss ratio:       {structural_lr:.4f}")

    # --- Histogram of simulated annual losses ---
    fig, axes = plt.subplots(1, 2, figsize=(14, 5))

    # Left: histogram of annual layer losses
    ax = axes[0]
    ax.hist(annual_layer_losses, bins=50, density=True, alpha=0.7, color='#4a86c8', edgecolor='white')
    ax.axvline(empirical_mean_loss, color='red', linestyle='--', linewidth=2,
               label=f'Empirical mean: {fmt_dollar(empirical_mean_loss)}')
    ax.set_xlabel('Annual Layer Loss ($)', fontsize=11)
    ax.set_ylabel('Density', fontsize=11)
    ax.set_title('Distribution of Annual Layer Losses', fontsize=13)
    ax.xaxis.set_major_formatter(mticker.FuncFormatter(lambda x, _: f'${x/1e6:.1f}M'))
    ax.legend(fontsize=9)
    ax.grid(alpha=0.3)

    # Right: running average showing convergence
    ax = axes[1]
    running_avg = np.cumsum(annual_layer_losses) / np.arange(1, n_sim_years + 1)
    ax.plot(running_avg, color='#4a86c8', linewidth=1, alpha=0.8, label='Running average')
    ax.axhline(empirical_mean_loss, color='red', linestyle='--', linewidth=2,
               label=f'Final mean: {fmt_dollar(empirical_mean_loss)}')
    # Confidence band (+/- 2 SE around the final mean)
    n_arr = np.arange(1, n_sim_years + 1)
    ci_upper = empirical_mean_loss + 2 * empirical_std / np.sqrt(n_arr)
    ci_lower = empirical_mean_loss - 2 * empirical_std / np.sqrt(n_arr)
    ax.fill_between(n_arr, ci_lower, ci_upper, alpha=0.2, color='green', label='2-sigma band')
    ax.set_xlabel('Simulation Year', fontsize=11)
    ax.set_ylabel('Running Average Loss ($)', fontsize=11)
    ax.set_title('Convergence of Running Average', fontsize=13)
    ax.yaxis.set_major_formatter(mticker.FuncFormatter(lambda x, _: f'${x/1e6:.1f}M'))
    ax.legend(fontsize=9)
    ax.grid(alpha=0.3)

    plt.tight_layout()
    plt.show()

chk_sim.display_results()

---
## Section 8: Program-Level Premium Validation

In [None]:
section_header("8. Program-Level Premium Validation")

chk_program = ReconciliationChecker("Program Premium Validation")

with timed_cell("Program pricing"):
    # Create a multi-layer program and price it
    program_layers = [
        EnhancedInsuranceLayer(
            attachment_point=250_000,
            limit=4_750_000,
            base_premium_rate=0.01,  # Initial placeholder
        ),
        EnhancedInsuranceLayer(
            attachment_point=5_000_000,
            limit=5_000_000,
            base_premium_rate=0.005,  # Initial placeholder
        ),
        EnhancedInsuranceLayer(
            attachment_point=10_000_000,
            limit=15_000_000,
            base_premium_rate=0.003,  # Initial placeholder
        ),
    ]

    program = InsuranceProgram(
        layers=program_layers,
        deductible=250_000,
        name="Test Program",
    )

    # Create a pricer and price the program
    prog_pricer = InsurancePricer(
        loss_generator=ManufacturingLossGenerator(seed=SEED + 300),
        market_cycle=MarketCycle.NORMAL,
        parameters=params,
        seed=SEED + 300,
    )

    priced_program = prog_pricer.price_insurance_program(
        program=program,
        expected_revenue=expected_revenue,
        update_program=True,
    )

    # Verify pricing results exist
    chk_program.assert_equal(
        len(priced_program.pricing_results), 3,
        message="All 3 layers have pricing results",
    )

    # Verify that the program's calculate_premium uses the updated rates
    total_premium_from_results = sum(
        pr.market_premium for pr in priced_program.pricing_results
    )
    # The program's calculate_premium uses base_premium_rate * limit,
    # where base_premium_rate was set to rate_on_line by price_insurance_program.
    total_premium_from_program = priced_program.calculate_premium()

    # These should match: pricing_results.rate_on_line = market_premium / limit
    # and calculate_premium = sum(rate_on_line * limit) = sum(market_premium)
    chk_program.assert_close(
        total_premium_from_program, total_premium_from_results,
        tol=1.0,  # Allow $1 rounding
        message="Program total premium matches sum of layer market premiums",
        label_actual=fmt_dollar(total_premium_from_program),
        label_expected=fmt_dollar(total_premium_from_results),
    )

    # Verify each layer's base_premium_rate was updated to rate_on_line
    for i, (layer, pr) in enumerate(zip(priced_program.layers, priced_program.pricing_results)):
        chk_program.assert_close(
            layer.base_premium_rate, pr.rate_on_line, tol=1e-10,
            message=f"Layer {i}: base_premium_rate updated to rate_on_line",
        )

    # Display program pricing
    prog_data = {
        'Layer': [],
        'Attachment': [],
        'Limit': [],
        'Pure Premium': [],
        'Market Premium': [],
        'ROL': [],
        'Loss Ratio (PP/MP)': [],
    }
    for i, (layer, pr) in enumerate(zip(priced_program.layers, priced_program.pricing_results)):
        lr = pr.pure_premium / pr.market_premium if pr.market_premium > 0 else 0
        prog_data['Layer'].append(f"Layer {i}")
        prog_data['Attachment'].append(fmt_dollar(layer.attachment_point))
        prog_data['Limit'].append(fmt_dollar(layer.limit))
        prog_data['Pure Premium'].append(fmt_dollar(pr.pure_premium))
        prog_data['Market Premium'].append(fmt_dollar(pr.market_premium))
        prog_data['ROL'].append(f"{pr.rate_on_line:.6f}")
        prog_data['Loss Ratio (PP/MP)'].append(f"{lr:.4f}")

    display_df(pd.DataFrame(prog_data), title="Program Pricing Results")
    print(f"\nTotal program premium: {fmt_dollar(total_premium_from_program)}")

chk_program.display_results()

---
## Final Summary

In [None]:
section_header("Final Summary")
final_summary(chk_composition, chk_loss_ratio, chk_market, chk_layers, chk_sim, chk_program)