# Wheel Strategy v4: CSP Assignment → Covered Call Module

This notebook extends the CSP backtest to handle the full wheel cycle:
1. Sell Cash-Secured Put (CSP)
2. If assigned (expires ITM): take stock delivery
3. Sell Covered Call (CC) against assigned shares
4. If called away (CC expires ITM): wheel complete

## Key Features
- **State Machine Architecture**: Explicit trade lifecycle states prevent logic spaghetti
- **Unified P&L Accounting**: CSP + CC + Stock P&L properly aggregated
- **v1 Scope**: One CC cycle per assignment (no rolling, no early assignment)

## Trade States
```
CSP_OPEN → CSP_CLOSED_PROFIT | CSP_CLOSED_STOP | CSP_ASSIGNED | CSP_CLOSED_WORTHLESS
CSP_ASSIGNED → CC_OPEN
CC_OPEN → CC_CLOSED_PROFIT | CC_ASSIGNED | CC_CLOSED_WORTHLESS
All terminal states → WHEEL_COMPLETE
```


## Imports

In [30]:
from pathlib import Path
from dotenv import dotenv_values, load_dotenv
import sys
import os
import pandas as pd
import databento as db
import pandas_market_calendars as mcal

sys.executable

env_path = Path("/Users/samuelminer/Projects/nissan_options/wheel_strategy/.env")

print("Parsed keys:", dotenv_values(env_path).keys())

load_dotenv()  # loads .env from current working directory

assert os.getenv("DATABENTO_API_KEY"), "DATABENTO_API_KEY still not found"
print("os.getenv:", bool(os.getenv("DATABENTO_API_KEY")))
client = db.Historical()

pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
pd.set_option('display.max_colwidth', None)


Parsed keys: odict_keys(['DATABENTO_API_KEY', 'ANTHROPIC_API_KEY'])
os.getenv: True


## Configuration

All configurable parameters for the backtest. Modify this cell to change settings.

In [31]:
# =============================================================================
# UNIFIED CONFIGURATION
# =============================================================================

CONFIG = {
    # -------------------------------------------------------------------------
    # SYMBOL & TIMING
    # -------------------------------------------------------------------------
    'symbol': 'TSLA',                          # Underlying symbol to backtest
    'timezone': 'America/New_York',
    
    # Entry date/time for the single-day backtest
    'entry_date': '2023-06-06',                # Date to enter positions
    'entry_time': '15:45',                     # Time to capture option chain snapshot
    
    # Historical data lookback for technical indicators (e.g., Bollinger Bands)
    'lookback_days': 252 * 2,                  # ~2 years of daily data
    
    # -------------------------------------------------------------------------
    # OPTION SELECTION CRITERIA
    # -------------------------------------------------------------------------
    'option_type': 'P',                        # 'P' for puts (CSP), 'C' for calls
    'dte_min': 30,                             # Minimum days to expiration
    'dte_max': 45,                             # Maximum days to expiration
    'delta_min': 0.25,                         # Minimum absolute delta
    'delta_max': 0.35,                         # Maximum absolute delta
    
    # -------------------------------------------------------------------------
    # LIQUIDITY MODEL (regime-aware, penalty-based)
    # -------------------------------------------------------------------------
    # Hard rejection thresholds (truly untradeable)
    'min_bid_hard': 0.10,                      # Hard floor - reject penny options
    'hard_max_spread_pct': 0.20,               # Hard ceiling - reject extreme spreads
    
    # Base target spread (calm market conditions)
    'base_max_spread_pct': 0.08,               # Target max spread in normal conditions
    
    # IV regime adjustments (allow wider spreads in high-vol)
    'ivp_high_threshold': 0.70,                # IV percentile threshold for "high vol"
    'ivp_high_max_spread_pct': 0.12,           # Allowed spread when IV is high
    'ivp_extreme_threshold': 0.90,             # IV percentile threshold for "extreme vol"
    'ivp_extreme_max_spread_pct': 0.15,        # Allowed spread when IV is extreme
    
    # DTE adjustments (short-dated options have wider spreads)
    'short_dte_threshold': 7,                  # DTE below this gets extra allowance
    'short_dte_extra_spread_pct': 0.02,        # Extra spread allowance for short DTE
    
    # Penalty tiers (execution tax based on spread quality)
    # tight:    spread <= 0.6 * allowed → penalty = 1.0 (no extra slippage)
    # moderate: spread <= allowed       → penalty = 1.15 (15% wider effective spread)
    # wide:     spread <= hard_max      → penalty = 1.35 (35% wider effective spread)
    # ugly:     spread > hard_max       → REJECT (no trade)
    
    # -------------------------------------------------------------------------
    # EXIT STRATEGY
    # -------------------------------------------------------------------------
    'exit_pct': 0.50,                          # 0.50 = buy back at 50%, keep 50% profit
    'stop_loss_multiplier': 2.0,               # Exit if option price reaches Nx premium
    'max_hold_dte': None,                      # Exit at X DTE if no other trigger (None = disabled)
    
    # -------------------------------------------------------------------------
    # TRANSACTION COSTS (NEW - will be applied later)
    # -------------------------------------------------------------------------
    'commission_per_contract': 0.65,           # Per contract commission (round trip = 2x)
    'sec_fee_per_contract': 0.01,              # SEC/TAF fees per contract
    
    # -------------------------------------------------------------------------
    # EXECUTION / FILL ASSUMPTIONS (NEW - will be applied later)
    # -------------------------------------------------------------------------
    'fill_mode': 'mid',                        # 'mid' (current), 'bid' (realistic), 'pessimistic'
    'use_realistic_fills': False,              # When True: sell at bid, buy back at ask
    
    # -------------------------------------------------------------------------
    # PROBABILISTIC EXIT FILLS
    # -------------------------------------------------------------------------
    'execution_seed': 42,                      # Random seed for reproducible fills
    'use_probabilistic_exit_fills': True,      # Enable probabilistic fill model
    
    # Fill probability buckets by spread quality
    'pfill_tight': 0.90,                       # spread <= 5%
    'pfill_normal': 0.70,                      # spread <= 10%
    'pfill_wide': 0.40,                        # spread > 10%
    
    # Spread thresholds for buckets
    'tight_spread_pct': 0.05,
    'normal_spread_pct': 0.10,
    
    # Scaling and clamping
    'pfill_scale': 0.6,                        # Sensitivity multiplier (0.8, 1.0, 1.2)
    'pfill_min': 0.05,
    'pfill_max': 0.98,
    
    # Optional IVP penalty multipliers
    'pfill_ivp_high_mult': 0.85,
    'pfill_ivp_extreme_mult': 0.70,
    
    # -------------------------------------------------------------------------
    # CACHE
    # -------------------------------------------------------------------------
    'cache_dir': '../cache/',
}

# -------------------------------------------------------------------------
# DERIVED VALUES (computed from CONFIG)
# -------------------------------------------------------------------------
SYMBOL = CONFIG['symbol']
TZ = CONFIG['timezone']
CACHE_DIR = CONFIG['cache_dir']
os.makedirs(CACHE_DIR, exist_ok=True)

# Entry timestamp
ENTRY_DATE = pd.Timestamp(CONFIG['entry_date'], tz=TZ)
ENTRY_TIME = pd.Timestamp(f"{CONFIG['entry_date']} {CONFIG['entry_time']}", tz=TZ)

print("=" * 60)
print("BACKTEST CONFIGURATION")
print("=" * 60)
print(f"Symbol:          {SYMBOL}")
print(f"Entry Date:      {ENTRY_DATE.date()}")
print(f"Entry Time:      {CONFIG['entry_time']}")
print(f"Option Type:     {'Cash-Secured Put' if CONFIG['option_type'] == 'P' else 'Covered Call'}")
print(f"DTE Range:       {CONFIG['dte_min']} - {CONFIG['dte_max']} days")
print(f"Delta Range:     {CONFIG['delta_min']} - {CONFIG['delta_max']}")
print(f"Exit Target:     {CONFIG['exit_pct']*100:.0f}% of premium")
print(f"Stop Loss:       {CONFIG['stop_loss_multiplier']}x premium")
print(f"Fill Mode:       {CONFIG['fill_mode']}")
print(f"Realistic Fills: {CONFIG['use_realistic_fills']}")
print(f"Commission:      ${CONFIG['commission_per_contract']}/contract")
print("=" * 60)
print("\nNOTE: Transaction costs and realistic fills are NOT yet applied.")
print("      Run both notebooks to compare baseline vs realistic results.")

BACKTEST CONFIGURATION
Symbol:          TSLA
Entry Date:      2023-06-06
Entry Time:      15:45
Option Type:     Cash-Secured Put
DTE Range:       30 - 45 days
Delta Range:     0.25 - 0.35
Exit Target:     50% of premium
Stop Loss:       2.0x premium
Fill Mode:       mid
Realistic Fills: False
Commission:      $0.65/contract

NOTE: Transaction costs and realistic fills are NOT yet applied.
      Run both notebooks to compare baseline vs realistic results.


In [32]:
# =============================================================================
# COVERED CALL CONFIGURATION
# =============================================================================

CC_CONFIG = {
    # -------------------------------------------------------------------------
    # OPTION SELECTION CRITERIA
    # -------------------------------------------------------------------------
    'dte_min': 14,                             # Minimum days to expiration
    'dte_max': 30,                             # Maximum days to expiration
    'delta_min': 0.25,                         # Minimum absolute delta
    'delta_max': 0.35,                         # Maximum absolute delta
    'strike_min_pct_above_basis': 0.0,         # Allow ATM (0% above cost basis)
    'entry_time': '15:45',                     # Same snapshot time as CSP
    
    # -------------------------------------------------------------------------
    # BEHAVIORAL FLAGS (explicit intent documentation)
    # -------------------------------------------------------------------------
    'sell_call_only_if_price_above_basis': True,  # Require strike >= cost basis per share
    
    # -------------------------------------------------------------------------
    # TIE-BREAKING FOR CALL SELECTION
    # -------------------------------------------------------------------------
    # When multiple candidates match criteria, how to select:
    # Options: 'highest_premium', 'closest_delta', 'highest_strike'
    'tie_break_method': 'highest_premium',
}

print("=" * 60)
print("COVERED CALL CONFIGURATION")
print("=" * 60)
print(f"DTE Range:       {CC_CONFIG['dte_min']} - {CC_CONFIG['dte_max']} days")
print(f"Delta Range:     {CC_CONFIG['delta_min']} - {CC_CONFIG['delta_max']}")
print(f"Entry Time:      {CC_CONFIG['entry_time']}")
print(f"Strike >= Basis: {CC_CONFIG['sell_call_only_if_price_above_basis']}")
print(f"Tie-Breaking:    {CC_CONFIG['tie_break_method']}")
print("=" * 60)



COVERED CALL CONFIGURATION
DTE Range:       14 - 30 days
Delta Range:     0.25 - 0.35
Entry Time:      15:45
Strike >= Basis: True
Tie-Breaking:    highest_premium


In [33]:

# =============================================================================
# WHEEL STATE MACHINE
# =============================================================================
# Explicit state transitions prevent logic spaghetti and make logs interpretable.
# WHEEL_COMPLETE is the single canonical terminal state for all paths.

import uuid

# Valid state transitions
VALID_TRANSITIONS = {
    'CSP_OPEN': ['CSP_CLOSED_PROFIT', 'CSP_CLOSED_STOP', 'CSP_ASSIGNED', 'CSP_CLOSED_WORTHLESS'],
    'CSP_ASSIGNED': ['CC_OPEN', 'WHEEL_COMPLETE'],  # Can sell CC or mark incomplete
    'CC_OPEN': ['CC_CLOSED_PROFIT', 'CC_ASSIGNED', 'CC_CLOSED_WORTHLESS'],
    'CC_ASSIGNED': ['WHEEL_COMPLETE'],
    'CC_CLOSED_PROFIT': ['WHEEL_COMPLETE'],      # v1: no re-entry after CC profit
    'CC_CLOSED_WORTHLESS': ['WHEEL_COMPLETE'],   # v1: no re-entry after CC expires
    'CSP_CLOSED_PROFIT': ['WHEEL_COMPLETE'],
    'CSP_CLOSED_STOP': ['WHEEL_COMPLETE'],
    'CSP_CLOSED_WORTHLESS': ['WHEEL_COMPLETE'],
    'WHEEL_COMPLETE': [],  # Terminal state - no further transitions
}

# Event to state mapping
EVENT_TO_STATE = {
    # CSP phase events
    ('CSP_OPEN', 'profit_target'): 'CSP_CLOSED_PROFIT',
    ('CSP_OPEN', 'stop_loss'): 'CSP_CLOSED_STOP',
    ('CSP_OPEN', 'assigned'): 'CSP_ASSIGNED',
    ('CSP_OPEN', 'expired_worthless'): 'CSP_CLOSED_WORTHLESS',
    
    # Assignment to CC
    ('CSP_ASSIGNED', 'sell_call'): 'CC_OPEN',
    
    # CC phase events
    ('CC_OPEN', 'profit_target'): 'CC_CLOSED_PROFIT',
    ('CC_OPEN', 'called_away'): 'CC_ASSIGNED',
    ('CC_OPEN', 'expired_worthless'): 'CC_CLOSED_WORTHLESS',
    
    # Terminal transitions (all paths lead to WHEEL_COMPLETE)
    ('CC_ASSIGNED', 'complete'): 'WHEEL_COMPLETE',
    ('CC_CLOSED_PROFIT', 'complete'): 'WHEEL_COMPLETE',
    ('CC_CLOSED_WORTHLESS', 'complete'): 'WHEEL_COMPLETE',
    ('CSP_CLOSED_PROFIT', 'complete'): 'WHEEL_COMPLETE',
    ('CSP_CLOSED_STOP', 'complete'): 'WHEEL_COMPLETE',
    ('CSP_CLOSED_WORTHLESS', 'complete'): 'WHEEL_COMPLETE',
    ('CSP_ASSIGNED', 'complete'): 'WHEEL_COMPLETE',  # For incomplete wheels (no CC processed)
}


def advance_wheel_state(current_state, event):
    """
    Advance wheel state based on event. Enforces valid transitions.
    
    Args:
        current_state: Current state string (e.g., 'CSP_OPEN', 'CC_OPEN')
        event: Event triggering transition (e.g., 'profit_target', 'assigned', 'called_away')
    
    Returns:
        New state string
    
    Raises:
        ValueError if transition is invalid
    
    Example:
        >>> advance_wheel_state('CSP_OPEN', 'assigned')
        'CSP_ASSIGNED'
        >>> advance_wheel_state('CC_OPEN', 'called_away')
        'CC_ASSIGNED'
    """
    key = (current_state, event)
    
    if key not in EVENT_TO_STATE:
        valid_events = [e for (s, e) in EVENT_TO_STATE.keys() if s == current_state]
        raise ValueError(
            f"Invalid transition: state='{current_state}' + event='{event}'. "
            f"Valid events from {current_state}: {valid_events}"
        )
    
    new_state = EVENT_TO_STATE[key]
    
    # Double-check against VALID_TRANSITIONS (belt and suspenders)
    if new_state not in VALID_TRANSITIONS.get(current_state, []):
        raise ValueError(
            f"State '{new_state}' not reachable from '{current_state}'. "
            f"Valid transitions: {VALID_TRANSITIONS.get(current_state, [])}"
        )
    
    return new_state


def generate_wheel_id():
    """Generate a unique wheel ID for linking CSP + CC phases."""
    return str(uuid.uuid4())[:8]


def get_phase_from_state(state):
    """
    Extract phase from state string.
    
    Returns:
        'csp': For CSP states (CSP_OPEN, CSP_CLOSED_*, CSP_ASSIGNED)
        'cc': For CC states (CC_OPEN, CC_CLOSED_*, CC_ASSIGNED)
        'total': For WHEEL_COMPLETE (used in wheel summaries)
        'unknown': For unrecognized states
    """
    if state.startswith('CSP'):
        return 'csp'
    elif state.startswith('CC'):
        return 'cc'
    elif state == 'WHEEL_COMPLETE':
        return 'total'  # Consistent with wheel summary phase
    else:
        return 'unknown'


def is_terminal_state(state):
    """Check if state is a terminal state (no further transitions possible)."""
    return len(VALID_TRANSITIONS.get(state, [])) == 0 or state == 'WHEEL_COMPLETE'


# Test the state machine
print("=" * 60)
print("STATE MACHINE VALIDATION")
print("=" * 60)

# Test valid transitions
test_cases = [
    ('CSP_OPEN', 'profit_target', 'CSP_CLOSED_PROFIT'),
    ('CSP_OPEN', 'assigned', 'CSP_ASSIGNED'),
    ('CSP_ASSIGNED', 'sell_call', 'CC_OPEN'),
    ('CSP_ASSIGNED', 'complete', 'WHEEL_COMPLETE'),  # For incomplete wheels
    ('CC_OPEN', 'called_away', 'CC_ASSIGNED'),
    ('CC_ASSIGNED', 'complete', 'WHEEL_COMPLETE'),
]

for current, event, expected in test_cases:
    result = advance_wheel_state(current, event)
    status = "✓" if result == expected else "✗"
    print(f"  {status} {current} + '{event}' → {result}")

# Test invalid transition (should raise)
try:
    advance_wheel_state('CSP_OPEN', 'called_away')  # Invalid: called_away only valid for CC_OPEN
    print("  ✗ Should have raised ValueError for invalid transition")
except ValueError as e:
    print(f"  ✓ Correctly rejected invalid transition: CSP_OPEN + 'called_away'")

print("=" * 60)


STATE MACHINE VALIDATION
  ✓ CSP_OPEN + 'profit_target' → CSP_CLOSED_PROFIT
  ✓ CSP_OPEN + 'assigned' → CSP_ASSIGNED
  ✓ CSP_ASSIGNED + 'sell_call' → CC_OPEN
  ✓ CSP_ASSIGNED + 'complete' → WHEEL_COMPLETE
  ✓ CC_OPEN + 'called_away' → CC_ASSIGNED
  ✓ CC_ASSIGNED + 'complete' → WHEEL_COMPLETE
  ✓ Correctly rejected invalid transition: CSP_OPEN + 'called_away'


In [34]:
# =============================================================================
# HELPER FUNCTIONS FOR REALISTIC EXECUTION
# =============================================================================

def get_entry_price(row, fill_mode='realistic', penalty=1.0):
    """
    Calculate entry price when SELLING a put (we receive premium).
    Higher price = better for us.
    
    Slippage is calculated as a percentage of the bid-ask spread from mid.
    Penalty multiplier widens the effective spread for illiquid options.
    
    | Scenario    | Formula                              | Interpretation              |
    |-------------|--------------------------------------|-----------------------------|
    | pessimistic | mid - 75% of (spread * penalty)      | Forced/stressed execution   |
    | realistic   | mid - 30% of (spread * penalty)      | Normal retail execution     |
    | optimistic  | mid                                  | Patient, favorable fills    |
    
    Args:
        row: DataFrame row with bid_px_00, ask_px_00
        fill_mode: 'optimistic', 'realistic', or 'pessimistic'
        penalty: liquidity penalty multiplier (1.0 = no extra slippage)
    """
    bid = row['bid_px_00']
    ask = row['ask_px_00']
    mid = (bid + ask) / 2
    spread = ask - bid
    
    # Apply liquidity penalty to effective spread
    effective_spread = spread * penalty
    
    if fill_mode == 'optimistic':
        return mid                              # Best case - get mid (no penalty applied)
    elif fill_mode == 'pessimistic':
        fill = mid - (0.75 * effective_spread)  # Worst case - 75% toward bid
    else:  # realistic
        fill = mid - (0.30 * effective_spread)  # Normal - 30% toward bid
    
    # Clamp to [bid, ask] to stay realistic
    return max(bid, min(ask, fill))


def get_exit_price(daily_row, fill_mode=CONFIG['fill_mode'], target_price=None, penalty=1.0): # IS THIS RIGHT TO SET AT PENALTY = 1.0? ##
    """
    Calculate exit price when BUYING BACK a put (we pay to close).
    Lower price = better for us.
    
    For daily OHLCV data, we estimate spread behavior from the day's range.
    Penalty multiplier widens the effective range for illiquid options.
    
    | Scenario    | Formula                              | Interpretation              |
    |-------------|--------------------------------------|-----------------------------|
    | pessimistic | close + 75% of (range * penalty)     | Forced/stressed execution   |
    | realistic   | close + 30% of (range * penalty)     | Normal retail execution     |
    | optimistic  | close - 25% of (range * penalty)     | Patient, favorable fills    |
    
    Args:
        daily_row: DataFrame row with close, high, low
        fill_mode: 'optimistic', 'realistic', or 'pessimistic'
        target_price: Optional target price (not currently used but reserved)
        penalty: liquidity penalty multiplier (1.0 = no extra slippage)
    """
    close = daily_row['close']
    high = daily_row['high']
    low = daily_row['low']
    day_range = high - low  # Proxy for intraday spread/volatility
    
    # Apply liquidity penalty to effective range
    effective_range = day_range * penalty
    
    if fill_mode == 'optimistic':
        # Patient buyer - gets below close (toward low)
        fill = close - (0.25 * effective_range)
        return max(low, fill)
    elif fill_mode == 'pessimistic':
        # Forced buyer - pays above close (toward high)
        fill = close + (0.75 * effective_range)
        return min(high, fill)
    else:  # realistic
        # Normal execution - slight slippage above close
        fill = close + (0.30 * effective_range)
        return min(high, fill)


def get_transaction_costs(config, is_round_trip=True):
    """
    Calculate total transaction costs per contract.
    
    Args:
        config: CONFIG dict with commission and fee rates
        is_round_trip: True if both entry and exit, False if entry only (e.g., expired worthless)
    
    Returns:
        Total fees in dollars per contract
    """
    per_leg = config['commission_per_contract'] + config['sec_fee_per_contract']
    return per_leg * 2 if is_round_trip else per_leg


def compute_allowed_spread(row, config):
    """
    Compute the allowed spread percentage for a single option based on regime.
    
    Regime factors:
    - IV percentile (high vol → allow wider spreads)
    - DTE (short-dated → allow wider spreads)
    
    Returns: allowed_spread_pct for this option
    """
    base = config['base_max_spread_pct']
    
    # IV regime adjustment
    ivp = row.get('ivp', 0.5)  # Default to median if not computed
    if ivp >= config['ivp_extreme_threshold']:
        base = config['ivp_extreme_max_spread_pct']
    elif ivp >= config['ivp_high_threshold']:
        base = config['ivp_high_max_spread_pct']
    
    # DTE adjustment
    dte = row.get('dte', 30)
    if dte <= config['short_dte_threshold']:
        base += config['short_dte_extra_spread_pct']
    
    return base


def compute_liquidity_penalty(spread_pct, allowed_spread_pct, hard_max_spread_pct):
    """
    Compute liquidity penalty multiplier based on spread quality.
    
    Tiers:
    - tight:    spread <= 0.6 * allowed → penalty = 1.0 (no extra slippage)
    - moderate: spread <= allowed       → penalty = 1.15
    - wide:     spread <= hard_max      → penalty = 1.35
    - ugly:     spread > hard_max       → None (reject)
    
    Returns: (tier_name, penalty_multiplier) or (None, None) if rejected
    """
    if spread_pct > hard_max_spread_pct:
        return 'reject', None
    
    tight_threshold = 0.6 * allowed_spread_pct
    
    if spread_pct <= tight_threshold:
        return 'tight', 1.0
    elif spread_pct <= allowed_spread_pct:
        return 'moderate', 1.15
    else:  # spread_pct <= hard_max_spread_pct
        return 'wide', 1.35


def apply_liquidity_model(df, config):
    """
    Apply regime-aware liquidity model with penalty tiers.
    
    Instead of binary reject, this:
    1. Computes IV percentile (ivp) for regime detection
    2. Computes allowed_spread_pct per option (regime-aware)
    3. Assigns liquidity_tier and liquidity_penalty
    4. Only hard-rejects truly ugly spreads
    
    Args:
        df: DataFrame with option quotes (needs bid_px_00, ask_px_00, spread_pct, iv, dte)
        config: CONFIG dict with liquidity model settings
    
    Returns:
        DataFrame with liquidity columns added, ugly spreads removed
    """
    if len(df) == 0:
        return df
    
    df = df.copy()
    original_count = len(df)
    
    # Ensure required columns exist
    if 'spread_pct' not in df.columns:
        df['spread'] = df['ask_px_00'] - df['bid_px_00']
        df['spread_pct'] = df['spread'] / df['mid']
    
    # Step 1: Compute IV percentile (cross-sectional within this snapshot)
    if 'iv' in df.columns:
        df['ivp'] = df['iv'].rank(pct=True)
    else:
        df['ivp'] = 0.5  # Default to median if IV not available
    
    # Step 2: Compute allowed spread per option
    df['allowed_spread_pct'] = df.apply(
        lambda row: compute_allowed_spread(row, config), axis=1
    )
    
    # Step 3: Compute liquidity tier and penalty
    def get_tier_and_penalty(row):
        return compute_liquidity_penalty(
            row['spread_pct'], 
            row['allowed_spread_pct'],
            config['hard_max_spread_pct']
        )
    
    tiers_penalties = df.apply(get_tier_and_penalty, axis=1)
    df['liquidity_tier'] = tiers_penalties.apply(lambda x: x[0])
    df['liquidity_penalty'] = tiers_penalties.apply(lambda x: x[1])
    
    # Step 4: Hard reject only truly ugly spreads and penny options
    df = df[
        (df['liquidity_tier'] != 'reject') &
        (df['bid_px_00'] >= config['min_bid_hard'])
    ].copy()
    
    rejected = original_count - len(df)
    
    # Print diagnostics
    print(f"\n  Liquidity Model Applied:")
    print(f"    Original: {original_count} options")
    print(f"    Hard rejected: {rejected} ({rejected/original_count*100:.1f}%)")
    print(f"    Remaining: {len(df)} options")
    
    if len(df) > 0:
        tier_counts = df['liquidity_tier'].value_counts()
        print(f"    Tier breakdown: {dict(tier_counts)}")
        print(f"    Avg spread: {df['spread_pct'].mean()*100:.1f}%, Avg allowed: {df['allowed_spread_pct'].mean()*100:.1f}%")
        print(f"    Avg penalty: {df['liquidity_penalty'].mean():.2f}x")
    return df


def calculate_pnl(premium_received, exit_price_paid, fees, cost_basis):
    """
    Calculate P&L metrics for a trade.
    
    Args:
        premium_received: Premium collected when selling (contract value)
        exit_price_paid: Price paid to close position (contract value), 0 if expired worthless
        fees: Total transaction costs
        cost_basis: Capital at risk (strike * 100 for CSP)
    
    Returns:
        dict with pnl, pnl_pct, roc
    """
    pnl = premium_received - exit_price_paid - fees
    pnl_pct = (pnl / premium_received) * 100 if premium_received > 0 else 0
    roc = (pnl / cost_basis) * 100 if cost_basis > 0 else 0
    
    return {
        'pnl': pnl,
        'pnl_pct': pnl_pct,
        'roc': roc,
        'fees': fees
    }


def compute_p_fill_profit(row, config):
    """
    Compute probability of fill for profit target exit based on entry liquidity.
    
    Uses entry-time spread and IVP to determine fill probability:
    - tight spread (<=5%): high fill probability
    - normal spread (<=10%): moderate fill probability
    - wide spread (>10%): low fill probability
    
    Applies IVP penalty multipliers for high-volatility regimes.
    
    Args:
        row: DataFrame row with spread_pct_entry, ivp_entry
        config: CONFIG dict with fill probability settings
    
    Returns:
        float: Fill probability in [pfill_min, pfill_max]
    """
    spread_pct = row.get('spread_pct_entry', 0.05)
    
    # Bucket by spread quality
    if spread_pct <= config['tight_spread_pct']:
        p_fill = config['pfill_tight']
    elif spread_pct <= config['normal_spread_pct']:
        p_fill = config['pfill_normal']
    else:
        p_fill = config['pfill_wide']
    
    # Apply scale multiplier
    p_fill *= config['pfill_scale']
    
    # Apply IVP penalty (always use ivp_entry, not generic ivp)
    ivp = row.get('ivp_entry', 0.5)
    if ivp >= config['ivp_extreme_threshold']:
        p_fill *= config.get('pfill_ivp_extreme_mult', 1.0)
    elif ivp >= config['ivp_high_threshold']:
        p_fill *= config.get('pfill_ivp_high_mult', 1.0)
    
    # Clamp to valid range
    return max(config['pfill_min'], min(config['pfill_max'], p_fill))


def try_probabilistic_fill(p_fill, rng):
    """
    Simulate probabilistic fill by drawing uniform random number.
    
    Args:
        p_fill: Fill probability (0.0 to 1.0)
        rng: numpy random number generator
    
    Returns:
        tuple: (filled: bool, u: float) where u is the random draw
    """
    u = rng.uniform(0, 1)
    filled = (u <= p_fill)
    return filled, u


# Print summary of fill assumptions
print("=" * 60)
print("FILL ASSUMPTIONS BY SCENARIO")
print("=" * 60)
print(f"{'Scenario':<12} {'Entry (Sell)':<25} {'Exit (Buy Back)':<25}")
print("-" * 60)
print(f"{'Pessimistic':<12} {'Mid - 75% of spread':<25} {'Close + 75% of range':<25}")
print(f"{'Realistic':<12} {'Mid - 30% of spread':<25} {'Close + 30% of range':<25}")
print(f"{'Optimistic':<12} {'Mid (no slippage)':<25} {'Close - 25% of range':<25}")
print("=" * 60)
print(f"\nTransaction costs: ${CONFIG['commission_per_contract'] + CONFIG['sec_fee_per_contract']:.2f}/leg")
print(f"\nLiquidity Model (regime-aware):")
print(f"  Hard reject: bid < ${CONFIG['min_bid_hard']} or spread > {CONFIG['hard_max_spread_pct']*100:.0f}%")
print(f"  Base target spread: {CONFIG['base_max_spread_pct']*100:.0f}%")
print(f"  High IV ({CONFIG['ivp_high_threshold']*100:.0f}%ile): allow {CONFIG['ivp_high_max_spread_pct']*100:.0f}%")
print(f"  Extreme IV ({CONFIG['ivp_extreme_threshold']*100:.0f}%ile): allow {CONFIG['ivp_extreme_max_spread_pct']*100:.0f}%")
print(f"  Short DTE (≤{CONFIG['short_dte_threshold']}d): +{CONFIG['short_dte_extra_spread_pct']*100:.0f}% allowed")


FILL ASSUMPTIONS BY SCENARIO
Scenario     Entry (Sell)              Exit (Buy Back)          
------------------------------------------------------------
Pessimistic  Mid - 75% of spread       Close + 75% of range     
Realistic    Mid - 30% of spread       Close + 30% of range     
Optimistic   Mid (no slippage)         Close - 25% of range     

Transaction costs: $0.66/leg

Liquidity Model (regime-aware):
  Hard reject: bid < $0.1 or spread > 20%
  Base target spread: 8%
  High IV (70%ile): allow 12%
  Extreme IV (90%ile): allow 15%
  Short DTE (≤7d): +2% allowed


### Import Daily Equity Data For a Single Symbol

In [35]:
# Use CONFIG values (CACHE_DIR, SYMBOL, TZ already defined in CONFIG cell)
dataset = "EQUS.MINI"     # consolidated US equities (best choice)
schema = "ohlcv-1d"       # DAILY bars

# Calculate date range for historical data
# Use 2-day buffer to avoid requesting data that isn't yet available in the API
end = pd.Timestamp.utcnow().normalize() - pd.Timedelta(days=2)
start = end - pd.Timedelta(days=CONFIG['lookback_days'])

# Generate cache filename
start_str = start.strftime('%Y%m%d')
end_str = end.strftime('%Y%m%d')
cache_file = os.path.join(CACHE_DIR, f"equity_daily_{SYMBOL}_{start_str}_{end_str}.parquet")

# Check cache first
if os.path.exists(cache_file):
    print(f"[CACHE HIT] Loading daily equity data for {SYMBOL} from cache")
    data = pd.read_parquet(cache_file)
    print(f"  Loaded {len(data)} days of data")
else:
    print(f"[API] Fetching daily equity data for {SYMBOL} from {start.date()} to {end.date()}...")
    data = client.timeseries.get_range(
        dataset=dataset,
        symbols=SYMBOL,
        schema=schema,
        stype_in="raw_symbol",
        start=start,
        end=end,
    )
    # Convert to DataFrame and save to cache
    data = data.to_df(tz=TZ)
    data.to_parquet(cache_file)
    print(f"[CACHE SAVE] Saved {len(data)} days to cache")




[CACHE HIT] Loading daily equity data for TSLA from cache
  Loaded 347 days of data


In [36]:
# data is already a DataFrame from cache or API fetch
equity_data = data
equity_data.head()

Unnamed: 0_level_0,rtype,publisher_id,instrument_id,open,high,low,close,volume,symbol
ts_event,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2024-08-05 20:00:00-04:00,35,95,16244,201.71,203.49,192.7,195.1,2456159,TSLA
2024-08-06 20:00:00-04:00,35,95,16244,201.97,203.48,188.51,189.17,2174677,TSLA
2024-08-07 20:00:00-04:00,35,95,16244,190.42,200.75,189.72,200.58,1998501,TSLA
2024-08-08 20:00:00-04:00,35,95,16244,201.0,201.01,195.13,200.1,1717042,TSLA
2024-08-11 20:00:00-04:00,35,95,16244,200.09,200.64,194.68,197.7,1804376,TSLA


### Equity Technical Filter

In [37]:

import pandas as pd

entry_technical_filter = equity_data.copy().sort_index()

# Bollinger Bands parameters
window = 20
k = 2.0  # 2-sigma Bollinger Bands

# Calculate rolling statistics on close price
roll = entry_technical_filter["close"].rolling(window=window, min_periods=window)
entry_technical_filter["sma20"] = roll.mean()
entry_technical_filter["std20"] = roll.std(ddof=0)

# Calculate Bollinger Bands
entry_technical_filter["bb_upper"] = entry_technical_filter["sma20"] + k * entry_technical_filter["std20"]
entry_technical_filter["bb_lower"] = entry_technical_filter["sma20"] - k * entry_technical_filter["std20"]

# Optional: Bollinger %B (position within bands)
entry_technical_filter["bb_pctb"] = (
    (entry_technical_filter["close"] - entry_technical_filter["bb_lower"]) / 
    (entry_technical_filter["bb_upper"] - entry_technical_filter["bb_lower"])
)

# Optional: Bollinger Bandwidth (width of bands relative to SMA)
entry_technical_filter["bb_bandwidth"] = (
    (entry_technical_filter["bb_upper"] - entry_technical_filter["bb_lower"]) / 
    entry_technical_filter["sma20"]
)

entry_technical_filter.dropna().head()

Unnamed: 0_level_0,rtype,publisher_id,instrument_id,open,high,low,close,volume,symbol,sma20,std20,bb_upper,bb_lower,bb_pctb,bb_bandwidth
ts_event,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
2024-09-02 20:00:00-04:00,35,95,16244,215.8,219.9,209.06,209.06,1821502,TSLA,208.6555,9.352415,227.36033,189.95067,0.510813,0.179289
2024-09-03 20:00:00-04:00,35,95,16244,208.8,222.22,207.6,218.6,1930709,TSLA,209.8305,9.046774,227.924047,191.736953,0.742338,0.172459
2024-09-04 20:00:00-04:00,35,95,16244,223.93,235.0,222.25,229.67,2590599,TSLA,211.8555,8.72244,229.300381,194.410619,1.010594,0.164687
2024-09-05 20:00:00-04:00,35,95,16244,227.48,234.64,209.75,211.49,2733578,TSLA,212.401,8.33266,229.066321,195.735679,0.472668,0.156923
2024-09-08 20:00:00-04:00,35,95,16244,214.53,219.87,213.66,217.2,1502230,TSLA,213.256,7.892274,229.040549,197.471451,0.624932,0.148034


### Equity Technical Filter

In [38]:
# With BB Filter
df_equity_entry = entry_technical_filter.copy()[['close','sma20','bb_upper']].dropna()
df_equity_entry['bb_entry'] = df_equity_entry['close'] <= df_equity_entry['bb_upper']
df_equity_entry[['bb_entry']].value_counts()
df_equity_entry.head()

Unnamed: 0_level_0,close,sma20,bb_upper,bb_entry
ts_event,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2024-09-02 20:00:00-04:00,209.06,208.6555,227.36033,True
2024-09-03 20:00:00-04:00,218.6,209.8305,227.924047,True
2024-09-04 20:00:00-04:00,229.67,211.8555,229.300381,False
2024-09-05 20:00:00-04:00,211.49,212.401,229.066321,True
2024-09-08 20:00:00-04:00,217.2,213.256,229.040549,True


### Get Options Data For Dates that Pass Technical Filter

In [39]:
# Options data settings (uses CONFIG values)
dataset = "OPRA.PILLAR"
schema = "cmbp-1"

# Use entry time from CONFIG
start = ENTRY_TIME
end = start + pd.Timedelta(minutes=1)

# Generate cache filename for options data
date_str = start.strftime('%Y%m%d')
time_str = start.strftime('%H%M')
cache_file = os.path.join(CACHE_DIR, f"options_{SYMBOL}_{date_str}_{time_str}.parquet")

# Check cache first
if os.path.exists(cache_file):
    print(f"[CACHE HIT] Loading options data for {SYMBOL} on {start.date()} at {start.time()}")
    df_opts = pd.read_parquet(cache_file)
    print(f"  Loaded {len(df_opts)} option quotes")
else:
    print(f"[API] Fetching options for {SYMBOL} on {start.date()} at {start.time()}...")
    data = client.timeseries.get_range(
        dataset=dataset,
        schema=schema,
        symbols=f"{SYMBOL}.OPT",     # ✅ parent symbology format
        stype_in="parent",           # ✅ parent lookup
        start=start,
        end=end,
    )
    
    df_opts = data.to_df(tz=TZ).sort_values("ts_event")
    
    # Save to cache
    df_opts.to_parquet(cache_file)
    print(f"[CACHE SAVE] Saved {len(df_opts)} option quotes to cache")

df_opts.head()


[CACHE HIT] Loading options data for TSLA on 2023-06-06 at 15:45:00
  Loaded 363798 option quotes


Unnamed: 0_level_0,ts_event,rtype,publisher_id,instrument_id,action,side,price,size,flags,ts_in_delta,bid_px_00,ask_px_00,bid_sz_00,ask_sz_00,bid_pb_00,ask_pb_00,symbol
ts_recv,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
2023-06-06 15:45:00.000069556-04:00,2023-06-06 15:44:59.999864576-04:00,177,30,738198022,A,A,5.05,84,194,0,5.0,5.05,150,84,0,0,TSLA 230616P00215000
2023-06-06 15:45:00.000074115-04:00,2023-06-06 15:44:59.999868928-04:00,177,30,687866628,A,B,19.05,176,194,0,19.05,19.25,176,142,0,0,TSLA 240315C00300000
2023-06-06 15:45:00.000095165-04:00,2023-06-06 15:44:59.999889664-04:00,177,30,738198042,A,A,5.75,15,194,0,5.65,5.75,355,15,0,0,TSLA 230609P00222500
2023-06-06 15:45:00.000095264-04:00,2023-06-06 15:44:59.999890176-04:00,177,30,704643949,A,B,28.2,21,194,0,28.2,28.4,21,118,0,0,TSLA 230616C00192500
2023-06-06 15:45:00.000098559-04:00,2023-06-06 15:44:59.999894016-04:00,177,30,687866390,A,B,11.25,117,194,0,11.25,11.3,117,24,0,0,TSLA 230721C00235000


In [40]:
sym = df_opts["symbol"]

# Split ROOT and OPRA code (e.g. "AAPL" and "240119P00205000")
root_and_code = sym.str.split(expand=True)
df_opts["root"] = root_and_code[0]
code = root_and_code[1]

# Expiration: YYMMDD in positions 0–5
df_opts["expiration"] = pd.to_datetime(code.str[:6], format="%y%m%d")

# Call/Put flag: single char at position 6
df_opts["call_put"] = code.str[6]

# Strike: remaining digits, usually in 1/1000 dollars
# Example: "00205000" -> 205.000
strike_int = code.str[7:].astype("int32")
df_opts["strike"] = strike_int / 1000.0

# Calculate DTE (Days to Expiry)
# Localize expiration to match ts_event timezone, then normalize both to midnight
expiration_tz = df_opts["expiration"].dt.tz_localize(df_opts["ts_event"].dt.tz)
df_opts["dte"] = (expiration_tz - df_opts["ts_event"].dt.normalize()).dt.days
print(f'df shape: {df_opts.shape}')
df_opts.head()



df shape: (363798, 22)


Unnamed: 0_level_0,ts_event,rtype,publisher_id,instrument_id,action,side,price,size,flags,ts_in_delta,bid_px_00,ask_px_00,bid_sz_00,ask_sz_00,bid_pb_00,ask_pb_00,symbol,root,expiration,call_put,strike,dte
ts_recv,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
2023-06-06 15:45:00.000069556-04:00,2023-06-06 15:44:59.999864576-04:00,177,30,738198022,A,A,5.05,84,194,0,5.0,5.05,150,84,0,0,TSLA 230616P00215000,TSLA,2023-06-16,P,215.0,10
2023-06-06 15:45:00.000074115-04:00,2023-06-06 15:44:59.999868928-04:00,177,30,687866628,A,B,19.05,176,194,0,19.05,19.25,176,142,0,0,TSLA 240315C00300000,TSLA,2024-03-15,C,300.0,283
2023-06-06 15:45:00.000095165-04:00,2023-06-06 15:44:59.999889664-04:00,177,30,738198042,A,A,5.75,15,194,0,5.65,5.75,355,15,0,0,TSLA 230609P00222500,TSLA,2023-06-09,P,222.5,3
2023-06-06 15:45:00.000095264-04:00,2023-06-06 15:44:59.999890176-04:00,177,30,704643949,A,B,28.2,21,194,0,28.2,28.4,21,118,0,0,TSLA 230616C00192500,TSLA,2023-06-16,C,192.5,10
2023-06-06 15:45:00.000098559-04:00,2023-06-06 15:44:59.999894016-04:00,177,30,687866390,A,B,11.25,117,194,0,11.25,11.3,117,24,0,0,TSLA 230721C00235000,TSLA,2023-07-21,C,235.0,45


In [41]:
# Filter options using CONFIG values
df_opts = df_opts[
    (df_opts['dte'] >= CONFIG['dte_min']) & 
    (df_opts['dte'] <= CONFIG['dte_max']) & 
    (df_opts['call_put'] == CONFIG['option_type'])
].sort_values(['dte', 'strike'])
print(f'df shape: {df_opts.shape}')
df_opts.head()


df shape: (20043, 22)


Unnamed: 0_level_0,ts_event,rtype,publisher_id,instrument_id,action,side,price,size,flags,ts_in_delta,bid_px_00,ask_px_00,bid_sz_00,ask_sz_00,bid_pb_00,ask_pb_00,symbol,root,expiration,call_put,strike,dte
ts_recv,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
2023-06-06 15:45:05.878516698-04:00,2023-06-06 15:45:05.878309376-04:00,177,30,721420985,A,A,0.01,188,194,0,,0.01,0,188,0,0,TSLA 230707P00020000,TSLA,2023-07-07,P,20.0,31
2023-06-06 15:45:10.838429659-04:00,2023-06-06 15:45:10.838222848-04:00,177,30,721420985,A,A,0.01,50,194,0,,0.01,0,50,0,0,TSLA 230707P00020000,TSLA,2023-07-07,P,20.0,31
2023-06-06 15:45:22.004667964-04:00,2023-06-06 15:45:22.004460800-04:00,177,30,721420985,A,A,0.01,188,194,0,,0.01,0,188,0,0,TSLA 230707P00020000,TSLA,2023-07-07,P,20.0,31
2023-06-06 15:45:22.930718876-04:00,2023-06-06 15:45:22.930510592-04:00,177,30,721420985,A,A,0.01,50,194,0,,0.01,0,50,0,0,TSLA 230707P00020000,TSLA,2023-07-07,P,20.0,31
2023-06-06 15:45:34.364728077-04:00,2023-06-06 15:45:34.364520448-04:00,177,30,721420985,A,A,0.01,188,194,0,,0.01,0,188,0,0,TSLA 230707P00020000,TSLA,2023-07-07,P,20.0,31


In [42]:
# Get unique timestamps from your filtered options
unique_timestamps = df_opts.index.unique()

# Use entry time from CONFIG
start_time = ENTRY_TIME
end_time = start_time + pd.Timedelta(minutes=1)

# Generate cache filename for minute equity data
date_str = start_time.strftime('%Y%m%d')
time_str = start_time.strftime('%H%M')
cache_file = os.path.join(CACHE_DIR, f"equity_minute_{SYMBOL}_{date_str}_{time_str}.parquet")

# Check cache first
if os.path.exists(cache_file):
    print(f"[CACHE HIT] Loading minute equity data for {SYMBOL} on {start_time.date()} at {start_time.time()}")
    equity_df = pd.read_parquet(cache_file)
    print(f"  Loaded {len(equity_df)} minute records")
else:
    print(f"[API] Fetching minute equity data for {SYMBOL} on {start_time.date()} at {start_time.time()}...")

    # Fetch OHLCV data for TSLA at the specific timestamp
    equity_data = client.timeseries.get_range(
        dataset='XNAS.ITCH',  # NASDAQ for TSLA
        symbols=[SYMBOL],
        schema='ohlcv-1m',  # 1-minute OHLCV bars
        start=start_time,
        end=end_time,
        stype_in='raw_symbol'
    )

    # Convert to dataframe
    equity_df = equity_data.to_df()
    print(f"[CACHE SAVE] Saved {len(equity_df)} minute records to cache")
    equity_df.to_parquet(cache_file)

print(f"Total: {len(equity_df)} equity records")
equity_df


[CACHE HIT] Loading minute equity data for TSLA on 2023-06-06 at 15:45:00
  Loaded 1 minute records
Total: 1 equity records


Unnamed: 0_level_0,rtype,publisher_id,instrument_id,open,high,low,close,volume,symbol
ts_event,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2023-06-06 19:45:00+00:00,33,2,10274,219.75,219.91,219.75,219.86,19083,TSLA


In [43]:
import numpy as np
import pandas as pd
from py_vollib.black_scholes.implied_volatility import implied_volatility
from py_vollib.black_scholes.greeks.analytical import delta

r = 0.04  # fixed risk-free rate (4% as decimal for py_vollib)

# 0) Keep only rows that actually have a quote (bid/ask)
quotes = df_opts[df_opts["bid_px_00"].notna() & df_opts["ask_px_00"].notna()].copy()

# 1) Compute mid price per tick
quotes["mid"] = (quotes["bid_px_00"] + quotes["ask_px_00"]) / 2
quotes["spread"] = quotes["ask_px_00"] - quotes["bid_px_00"]
quotes["spread_pct"] = quotes["spread"] / quotes["mid"]

# 2) Collapse to ONE row per option contract (snapshot at ~3:45 pm)
chain_snapshot = (
    quotes
    .sort_values("ts_event")   # important: so tail(1) is the latest
    .groupby(["symbol", "expiration", "strike", "call_put"])
    .tail(1)                   # last quote for each contract
    .copy()
)
underlying_price = equity_df["close"].iloc[0]   # 15:45 close
chain_snapshot["underlying_last"] = underlying_price

# Note: Entry price will be calculated AFTER liquidity model applies penalties
# For now, just store mid price - actual entry_price calculated in backtest_candidates
print(f"Fill mode: {CONFIG['fill_mode']}")
print(f"  Mid prices available; entry prices will include liquidity penalty after filtering")


Fill mode: mid
  Mid prices available; entry prices will include liquidity penalty after filtering


In [44]:
def compute_iv(row):
    price = row["mid"]
    S     = row["underlying_last"]
    K     = row["strike"]
    t     = row["dte"] / 365.0
    flag  = "p" if row["call_put"] == "P" else "c"

    if not (np.isfinite(price) and np.isfinite(S) and np.isfinite(K) and t > 0):
        return np.nan
    if price <= 0 or S <= 0 or K <= 0:
        return np.nan

    try:
        return implied_volatility(price, S, K, t, r, flag)
    except Exception:
        return np.nan


def compute_delta(row):
    sigma = row["iv"]
    if not np.isfinite(sigma):
        return np.nan

    S    = row["underlying_last"]
    K    = row["strike"]
    t    = row["dte"] / 365.0
    flag = "p" if row["call_put"] == "P" else "c"

    return abs(delta(flag, S, K, t, r, sigma))

chain_snapshot["iv"] = chain_snapshot.apply(compute_iv, axis=1)
chain_snapshot["delta"] = chain_snapshot.apply(compute_delta, axis=1)

chain_snapshot.head()

Unnamed: 0_level_0,ts_event,rtype,publisher_id,instrument_id,action,side,price,size,flags,ts_in_delta,bid_px_00,ask_px_00,bid_sz_00,ask_sz_00,bid_pb_00,ask_pb_00,symbol,root,expiration,call_put,strike,dte,mid,spread,spread_pct,underlying_last,iv,delta
ts_recv,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1
2023-06-06 15:45:14.746222286-04:00,2023-06-06 15:45:14.746014976-04:00,177,30,721421169,A,A,0.06,174,194,0,0.01,0.06,25,174,0,0,TSLA 230714P00060000,TSLA,2023-07-14,P,60.0,38,0.035,0.05,1.428571,219.86,1.424901,0.001094
2023-06-06 15:45:23.257020132-04:00,2023-06-06 15:45:23.256814080-04:00,177,30,721420718,A,A,106.55,25,194,0,104.4,106.55,25,25,0,0,TSLA 230707P00325000,TSLA,2023-07-07,P,325.0,31,105.475,2.15,0.020384,219.86,0.824226,0.932261
2023-06-06 15:45:23.365916893-04:00,2023-06-06 15:45:23.365711104-04:00,177,30,721420579,A,A,106.6,100,194,0,104.45,106.6,85,100,0,0,TSLA 230714P00325000,TSLA,2023-07-14,P,325.0,38,105.525,2.15,0.020374,219.86,0.773831,0.922747
2023-06-06 15:45:23.887944575-04:00,2023-06-06 15:45:23.887738624-04:00,177,30,721421156,A,A,191.5,45,194,0,189.4,191.5,45,45,0,0,TSLA 230721P00410000,TSLA,2023-07-21,P,410.0,45,190.45,2.1,0.011027,219.86,1.070985,0.927305
2023-06-06 15:45:23.888330057-04:00,2023-06-06 15:45:23.888124160-04:00,177,30,721421152,A,A,136.55,45,194,0,134.4,136.55,25,45,0,0,TSLA 230721P00355000,TSLA,2023-07-21,P,355.0,45,135.475,2.15,0.01587,219.86,0.860183,0.922052


In [45]:
chain_snapshot['date'] = chain_snapshot['ts_event'].dt.date

candidates = chain_snapshot[
    (chain_snapshot["call_put"] == CONFIG['option_type'])
    & chain_snapshot["dte"].between(CONFIG['dte_min'], CONFIG['dte_max'])
    & chain_snapshot["delta"].abs().between(CONFIG['delta_min'], CONFIG['delta_max'])
].copy()

# Apply liquidity model (regime-aware, penalty-based)
candidates = apply_liquidity_model(candidates, CONFIG)

candidates[["symbol", "expiration", "strike", "dte", "iv", "delta",'mid']].sort_values(
    ["dte", "strike"]
)
candidates


  Liquidity Model Applied:
    Original: 7 options
    Hard rejected: 0 (0.0%)
    Remaining: 7 options
    Tier breakdown: {'tight': np.int64(7)}
    Avg spread: 2.7%, Avg allowed: 10.1%
    Avg penalty: 1.00x


Unnamed: 0_level_0,ts_event,rtype,publisher_id,instrument_id,action,side,price,size,flags,ts_in_delta,bid_px_00,ask_px_00,bid_sz_00,ask_sz_00,bid_pb_00,ask_pb_00,symbol,root,expiration,call_put,strike,dte,mid,spread,spread_pct,underlying_last,iv,delta,date,ivp,allowed_spread_pct,liquidity_tier,liquidity_penalty
ts_recv,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1
2023-06-06 15:45:57.550865022-04:00,2023-06-06 15:45:57.550656256-04:00,177,30,721420824,A,A,9.5,683,194,0,9.1,9.5,691,683,0,0,TSLA 230714P00210000,TSLA,2023-07-14,P,210.0,38,9.3,0.4,0.043011,219.86,0.508366,0.34933,2023-06-06,0.142857,0.08,tight,1.0
2023-06-06 15:45:57.551230515-04:00,2023-06-06 15:45:57.551021568-04:00,177,30,721420829,A,A,6.15,658,194,0,5.95,6.15,1,658,0,0,TSLA 230714P00200000,TSLA,2023-07-14,P,200.0,38,6.05,0.2,0.033058,219.86,0.521479,0.250938,2023-06-06,0.714286,0.12,tight,1.0
2023-06-06 15:45:57.561368377-04:00,2023-06-06 15:45:57.561160448-04:00,177,30,721420677,A,A,7.65,349,194,0,7.45,7.65,25,349,0,0,TSLA 230714P00205000,TSLA,2023-07-14,P,205.0,38,7.55,0.2,0.02649,219.86,0.514882,0.298274,2023-06-06,0.428571,0.08,tight,1.0
2023-06-06 15:45:59.421820422-04:00,2023-06-06 15:45:59.421614336-04:00,177,30,721420816,A,A,8.3,860,194,0,8.0,8.3,344,860,0,0,TSLA 230707P00210000,TSLA,2023-07-07,P,210.0,31,8.15,0.3,0.03681,219.86,0.511436,0.34268,2023-06-06,0.285714,0.08,tight,1.0
2023-06-06 15:45:59.584316824-04:00,2023-06-06 15:45:59.584111104-04:00,177,30,721420934,A,B,7.8,1085,194,0,7.8,7.9,1085,1258,0,0,TSLA 230721P00200000,TSLA,2023-07-21,P,200.0,45,7.85,0.1,0.012739,219.86,0.552869,0.270878,2023-06-06,1.0,0.15,tight,1.0
2023-06-06 15:45:59.592987302-04:00,2023-06-06 15:45:59.592781568-04:00,177,30,721420479,A,A,9.55,402,194,0,9.5,9.55,104,402,0,0,TSLA 230721P00205000,TSLA,2023-07-21,P,205.0,45,9.525,0.05,0.005249,219.86,0.548185,0.313684,2023-06-06,0.857143,0.12,tight,1.0
2023-06-06 15:45:59.995855920-04:00,2023-06-06 15:45:59.995651072-04:00,177,30,721420378,A,A,6.5,383,194,0,6.3,6.5,339,383,0,0,TSLA 230707P00205000,TSLA,2023-07-07,P,205.0,31,6.4,0.2,0.03125,219.86,0.515357,0.286492,2023-06-06,0.571429,0.08,tight,1.0


In [46]:

backtest_candidates = candidates.copy()

# Calculate entry price WITH liquidity penalty
backtest_candidates['entry_price'] = backtest_candidates.apply(
    lambda row: get_entry_price(row, CONFIG['fill_mode'], row.get('liquidity_penalty', 1.0)), 
    axis=1
)

# Premium and cost basis
backtest_candidates['per_share_premium'] = backtest_candidates['entry_price']
backtest_candidates['premium'] = backtest_candidates['per_share_premium'] * 100
backtest_candidates['cost_basis'] = backtest_candidates['strike'] * 100  # CSP cost basis = strike * 100

# Exit parameters
backtest_candidates['exit_pct'] = CONFIG['exit_pct']
backtest_candidates['exit_price_per_share'] = backtest_candidates['per_share_premium'] * backtest_candidates['exit_pct']

# Carry entry-time liquidity data for probabilistic fills
backtest_candidates['spread_pct_entry'] = candidates['spread_pct']
backtest_candidates['ivp_entry'] = candidates['ivp']  # Always use _entry suffix

# Keep liquidity info for exit calculations
backtest_candidates = backtest_candidates[[
    'symbol', 'cost_basis', 'premium', 'exit_pct', 'exit_price_per_share',
    'date', 'dte', 'expiration', 'mid', 'strike', 'entry_price',
    'liquidity_tier', 'liquidity_penalty', 'spread_pct_entry', 'ivp_entry'
]]

# Show summary
print(f"\nBacktest Candidates: {len(backtest_candidates)} options")
print(f"  Avg entry price: ${backtest_candidates['entry_price'].mean():.2f}/share")
print(f"  Avg mid price: ${backtest_candidates['mid'].mean():.2f}/share")
print(f"  Avg slippage: ${(backtest_candidates['mid'] - backtest_candidates['entry_price']).mean():.2f}/share")
print(f"  Liquidity tiers: {dict(backtest_candidates['liquidity_tier'].value_counts())}")

backtest_candidates


Backtest Candidates: 7 options
  Avg entry price: $7.77/share
  Avg mid price: $7.83/share
  Avg slippage: $0.06/share
  Liquidity tiers: {'tight': np.int64(7)}


Unnamed: 0_level_0,symbol,cost_basis,premium,exit_pct,exit_price_per_share,date,dte,expiration,mid,strike,entry_price,liquidity_tier,liquidity_penalty,spread_pct_entry,ivp_entry
ts_recv,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
2023-06-06 15:45:57.550865022-04:00,TSLA 230714P00210000,21000.0,918.0,0.5,4.59,2023-06-06,38,2023-07-14,9.3,210.0,9.18,tight,1.0,0.043011,0.142857
2023-06-06 15:45:57.551230515-04:00,TSLA 230714P00200000,20000.0,599.0,0.5,2.995,2023-06-06,38,2023-07-14,6.05,200.0,5.99,tight,1.0,0.033058,0.714286
2023-06-06 15:45:57.561368377-04:00,TSLA 230714P00205000,20500.0,749.0,0.5,3.745,2023-06-06,38,2023-07-14,7.55,205.0,7.49,tight,1.0,0.02649,0.428571
2023-06-06 15:45:59.421820422-04:00,TSLA 230707P00210000,21000.0,806.0,0.5,4.03,2023-06-06,31,2023-07-07,8.15,210.0,8.06,tight,1.0,0.03681,0.285714
2023-06-06 15:45:59.584316824-04:00,TSLA 230721P00200000,20000.0,782.0,0.5,3.91,2023-06-06,45,2023-07-21,7.85,200.0,7.82,tight,1.0,0.012739,1.0
2023-06-06 15:45:59.592987302-04:00,TSLA 230721P00205000,20500.0,951.0,0.5,4.755,2023-06-06,45,2023-07-21,9.525,205.0,9.51,tight,1.0,0.005249,0.857143
2023-06-06 15:45:59.995855920-04:00,TSLA 230707P00205000,20500.0,634.0,0.5,3.17,2023-06-06,31,2023-07-07,6.4,205.0,6.34,tight,1.0,0.03125,0.571429


In [64]:
def fetch_underlying_price_at_expiration(underlying_symbol, expiration_date, client, config):
    """
    Fetch underlying stock price at expiration date.
    
    Used to determine if a covered call is ITM (called away) or OTM (expires worthless).
    
    Args:
        underlying_symbol: Underlying symbol (e.g., 'TSLA')
        expiration_date: Expiration date (pandas Timestamp)
        client: Databento client
        config: Configuration dict with cache_dir and timezone
    
    Returns:
        float: Close price at expiration, or None if unavailable
    """
    cache_dir = config.get('cache_dir', '../cache/')
    tz = config.get('timezone', 'America/New_York')
    
    # Normalize expiration date
    if hasattr(expiration_date, 'tz') and expiration_date.tz:
        expiration_date = expiration_date.tz_localize(None)
    expiration_date = pd.Timestamp(expiration_date).normalize()
    
    # Try to fetch daily equity data for expiration date
    try:
        # Generate cache filename
        date_str = expiration_date.strftime('%Y%m%d')
        cache_file = os.path.join(cache_dir, f"equity_daily_{underlying_symbol}_{date_str}.parquet")
        
        # Check cache first
        if os.path.exists(cache_file):
            equity_df = pd.read_parquet(cache_file)
            if len(equity_df) > 0:
                # Return the close price
                return float(equity_df['close'].iloc[-1])
        
        # Cache miss - fetch from API
        start = pd.Timestamp(expiration_date, tz=tz)
        end = start + pd.Timedelta(days=1)
        
        equity_data = client.timeseries.get_range(
            dataset='EQUS.MINI',  # Consolidated US equities
            symbols=[underlying_symbol],
            schema='ohlcv-1d',
            start=start,
            end=end,
            stype_in='raw_symbol'
        )
        
        equity_df = equity_data.to_df(tz=tz)
        
        if len(equity_df) > 0:
            # Save to cache
            equity_df.to_parquet(cache_file)
            return float(equity_df['close'].iloc[-1])
        
    except Exception as e:
        print(f"    Warning: Could not fetch underlying price at expiration: {e}")
        return None
    
    return None
    
def fetch_daily_prices_for_option(symbol, entry_date, expiration_date, client, config):
    """
    Fetch daily OHLC prices for an option from entry date to expiration.

    Args:
        symbol: Option symbol
        entry_date: Entry date (normalized)
        expiration_date: Expiration date (normalized)
        client: Databento client
        config: Configuration dict

    Returns:
        DataFrame with daily OHLC data
    """
    # Generate cache filename for daily option prices
    entry_str = entry_date.strftime('%Y%m%d')
    exp_str = expiration_date.strftime('%Y%m%d')
    cache_file = os.path.join(CACHE_DIR, f"option_daily_{symbol}_{entry_str}_{exp_str}.parquet")

    # Check cache first
    if os.path.exists(cache_file):
        print(f"    [CACHE HIT] Loading daily prices for {symbol}")
        return pd.read_parquet(cache_file)

    # Cache miss - fetch from API
    print(f"    [API] Fetching daily prices for {symbol} from {entry_date.date()} to {expiration_date.date()}")

    start_daily = entry_date + pd.Timedelta(days=1)  # Day after entry
    end_daily = expiration_date + pd.Timedelta(days=1)  # Include expiration day

    daily_data = client.timeseries.get_range(
        dataset='OPRA.PILLAR',
        schema='ohlcv-1d',
        symbols=symbol,
        stype_in='raw_symbol',
        start=start_daily,
        end=end_daily,
    )

    df_daily = daily_data.to_df(tz=config['timezone'])

    # Save to cache
    df_daily.to_parquet(cache_file)
    print(f"    [CACHE SAVE] Saved {len(df_daily)} days to cache")

    return df_daily


def check_profit_target_hit(df_daily, exit_price_per_share, entry_date):
    """
    Check if the exit price target was hit in the daily price data.

    Args:
        df_daily: DataFrame with daily OHLC data (prices are per-share)
        exit_price_per_share: Target price per share to exit at
        entry_date: Entry date to skip (we can't exit same day we entered)

    Returns:
        tuple: (hit_date, daily_row) if hit, (None, None) if not hit
    """
    for check_date, daily_row in df_daily.iterrows():
        # Skip the entry date - we can't exit on the same day we entered
        check_date_normalized = check_date.tz_localize(None) if hasattr(check_date, 'tz_localize') and check_date.tz else check_date
        if check_date_normalized.date() <= entry_date.date():
            continue
            
        daily_low = daily_row['low']
        daily_high = daily_row['high']

        # Check if our exit target (per-share) is within the daily range
        if daily_low <= exit_price_per_share <= daily_high:
            return check_date, daily_row

    return None, None


def create_exit_record(symbol, entry_date, expiration_date, premium, exit_pct,
                       exit_price, exit_reason, check_date, daily_row, cost_basis,
                       wheel_id=None, initial_capital=None,  
                       touch_profit_target=None, p_fill_profit_target=None, u_fill_profit_target=None,
                       filled_profit_target=None, spread_pct_entry=None, ivp_entry=None,
                       touch_count=None):
    """
    Create an exit record dictionary.

    Args:
        symbol: Option symbol
        entry_date: Entry date
        expiration_date: Expiration date
        premium: Premium received
        exit_pct: Exit percentage (e.g., 0.25 = exit when decays 25%)
        exit_price: Actual exit price
        exit_reason: Reason for exit
        check_date: Date of exit
        daily_row: Daily price data row
        cost_basis: Cost basis (strike * 100)
        wheel_id: Unique identifier for this wheel instance  # ADD THIS
        initial_capital: Initial capital at risk (strike * 100 for CSP)  # ADD THIS
        touch_profit_target: Did price ever touch limit? (bool)
        p_fill_profit_target: Computed fill probability (float)
        u_fill_profit_target: Random draw value (float)
        filled_profit_target: Actually filled? (bool)
        spread_pct_entry: Entry-time spread % (float)
        ivp_entry: Entry-time IV percentile (float)
        touch_count: Number of distinct days with low <= L before fill (int)

    Returns:
        dict: Exit record
    """

        # Map exit_reason to state
    exit_reason_to_state = {
        'profit_target': 'CSP_CLOSED_PROFIT',
        'stop_loss': 'CSP_CLOSED_STOP',
        'expired_worthless': 'CSP_CLOSED_WORTHLESS',
        'assigned': 'CSP_ASSIGNED',
    }
    state = exit_reason_to_state.get(exit_reason, 'CSP_OPEN')

    return {
        'wheel_id': wheel_id,  # ADD THIS LINE
        'initial_capital': initial_capital,  # ADD THIS LINE
        'state': state,  # ADD THIS LINE - map exit_reason to state
        'symbol': symbol,
        'entry_date': entry_date,
        'exit_date': check_date.tz_localize(None) if hasattr(check_date, 'tz_localize') and check_date.tz else check_date,
        'expiration': expiration_date,
        'cost_basis': cost_basis,
        'premium': premium,
        'exit_pct': exit_pct,
        'exit_price': exit_price,
        'exit_reason': exit_reason,
        'days_held': (check_date.tz_localize(None) - entry_date).days if check_date else None,
        'daily_low': daily_row['low'] if daily_row is not None else None,
        'daily_high': daily_row['high'] if daily_row is not None else None,
        'touch_profit_target': touch_profit_target,
        'p_fill_profit_target': p_fill_profit_target,
        'u_fill_profit_target': u_fill_profit_target,
        'filled_profit_target': filled_profit_target,
        'spread_pct_entry': spread_pct_entry,
        'ivp_entry': ivp_entry,
        'touch_count': touch_count,
        'fill_model': 'ohlc_touch_prob_v1',
    }


def calculate_pnl_metrics(exits_df, config):
    """
    Calculate P&L metrics for exit results.

    Args:
        exits_df: DataFrame with exit records
        config: Configuration dict with fee settings

    Returns:
        DataFrame with P&L metrics added
    """
    if len(exits_df) > 0:
        exits_df = exits_df.copy()
        
        # Calculate transaction costs based on exit reason
        # Expired worthless = entry fee only (no buyback needed)
        # All other exits = round-trip fees
        exits_df['fees'] = exits_df['exit_reason'].apply(
            lambda reason: get_transaction_costs(config, is_round_trip=(reason != 'expired_worthless'))
        )
        
        # P&L after fees
        exits_df['exit_pnl'] = exits_df['premium'] - exits_df['exit_price'] - exits_df['fees']
        exits_df['exit_pnl_pct'] = (exits_df['exit_pnl'] / exits_df['premium']) * 100
        exits_df['roc'] = (exits_df['exit_pnl'] / exits_df['cost_basis']) * 100
        
        # Summary stats
        total_fees = exits_df['fees'].sum()
        print(f"\n  Transaction costs: ${total_fees:.2f} total ({len(exits_df)} trades)")

    return exits_df


def backtest_exit_strategy(backtest_candidates, client, config):
    """
    Backtest exit strategy for wheel options with probabilistic fills.

    Exit conditions:
    1. Profit target: Exit when option price <= premium * exit_pct
       - Uses probabilistic fill model: touch does not guarantee fill
       - Loops through days until filled or expired
    2. Stop-loss: Exit immediately when high >= stop threshold (p_fill=1.0)
    3. Expiration: If no exit by expiration, option expires worthless

    Args:
        backtest_candidates: DataFrame with options to backtest
        client: Databento client
        config: Configuration dict

    Returns:
        DataFrame with exit results
    """
    import numpy as np
    
    # Initialize RNG for reproducible fills
    rng = np.random.RandomState(config.get('execution_seed', 42))
    exits = []

    for idx, row in backtest_candidates.iterrows():
        symbol = row['symbol']

                # Generate wheel_id for this candidate
        wheel_id = generate_wheel_id()  # ADD THIS
        initial_capital = row['cost_basis']  # ADD THIS (cost_basis = strike * 100)

        # Normalize dates
        entry_date = pd.Timestamp(row['date']).tz_localize(None)
        expiration_date = pd.Timestamp(row['expiration']).tz_localize(None)

        # Entry details - work with per-share prices for comparison, contract prices for P&L
        premium_per_share = row['entry_price']  # Use entry_price (with slippage) not mid
        premium = premium_per_share * 100  # Contract premium (100 shares per contract)
        exit_pct = row['exit_pct']
        exit_price_per_share = premium_per_share * exit_pct  # Per-share exit price (buy back at this price)
        stop_loss_per_share = premium_per_share * config.get('stop_loss_multiplier', 2.0)
        cost_basis = row['strike'] * 100  # Contract cost basis
        liquidity_penalty = row.get('liquidity_penalty', 1.0)
        
        # Entry-time liquidity data for fill probability
        spread_pct_entry = row.get('spread_pct_entry', 0.05)
        ivp_entry = row.get('ivp_entry', 0.5)

        print(f"\nProcessing {symbol}...")
        print(f"  Entry: {entry_date.date()}, Premium: ${premium:.2f} (${premium_per_share:.2f}/share)")
        print(f"  Exit target: ${exit_price_per_share*100:.2f} (${exit_price_per_share:.2f}/share, exit at {exit_pct*100:.0f}% of premium)")
        print(f"  Stop loss: ${stop_loss_per_share*100:.2f} (${stop_loss_per_share:.2f}/share)")

        try:
            # Fetch daily prices
            df_daily = fetch_daily_prices_for_option(symbol, entry_date, expiration_date, client, config)

            # Initialize tracking variables
            touch_count = 0
            touch_profit_target = False
            filled_profit_target = False
            p_fill_profit = None
            u_fill_profit = None
            exit_date = None
            exit_daily_row = None
            exit_reason = None
            exit_price = None

            # Compute fill probability once (based on entry-time liquidity)
            if config.get('use_probabilistic_exit_fills', True):
                p_fill_profit = compute_p_fill_profit(row, config)

            # Loop through trading days until expiration
            for check_date, daily_row in df_daily.iterrows():
                # Normalize check_date
                check_date_normalized = check_date.tz_localize(None) if hasattr(check_date, 'tz_localize') and check_date.tz else check_date
                
                # Skip entry date
                if check_date_normalized.date() <= entry_date.date():
                    continue

                daily_low = daily_row['low']
                daily_high = daily_row['high']

                # Check stop-loss first (always fills with p_fill=1.0)
                if daily_high >= stop_loss_per_share:
                    exit_date = check_date_normalized
                    exit_daily_row = daily_row
                    exit_reason = 'stop_loss'
                    # Stop-loss: calculate exit price with slippage
                    actual_exit_per_share = get_exit_price(daily_row, config.get('fill_mode', 'realistic'), penalty=liquidity_penalty)
                    exit_price = actual_exit_per_share * 100
                    filled_profit_target = False  # Stop-loss is not a profit target fill
                    print(f"  ⚠ Stop-loss triggered on {exit_date.date()}")
                    print(f"    Threshold: ${stop_loss_per_share:.2f}/share, Actual fill: ${actual_exit_per_share:.2f}/share")
                    break

                # Check profit target touch
                if daily_low <= exit_price_per_share:
                    touch_profit_target = True
                    touch_count += 1

                    if config.get('use_probabilistic_exit_fills', True) and p_fill_profit is not None:
                        # Probabilistic fill: draw random number
                        filled, u = try_probabilistic_fill(p_fill_profit, rng)
                        u_fill_profit = u

                        if filled:
                            # Fill successful - exit trade
                            exit_date = check_date_normalized
                            exit_daily_row = daily_row
                            exit_reason = 'profit_target'
                            filled_profit_target = True
                            # Calculate exit price with slippage
                            actual_exit_per_share = get_exit_price(daily_row, config.get('fill_mode', 'realistic'), penalty=liquidity_penalty)
                            exit_price = actual_exit_per_share * 100
                            print(f"  ✓ Profit target hit on {exit_date.date()} (touch #{touch_count}, filled)")
                            print(f"    Target: ${exit_price_per_share:.2f}/share, Actual fill: ${actual_exit_per_share:.2f}/share")
                            print(f"    p_fill={p_fill_profit:.2f}, u={u:.3f}")
                            break
                        else:
                            # Touch but no fill - continue holding
                            print(f"    Touch #{touch_count} on {check_date_normalized.date()}: NO FILL (p_fill={p_fill_profit:.2f}, u={u:.3f})")
                    else:
                        # Deterministic fill (legacy behavior)
                        exit_date = check_date_normalized
                        exit_daily_row = daily_row
                        exit_reason = 'profit_target'
                        filled_profit_target = True
                        touch_count = 1
                        actual_exit_per_share = get_exit_price(daily_row, config.get('fill_mode', 'realistic'), penalty=liquidity_penalty)
                        exit_price = actual_exit_per_share * 100
                        print(f"  ✓ Profit target hit on {exit_date.date()} (deterministic)")
                        break

            # Handle expiration if no exit occurred
            if exit_date is None:
                exit_date = expiration_date
                exit_daily_row = None
                exit_reason = 'expired_worthless'
                exit_price = 0.0
                print(f"  🎉 Option expired worthless on {expiration_date.date()} - KEEP 100% PREMIUM!")

            # Create exit record with all tracking fields
            exit_record = create_exit_record(
                symbol, entry_date, expiration_date, premium, exit_pct,
                exit_price, exit_reason, exit_date, exit_daily_row, cost_basis,
                wheel_id=wheel_id, 
                initial_capital=initial_capital,
                touch_profit_target=touch_profit_target,
                p_fill_profit_target=p_fill_profit,
                u_fill_profit_target=u_fill_profit,
                filled_profit_target=filled_profit_target,
                spread_pct_entry=spread_pct_entry,
                ivp_entry=ivp_entry,
                touch_count=touch_count if touch_profit_target else 0
            )
            exits.append(exit_record)

        except Exception as e:
            print(f"  ✗ Error: {e}")
            import traceback
            traceback.print_exc()
            continue

    # Create results DataFrame and calculate P&L
    exits_df = pd.DataFrame(exits)
    exits_df = calculate_pnl_metrics(exits_df, config)

    return exits_df

# Run backtest (uses CONFIG from top of notebook)
exits_df = backtest_exit_strategy(
    backtest_candidates=backtest_candidates,
    client=client,
    config=CONFIG
)

# Display results
print("\n" + "="*60)
print("BACKTEST RESULTS")
print("="*60)
print(f"\nTotal exits: {len(exits_df)}")

if len(exits_df) > 0:
    print(f"\nExit reasons:")
    print(exits_df['exit_reason'].value_counts())
    print(f"\nP&L Summary:")
    print(exits_df[['exit_pnl', 'exit_pnl_pct', 'roc']].describe())
    
    # Show sample
    print("\nSample exits:")
    print(exits_df[['symbol', 'entry_date', 'exit_date', 'premium', 'exit_price', 
                   'exit_pnl', 'roc', 'exit_reason']].head(10))
else:
    print("\n⚠ No exits recorded - check for errors above")



Processing TSLA  230714P00210000...
  Entry: 2023-06-06, Premium: $918.00 ($9.18/share)
  Exit target: $459.00 ($4.59/share, exit at 50% of premium)
  Stop loss: $1836.00 ($18.36/share)
    [CACHE HIT] Loading daily prices for TSLA  230714P00210000
  ✓ Profit target hit on 2023-06-08 (touch #1, filled)
    Target: $4.59/share, Actual fill: $3.63/share
    p_fill=0.54, u=0.375

Processing TSLA  230714P00200000...
  Entry: 2023-06-06, Premium: $599.00 ($5.99/share)
  Exit target: $299.50 ($3.00/share, exit at 50% of premium)
  Stop loss: $1198.00 ($11.98/share)
    [CACHE HIT] Loading daily prices for TSLA  230714P00200000
    Touch #1 on 2023-06-08: NO FILL (p_fill=0.46, u=0.951)
    Touch #2 on 2023-06-08: NO FILL (p_fill=0.46, u=0.732)
    Touch #3 on 2023-06-08: NO FILL (p_fill=0.46, u=0.599)
  ✓ Profit target hit on 2023-06-08 (touch #4, filled)
    Target: $3.00/share, Actual fill: $2.71/share
    p_fill=0.46, u=0.156

Processing TSLA  230714P00205000...
  Entry: 2023-06-06, Premi

In [65]:
# =============================================================================
# WHEEL MODULE: ASSIGNMENT HANDLER AND COVERED CALL SELECTION
# =============================================================================

def handle_assignment(csp_exit_record):
    """
    Create assignment record when CSP expires ITM.
    
    This captures the state when a put is assigned and shares are received.
    
    NOTE: premium_kept is recorded for audit trail but is already 
    included in CSP P&L. Do not double-count in wheel totals.
    
    Args:
        csp_exit_record: Dict from create_exit_record with exit_reason='assigned'
    
    Returns:
        dict: Assignment record with stock position details
    """
    strike = csp_exit_record['strike']
    premium = csp_exit_record['premium']
    net_stock_cost = strike * 100 - premium
    
    return {
        'wheel_id': csp_exit_record['wheel_id'],
        'symbol': csp_exit_record['symbol'].split()[0],  # Underlying (e.g., "TSLA")
        'assignment_date': csp_exit_record['expiration'],
        'strike': strike,
        'shares': 100,
        'assigned_price': strike,
        'cash_used': strike * 100,
        'premium_kept': premium,  # Audit only - already in CSP P&L
        'net_stock_cost': net_stock_cost,
        'stock_cost_per_share': net_stock_cost / 100,  # Derived field for CC strike constraint
        'underlying_at_assignment': csp_exit_record.get('underlying_at_expiration'),
        'initial_capital': csp_exit_record.get('initial_capital'),
    }


def fetch_option_chain_for_cc(underlying_symbol, entry_date, client, config, cc_config):
    """
    Fetch call option chain for covered call selection.
    
    Args:
        underlying_symbol: Underlying symbol (e.g., 'TSLA')
        entry_date: Date to enter CC position
        client: Databento client
        config: Main CONFIG dict
        cc_config: CC_CONFIG dict
    
    Returns:
        DataFrame with call options meeting criteria, or empty DataFrame
    """
    from py_vollib.black_scholes.implied_volatility import implied_volatility
    from py_vollib.black_scholes.greeks.analytical import delta as calc_delta
    import numpy as np
    
    cache_dir = config.get('cache_dir', '../cache/')
    tz = config.get('timezone', 'America/New_York')
    entry_time = cc_config.get('entry_time', '15:45')
    
    # Build entry timestamp
    entry_ts = pd.Timestamp(f"{entry_date.date()} {entry_time}", tz=tz)
    
    # Cache filename
    date_str = entry_ts.strftime('%Y%m%d')
    time_str = entry_ts.strftime('%H%M')
    cache_file = os.path.join(cache_dir, f"options_{underlying_symbol}_{date_str}_{time_str}.parquet")
    
    # Check cache first
    if os.path.exists(cache_file):
        print(f"    [CACHE HIT] Loading options for CC selection on {entry_date.date()}")
        df_opts = pd.read_parquet(cache_file)
    else:
        print(f"    [API] Fetching options for CC selection on {entry_date.date()}...")
        start = entry_ts
        end = start + pd.Timedelta(minutes=1)
        
        try:
            data = client.timeseries.get_range(
                dataset='OPRA.PILLAR',
                schema='cmbp-1',
                symbols=f"{underlying_symbol}.OPT",
                stype_in='parent',
                start=start,
                end=end,
            )
            df_opts = data.to_df(tz=tz).sort_values("ts_event")
            df_opts.to_parquet(cache_file)
        except Exception as e:
            print(f"    Error fetching options: {e}")
            return pd.DataFrame()
    
    if len(df_opts) == 0:
        return pd.DataFrame()
    
    # Parse option symbols
    sym = df_opts["symbol"]
    root_and_code = sym.str.split(expand=True)
    df_opts["root"] = root_and_code[0]
    code = root_and_code[1]
    df_opts["expiration"] = pd.to_datetime(code.str[:6], format="%y%m%d")
    df_opts["call_put"] = code.str[6]
    strike_int = code.str[7:].astype("int32")
    df_opts["strike"] = strike_int / 1000.0
    expiration_tz = df_opts["expiration"].dt.tz_localize(df_opts["ts_event"].dt.tz)
    df_opts["dte"] = (expiration_tz - df_opts["ts_event"].dt.normalize()).dt.days
    
    # Filter to calls only with valid quotes
    calls = df_opts[
        (df_opts['call_put'] == 'C') &
        (df_opts['bid_px_00'].notna()) &
        (df_opts['ask_px_00'].notna()) &
        (df_opts['dte'] >= cc_config['dte_min']) &
        (df_opts['dte'] <= cc_config['dte_max'])
    ].copy()
    
    if len(calls) == 0:
        return pd.DataFrame()
    
    # Compute mid price
    calls['mid'] = (calls['bid_px_00'] + calls['ask_px_00']) / 2
    calls['spread'] = calls['ask_px_00'] - calls['bid_px_00']
    calls['spread_pct'] = calls['spread'] / calls['mid']
    
    # Get last quote per contract
    calls = calls.sort_values("ts_event").groupby(
        ["symbol", "expiration", "strike", "call_put"]
    ).tail(1).copy()
    
    # Get underlying price
    equity_cache = os.path.join(cache_dir, f"equity_minute_{underlying_symbol}_{date_str}_{time_str}.parquet")
    if os.path.exists(equity_cache):
        equity_df = pd.read_parquet(equity_cache)
        underlying_price = equity_df['close'].iloc[-1] if len(equity_df) > 0 else None
    else:
        underlying_price = None
    
    if underlying_price is None:
        print(f"    Warning: Could not get underlying price for delta calc")
        return pd.DataFrame()
    
    calls['underlying_last'] = underlying_price
    
    # Compute IV and delta
    r = 0.04  # Risk-free rate
    
    def compute_iv_delta(row):
        price = row['mid']
        S = row['underlying_last']
        K = row['strike']
        t = row['dte'] / 365.0
        
        if not (np.isfinite(price) and np.isfinite(S) and np.isfinite(K) and t > 0):
            return np.nan, np.nan
        if price <= 0 or S <= 0 or K <= 0:
            return np.nan, np.nan
        
        try:
            iv = implied_volatility(price, S, K, t, r, 'c')
            d = abs(calc_delta('c', S, K, t, r, iv))
            return iv, d
        except Exception:
            return np.nan, np.nan
    
    iv_delta = calls.apply(compute_iv_delta, axis=1, result_type='expand')
    calls['iv'] = iv_delta[0]
    calls['delta'] = iv_delta[1]
    
    # Filter by delta
    calls = calls[
        calls['delta'].between(cc_config['delta_min'], cc_config['delta_max'])
    ].copy()
    
    return calls


def select_covered_call(assignment_record, option_chain, cc_config, config):
    """
    Select covered call to sell after assignment.
    
    Args:
        assignment_record: Dict from handle_assignment()
        option_chain: DataFrame with call options (from fetch_option_chain_for_cc)
        cc_config: CC_CONFIG dict
        config: Main CONFIG dict
    
    Returns:
        Series with selected call, or None if no suitable call found
    """
    if len(option_chain) == 0:
        return None
    
    stock_cost_per_share = assignment_record['stock_cost_per_share']
    
    # Filter by strike constraint if enabled
    if cc_config.get('sell_call_only_if_price_above_basis', True):
        candidates = option_chain[
            option_chain['strike'] >= stock_cost_per_share
        ].copy()
    else:
        candidates = option_chain.copy()
    
    if len(candidates) == 0:
        print(f"    No calls with strike >= cost basis ${stock_cost_per_share:.2f}")
        return None
    
    # Apply liquidity model
    candidates = apply_liquidity_model(candidates, config)
    
    if len(candidates) == 0:
        print(f"    No calls passed liquidity filter")
        return None
    
    # Tie-breaking
    method = cc_config.get('tie_break_method', 'highest_premium')
    
    if method == 'highest_premium':
        selected = candidates.loc[candidates['mid'].idxmax()]
    elif method == 'closest_delta':
        target_delta = (cc_config['delta_min'] + cc_config['delta_max']) / 2
        candidates['delta_dist'] = (candidates['delta'].abs() - target_delta).abs()
        selected = candidates.loc[candidates['delta_dist'].idxmin()]
    elif method == 'highest_strike':
        selected = candidates.loc[candidates['strike'].idxmax()]
    else:
        selected = candidates.iloc[0]  # Fallback
    
    print(f"    Selected CC: {selected['symbol']}")
    print(f"      Strike: ${selected['strike']:.2f}, Delta: {selected['delta']:.3f}, Premium: ${selected['mid']:.2f}/share")
    print(f"      DTE: {selected['dte']}, Expiration: {selected['expiration'].date()}")
    
    return selected


print("=" * 60)
print("WHEEL MODULE: Assignment and Covered Call Functions Loaded")
print("=" * 60)
print("  - handle_assignment(): Create assignment record from CSP exit")
print("  - fetch_option_chain_for_cc(): Fetch call chain for CC selection")
print("  - select_covered_call(): Select optimal call to sell")
print("=" * 60)



WHEEL MODULE: Assignment and Covered Call Functions Loaded
  - handle_assignment(): Create assignment record from CSP exit
  - fetch_option_chain_for_cc(): Fetch call chain for CC selection
  - select_covered_call(): Select optimal call to sell


In [66]:

# =============================================================================
# COVERED CALL BACKTEST FUNCTION
# =============================================================================

def backtest_covered_call(assignment_record, cc_selection, client, config, cc_config):
    """
    Backtest a covered call position after assignment.
    
    Uses same exit framework as CSP with CC-specific adjustments:
    - Profit target: Buy back call at cc_config exit_pct
    - Stop loss: Not typically used for CC (we own the shares)
    - Called away: Underlying >= strike at expiration
    - Expired worthless: Underlying < strike at expiration
    
    Args:
        assignment_record: Dict from handle_assignment()
        cc_selection: Series from select_covered_call()
        client: Databento client
        config: Main CONFIG dict
        cc_config: CC_CONFIG dict
    
    Returns:
        dict: Exit record for covered call phase
    """
    import numpy as np
    
    wheel_id = assignment_record['wheel_id']
    underlying_symbol = assignment_record['symbol']
    initial_capital = assignment_record['initial_capital']
    net_stock_cost = assignment_record['net_stock_cost']
    
    symbol = cc_selection['symbol']
    strike = cc_selection['strike']
    expiration_date = pd.Timestamp(cc_selection['expiration']).tz_localize(None)
    
    # Entry details
    entry_date = assignment_record['assignment_date']
    if hasattr(entry_date, 'tz') and entry_date.tz:
        entry_date = entry_date.tz_localize(None)
    
    # CC entry is next trading day after assignment
    # For simplicity, use assignment date as CC entry (could be refined with market calendar)
    cc_entry_date = pd.Timestamp(entry_date) + pd.Timedelta(days=1)
    cc_entry_date = cc_entry_date.tz_localize(None)
    
    # Calculate entry price with liquidity penalty
    liquidity_penalty = cc_selection.get('liquidity_penalty', 1.0)
    premium_per_share = get_entry_price(cc_selection, config['fill_mode'], liquidity_penalty)
    premium = premium_per_share * 100
    cost_basis = strike * 100  # CC cost basis is the strike
    
    # Exit parameters (use main config for exit_pct)
    exit_pct = config.get('exit_pct', 0.50)
    exit_price_per_share = premium_per_share * exit_pct
    
    # Initialize state
    current_state = advance_wheel_state('CSP_ASSIGNED', 'sell_call')  # CSP_ASSIGNED → CC_OPEN
    
    print(f"\n  CC Position: {symbol}")
    print(f"    Wheel ID: {wheel_id}, State: {current_state}")
    print(f"    Entry: {cc_entry_date.date()}, Strike: ${strike:.2f}, Premium: ${premium:.2f}")
    print(f"    Exit target: ${exit_price_per_share*100:.2f} (${exit_price_per_share:.2f}/share)")
    
    # Initialize RNG
    rng = np.random.RandomState(config.get('execution_seed', 42) + hash(wheel_id) % 1000)
    
    # Tracking variables
    touch_count = 0
    touch_profit_target = False
    filled_profit_target = False
    p_fill_profit = None
    u_fill_profit = None
    exit_date = None
    exit_daily_row = None
    exit_reason = None
    exit_price = None
    underlying_at_exp = None
    
    try:
        # Fetch daily prices for the call option
        df_daily = fetch_daily_prices_for_option(symbol, cc_entry_date, expiration_date, client, config)
        
        # Compute fill probability
        if config.get('use_probabilistic_exit_fills', True):
            p_fill_profit = compute_p_fill_profit(cc_selection, config)
        
        # Loop through trading days
        for check_date, daily_row in df_daily.iterrows():
            check_date_normalized = check_date.tz_localize(None) if hasattr(check_date, 'tz_localize') and check_date.tz else check_date
            
            if check_date_normalized.date() <= cc_entry_date.date():
                continue
            
            daily_low = daily_row['low']
            daily_high = daily_row['high']
            
            # Check profit target (call decays)
            if daily_low <= exit_price_per_share:
                touch_profit_target = True
                touch_count += 1
                
                if config.get('use_probabilistic_exit_fills', True) and p_fill_profit is not None:
                    filled, u = try_probabilistic_fill(p_fill_profit, rng)
                    u_fill_profit = u
                    
                    if filled:
                        exit_date = check_date_normalized
                        exit_daily_row = daily_row
                        exit_reason = 'profit_target'
                        current_state = advance_wheel_state(current_state, 'profit_target')
                        filled_profit_target = True
                        actual_exit_per_share = get_exit_price(daily_row, config.get('fill_mode', 'realistic'), penalty=liquidity_penalty)
                        exit_price = actual_exit_per_share * 100
                        print(f"    ✓ CC profit target on {exit_date.date()} → State: {current_state}")
                        break
                else:
                    exit_date = check_date_normalized
                    exit_daily_row = daily_row
                    exit_reason = 'profit_target'
                    current_state = advance_wheel_state(current_state, 'profit_target')
                    filled_profit_target = True
                    actual_exit_per_share = get_exit_price(daily_row, config.get('fill_mode', 'realistic'), penalty=liquidity_penalty)
                    exit_price = actual_exit_per_share * 100
                    print(f"    ✓ CC profit target on {exit_date.date()} (deterministic) → State: {current_state}")
                    break
        
        # Handle expiration - check for assignment
        if exit_date is None:
            exit_date = expiration_date
            exit_daily_row = None
            exit_price = 0.0
            
            # Fetch underlying price at expiration
            underlying_at_exp = fetch_underlying_price_at_expiration(
                underlying_symbol, expiration_date, client, config
            )
            
            if underlying_at_exp is not None:
                # For calls: ITM when underlying >= strike → called away
                if underlying_at_exp >= strike:
                    exit_reason = 'called_away'
                    current_state = advance_wheel_state(current_state, 'called_away')
                    print(f"    📤 CC CALLED AWAY on {expiration_date.date()} → State: {current_state}")
                    print(f"      Underlying: ${underlying_at_exp:.2f} >= Strike: ${strike:.2f}")
                else:
                    exit_reason = 'expired_worthless'
                    current_state = advance_wheel_state(current_state, 'expired_worthless')
                    print(f"    🎉 CC expired worthless on {expiration_date.date()} → State: {current_state}")
            else:
                exit_reason = 'expired_worthless'
                current_state = advance_wheel_state(current_state, 'expired_worthless')
                print(f"    CC expired (assumed worthless) → State: {current_state}")
    
    except Exception as e:
        print(f"    ✗ CC Error: {e}")
        import traceback
        traceback.print_exc()
        exit_date = expiration_date
        exit_reason = 'error'
        exit_price = 0.0
        current_state = 'CC_CLOSED_WORTHLESS'
    
    # Create exit record
    cc_exit_record = {
        'wheel_id': wheel_id,
        'phase': 'cc',
        'state': current_state,
        'symbol': symbol,
        'strike': strike,
        'entry_date': cc_entry_date,
        'exit_date': exit_date,
        'expiration': expiration_date,
        'cost_basis': cost_basis,
        'initial_capital': initial_capital,
        'premium': premium,
        'exit_pct': exit_pct,
        'exit_price': exit_price,
        'exit_reason': exit_reason,
        'days_held': (exit_date - cc_entry_date).days if exit_date else None,
        'underlying_at_expiration': underlying_at_exp,
        'daily_low': exit_daily_row['low'] if exit_daily_row is not None else None,
        'daily_high': exit_daily_row['high'] if exit_daily_row is not None else None,
        'touch_profit_target': touch_profit_target,
        'p_fill_profit_target': p_fill_profit,
        'u_fill_profit_target': u_fill_profit,
        'filled_profit_target': filled_profit_target,
        'touch_count': touch_count,
        'fill_model': 'ohlc_touch_prob_v1',
        # Stock position info for P&L
        'net_stock_cost': net_stock_cost,
    }
    
    return cc_exit_record, current_state


print("=" * 60)
print("COVERED CALL BACKTEST FUNCTION LOADED")
print("=" * 60)


COVERED CALL BACKTEST FUNCTION LOADED


In [67]:
# =============================================================================
# WHEEL ORCHESTRATION AND P&L CALCULATION
# =============================================================================

def calculate_wheel_pnl(csp_exit, cc_exit=None, assignment_record=None):
    """
    Calculate complete wheel P&L including CSP, CC, and stock legs.
    
    P&L Components:
    - CSP P&L: premium - exit_price - fees (already calculated)
    - CC P&L: call_premium - exit_price - call_fees
    - Stock P&L: (call_strike * 100) - net_stock_cost (only if called away)
    
    Wheel-level ROC is based on initial CSP capital (strike * 100).
    
    Args:
        csp_exit: CSP exit record dict
        cc_exit: CC exit record dict (optional, only if assigned)
        assignment_record: Assignment record dict (optional, only if assigned)
    
    Returns:
        dict: Wheel summary with total P&L
    """
    wheel_id = csp_exit['wheel_id']
    initial_capital = csp_exit.get('initial_capital', csp_exit['cost_basis'])
    
    # CSP P&L
    csp_premium = csp_exit['premium']
    csp_exit_price = csp_exit['exit_price']
    csp_fees = get_transaction_costs(CONFIG, is_round_trip=(csp_exit['exit_reason'] not in ['expired_worthless', 'assigned']))
    csp_pnl = csp_premium - csp_exit_price - csp_fees
    
    # Initialize totals
    cc_pnl = 0.0
    stock_pnl = 0.0
    cc_fees = 0.0
    total_days = csp_exit.get('days_held', 0) or 0
    
    # CC P&L (if applicable)
    if cc_exit is not None:
        cc_premium = cc_exit['premium']
        cc_exit_price = cc_exit['exit_price']
        cc_fees = get_transaction_costs(CONFIG, is_round_trip=(cc_exit['exit_reason'] not in ['expired_worthless', 'called_away']))
        cc_pnl = cc_premium - cc_exit_price - cc_fees
        total_days += cc_exit.get('days_held', 0) or 0
        
        # Stock P&L (only if called away)
        if cc_exit['exit_reason'] == 'called_away' and assignment_record is not None:
            net_stock_cost = assignment_record['net_stock_cost']
            call_strike = cc_exit['strike']
            stock_pnl = (call_strike * 100) - net_stock_cost
    
    # Total wheel P&L
    total_pnl = csp_pnl + cc_pnl + stock_pnl
    total_fees = csp_fees + cc_fees
    
    # ROC calculations
    csp_roc = (csp_pnl / initial_capital) * 100
    wheel_roc = (total_pnl / initial_capital) * 100
    
    # Determine final state using state machine
    if cc_exit is not None:
        # CC was processed - use CC's final state or advance to complete
        cc_state = cc_exit.get('state', 'CC_ASSIGNED')
        if cc_state == 'WHEEL_COMPLETE':
            final_state = 'WHEEL_COMPLETE'
        else:
            final_state = advance_wheel_state(cc_state, 'complete')
    else:
        # No CC processed - advance CSP state to complete
        # This works for both assigned (incomplete) and non-assigned CSPs
        final_state = advance_wheel_state(csp_exit['state'], 'complete')
    
    return {
        'wheel_id': wheel_id,
        'phase': 'total',
        'state': final_state,
        'initial_capital': initial_capital,
        
        # CSP details
        'csp_premium': csp_premium,
        'csp_exit_price': csp_exit_price,
        'csp_pnl': csp_pnl,
        'csp_fees': csp_fees,
        
        # CC details
        'cc_premium': cc_exit['premium'] if cc_exit else 0.0,
        'cc_exit_price': cc_exit['exit_price'] if cc_exit else 0.0,
        'cc_pnl': cc_pnl,
        'cc_fees': cc_fees,
        
        # Stock details
        'stock_pnl': stock_pnl,
        
        # Totals
        'total_pnl': total_pnl,
        'total_fees': total_fees,
        'csp_roc': csp_roc,
        'wheel_roc': wheel_roc,
        'total_days': total_days,
        
        # Exit info
        'csp_exit_reason': csp_exit['exit_reason'],
        'cc_exit_reason': cc_exit['exit_reason'] if cc_exit else None,
    }


def run_full_wheel_backtest(csp_exits_df, client, config, cc_config):
    """
    Run full wheel backtest: process CSP exits and handle assignments with CC.
    
    For each CSP that was assigned:
    1. Create assignment record
    2. Fetch call chain for next trading day
    3. Select covered call
    4. Backtest CC position
    5. Calculate wheel P&L
    
    Args:
        csp_exits_df: DataFrame with CSP exit records
        client: Databento client
        config: Main CONFIG dict
        cc_config: CC_CONFIG dict
    
    Returns:
        tuple: (all_exits_df, wheel_summaries_df)
    """
    all_exits = []
    wheel_summaries = []
    
    for idx, csp_exit in csp_exits_df.iterrows():
        # Convert row to dict
        csp_exit_dict = csp_exit.to_dict()
        all_exits.append(csp_exit_dict)
        
        # Check if assigned
        if csp_exit['exit_reason'] != 'assigned':
            # No assignment - calculate simple wheel P&L (CSP only)
            wheel_summary = calculate_wheel_pnl(csp_exit_dict)
            wheel_summaries.append(wheel_summary)
            print(f"\n  Wheel {csp_exit['wheel_id']}: CSP {csp_exit['exit_reason']} → {wheel_summary['state']}")
            print(f"    CSP P&L: ${wheel_summary['csp_pnl']:.2f}, ROC: {wheel_summary['csp_roc']:.2f}%")
            continue
        
        # Handle assignment
        print(f"\n{'='*60}")
        print(f"PROCESSING ASSIGNMENT: Wheel {csp_exit['wheel_id']}")
        print(f"{'='*60}")
        
        assignment_record = handle_assignment(csp_exit_dict)
        print(f"  Assignment: {assignment_record['shares']} shares at ${assignment_record['strike']:.2f}")
        print(f"  Net stock cost: ${assignment_record['net_stock_cost']:.2f} (${assignment_record['stock_cost_per_share']:.2f}/share)")
        
        # Fetch call chain for CC selection (next trading day)
        underlying_symbol = assignment_record['symbol']
        cc_entry_date = pd.Timestamp(assignment_record['assignment_date']) + pd.Timedelta(days=1)
        
        call_chain = fetch_option_chain_for_cc(
            underlying_symbol, cc_entry_date, client, config, cc_config
        )
        
        if len(call_chain) == 0:
            print(f"    ⚠ No suitable calls found - wheel incomplete")
            wheel_summary = calculate_wheel_pnl(csp_exit_dict)
            wheel_summaries.append(wheel_summary)
            continue
        
        # Select covered call
        cc_selection = select_covered_call(assignment_record, call_chain, cc_config, config)
        
        if cc_selection is None:
            print(f"    ⚠ No call selected - wheel incomplete")
            wheel_summary = calculate_wheel_pnl(csp_exit_dict)
            wheel_summaries.append(wheel_summary)
            continue
        
        # Backtest covered call
        cc_exit_dict, final_cc_state = backtest_covered_call(
            assignment_record, cc_selection, client, config, cc_config
        )
        all_exits.append(cc_exit_dict)
        
        # Calculate complete wheel P&L
        wheel_summary = calculate_wheel_pnl(csp_exit_dict, cc_exit_dict, assignment_record)
        wheel_summaries.append(wheel_summary)
        
        print(f"\n  WHEEL COMPLETE: {wheel_summary['wheel_id']}")
        print(f"    CSP P&L:   ${wheel_summary['csp_pnl']:.2f}")
        print(f"    CC P&L:    ${wheel_summary['cc_pnl']:.2f}")
        print(f"    Stock P&L: ${wheel_summary['stock_pnl']:.2f}")
        print(f"    ────────────────────────")
        print(f"    TOTAL:     ${wheel_summary['total_pnl']:.2f} ({wheel_summary['wheel_roc']:.2f}% ROC)")
        print(f"    Days:      {wheel_summary['total_days']}")
    
    # Create DataFrames
    all_exits_df = pd.DataFrame(all_exits)
    wheel_summaries_df = pd.DataFrame(wheel_summaries)
    
    return all_exits_df, wheel_summaries_df


print("=" * 60)
print("WHEEL ORCHESTRATION FUNCTIONS LOADED")
print("=" * 60)
print("  - calculate_wheel_pnl(): Calculate complete wheel P&L")
print("  - run_full_wheel_backtest(): Process CSP exits and handle CCs")
print("=" * 60)


WHEEL ORCHESTRATION FUNCTIONS LOADED
  - calculate_wheel_pnl(): Calculate complete wheel P&L
  - run_full_wheel_backtest(): Process CSP exits and handle CCs


In [68]:

# =============================================================================
# RUN FULL WHEEL BACKTEST
# =============================================================================
# Process CSP exits and handle any assignments with covered calls

print("=" * 60)
print("RUNNING FULL WHEEL BACKTEST")
print("=" * 60)
print(f"Processing {len(exits_df)} CSP exits...")
print(f"Exit reasons: {dict(exits_df['exit_reason'].value_counts())}")

# Run the wheel backtest
all_exits_df, wheel_summaries_df = run_full_wheel_backtest(
    csp_exits_df=exits_df,
    client=client,
    config=CONFIG,
    cc_config=CC_CONFIG
)

print("\n" + "=" * 60)
print("WHEEL BACKTEST COMPLETE")
print("=" * 60)


RUNNING FULL WHEEL BACKTEST
Processing 7 CSP exits...
Exit reasons: {'profit_target': np.int64(7)}

  Wheel 396c8fb1: CSP profit_target → WHEEL_COMPLETE
    CSP P&L: $553.58, ROC: 2.64%

  Wheel 3a383441: CSP profit_target → WHEEL_COMPLETE
    CSP P&L: $326.68, ROC: 1.63%

  Wheel 5cef9c1c: CSP profit_target → WHEEL_COMPLETE
    CSP P&L: $456.68, ROC: 2.23%

  Wheel e9b484b8: CSP profit_target → WHEEL_COMPLETE
    CSP P&L: $484.68, ROC: 2.31%

  Wheel ea15de0c: CSP profit_target → WHEEL_COMPLETE
    CSP P&L: $395.68, ROC: 1.98%

  Wheel 44075f08: CSP profit_target → WHEEL_COMPLETE
    CSP P&L: $489.68, ROC: 2.39%

  Wheel 9a35219b: CSP profit_target → WHEEL_COMPLETE
    CSP P&L: $411.28, ROC: 2.01%

WHEEL BACKTEST COMPLETE


In [None]:

# =============================================================================
# WHEEL BACKTEST RESULTS
# =============================================================================

print("=" * 60)
print("ALL EXITS (CSP + CC)")
print("=" * 60)
print(f"Total exit records: {len(all_exits_df)}")
print(f"\nPhase breakdown:")
print(all_exits_df['phase'].value_counts())
print(f"\nState breakdown:")
print(all_exits_df['state'].value_counts())
print(f"\nExit reason breakdown:")
print(all_exits_df['exit_reason'].value_counts())

# Display key columns
display_cols = ['wheel_id', 'phase', 'state', 'symbol', 'strike', 'entry_date', 'exit_date', 
                'premium', 'exit_price', 'exit_reason', 'days_held']
available_cols = [c for c in display_cols if c in all_exits_df.columns]
all_exits_df[available_cols]


In [53]:
# =============================================================================
# WHEEL SUMMARIES
# =============================================================================

print("=" * 60)
print("WHEEL SUMMARIES")
print("=" * 60)
print(f"Total wheels: {len(wheel_summaries_df)}")
print(f"\nFinal state breakdown:")
print(wheel_summaries_df['state'].value_counts())

# Calculate aggregate statistics
total_pnl = wheel_summaries_df['total_pnl'].sum()
total_capital = wheel_summaries_df['initial_capital'].sum()
avg_wheel_roc = wheel_summaries_df['wheel_roc'].mean()
total_days = wheel_summaries_df['total_days'].sum()

print(f"\n{'─'*40}")
print(f"AGGREGATE STATISTICS")
print(f"{'─'*40}")
print(f"Total P&L:       ${total_pnl:,.2f}")
print(f"Total Capital:   ${total_capital:,.2f}")
print(f"Aggregate ROC:   {(total_pnl/total_capital)*100:.2f}%")
print(f"Avg Wheel ROC:   {avg_wheel_roc:.2f}%")
print(f"Total Days:      {total_days}")

# Show summaries
summary_cols = ['wheel_id', 'state', 'csp_pnl', 'cc_pnl', 'stock_pnl', 'total_pnl', 
                'wheel_roc', 'total_days', 'csp_exit_reason', 'cc_exit_reason']
available_cols = [c for c in summary_cols if c in wheel_summaries_df.columns]
wheel_summaries_df[available_cols].round(2)


WHEEL SUMMARIES


NameError: name 'wheel_summaries_df' is not defined

In [25]:
# =============================================================================
# SENSITIVITY ANALYSIS: Probabilistic Fill Scale
# =============================================================================

# Run backtest with different pfill_scale values to test robustness
sensitivity_results = []

# Save original config
original_pfill_scale = CONFIG['pfill_scale']
original_execution_seed = CONFIG['execution_seed']

for scale in [0.8, 1.0, 1.2]:
    print(f"\n{'='*60}")
    print(f"Running sensitivity test: pfill_scale = {scale}")
    print(f"{'='*60}")
    
    # Update config
    CONFIG['pfill_scale'] = scale
    CONFIG['execution_seed'] = 42  # Keep seed constant for fair comparison
    
    # Run backtest
    exits_sensitivity = backtest_exit_strategy(
        backtest_candidates=backtest_candidates,
        client=client,
        config=CONFIG
    )
    
    # Calculate metrics
    if len(exits_sensitivity) > 0:
        sensitivity_results.append({
            'pfill_scale': scale,
            'total_roc': exits_sensitivity['roc'].sum(),
            'avg_roc': exits_sensitivity['roc'].mean(),
            'pct_profit_target': (exits_sensitivity['exit_reason'] == 'profit_target').mean() * 100,
            'pct_touch_missed': ((exits_sensitivity['touch_profit_target'] == True) & 
                                 (exits_sensitivity['filled_profit_target'] == False)).mean() * 100 if 'touch_profit_target' in exits_sensitivity.columns else 0.0,
            'avg_days_held': exits_sensitivity['days_held'].mean(),
            'total_trades': len(exits_sensitivity),
        })
    else:
        sensitivity_results.append({
            'pfill_scale': scale,
            'total_roc': 0.0,
            'avg_roc': 0.0,
            'pct_profit_target': 0.0,
            'pct_touch_missed': 0.0,
            'avg_days_held': 0.0,
            'total_trades': 0,
        })

# Restore original config
CONFIG['pfill_scale'] = original_pfill_scale
CONFIG['execution_seed'] = original_execution_seed

# Display results
print(f"\n{'='*60}")
print("SENSITIVITY ANALYSIS RESULTS")
print(f"{'='*60}\n")

sensitivity_df = pd.DataFrame(sensitivity_results)
print(sensitivity_df.to_string(index=False))

print(f"\n{'='*60}")
print("INTERPRETATION")
print(f"{'='*60}")
print("pfill_scale controls fill probability sensitivity:")
print("  - 0.8 = pessimistic (lower fill rates)")
print("  - 1.0 = baseline (nominal fill rates)")
print("  - 1.2 = optimistic (higher fill rates)")
print("\nStrategy should remain robust across all scales.")
print("If performance collapses at 0.8, strategy may not be viable.")



Running sensitivity test: pfill_scale = 0.8

Processing TSLA  230714P00210000...
  Entry: 2023-06-06, Premium: $918.00 ($9.18/share)
  Exit target: $459.00 ($4.59/share, exit at 50% of premium)
  Stop loss: $1836.00 ($18.36/share)
    [CACHE HIT] Loading daily prices for TSLA  230714P00210000
  ✓ Profit target hit on 2023-06-08 (touch #1, filled)
    Target: $4.59/share, Actual fill: $3.63/share
    p_fill=0.72, u=0.375

Processing TSLA  230714P00200000...
  Entry: 2023-06-06, Premium: $599.00 ($5.99/share)
  Exit target: $299.50 ($3.00/share, exit at 50% of premium)
  Stop loss: $1198.00 ($11.98/share)
    [CACHE HIT] Loading daily prices for TSLA  230714P00200000
    Touch #1 on 2023-06-08: NO FILL (p_fill=0.61, u=0.951)
    Touch #2 on 2023-06-08: NO FILL (p_fill=0.61, u=0.732)
  ✓ Profit target hit on 2023-06-08 (touch #3, filled)
    Target: $3.00/share, Actual fill: $2.46/share
    p_fill=0.61, u=0.599

Processing TSLA  230714P00205000...
  Entry: 2023-06-06, Premium: $749.00 ($

In [26]:
# Configure pandas to display all columns
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
pd.set_option('display.max_colwidth', None)

exits_df.round(2)

Unnamed: 0,symbol,entry_date,exit_date,expiration,cost_basis,premium,exit_pct,exit_price,exit_reason,days_held,daily_low,daily_high,touch_profit_target,p_fill_profit_target,u_fill_profit_target,filled_profit_target,spread_pct_entry,ivp_entry,touch_count,fill_model,fees,exit_pnl,exit_pnl_pct,roc
0,TSLA 230714P00210000,2023-06-06,2023-06-08 20:00:00,2023-07-14,21000.0,918.0,0.5,363.1,profit_target,2,3.55,3.82,True,0.54,0.37,True,0.04,0.14,1,ohlc_touch_prob_v1,1.32,553.58,60.3,2.64
1,TSLA 230714P00200000,2023-06-06,2023-06-08 20:00:00,2023-07-14,20000.0,599.0,0.5,271.0,profit_target,2,2.71,2.71,True,0.46,0.16,True,0.03,0.71,4,ohlc_touch_prob_v1,1.32,326.68,54.54,1.63
2,TSLA 230714P00205000,2023-06-06,2023-06-08 20:00:00,2023-07-14,20500.0,749.0,0.5,291.0,profit_target,2,2.85,3.05,True,0.54,0.16,True,0.03,0.43,1,ohlc_touch_prob_v1,1.32,456.68,60.97,2.23
3,TSLA 230707P00210000,2023-06-06,2023-06-08 20:00:00,2023-07-07,21000.0,806.0,0.5,320.0,profit_target,2,2.56,3.2,True,0.54,0.06,True,0.04,0.29,1,ohlc_touch_prob_v1,1.32,484.68,60.13,2.31
4,TSLA 230721P00200000,2023-06-06,2023-06-08 20:00:00,2023-07-21,20000.0,782.0,0.5,385.0,profit_target,2,3.16,3.85,True,0.38,0.02,True,0.01,1.0,4,ohlc_touch_prob_v1,1.32,395.68,50.6,1.98
5,TSLA 230721P00205000,2023-06-06,2023-06-08 20:00:00,2023-07-21,20500.0,951.0,0.5,460.0,profit_target,2,4.2,4.6,True,0.46,0.21,True,0.01,0.86,3,ohlc_touch_prob_v1,1.32,489.68,51.49,2.39
6,TSLA 230707P00205000,2023-06-06,2023-06-08 20:00:00,2023-07-07,20500.0,634.0,0.5,221.4,profit_target,2,2.0,2.38,True,0.54,0.18,True,0.03,0.57,1,ohlc_touch_prob_v1,1.32,411.28,64.87,2.01


In [27]:
100*(exits_df.exit_pnl.sum()/exits_df.cost_basis.sum())


np.float64(2.173003484320558)

In [28]:
# We need to save backtest results with metadata as our strategy evolves
# exists_df should contain option data such as delta at entry, peak delta, maybe other information that would be helpful for analysis

# 
exits_df['daily_adjusted_roc'] = exits_df['exit_pnl']/exits_df['cost_basis']
exits_df['daily_adjusted_roc'].describe()
exits_df['days_held'].describe()
exits_df['exit_reason'].value_counts()


exit_reason
profit_target    7
Name: count, dtype: int64