# Wheel Strategy v5: Date Range + Multi-Ticker Scheduler

This notebook extends v4 with a scheduler layer that iterates over trading dates and symbols, launching wheel instances for each qualifying entry.

> **v5 Scope**: Changes are limited to *wrappers + orchestration + aggregation*; core strategy logic from v4 is unchanged.

## Architecture Overview

```
┌─────────────────────────────────────────────────────────────┐
│                    V5 SCHEDULER LAYER                       │
│  SCHEDULER_CONFIG → get_trading_days() → Date x Symbol Loop │
│                     → get_entry_candidates() → candidates   │
└─────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│                 V4 WHEEL ENGINE (FROZEN)                    │
│  CONFIG, CC_CONFIG → run_single_wheel() → CSP/CC Lifecycle  │
│                     → State Machine → Exit Records          │
└─────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│                     AGGREGATION                             │
│  all_wheel_results → aggregate_v5_results() → Summary Stats │
└─────────────────────────────────────────────────────────────┘
```

## Key Features
- **Multi-Date Backtesting**: Run strategy across configurable date ranges
- **Multi-Symbol Support**: Test across multiple underlyings simultaneously  
- **Deterministic Execution**: Reproducible results via sorted candidates + execution_seed
- **Version Tracking**: Each result stamped with `execution_version` for comparison
- **Preserved v4 Engine**: State machine, exit logic, and P&L calculation unchanged

## Trade States (from v4)
```
CSP_OPEN → CSP_CLOSED_PROFIT | CSP_CLOSED_STOP | CSP_ASSIGNED | CSP_CLOSED_WORTHLESS
CSP_ASSIGNED → CC_OPEN
CC_OPEN → CC_CLOSED_PROFIT | CC_ASSIGNED | CC_CLOSED_WORTHLESS
All terminal states → WHEEL_COMPLETE
```


## Imports

In [1]:
from pathlib import Path
from dotenv import dotenv_values, load_dotenv
import sys
import os
import pandas as pd
import databento as db
import pandas_market_calendars as mcal

sys.executable

env_path = Path("/Users/samuelminer/Projects/nissan_options/wheel_strategy/.env")

print("Parsed keys:", dotenv_values(env_path).keys())

load_dotenv()  # loads .env from current working directory

assert os.getenv("DATABENTO_API_KEY"), "DATABENTO_API_KEY still not found"
print("os.getenv:", bool(os.getenv("DATABENTO_API_KEY")))
client = db.Historical()

pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
pd.set_option('display.max_colwidth', None)


Parsed keys: odict_keys(['DATABENTO_API_KEY', 'ANTHROPIC_API_KEY'])
os.getenv: True


## Configuration

All configurable parameters for the backtest. Modify this cell to change settings.

In [2]:
# =============================================================================
# SCHEDULER CONFIGURATION (V5 Addition)
# =============================================================================
# Controls multi-date, multi-symbol orchestration layer.
# CONFIG and CC_CONFIG remain frozen - scheduler only wraps them.

SCHEDULER_CONFIG = {
    # -------------------------------------------------------------------------
    # DATE RANGE
    # -------------------------------------------------------------------------
    'start_date': '2023-06-06',                # First trading day to consider
    'end_date': '2023-09-13',                  # Last trading day to consider
    'trading_calendar': 'NYSE',                # Market calendar for trading days
    
    # -------------------------------------------------------------------------
    # SYMBOLS
    # -------------------------------------------------------------------------
    'symbols': ['TSLA'],                       # Underlyings to backtest
    
    # -------------------------------------------------------------------------
    # WHEEL CONSTRAINTS
    # -------------------------------------------------------------------------
    'allow_multiple_wheels_per_symbol': True,  # If True, launch wheel for every candidate
    'max_wheels_per_symbol_per_day': None,     # None = no limit, or integer cap
    
    # -------------------------------------------------------------------------
    # EXECUTION
    # -------------------------------------------------------------------------
    'scheduler_seed': 123,                     # Reserved for v6 stochastic scheduling
    'log_level': 'INFO',                       # 'INFO' = verbose, 'QUIET' = minimal output
}

print("=" * 60)
print("SCHEDULER CONFIGURATION (V5)")
print("=" * 60)
print(f"Date Range:      {SCHEDULER_CONFIG['start_date']} to {SCHEDULER_CONFIG['end_date']}")
print(f"Calendar:        {SCHEDULER_CONFIG['trading_calendar']}")
print(f"Symbols:         {SCHEDULER_CONFIG['symbols']}")
print(f"Multiple Wheels: {SCHEDULER_CONFIG['allow_multiple_wheels_per_symbol']}")
print(f"Max per Day:     {SCHEDULER_CONFIG['max_wheels_per_symbol_per_day'] or 'Unlimited'}")
print(f"Log Level:       {SCHEDULER_CONFIG['log_level']}")
print("=" * 60)


SCHEDULER CONFIGURATION (V5)
Date Range:      2023-06-06 to 2023-09-13
Calendar:        NYSE
Symbols:         ['TSLA']
Multiple Wheels: True
Max per Day:     Unlimited
Log Level:       INFO


In [3]:
# =============================================================================
# UNIFIED CONFIGURATION
# =============================================================================

CONFIG = {
    # -------------------------------------------------------------------------
    # SYMBOL & TIMING
    # -------------------------------------------------------------------------
    'symbol': SCHEDULER_CONFIG['symbols'],                          # Underlying symbol to backtest
    'timezone': 'America/New_York',
    
    # Entry date/time for the single-day backtest
    'entry_date': SCHEDULER_CONFIG['start_date'],                # Date to enter positions
    'entry_time': '15:45',                     # Time to capture option chain snapshot
    
    # Historical data lookback for technical indicators (e.g., Bollinger Bands)
    'lookback_days': 252 * 2,                  # ~2 years of daily data
    
    # -------------------------------------------------------------------------
    # OPTION SELECTION CRITERIA
    # -------------------------------------------------------------------------
    'option_type': 'P',                        # 'P' for puts (CSP), 'C' for calls
    'dte_min': 30,                             # Minimum days to expiration
    'dte_max': 45,                             # Maximum days to expiration
    'delta_min': 0.25,                         # Minimum absolute delta
    'delta_max': 0.35,                         # Maximum absolute delta
    
    # -------------------------------------------------------------------------
    # LIQUIDITY MODEL (regime-aware, penalty-based)
    # -------------------------------------------------------------------------
    # Hard rejection thresholds (truly untradeable)
    'min_bid_hard': 0.10,                      # Hard floor - reject penny options
    'hard_max_spread_pct': 0.20,               # Hard ceiling - reject extreme spreads
    
    # Base target spread (calm market conditions)
    'base_max_spread_pct': 0.08,               # Target max spread in normal conditions
    
    # IV regime adjustments (allow wider spreads in high-vol)
    'ivp_high_threshold': 0.70,                # IV percentile threshold for "high vol"
    'ivp_high_max_spread_pct': 0.12,           # Allowed spread when IV is high
    'ivp_extreme_threshold': 0.90,             # IV percentile threshold for "extreme vol"
    'ivp_extreme_max_spread_pct': 0.15,        # Allowed spread when IV is extreme
    
    # DTE adjustments (short-dated options have wider spreads)
    'short_dte_threshold': 7,                  # DTE below this gets extra allowance
    'short_dte_extra_spread_pct': 0.02,        # Extra spread allowance for short DTE
    
    # Penalty tiers (execution tax based on spread quality)
    # tight:    spread <= 0.6 * allowed → penalty = 1.0 (no extra slippage)
    # moderate: spread <= allowed       → penalty = 1.15 (15% wider effective spread)
    # wide:     spread <= hard_max      → penalty = 1.35 (35% wider effective spread)
    # ugly:     spread > hard_max       → REJECT (no trade)
    
    # -------------------------------------------------------------------------
    # EXIT STRATEGY
    # -------------------------------------------------------------------------
    'exit_pct': 0.50,                          # 0.50 = buy back at 50%, keep 50% profit
    'stop_loss_multiplier': 2.0,               # Exit if option price reaches Nx premium
    'max_hold_dte': None,                      # Exit at X DTE if no other trigger (None = disabled)
    
    # -------------------------------------------------------------------------
    # TRANSACTION COSTS (NEW - will be applied later)
    # -------------------------------------------------------------------------
    'commission_per_contract': 0.65,           # Per contract commission (round trip = 2x)
    'sec_fee_per_contract': 0.01,              # SEC/TAF fees per contract
    
    # -------------------------------------------------------------------------
    # EXECUTION / FILL ASSUMPTIONS (NEW - will be applied later)
    # -------------------------------------------------------------------------
    'fill_mode': 'mid',                        # 'mid' (current), 'bid' (realistic), 'pessimistic'
    'use_realistic_fills': False,              # When True: sell at bid, buy back at ask
    
    # -------------------------------------------------------------------------
    # PROBABILISTIC EXIT FILLS
    # -------------------------------------------------------------------------
    'execution_seed': 42,                      # Random seed for reproducible fills
    'use_probabilistic_exit_fills': True,      # Enable probabilistic fill model
    
    # Fill probability buckets by spread quality
    'pfill_tight': 0.90,                       # spread <= 5%
    'pfill_normal': 0.70,                      # spread <= 10%
    'pfill_wide': 0.40,                        # spread > 10%
    
    # Spread thresholds for buckets
    'tight_spread_pct': 0.05,
    'normal_spread_pct': 0.10,
    
    # Scaling and clamping
    'pfill_scale': 0.6,                        # Sensitivity multiplier (0.8, 1.0, 1.2)
    'pfill_min': 0.05,
    'pfill_max': 0.98,
    
    # Optional IVP penalty multipliers
    'pfill_ivp_high_mult': 0.85,
    'pfill_ivp_extreme_mult': 0.70,
    
    # -------------------------------------------------------------------------
    # CACHE
    # -------------------------------------------------------------------------
    'cache_dir': '../cache/',
}

# -------------------------------------------------------------------------
# DERIVED VALUES (computed from CONFIG)
# -------------------------------------------------------------------------
SYMBOL = CONFIG['symbol']
TZ = CONFIG['timezone']
CACHE_DIR = CONFIG['cache_dir']
os.makedirs(CACHE_DIR, exist_ok=True)

# Entry timestamp
ENTRY_DATE = pd.Timestamp(CONFIG['entry_date'], tz=TZ)
ENTRY_TIME = pd.Timestamp(f"{CONFIG['entry_date']} {CONFIG['entry_time']}", tz=TZ)

print("=" * 60)
print("BACKTEST CONFIGURATION")
print("=" * 60)
print(f"Symbol:          {SYMBOL}")
print(f"Entry Date:      {ENTRY_DATE.date()}")
print(f"Entry Time:      {CONFIG['entry_time']}")
print(f"Option Type:     {'Cash-Secured Put' if CONFIG['option_type'] == 'P' else 'Covered Call'}")
print(f"DTE Range:       {CONFIG['dte_min']} - {CONFIG['dte_max']} days")
print(f"Delta Range:     {CONFIG['delta_min']} - {CONFIG['delta_max']}")
print(f"Exit Target:     {CONFIG['exit_pct']*100:.0f}% of premium")
print(f"Stop Loss:       {CONFIG['stop_loss_multiplier']}x premium")
print(f"Fill Mode:       {CONFIG['fill_mode']}")
print(f"Realistic Fills: {CONFIG['use_realistic_fills']}")
print(f"Commission:      ${CONFIG['commission_per_contract']}/contract")
print("=" * 60)
print("\nNOTE: Transaction costs and realistic fills are NOT yet applied.")
print("      Run both notebooks to compare baseline vs realistic results.")

BACKTEST CONFIGURATION
Symbol:          ['TSLA']
Entry Date:      2023-06-06
Entry Time:      15:45
Option Type:     Cash-Secured Put
DTE Range:       30 - 45 days
Delta Range:     0.25 - 0.35
Exit Target:     50% of premium
Stop Loss:       2.0x premium
Fill Mode:       mid
Realistic Fills: False
Commission:      $0.65/contract

NOTE: Transaction costs and realistic fills are NOT yet applied.
      Run both notebooks to compare baseline vs realistic results.


In [4]:
# =============================================================================
# COVERED CALL CONFIGURATION
# =============================================================================

CC_CONFIG = {
    # -------------------------------------------------------------------------
    # OPTION SELECTION CRITERIA
    # -------------------------------------------------------------------------
    'dte_min': 14,                             # Minimum days to expiration
    'dte_max': 30,                             # Maximum days to expiration
    'delta_min': 0.25,                         # Minimum absolute delta
    'delta_max': 0.35,                         # Maximum absolute delta
    'strike_min_pct_above_basis': 0.0,         # Allow ATM (0% above cost basis)
    'entry_time': '15:45',                     # Same snapshot time as CSP
    
    # -------------------------------------------------------------------------
    # BEHAVIORAL FLAGS (explicit intent documentation)
    # -------------------------------------------------------------------------
    'sell_call_only_if_price_above_basis': True,  # Require strike >= cost basis per share
    
    # -------------------------------------------------------------------------
    # TIE-BREAKING FOR CALL SELECTION
    # -------------------------------------------------------------------------
    # When multiple candidates match criteria, how to select:
    # Options: 'highest_premium', 'closest_delta', 'highest_strike'
    'tie_break_method': 'highest_premium',
}

print("=" * 60)
print("COVERED CALL CONFIGURATION")
print("=" * 60)
print(f"DTE Range:       {CC_CONFIG['dte_min']} - {CC_CONFIG['dte_max']} days")
print(f"Delta Range:     {CC_CONFIG['delta_min']} - {CC_CONFIG['delta_max']}")
print(f"Entry Time:      {CC_CONFIG['entry_time']}")
print(f"Strike >= Basis: {CC_CONFIG['sell_call_only_if_price_above_basis']}")
print(f"Tie-Breaking:    {CC_CONFIG['tie_break_method']}")
print("=" * 60)



COVERED CALL CONFIGURATION
DTE Range:       14 - 30 days
Delta Range:     0.25 - 0.35
Entry Time:      15:45
Strike >= Basis: True
Tie-Breaking:    highest_premium


In [5]:

# =============================================================================
# WHEEL STATE MACHINE
# =============================================================================
# Explicit state transitions prevent logic spaghetti and make logs interpretable.
# WHEEL_COMPLETE is the single canonical terminal state for all paths.

import uuid

# Valid state transitions
VALID_TRANSITIONS = {
    'CSP_OPEN': ['CSP_CLOSED_PROFIT', 'CSP_CLOSED_STOP', 'CSP_ASSIGNED', 'CSP_CLOSED_WORTHLESS'],
    'CSP_ASSIGNED': ['CC_OPEN', 'WHEEL_COMPLETE'],  # Can sell CC or mark incomplete
    'CC_OPEN': ['CC_CLOSED_PROFIT', 'CC_ASSIGNED', 'CC_CLOSED_WORTHLESS'],
    'CC_ASSIGNED': ['WHEEL_COMPLETE'],
    'CC_CLOSED_PROFIT': ['WHEEL_COMPLETE'],      # v1: no re-entry after CC profit
    'CC_CLOSED_WORTHLESS': ['WHEEL_COMPLETE'],   # v1: no re-entry after CC expires
    'CSP_CLOSED_PROFIT': ['WHEEL_COMPLETE'],
    'CSP_CLOSED_STOP': ['WHEEL_COMPLETE'],
    'CSP_CLOSED_WORTHLESS': ['WHEEL_COMPLETE'],
    'WHEEL_COMPLETE': [],  # Terminal state - no further transitions
}

# Event to state mapping
EVENT_TO_STATE = {
    # CSP phase events
    ('CSP_OPEN', 'profit_target'): 'CSP_CLOSED_PROFIT',
    ('CSP_OPEN', 'stop_loss'): 'CSP_CLOSED_STOP',
    ('CSP_OPEN', 'assigned'): 'CSP_ASSIGNED',
    ('CSP_OPEN', 'expired_worthless'): 'CSP_CLOSED_WORTHLESS',
    
    # Assignment to CC
    ('CSP_ASSIGNED', 'sell_call'): 'CC_OPEN',
    
    # CC phase events
    ('CC_OPEN', 'profit_target'): 'CC_CLOSED_PROFIT',
    ('CC_OPEN', 'called_away'): 'CC_ASSIGNED',
    ('CC_OPEN', 'expired_worthless'): 'CC_CLOSED_WORTHLESS',
    
    # Terminal transitions (all paths lead to WHEEL_COMPLETE)
    ('CC_ASSIGNED', 'complete'): 'WHEEL_COMPLETE',
    ('CC_CLOSED_PROFIT', 'complete'): 'WHEEL_COMPLETE',
    ('CC_CLOSED_WORTHLESS', 'complete'): 'WHEEL_COMPLETE',
    ('CSP_CLOSED_PROFIT', 'complete'): 'WHEEL_COMPLETE',
    ('CSP_CLOSED_STOP', 'complete'): 'WHEEL_COMPLETE',
    ('CSP_CLOSED_WORTHLESS', 'complete'): 'WHEEL_COMPLETE',
    ('CSP_ASSIGNED', 'complete'): 'WHEEL_COMPLETE',  # For incomplete wheels (no CC processed)
}


def advance_wheel_state(current_state, event):
    """
    Advance wheel state based on event. Enforces valid transitions.
    
    Args:
        current_state: Current state string (e.g., 'CSP_OPEN', 'CC_OPEN')
        event: Event triggering transition (e.g., 'profit_target', 'assigned', 'called_away')
    
    Returns:
        New state string
    
    Raises:
        ValueError if transition is invalid
    
    Example:
        >>> advance_wheel_state('CSP_OPEN', 'assigned')
        'CSP_ASSIGNED'
        >>> advance_wheel_state('CC_OPEN', 'called_away')
        'CC_ASSIGNED'
    """
    key = (current_state, event)
    
    if key not in EVENT_TO_STATE:
        valid_events = [e for (s, e) in EVENT_TO_STATE.keys() if s == current_state]
        raise ValueError(
            f"Invalid transition: state='{current_state}' + event='{event}'. "
            f"Valid events from {current_state}: {valid_events}"
        )
    
    new_state = EVENT_TO_STATE[key]
    
    # Double-check against VALID_TRANSITIONS (belt and suspenders)
    if new_state not in VALID_TRANSITIONS.get(current_state, []):
        raise ValueError(
            f"State '{new_state}' not reachable from '{current_state}'. "
            f"Valid transitions: {VALID_TRANSITIONS.get(current_state, [])}"
        )
    
    return new_state


def generate_wheel_id():
    """Generate a unique wheel ID for linking CSP + CC phases."""
    return str(uuid.uuid4())[:8]


def get_phase_from_state(state):
    """
    Extract phase from state string.
    
    Returns:
        'csp': For CSP states (CSP_OPEN, CSP_CLOSED_*, CSP_ASSIGNED)
        'cc': For CC states (CC_OPEN, CC_CLOSED_*, CC_ASSIGNED)
        'total': For WHEEL_COMPLETE (used in wheel summaries)
        'unknown': For unrecognized states
    """
    if state.startswith('CSP'):
        return 'csp'
    elif state.startswith('CC'):
        return 'cc'
    elif state == 'WHEEL_COMPLETE':
        return 'total'  # Consistent with wheel summary phase
    else:
        return 'unknown'


def is_terminal_state(state):
    """Check if state is a terminal state (no further transitions possible)."""
    return len(VALID_TRANSITIONS.get(state, [])) == 0 or state == 'WHEEL_COMPLETE'


# Test the state machine
print("=" * 60)
print("STATE MACHINE VALIDATION")
print("=" * 60)

# Test valid transitions
test_cases = [
    ('CSP_OPEN', 'profit_target', 'CSP_CLOSED_PROFIT'),
    ('CSP_OPEN', 'assigned', 'CSP_ASSIGNED'),
    ('CSP_ASSIGNED', 'sell_call', 'CC_OPEN'),
    ('CSP_ASSIGNED', 'complete', 'WHEEL_COMPLETE'),  # For incomplete wheels
    ('CC_OPEN', 'called_away', 'CC_ASSIGNED'),
    ('CC_ASSIGNED', 'complete', 'WHEEL_COMPLETE'),
]

for current, event, expected in test_cases:
    result = advance_wheel_state(current, event)
    status = "✓" if result == expected else "✗"
    print(f"  {status} {current} + '{event}' → {result}")

# Test invalid transition (should raise)
try:
    advance_wheel_state('CSP_OPEN', 'called_away')  # Invalid: called_away only valid for CC_OPEN
    print("  ✗ Should have raised ValueError for invalid transition")
except ValueError as e:
    print(f"  ✓ Correctly rejected invalid transition: CSP_OPEN + 'called_away'")

print("=" * 60)


STATE MACHINE VALIDATION
  ✓ CSP_OPEN + 'profit_target' → CSP_CLOSED_PROFIT
  ✓ CSP_OPEN + 'assigned' → CSP_ASSIGNED
  ✓ CSP_ASSIGNED + 'sell_call' → CC_OPEN
  ✓ CSP_ASSIGNED + 'complete' → WHEEL_COMPLETE
  ✓ CC_OPEN + 'called_away' → CC_ASSIGNED
  ✓ CC_ASSIGNED + 'complete' → WHEEL_COMPLETE
  ✓ Correctly rejected invalid transition: CSP_OPEN + 'called_away'


In [6]:
# =============================================================================
# HELPER FUNCTIONS FOR REALISTIC EXECUTION
# =============================================================================

def get_entry_price(row, fill_mode='realistic', penalty=1.0):
    """
    Calculate entry price when SELLING a put (we receive premium).
    Higher price = better for us.
    
    Slippage is calculated as a percentage of the bid-ask spread from mid.
    Penalty multiplier widens the effective spread for illiquid options.
    
    | Scenario    | Formula                              | Interpretation              |
    |-------------|--------------------------------------|-----------------------------|
    | pessimistic | mid - 75% of (spread * penalty)      | Forced/stressed execution   |
    | realistic   | mid - 30% of (spread * penalty)      | Normal retail execution     |
    | optimistic  | mid                                  | Patient, favorable fills    |
    
    Args:
        row: DataFrame row with bid_px_00, ask_px_00
        fill_mode: 'optimistic', 'realistic', or 'pessimistic'
        penalty: liquidity penalty multiplier (1.0 = no extra slippage)
    """
    bid = row['bid_px_00']
    ask = row['ask_px_00']
    mid = (bid + ask) / 2
    spread = ask - bid
    
    # Apply liquidity penalty to effective spread
    effective_spread = spread * penalty
    
    if fill_mode == 'optimistic':
        return mid                              # Best case - get mid (no penalty applied)
    elif fill_mode == 'pessimistic':
        fill = mid - (0.75 * effective_spread)  # Worst case - 75% toward bid
    else:  # realistic
        fill = mid - (0.30 * effective_spread)  # Normal - 30% toward bid
    
    # Clamp to [bid, ask] to stay realistic
    return max(bid, min(ask, fill))


def get_exit_price(daily_row, fill_mode=CONFIG['fill_mode'], target_price=None, penalty=1.0): # IS THIS RIGHT TO SET AT PENALTY = 1.0? ##
    """
    Calculate exit price when BUYING BACK a put (we pay to close).
    Lower price = better for us.
    
    For daily OHLCV data, we estimate spread behavior from the day's range.
    Penalty multiplier widens the effective range for illiquid options.
    
    | Scenario    | Formula                              | Interpretation              |
    |-------------|--------------------------------------|-----------------------------|
    | pessimistic | close + 75% of (range * penalty)     | Forced/stressed execution   |
    | realistic   | close + 30% of (range * penalty)     | Normal retail execution     |
    | optimistic  | close - 25% of (range * penalty)     | Patient, favorable fills    |
    
    Args:
        daily_row: DataFrame row with close, high, low
        fill_mode: 'optimistic', 'realistic', or 'pessimistic'
        target_price: Optional target price (not currently used but reserved)
        penalty: liquidity penalty multiplier (1.0 = no extra slippage)
    """
    close = daily_row['close']
    high = daily_row['high']
    low = daily_row['low']
    day_range = high - low  # Proxy for intraday spread/volatility
    
    # Apply liquidity penalty to effective range
    effective_range = day_range * penalty
    
    if fill_mode == 'optimistic':
        # Patient buyer - gets below close (toward low)
        fill = close - (0.25 * effective_range)
        return max(low, fill)
    elif fill_mode == 'pessimistic':
        # Forced buyer - pays above close (toward high)
        fill = close + (0.75 * effective_range)
        return min(high, fill)
    else:  # realistic
        # Normal execution - slight slippage above close
        fill = close + (0.30 * effective_range)
        return min(high, fill)


def get_transaction_costs(config, is_round_trip=True):
    """
    Calculate total transaction costs per contract.
    
    Args:
        config: CONFIG dict with commission and fee rates
        is_round_trip: True if both entry and exit, False if entry only (e.g., expired worthless)
    
    Returns:
        Total fees in dollars per contract
    """
    per_leg = config['commission_per_contract'] + config['sec_fee_per_contract']
    return per_leg * 2 if is_round_trip else per_leg


def compute_allowed_spread(row, config):
    """
    Compute the allowed spread percentage for a single option based on regime.
    
    Regime factors:
    - IV percentile (high vol → allow wider spreads)
    - DTE (short-dated → allow wider spreads)
    
    Returns: allowed_spread_pct for this option
    """
    base = config['base_max_spread_pct']
    
    # IV regime adjustment
    ivp = row.get('ivp', 0.5)  # Default to median if not computed
    if ivp >= config['ivp_extreme_threshold']:
        base = config['ivp_extreme_max_spread_pct']
    elif ivp >= config['ivp_high_threshold']:
        base = config['ivp_high_max_spread_pct']
    
    # DTE adjustment
    dte = row.get('dte', 30)
    if dte <= config['short_dte_threshold']:
        base += config['short_dte_extra_spread_pct']
    
    return base


def compute_liquidity_penalty(spread_pct, allowed_spread_pct, hard_max_spread_pct):
    """
    Compute liquidity penalty multiplier based on spread quality.
    
    Tiers:
    - tight:    spread <= 0.6 * allowed → penalty = 1.0 (no extra slippage)
    - moderate: spread <= allowed       → penalty = 1.15
    - wide:     spread <= hard_max      → penalty = 1.35
    - ugly:     spread > hard_max       → None (reject)
    
    Returns: (tier_name, penalty_multiplier) or (None, None) if rejected
    """
    if spread_pct > hard_max_spread_pct:
        return 'reject', None
    
    tight_threshold = 0.6 * allowed_spread_pct
    
    if spread_pct <= tight_threshold:
        return 'tight', 1.0
    elif spread_pct <= allowed_spread_pct:
        return 'moderate', 1.15
    else:  # spread_pct <= hard_max_spread_pct
        return 'wide', 1.35


def apply_liquidity_model(df, config):
    """
    Apply regime-aware liquidity model with penalty tiers.
    
    Instead of binary reject, this:
    1. Computes IV percentile (ivp) for regime detection
    2. Computes allowed_spread_pct per option (regime-aware)
    3. Assigns liquidity_tier and liquidity_penalty
    4. Only hard-rejects truly ugly spreads
    
    Args:
        df: DataFrame with option quotes (needs bid_px_00, ask_px_00, spread_pct, iv, dte)
        config: CONFIG dict with liquidity model settings
    
    Returns:
        DataFrame with liquidity columns added, ugly spreads removed
    """
    if len(df) == 0:
        return df
    
    df = df.copy()
    original_count = len(df)
    
    # Ensure required columns exist
    if 'spread_pct' not in df.columns:
        df['spread'] = df['ask_px_00'] - df['bid_px_00']
        df['spread_pct'] = df['spread'] / df['mid']
    
    # Step 1: Compute IV percentile (cross-sectional within this snapshot)
    if 'iv' in df.columns:
        df['ivp'] = df['iv'].rank(pct=True)
    else:
        df['ivp'] = 0.5  # Default to median if IV not available
    
    # Step 2: Compute allowed spread per option
    df['allowed_spread_pct'] = df.apply(
        lambda row: compute_allowed_spread(row, config), axis=1
    )
    
    # Step 3: Compute liquidity tier and penalty
    def get_tier_and_penalty(row):
        return compute_liquidity_penalty(
            row['spread_pct'], 
            row['allowed_spread_pct'],
            config['hard_max_spread_pct']
        )
    
    tiers_penalties = df.apply(get_tier_and_penalty, axis=1)
    df['liquidity_tier'] = tiers_penalties.apply(lambda x: x[0])
    df['liquidity_penalty'] = tiers_penalties.apply(lambda x: x[1])
    
    # Step 4: Hard reject only truly ugly spreads and penny options
    df = df[
        (df['liquidity_tier'] != 'reject') &
        (df['bid_px_00'] >= config['min_bid_hard'])
    ].copy()
    
    rejected = original_count - len(df)
    
    # Print diagnostics
    print(f"\n  Liquidity Model Applied:")
    print(f"    Original: {original_count} options")
    print(f"    Hard rejected: {rejected} ({rejected/original_count*100:.1f}%)")
    print(f"    Remaining: {len(df)} options")
    
    if len(df) > 0:
        tier_counts = df['liquidity_tier'].value_counts()
        print(f"    Tier breakdown: {dict(tier_counts)}")
        print(f"    Avg spread: {df['spread_pct'].mean()*100:.1f}%, Avg allowed: {df['allowed_spread_pct'].mean()*100:.1f}%")
        print(f"    Avg penalty: {df['liquidity_penalty'].mean():.2f}x")
    return df


def calculate_pnl(premium_received, exit_price_paid, fees, cost_basis):
    """
    Calculate P&L metrics for a trade.
    
    Args:
        premium_received: Premium collected when selling (contract value)
        exit_price_paid: Price paid to close position (contract value), 0 if expired worthless
        fees: Total transaction costs
        cost_basis: Capital at risk (strike * 100 for CSP)
    
    Returns:
        dict with pnl, pnl_pct, roc
    """
    pnl = premium_received - exit_price_paid - fees
    pnl_pct = (pnl / premium_received) * 100 if premium_received > 0 else 0
    roc = (pnl / cost_basis) * 100 if cost_basis > 0 else 0
    
    return {
        'pnl': pnl,
        'pnl_pct': pnl_pct,
        'roc': roc,
        'fees': fees
    }


def compute_p_fill_profit(row, config):
    """
    Compute probability of fill for profit target exit based on entry liquidity.
    
    Uses entry-time spread and IVP to determine fill probability:
    - tight spread (<=5%): high fill probability
    - normal spread (<=10%): moderate fill probability
    - wide spread (>10%): low fill probability
    
    Applies IVP penalty multipliers for high-volatility regimes.
    
    Args:
        row: DataFrame row with spread_pct_entry, ivp_entry
        config: CONFIG dict with fill probability settings
    
    Returns:
        float: Fill probability in [pfill_min, pfill_max]
    """
    spread_pct = row.get('spread_pct_entry', 0.05)
    
    # Bucket by spread quality
    if spread_pct <= config['tight_spread_pct']:
        p_fill = config['pfill_tight']
    elif spread_pct <= config['normal_spread_pct']:
        p_fill = config['pfill_normal']
    else:
        p_fill = config['pfill_wide']
    
    # Apply scale multiplier
    p_fill *= config['pfill_scale']
    
    # Apply IVP penalty (always use ivp_entry, not generic ivp)
    ivp = row.get('ivp_entry', 0.5)
    if ivp >= config['ivp_extreme_threshold']:
        p_fill *= config.get('pfill_ivp_extreme_mult', 1.0)
    elif ivp >= config['ivp_high_threshold']:
        p_fill *= config.get('pfill_ivp_high_mult', 1.0)
    
    # Clamp to valid range
    return max(config['pfill_min'], min(config['pfill_max'], p_fill))


def try_probabilistic_fill(p_fill, rng):
    """
    Simulate probabilistic fill by drawing uniform random number.
    
    Args:
        p_fill: Fill probability (0.0 to 1.0)
        rng: numpy random number generator
    
    Returns:
        tuple: (filled: bool, u: float) where u is the random draw
    """
    u = rng.uniform(0, 1)
    filled = (u <= p_fill)
    return filled, u


def fetch_underlying_price_at_expiration(underlying_symbol, expiration_date, client, config):
    """
    Fetch the underlying price at option expiration (for ITM/OTM check).
    
    Uses cached daily equity data or fetches if not available.
    
    Args:
        underlying_symbol: Underlying ticker (e.g., 'TSLA')
        expiration_date: Expiration date (Timestamp)
        client: Databento client
        config: CONFIG dict
    
    Returns:
        float: Underlying close price on expiration date, or None if unavailable
    """
    cache_dir = config.get('cache_dir', '../cache/')
    tz = config.get('timezone', 'America/New_York')
    
    # Normalize date
    exp_date = pd.Timestamp(expiration_date)
    if exp_date.tz is not None:
        exp_date = exp_date.tz_localize(None)
    
    date_str = exp_date.strftime('%Y%m%d')
    
    # Try to find in daily equity cache (check multiple possible formats)
    cache_patterns = [
        f"equity_daily_{underlying_symbol}_*.parquet",
    ]
    
    # First, try to load from any cached daily file
    cache_files = [f for f in os.listdir(cache_dir) if f.startswith(f"equity_daily_{underlying_symbol}_")]
    
    for cache_file in cache_files:
        try:
            df = pd.read_parquet(os.path.join(cache_dir, cache_file))
            # Find the expiration date in the index
            if hasattr(df.index, 'date'):
                matches = df[df.index.date == exp_date.date()]
            else:
                df.index = pd.to_datetime(df.index)
                matches = df[df.index.date == exp_date.date()]
            
            if len(matches) > 0:
                return matches.iloc[-1]['close']
        except Exception:
            continue
    
    # Fallback: try to fetch specific day
    try:
        start = exp_date
        end = exp_date + pd.Timedelta(days=1)
        
        data = client.timeseries.get_range(
            dataset='EQUS.MINI',
            symbols=underlying_symbol,
            schema='ohlcv-1d',
            stype_in='raw_symbol',
            start=start,
            end=end,
        )
        df = data.to_df(tz=tz)
        if len(df) > 0:
            return df.iloc[-1]['close']
    except Exception as e:
        pass
    
    return None


# Print summary of fill assumptions
print("=" * 60)
print("FILL ASSUMPTIONS BY SCENARIO")
print("=" * 60)
print(f"{'Scenario':<12} {'Entry (Sell)':<25} {'Exit (Buy Back)':<25}")
print("-" * 60)
print(f"{'Pessimistic':<12} {'Mid - 75% of spread':<25} {'Close + 75% of range':<25}")
print(f"{'Realistic':<12} {'Mid - 30% of spread':<25} {'Close + 30% of range':<25}")
print(f"{'Optimistic':<12} {'Mid (no slippage)':<25} {'Close - 25% of range':<25}")
print("=" * 60)
print(f"\nTransaction costs: ${CONFIG['commission_per_contract'] + CONFIG['sec_fee_per_contract']:.2f}/leg")
print(f"\nLiquidity Model (regime-aware):")
print(f"  Hard reject: bid < ${CONFIG['min_bid_hard']} or spread > {CONFIG['hard_max_spread_pct']*100:.0f}%")
print(f"  Base target spread: {CONFIG['base_max_spread_pct']*100:.0f}%")
print(f"  High IV ({CONFIG['ivp_high_threshold']*100:.0f}%ile): allow {CONFIG['ivp_high_max_spread_pct']*100:.0f}%")
print(f"  Extreme IV ({CONFIG['ivp_extreme_threshold']*100:.0f}%ile): allow {CONFIG['ivp_extreme_max_spread_pct']*100:.0f}%")
print(f"  Short DTE (≤{CONFIG['short_dte_threshold']}d): +{CONFIG['short_dte_extra_spread_pct']*100:.0f}% allowed")


FILL ASSUMPTIONS BY SCENARIO
Scenario     Entry (Sell)              Exit (Buy Back)          
------------------------------------------------------------
Pessimistic  Mid - 75% of spread       Close + 75% of range     
Realistic    Mid - 30% of spread       Close + 30% of range     
Optimistic   Mid (no slippage)         Close - 25% of range     

Transaction costs: $0.66/leg

Liquidity Model (regime-aware):
  Hard reject: bid < $0.1 or spread > 20%
  Base target spread: 8%
  High IV (70%ile): allow 12%
  Extreme IV (90%ile): allow 15%
  Short DTE (≤7d): +2% allowed


## V5 Scheduler Layer

The following cells add orchestration on top of the v4 wheel engine:
1. **Trading Calendar Utility**: Generate valid trading days
2. **Entry Candidate Wrapper**: Encapsulate option chain fetch + filter logic
3. **Single Wheel Wrapper**: Run one wheel cycle for a given candidate
4. **Scheduler Loop**: Iterate over dates × symbols, launching wheels
5. **Aggregation**: Combine results across all wheels


In [7]:
# =============================================================================
# TRADING CALENDAR UTILITY (V5)
# =============================================================================

def get_trading_days(start_date, end_date, calendar='NYSE'):
    """
    Get valid trading days for a date range using market calendar.
    
    Args:
        start_date: Start date string or Timestamp
        end_date: End date string or Timestamp  
        calendar: Market calendar name (default 'NYSE')
    
    Returns:
        DatetimeIndex of valid trading days (timezone-naive, treated as NY dates)
    """
    cal = mcal.get_calendar(calendar)
    schedule = cal.schedule(start_date=start_date, end_date=end_date)
    # schedule.index is timezone-naive; we treat it as trading day in NY
    return schedule.index


# Test the function
test_days = get_trading_days('2023-01-01', '2023-01-15')
print("=" * 60)
print("TRADING CALENDAR UTILITY")
print("=" * 60)
print(f"Trading days Jan 1-15, 2023: {len(test_days)} days")
print(f"First: {test_days[0].date()}, Last: {test_days[-1].date()}")
print("=" * 60)


TRADING CALENDAR UTILITY
Trading days Jan 1-15, 2023: 9 days
First: 2023-01-03, Last: 2023-01-13


In [8]:
# =============================================================================
# ENTRY CANDIDATE WRAPPER (V5)
# =============================================================================

def get_entry_candidates(symbol, trade_date, config, client):
    """
    Get qualifying CSP entry candidates for a given symbol and date.
    
    Encapsulates:
    1. Fetch option chain snapshot at entry_time
    2. Parse option symbols and calculate DTE
    3. Compute IV and delta
    4. Apply delta/DTE filters
    5. Apply liquidity model
    6. Sort deterministically for reproducibility
    
    Args:
        symbol: Underlying symbol (e.g., 'TSLA')
        trade_date: Trading date (Timestamp or date-like)
        config: CONFIG dict with entry parameters
        client: Databento client
    
    Returns:
        DataFrame of qualifying candidates, or empty DataFrame if none found
    """
    import numpy as np
    from py_vollib.black_scholes.implied_volatility import implied_volatility
    from py_vollib.black_scholes.greeks.analytical import delta as calc_delta
    
    tz = config.get('timezone', 'America/New_York')
    cache_dir = config.get('cache_dir', '../cache/')
    entry_time = config.get('entry_time', '15:45')
    r = 0.04  # Risk-free rate
    
    # Build entry timestamp
    trade_date_str = pd.Timestamp(trade_date).strftime('%Y-%m-%d')
    entry_ts = pd.Timestamp(f"{trade_date_str} {entry_time}", tz=tz)
    
    # Cache filename for options
    date_str = entry_ts.strftime('%Y%m%d')
    time_str = entry_ts.strftime('%H%M')
    cache_file = os.path.join(cache_dir, f"options_{symbol}_{date_str}_{time_str}.parquet")
    
    # Fetch or load options data
    if os.path.exists(cache_file):
        df_opts = pd.read_parquet(cache_file)
    else:
        try:
            start = entry_ts
            end = start + pd.Timedelta(minutes=1)
            data = client.timeseries.get_range(
                dataset='OPRA.PILLAR',
                schema='cmbp-1',
                symbols=f"{symbol}.OPT",
                stype_in='parent',
                start=start,
                end=end,
            )
            df_opts = data.to_df(tz=tz).sort_values("ts_event")
            df_opts.to_parquet(cache_file)
        except Exception as e:
            # Return empty DataFrame on error
            return pd.DataFrame()
    
    if len(df_opts) == 0:
        return pd.DataFrame()
    
    # Parse option symbols
    sym = df_opts["symbol"]
    root_and_code = sym.str.split(expand=True)
    df_opts["root"] = root_and_code[0]
    code = root_and_code[1]
    df_opts["expiration"] = pd.to_datetime(code.str[:6], format="%y%m%d")
    df_opts["call_put"] = code.str[6]
    strike_int = code.str[7:].astype("int32")
    df_opts["strike"] = strike_int / 1000.0
    expiration_tz = df_opts["expiration"].dt.tz_localize(df_opts["ts_event"].dt.tz)
    df_opts["dte"] = (expiration_tz - df_opts["ts_event"].dt.normalize()).dt.days
    
    # Filter by DTE and option type
    df_opts = df_opts[
        (df_opts['dte'] >= config['dte_min']) & 
        (df_opts['dte'] <= config['dte_max']) & 
        (df_opts['call_put'] == config['option_type'])
    ]
    
    if len(df_opts) == 0:
        return pd.DataFrame()
    
    # Get underlying price
    equity_cache = os.path.join(cache_dir, f"equity_minute_{symbol}_{date_str}_{time_str}.parquet")
    if os.path.exists(equity_cache):
        equity_df = pd.read_parquet(equity_cache)
        underlying_price = equity_df['close'].iloc[-1] if len(equity_df) > 0 else None
    else:
        # Fetch underlying price
        try:
            start_time = entry_ts
            end_time = start_time + pd.Timedelta(minutes=1)
            equity_data = client.timeseries.get_range(
                dataset='XNAS.ITCH',
                symbols=[symbol],
                schema='ohlcv-1m',
                start=start_time,
                end=end_time,
                stype_in='raw_symbol'
            )
            equity_df = equity_data.to_df()
            equity_df.to_parquet(equity_cache)
            underlying_price = equity_df['close'].iloc[-1] if len(equity_df) > 0 else None
        except:
            underlying_price = None
    
    if underlying_price is None:
        return pd.DataFrame()
    
    # Keep only rows with valid quotes
    quotes = df_opts[df_opts["bid_px_00"].notna() & df_opts["ask_px_00"].notna()].copy()
    if len(quotes) == 0:
        return pd.DataFrame()
    
    quotes["mid"] = (quotes["bid_px_00"] + quotes["ask_px_00"]) / 2
    quotes["spread"] = quotes["ask_px_00"] - quotes["bid_px_00"]
    quotes["spread_pct"] = quotes["spread"] / quotes["mid"]
    
    # Collapse to one row per contract (latest quote)
    chain_snapshot = (
        quotes
        .sort_values("ts_event")
        .groupby(["symbol", "expiration", "strike", "call_put"])
        .tail(1)
        .copy()
    )
    chain_snapshot["underlying_last"] = underlying_price
    
    # Compute IV and delta
    def compute_iv_delta(row):
        price = row["mid"]
        S = row["underlying_last"]
        K = row["strike"]
        t = row["dte"] / 365.0
        flag = "p" if row["call_put"] == "P" else "c"
        
        if not (np.isfinite(price) and np.isfinite(S) and np.isfinite(K) and t > 0):
            return np.nan, np.nan
        if price <= 0 or S <= 0 or K <= 0:
            return np.nan, np.nan
        
        try:
            iv = implied_volatility(price, S, K, t, r, flag)
            d = abs(calc_delta(flag, S, K, t, r, iv))
            return iv, d
        except:
            return np.nan, np.nan
    
    iv_delta = chain_snapshot.apply(compute_iv_delta, axis=1, result_type='expand')
    chain_snapshot['iv'] = iv_delta[0]
    chain_snapshot['delta'] = iv_delta[1]
    chain_snapshot['date'] = chain_snapshot['ts_event'].dt.date
    
    # Filter by delta range
    candidates = chain_snapshot[
        chain_snapshot["delta"].abs().between(config['delta_min'], config['delta_max'])
    ].copy()
    
    if len(candidates) == 0:
        return pd.DataFrame()
    
    # Apply liquidity model (use the existing function - it prints output, so we suppress later)
    candidates = apply_liquidity_model(candidates, config)
    
    if len(candidates) == 0:
        return pd.DataFrame()
    
    # Calculate entry price with liquidity penalty
    candidates['entry_price'] = candidates.apply(
        lambda row: get_entry_price(row, config['fill_mode'], row.get('liquidity_penalty', 1.0)), 
        axis=1
    )
    candidates['per_share_premium'] = candidates['entry_price']
    candidates['premium'] = candidates['per_share_premium'] * 100
    candidates['cost_basis'] = candidates['strike'] * 100
    candidates['exit_pct'] = config['exit_pct']
    candidates['exit_price_per_share'] = candidates['per_share_premium'] * candidates['exit_pct']
    candidates['spread_pct_entry'] = candidates['spread_pct']
    candidates['ivp_entry'] = candidates['ivp']
    
    # DETERMINISTIC SORT for reproducibility
    candidates = candidates.sort_values(
        ['expiration', 'strike', 'symbol']
    ).reset_index(drop=True)
    
    return candidates


print("=" * 60)
print("ENTRY CANDIDATE WRAPPER LOADED")
print("=" * 60)
print("  get_entry_candidates(symbol, trade_date, config, client)")
print("  Returns DataFrame of qualifying CSP candidates")
print("=" * 60)


ENTRY CANDIDATE WRAPPER LOADED
  get_entry_candidates(symbol, trade_date, config, client)
  Returns DataFrame of qualifying CSP candidates


In [None]:
# =============================================================================
# SINGLE WHEEL WRAPPER (V5)
# =============================================================================

def run_single_wheel(candidate, config, cc_config, wheel_id, client, log_info=True):
    """
    Run a complete wheel cycle for a single candidate.
    
    Orchestrates:
    1. CSP entry and exit (using existing backtest_exit_strategy logic)
    2. If assigned: handle assignment and sell covered call
    3. CC exit (if applicable)
    4. Calculate wheel P&L
    
    Args:
        candidate: Series with candidate option data
        config: CONFIG dict
        cc_config: CC_CONFIG dict  
        wheel_id: Unique identifier for this wheel
        client: Databento client
        log_info: If True, print progress info
    
    Returns:
        List[dict]: Exit records for this wheel (CSP, CC if applicable, total)
    """
    import numpy as np
    
    # Initialize RNG with wheel-specific seed for reproducibility
    base_seed = config.get('execution_seed', 42)
    rng = np.random.RandomState(base_seed + hash(wheel_id) % 10000)
    
    wheel_results = []
    
    # Extract candidate info
    symbol = candidate['symbol']
    entry_date = pd.Timestamp(candidate['date']).tz_localize(None)
    expiration_date = pd.Timestamp(candidate['expiration']).tz_localize(None)
    strike = candidate['strike']
    premium_per_share = candidate['entry_price']
    premium = premium_per_share * 100
    cost_basis = candidate['cost_basis']
    exit_pct = candidate['exit_pct']
    exit_price_per_share = premium_per_share * exit_pct
    stop_loss_per_share = premium_per_share * config.get('stop_loss_multiplier', 2.0)
    liquidity_penalty = candidate.get('liquidity_penalty', 1.0)
    spread_pct_entry = candidate.get('spread_pct_entry', 0.05)
    ivp_entry = candidate.get('ivp_entry', 0.5)
    
    # Initialize CSP state
    current_state = 'CSP_OPEN'
    
    if log_info:
        print(f"\n  Wheel {wheel_id}: {symbol}")
        print(f"    Entry: {entry_date.date()}, Strike: ${strike:.2f}, Premium: ${premium:.2f}")
    
    # Initialize tracking variables BEFORE try block to avoid UnboundLocalError
    touch_count = 0
    touch_profit_target = False
    filled_profit_target = False
    p_fill_profit = None
    u_fill_profit = None
    exit_date = None
    exit_daily_row = None
    exit_reason = None
    exit_price = None
    underlying_at_exp = None
    
    # ----- CSP LIFECYCLE -----
    try:
        # Fetch daily prices for the option
        df_daily = fetch_daily_prices_for_option(symbol, entry_date, expiration_date, client, config)
        
        # Compute fill probability
        if config.get('use_probabilistic_exit_fills', True):
            p_fill_profit = compute_p_fill_profit(candidate, config)
        
        # Loop through trading days
        for check_date, daily_row in df_daily.iterrows():
            check_date_normalized = check_date.tz_localize(None) if hasattr(check_date, 'tz_localize') and check_date.tz else check_date
            
            if check_date_normalized.date() <= entry_date.date():
                continue
            
            daily_low = daily_row['low']
            daily_high = daily_row['high']
            
            # Check stop-loss first
            if daily_high >= stop_loss_per_share:
                exit_date = check_date_normalized
                exit_daily_row = daily_row
                exit_reason = 'stop_loss'
                current_state = advance_wheel_state(current_state, 'stop_loss')
                actual_exit_per_share = get_exit_price(daily_row, config.get('fill_mode', 'realistic'), penalty=liquidity_penalty)
                exit_price = actual_exit_per_share * 100
                break
            
            # Check profit target touch
            if daily_low <= exit_price_per_share:
                touch_profit_target = True
                touch_count += 1
                
                if config.get('use_probabilistic_exit_fills', True) and p_fill_profit is not None:
                    u = rng.uniform(0, 1)
                    u_fill_profit = u
                    
                    if u <= p_fill_profit:
                        exit_date = check_date_normalized
                        exit_daily_row = daily_row
                        exit_reason = 'profit_target'
                        current_state = advance_wheel_state(current_state, 'profit_target')
                        filled_profit_target = True
                        actual_exit_per_share = get_exit_price(daily_row, config.get('fill_mode', 'realistic'), penalty=liquidity_penalty)
                        exit_price = actual_exit_per_share * 100
                        break
                else:
                    exit_date = check_date_normalized
                    exit_daily_row = daily_row
                    exit_reason = 'profit_target'
                    current_state = advance_wheel_state(current_state, 'profit_target')
                    filled_profit_target = True
                    actual_exit_per_share = get_exit_price(daily_row, config.get('fill_mode', 'realistic'), penalty=liquidity_penalty)
                    exit_price = actual_exit_per_share * 100
                    break
        
        # Handle expiration if no exit
        if exit_date is None:
            exit_date = expiration_date
            exit_daily_row = None
            
            # Check if assigned (ITM at expiration)
            underlying_at_exp = fetch_underlying_price_at_expiration(
                symbol.split()[0], expiration_date, client, config
            )
            
            if underlying_at_exp is not None and underlying_at_exp < strike:
                # Put is ITM → assigned
                exit_reason = 'assigned'
                current_state = advance_wheel_state(current_state, 'assigned')
                exit_price = 0.0  # Option not bought back, shares assigned
            else:
                # OTM → expired worthless
                exit_reason = 'expired_worthless'
                current_state = advance_wheel_state(current_state, 'expired_worthless')
                exit_price = 0.0
        
    except Exception as e:
        if log_info:
            print(f"    ✗ CSP Error: {e}")
        exit_date = expiration_date
        exit_reason = 'error'
        exit_price = 0.0
        current_state = 'CSP_CLOSED_WORTHLESS'
    
    # Calculate CSP fees
    csp_fees = get_transaction_costs(config, is_round_trip=(exit_reason not in ['expired_worthless', 'assigned']))
    csp_pnl = premium - exit_price - csp_fees
    csp_roc = (csp_pnl / cost_basis) * 100
    
    # Create CSP exit record
    csp_exit = {
        'wheel_id': wheel_id,
        'phase': 'csp',
        'state': current_state,
        'symbol': symbol,
        'strike': strike,
        'entry_date': entry_date,
        'exit_date': exit_date,
        'expiration': expiration_date,
        'cost_basis': cost_basis,
        'initial_capital': cost_basis,
        'premium': premium,
        'exit_pct': exit_pct,
        'exit_price': exit_price,
        'exit_reason': exit_reason,
        'days_held': (exit_date - entry_date).days if exit_date else None,
        'pnl': csp_pnl,
        'roc': csp_roc,
        'fees': csp_fees,
        'touch_profit_target': touch_profit_target,
        'p_fill_profit_target': p_fill_profit,
        'u_fill_profit_target': u_fill_profit,
        'filled_profit_target': filled_profit_target,
        'touch_count': touch_count,
        'spread_pct_entry': spread_pct_entry,
        'ivp_entry': ivp_entry,
        'underlying_at_expiration': underlying_at_exp,
        'execution_version': 'v5_scheduler',
    }
    wheel_results.append(csp_exit)
    
    if log_info:
        print(f"    CSP: {exit_reason} → {current_state}, P&L: ${csp_pnl:.2f}")
    
    # ----- COVERED CALL LIFECYCLE (if assigned) -----
    cc_exit = None
    if exit_reason == 'assigned':
        if log_info:
            print(f"    → Processing CC after assignment...")
        
        # Create assignment record
        assignment_record = {
            'wheel_id': wheel_id,
            'symbol': symbol.split()[0],  # Underlying
            'assignment_date': expiration_date,
            'strike': strike,
            'shares': 100,
            'assigned_price': strike,
            'cash_used': strike * 100,
            'premium_kept': premium,
            'net_stock_cost': (strike * 100) - premium,
            'stock_cost_per_share': ((strike * 100) - premium) / 100,
            'underlying_at_assignment': underlying_at_exp,
            'initial_capital': cost_basis,
        }
        
        # Fetch call chain
        underlying_symbol = assignment_record['symbol']
        cc_entry_date = pd.Timestamp(expiration_date) + pd.Timedelta(days=1)
        
        call_chain = fetch_option_chain_for_cc(
            underlying_symbol, cc_entry_date, client, config, cc_config
        )
        
        if len(call_chain) > 0:
            cc_selection = select_covered_call(assignment_record, call_chain, cc_config, config)
            
            if cc_selection is not None:
                cc_exit_dict, final_cc_state = backtest_covered_call(
                    assignment_record, cc_selection, client, config, cc_config
                )
                cc_exit_dict['execution_version'] = 'v5_scheduler'
                wheel_results.append(cc_exit_dict)
                cc_exit = cc_exit_dict
                current_state = final_cc_state
    
    # ----- WHEEL TOTAL -----
    # Calculate total P&L
    total_pnl = csp_pnl
    total_days = csp_exit.get('days_held', 0) or 0
    cc_pnl = 0.0
    stock_pnl = 0.0
    
    if cc_exit is not None:
        cc_premium = cc_exit.get('premium', 0)
        cc_exit_price = cc_exit.get('exit_price', 0)
        cc_fees = get_transaction_costs(config, is_round_trip=(cc_exit.get('exit_reason') not in ['expired_worthless', 'called_away']))
        cc_pnl = cc_premium - cc_exit_price - cc_fees
        total_pnl += cc_pnl
        total_days += cc_exit.get('days_held', 0) or 0
        
        # Stock P&L if called away
        if cc_exit.get('exit_reason') == 'called_away':
            net_stock_cost = (strike * 100) - premium
            call_strike = cc_exit.get('strike', strike)
            stock_pnl = (call_strike * 100) - net_stock_cost
            total_pnl += stock_pnl
        
        # Advance to complete
        if current_state != 'WHEEL_COMPLETE':
            current_state = advance_wheel_state(current_state, 'complete')
    else:
        # No CC - advance CSP to complete
        if current_state != 'WHEEL_COMPLETE':
            current_state = advance_wheel_state(current_state, 'complete')
    
    wheel_roc = (total_pnl / cost_basis) * 100
    
    # Create total record
    total_record = {
        'wheel_id': wheel_id,
        'phase': 'total',
        'state': current_state,
        'symbol': symbol.split()[0],
        'entry_date': entry_date,
        'exit_date': cc_exit['exit_date'] if cc_exit else exit_date,
        'initial_capital': cost_basis,
        'csp_pnl': csp_pnl,
        'cc_pnl': cc_pnl,
        'stock_pnl': stock_pnl,
        'pnl': total_pnl,
        'wheel_roc': wheel_roc,
        'total_days': total_days,
        'csp_exit_reason': csp_exit['exit_reason'],
        'cc_exit_reason': cc_exit.get('exit_reason') if cc_exit else None,
        'execution_version': 'v5_scheduler',
    }
    wheel_results.append(total_record)
    
    if log_info:
        print(f"    TOTAL: ${total_pnl:.2f} ({wheel_roc:.2f}% ROC), {total_days} days")
    
    return wheel_results


print("=" * 60)
print("SINGLE WHEEL WRAPPER LOADED")
print("=" * 60)
print("  run_single_wheel(candidate, config, cc_config, wheel_id, client)")
print("  Returns List[dict] of exit records for one wheel cycle")
print("=" * 60)


SINGLE WHEEL WRAPPER LOADED
  run_single_wheel(candidate, config, cc_config, wheel_id, client)
  Returns List[dict] of exit records for one wheel cycle


In [10]:
# =============================================================================
# V5 SCHEDULER - MAIN ORCHESTRATION LOOP
# =============================================================================

def run_v5_scheduler(config, cc_config, scheduler_config, client):
    """
    Main V5 scheduler: iterate over trading days and symbols, launching wheels.
    
    For each trading day in the date range:
        For each symbol in the symbol list:
            1. Get entry candidates meeting all criteria
            2. Launch a wheel for each qualifying candidate
            3. Collect results
    
    Args:
        config: CONFIG dict (frozen - not mutated)
        cc_config: CC_CONFIG dict (frozen - not mutated)
        scheduler_config: SCHEDULER_CONFIG dict with date range and symbols
        client: Databento client
    
    Returns:
        DataFrame with all wheel exit records
    """
    from copy import deepcopy
    
    all_wheel_results = []
    wheel_counter = 0
    log_info = scheduler_config.get('log_level', 'INFO') == 'INFO'
    
    # Get trading days
    trading_days = get_trading_days(
        scheduler_config['start_date'],
        scheduler_config['end_date'],
        scheduler_config['trading_calendar'],
    )
    
    if log_info:
        print("=" * 60)
        print("V5 SCHEDULER STARTING")
        print("=" * 60)
        print(f"Date range: {scheduler_config['start_date']} to {scheduler_config['end_date']}")
        print(f"Trading days: {len(trading_days)}")
        print(f"Symbols: {scheduler_config['symbols']}")
        print("=" * 60)
    
    # Main loop: dates × symbols
    for trade_date in trading_days:
        for symbol in scheduler_config['symbols']:
            # Deep copy config to avoid mutation
            config_day = deepcopy(config)
            config_day['symbol'] = symbol
            config_day['entry_date'] = trade_date.strftime('%Y-%m-%d')
            
            # Get entry candidates
            entry_candidates = get_entry_candidates(symbol, trade_date, config_day, client)
            
            if len(entry_candidates) == 0:
                continue
            
            if log_info:
                print(f"\n[{trade_date.date()}] {symbol}: {len(entry_candidates)} candidates")
            
            # Apply max wheels per day limit if configured
            max_wheels = scheduler_config.get('max_wheels_per_symbol_per_day')
            if max_wheels is not None:
                entry_candidates = entry_candidates.head(max_wheels)
            
            # Launch wheel for each candidate
            for idx, candidate in entry_candidates.iterrows():
                wheel_id = f"{symbol}_{trade_date.date()}_{wheel_counter}"
                wheel_counter += 1
                
                wheel_results = run_single_wheel(
                    candidate=candidate,
                    config=config_day,
                    cc_config=cc_config,
                    wheel_id=wheel_id,
                    client=client,
                    log_info=log_info,
                )
                all_wheel_results.extend(wheel_results)
    
    if log_info:
        print(f"\n{'='*60}")
        print(f"SCHEDULER COMPLETE: {wheel_counter} wheels launched")
        print(f"{'='*60}")
    
    # Handle empty results case
    if len(all_wheel_results) == 0:
        return pd.DataFrame()
    
    # Combine all results into DataFrame
    results_df = pd.DataFrame(all_wheel_results)
    
    return results_df


print("=" * 60)
print("V5 SCHEDULER LOADED")
print("=" * 60)
print("  run_v5_scheduler(config, cc_config, scheduler_config, client)")
print("  Iterates over trading days × symbols, returns combined results")
print("=" * 60)


V5 SCHEDULER LOADED
  run_v5_scheduler(config, cc_config, scheduler_config, client)
  Iterates over trading days × symbols, returns combined results


In [11]:
# =============================================================================
# V5 AGGREGATION FUNCTIONS
# =============================================================================

def aggregate_v5_results(df):
    """
    Aggregate V5 scheduler results into summary statistics.
    
    Focuses on 'total' phase records which contain wheel-level P&L.
    
    Args:
        df: DataFrame from run_v5_scheduler()
    
    Returns:
        dict with aggregate statistics
    """
    if len(df) == 0:
        return {
            'total_wheels': 0,
            'total_pnl': 0.0,
            'avg_wheel_roc': 0.0,
            'median_wheel_roc': 0.0,
            'max_drawdown_proxy': 0.0,
            'win_rate': 0.0,
        }
    
    # Filter to wheel totals only
    totals = df[df['phase'] == 'total'].copy()
    
    if len(totals) == 0:
        return {
            'total_wheels': 0,
            'total_pnl': 0.0,
            'avg_wheel_roc': 0.0,
            'median_wheel_roc': 0.0,
            'max_drawdown_proxy': 0.0,
            'win_rate': 0.0,
        }
    
    return {
        'total_wheels': totals['wheel_id'].nunique(),
        'total_pnl': totals['pnl'].sum(),
        'avg_wheel_roc': totals['wheel_roc'].mean(),
        'median_wheel_roc': totals['wheel_roc'].median(),
        'std_wheel_roc': totals['wheel_roc'].std(),
        'max_wheel_roc': totals['wheel_roc'].max(),
        'min_wheel_roc': totals['wheel_roc'].min(),
        'max_drawdown_proxy': totals['pnl'].min(),
        'win_rate': (totals['pnl'] > 0).mean() * 100,
        'avg_days_held': totals['total_days'].mean(),
        'total_capital_deployed': totals['initial_capital'].sum(),
    }


def display_v5_summary(df):
    """
    Display formatted summary of V5 backtest results.
    
    Args:
        df: DataFrame from run_v5_scheduler()
    """
    stats = aggregate_v5_results(df)
    
    print("=" * 60)
    print("V5 SCHEDULER BACKTEST SUMMARY")
    print("=" * 60)
    print(f"Total Wheels:      {stats['total_wheels']}")
    print(f"Total P&L:         ${stats['total_pnl']:,.2f}")
    print(f"Win Rate:          {stats['win_rate']:.1f}%")
    print("-" * 40)
    print(f"Avg Wheel ROC:     {stats['avg_wheel_roc']:.2f}%")
    print(f"Median Wheel ROC:  {stats['median_wheel_roc']:.2f}%")
    print(f"Std Wheel ROC:     {stats.get('std_wheel_roc', 0):.2f}%")
    print(f"Best Wheel ROC:    {stats.get('max_wheel_roc', 0):.2f}%")
    print(f"Worst Wheel ROC:   {stats.get('min_wheel_roc', 0):.2f}%")
    print("-" * 40)
    print(f"Avg Days Held:     {stats.get('avg_days_held', 0):.1f}")
    print(f"Total Capital:     ${stats.get('total_capital_deployed', 0):,.2f}")
    print("=" * 60)
    
    # Phase breakdown
    if len(df) > 0:
        print("\nPhase Breakdown:")
        print(df['phase'].value_counts())
        
        print("\nExit Reason Breakdown (CSP phase):")
        csp_df = df[df['phase'] == 'csp']
        if len(csp_df) > 0:
            print(csp_df['exit_reason'].value_counts())
    
    return stats


print("=" * 60)
print("V5 AGGREGATION FUNCTIONS LOADED")
print("=" * 60)
print("  aggregate_v5_results(df) - Returns summary statistics dict")
print("  display_v5_summary(df) - Prints formatted summary")
print("=" * 60)


V5 AGGREGATION FUNCTIONS LOADED
  aggregate_v5_results(df) - Returns summary statistics dict
  display_v5_summary(df) - Prints formatted summary


## Run V5 Scheduler

Execute the scheduler across the configured date range and symbols.


In [12]:
# =============================================================================
# RUN V5 SCHEDULER
# =============================================================================
# Execute the multi-date, multi-symbol backtest

v5_results = run_v5_scheduler(
    config=CONFIG,
    cc_config=CC_CONFIG,
    scheduler_config=SCHEDULER_CONFIG,
    client=client,
)

# Display summary
v5_stats = display_v5_summary(v5_results)


V5 SCHEDULER STARTING
Date range: 2023-06-06 to 2023-09-13
Trading days: 69
Symbols: ['TSLA']

  Liquidity Model Applied:
    Original: 7 options
    Hard rejected: 0 (0.0%)
    Remaining: 7 options
    Tier breakdown: {'tight': np.int64(7)}
    Avg spread: 2.7%, Avg allowed: 10.1%
    Avg penalty: 1.00x

[2023-06-06] TSLA: 7 candidates

  Wheel TSLA_2023-06-06_0: TSLA  230707P00205000
    Entry: 2023-06-06, Strike: $205.00, Premium: $634.00
    ✗ CSP Error: name 'fetch_daily_prices_for_option' is not defined


UnboundLocalError: cannot access local variable 'touch_profit_target' where it is not associated with a value

In [None]:
# =============================================================================
# DATA MODEL VALIDATION (V5)
# =============================================================================
# Assert data integrity after scheduler run

def validate_v5_results(df):
    """
    Validate V5 results DataFrame for data integrity.
    
    Assertions:
    1. Every wheel_id has at least one phase
    2. Required fields present
    3. Execution version is consistent
    4. Exactly one 'total' row per wheel_id
    
    Args:
        df: DataFrame from run_v5_scheduler()
    
    Raises:
        AssertionError if validation fails
    """
    if len(df) == 0:
        print("⚠ No results to validate (empty DataFrame)")
        return
    
    print("=" * 60)
    print("DATA MODEL VALIDATION")
    print("=" * 60)
    
    # 1. Every wheel_id must have at least one phase
    phase_counts = df.groupby('wheel_id')['phase'].nunique()
    assert phase_counts.ge(1).all(), "Some wheel_ids have no phases"
    print("✓ Every wheel_id has at least one phase")
    
    # 2. Required fields present
    required_fields = [
        'wheel_id', 'phase', 'state', 'entry_date', 'exit_reason', 
        'pnl', 'execution_version'
    ]
    missing = [col for col in required_fields if col not in df.columns]
    assert len(missing) == 0, f"Missing required fields: {missing}"
    print(f"✓ All required fields present: {required_fields}")
    
    # 3. Verify execution version is consistent
    assert (df['execution_version'] == 'v5_scheduler').all(), \
        "Inconsistent execution_version values"
    print("✓ Execution version is consistent (v5_scheduler)")
    
    # 4. Ensure exactly one 'total' row per wheel_id
    totals = df[df['phase'] == 'total']
    assert totals['wheel_id'].is_unique, \
        "Duplicate 'total' rows detected for same wheel_id"
    print("✓ Exactly one 'total' row per wheel_id")
    
    # 5. Phase consistency check
    wheel_ids = df['wheel_id'].unique()
    for wid in wheel_ids:
        wheel_df = df[df['wheel_id'] == wid]
        phases = set(wheel_df['phase'].unique())
        
        # Every wheel must have 'csp' and 'total'
        assert 'csp' in phases, f"Wheel {wid} missing CSP phase"
        assert 'total' in phases, f"Wheel {wid} missing total phase"
    print("✓ All wheels have required phases (csp, total)")
    
    print("=" * 60)
    print("ALL VALIDATIONS PASSED")
    print("=" * 60)


# Run validation
validate_v5_results(v5_results)


DATA MODEL VALIDATION
✓ Every wheel_id has at least one phase
✓ All required fields present: ['wheel_id', 'phase', 'state', 'entry_date', 'exit_reason', 'pnl', 'execution_version']
✓ Execution version is consistent (v5_scheduler)
✓ Exactly one 'total' row per wheel_id
✓ All wheels have required phases (csp, total)
ALL VALIDATIONS PASSED


In [101]:
# =============================================================================
# V5 RESULTS EXPLORATION
# =============================================================================

# Configure pandas display
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
pd.set_option('display.max_colwidth', 50)

# Display all results
print("=" * 60)
print("V5 ALL EXIT RECORDS")
print("=" * 60)
print(f"Total records: {len(v5_results)}")

if len(v5_results) > 0:
    # Show phase breakdown
    print(f"\nPhase breakdown:")
    print(v5_results['phase'].value_counts())
    
    # Show key columns for all records
    display_cols = ['wheel_id', 'phase', 'state', 'symbol', 'entry_date', 'exit_date', 
                    'pnl', 'exit_reason', 'execution_version']
    available_cols = [c for c in display_cols if c in v5_results.columns]
    
    print("\nSample records:")
    v5_results[available_cols].head(15)


V5 ALL EXIT RECORDS
Total records: 754

Phase breakdown:
phase
csp      377
total    377
Name: count, dtype: int64

Sample records:


In [102]:
# =============================================================================
# V5 WHEEL TOTALS ANALYSIS
# =============================================================================

if len(v5_results) > 0:
    # Filter to totals only for wheel-level analysis
    totals = v5_results[v5_results['phase'] == 'total'].copy()
    
    print("=" * 60)
    print("WHEEL TOTALS (one row per wheel)")
    print("=" * 60)
    print(f"Total wheels: {len(totals)}")
    
    # Summary statistics
    print(f"\nP&L Statistics:")
    print(totals['pnl'].describe())
    
    print(f"\nROC Statistics:")
    print(totals['wheel_roc'].describe())
    
    print(f"\nExit Reason Breakdown:")
    print(totals['csp_exit_reason'].value_counts())
    
    # Show top and bottom performers
    print(f"\nTop 5 Performers:")
    print(totals.nlargest(5, 'wheel_roc')[['wheel_id', 'symbol', 'entry_date', 'pnl', 'wheel_roc', 'csp_exit_reason']])
    
    print(f"\nBottom 5 Performers:")
    print(totals.nsmallest(5, 'wheel_roc')[['wheel_id', 'symbol', 'entry_date', 'pnl', 'wheel_roc', 'csp_exit_reason']])


WHEEL TOTALS (one row per wheel)
Total wheels: 377

P&L Statistics:
count     377.000000
mean      -69.187546
std       687.523007
min     -1716.320000
25%      -795.320000
50%       348.680000
75%       486.680000
max       935.380000
Name: pnl, dtype: float64

ROC Statistics:
count    377.000000
mean      -0.220194
std        2.817519
min       -6.241164
25%       -3.267429
50%        1.515512
75%        2.051070
max        3.890333
Name: wheel_roc, dtype: float64

Exit Reason Breakdown:
csp_exit_reason
profit_target    223
stop_loss        154
Name: count, dtype: int64

Top 5 Performers:
               wheel_id symbol entry_date     pnl  wheel_roc csp_exit_reason
95   TSLA_2023-06-15_47   TSLA 2023-06-15  933.68   3.890333   profit_target
63   TSLA_2023-06-13_31   TSLA 2023-06-13  935.38   3.817878   profit_target
83   TSLA_2023-06-14_41   TSLA 2023-06-14  906.78   3.778250   profit_target
107  TSLA_2023-06-16_53   TSLA 2023-06-16  878.18   3.584408   profit_target
93   TSLA_2023-06

In [103]:
# We need to save backtest results with metadata as our strategy evolves
# exists_df should contain option data such as delta at entry, peak delta, maybe other information that would be helpful for analysis

# 
v5_results['daily_adjusted_roc'] = v5_results['csp_pnl']/v5_results['cost_basis']
v5_results['daily_adjusted_roc'].describe()
v5_results['days_held'].describe()
v5_results['exit_reason'].value_counts()


exit_reason
profit_target    223
stop_loss        154
Name: count, dtype: int64