# DAX Local Pivot Conditional Probabilities by Volatility Regime
## Does First-Hour Volatility Change Pivot Edge Strength?

**Research Question:**
Do local pivot conditional probabilities vary by volatility regime?
- **Hypothesis 1:** Quiet markets (Q1) → stronger mean reversion to local pivot levels
- **Hypothesis 2:** Spicy markets (Q4) → weaker pivot support/resistance, more breakouts
- **Goal:** Find regime-specific conditional edges

**Methodology:**
1. **Calculate Local Pivots** from first hour (9:00-10:00) H/L/C
2. **Calculate First Hour Volatility Regime:**
   - Early_TR = Sum of True Range across first 12 M5 bars (9:00-9:55)
   - Baseline = 20-day average of Early_TR (shifted, no look-ahead)
   - Early_Ratio = Early_TR / Baseline
   - **Outlier trimming:** Remove top 5% and bottom 5% extremes
   - **Dynamic quartiles:** Rolling 60-day percentile rank on trimmed data
   - Q1 (Quiet) = 0-25th percentile, Q2 = 25-50th, Q3 = 50-75th, Q4 (Spicy) = 75-100th

3. **Classify Opening Zone at 10:00** relative to local pivots
4. **Track First Touch Timestamps** (10:00-17:30 session)
5. **Calculate Conditional Probabilities BY REGIME:**
   - P(Target | Condition, Zone, Regime)
   - Temporal + Spatial ordering enforced
   - Compare: Does Q1 (quiet) vs Q4 (spicy) show different conditional probabilities?

**Key Innovation:**
- Previous research: Local pivots conditional probabilities (all regimes combined)
- This research: SPLIT by volatility regime to find regime-specific edges

**Expected Insights:**
- If Q1 shows higher mean-reversion probabilities → trade reversals in quiet markets
- If Q4 shows lower pivot hold probabilities → avoid fading breakouts in spicy markets
- If no difference → regime doesn't matter for local pivots (use combined probabilities)

**Data:** M5 OHLCV, Jan 2021 - present, RTH (09:00-17:30 Berlin)

---

## Step 1: Setup and Data Loading

In [1]:
import sys
sys.path.insert(0, '../../')

from shared.database_connector import fetch_ohlcv, get_date_range
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta
from scipy import stats
import warnings
warnings.filterwarnings('ignore')

sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (22, 16)

print('[OK] Dependencies loaded')
print('='*80)

[OK] Dependencies loaded


## Step 2: Fetch M5 Data

In [2]:
print('\n[STEP 2] Fetch M5 Data')
print('='*80)

date_range = get_date_range('deuidxeur', 'm5')
end_date = date_range['end']
start_date = datetime(2021, 1, 1)

print(f'Fetching M5 data: {start_date.date()} to {end_date.date()}')

df_raw = fetch_ohlcv(
    symbol='deuidxeur',
    timeframe='m5',
    start_date=start_date,
    end_date=end_date
)

df_m5 = df_raw.copy()
df_m5.index = df_m5.index.tz_convert('Europe/Berlin')

print(f'[OK] Fetched {len(df_m5)} M5 candles')

df_m5['date'] = df_m5.index.date
df_m5['hour'] = df_m5.index.hour
df_m5['minute'] = df_m5.index.minute

df_m5_rth = df_m5[
    (df_m5['hour'] >= 9) & 
    ((df_m5['hour'] < 17) | ((df_m5['hour'] == 17) & (df_m5['minute'] <= 30)))
].copy()

print(f'[OK] RTH filtered: {len(df_m5_rth)} candles')

2025-12-18 22:05:23,318 - shared.database_connector - INFO - Initializing database connection...



[STEP 2] Fetch M5 Data


2025-12-18 22:05:24,114 - shared.database_connector - INFO - [OK] Database connection successful
2025-12-18 22:05:24,347 - shared.database_connector - INFO - [OK] Date range for deuidxeur m5: 2020-09-14 22:00:00+00:00 to 2025-12-11 22:55:00+00:00
2025-12-18 22:05:24,349 - shared.database_connector - INFO - fetch_ohlcv(): symbol=deuidxeur, timeframe=m5, start=2021-01-01 00:00:00, end=2025-12-11 22:55:00+00:00


Fetching M5 data: 2021-01-01 to 2025-12-11


2025-12-18 22:05:27,847 - shared.database_connector - INFO - [OK] Fetched 339450 candles (2021-01-03 22:00:00+00:00 to 2025-12-11 22:55:00+00:00)


[OK] Fetched 339450 M5 candles
[OK] RTH filtered: 131382 candles


## Step 3: Calculate Local Pivots AND Volatility Regime

**Why combine these steps?**
Both use the same first hour data (9:00-10:00), so we calculate them together efficiently.

**Volatility Regime Calculation:**
1. True Range (TR) = max(H-L, |H-C_prev|, |L-C_prev|) for each M5 bar
2. Early_TR = Sum of TR across first 12 bars (9:00-9:55)
3. Baseline = 20-day average Early_TR (shifted to avoid look-ahead)
4. Early_Ratio = Early_TR / Baseline
5. Outlier trimming: Remove top 5% and bottom 5%
6. Rolling 60-day percentile rank on trimmed data
7. Quartile assignment: Q1 (0-25%), Q2 (25-50%), Q3 (50-75%), Q4 (75-100%)

In [3]:
print('\n[STEP 3] Calculate Local Pivots AND Volatility Regime')
print('='*80)

df_first_hour = df_m5_rth[
    (df_m5_rth['hour'] == 9) & (df_m5_rth['minute'] < 60)
].copy()

# Calculate True Range for each M5 bar
df_first_hour['prev_close'] = df_first_hour.groupby('date')['close'].shift(1)
df_first_hour['tr'] = df_first_hour.apply(
    lambda row: max(
        row['high'] - row['low'],
        abs(row['high'] - row['prev_close']) if pd.notna(row['prev_close']) else row['high'] - row['low'],
        abs(row['low'] - row['prev_close']) if pd.notna(row['prev_close']) else row['high'] - row['low']
    ),
    axis=1
)

daily_data = []

for date, day_bars in df_first_hour.groupby('date'):
    if len(day_bars) < 10:
        continue
    
    # Local Pivot calculation
    H = day_bars['high'].max()
    L = day_bars['low'].min()
    C = day_bars.iloc[-1]['close']
    
    LPP = (H + L + C) / 3
    LR1 = (2 * LPP) - L
    LS1 = (2 * LPP) - H
    LR2 = LPP + (H - L)
    LS2 = LPP - (H - L)
    LR3 = H + 2 * (LPP - L)
    LS3 = L - 2 * (H - LPP)
    
    # Early Volatility calculation
    Early_TR = day_bars['tr'].sum()
    
    daily_data.append({
        'date': date,
        'first_hour_high': H,
        'first_hour_low': L,
        'first_hour_close': C,
        'LPP': LPP,
        'LR1': LR1,
        'LS1': LS1,
        'LR2': LR2,
        'LS2': LS2,
        'LR3': LR3,
        'LS3': LS3,
        'Early_TR': Early_TR,
    })

df_daily = pd.DataFrame(daily_data).sort_values('date').reset_index(drop=True)

print(f'[OK] Calculated local pivots and Early_TR for {len(df_daily)} days')

# Calculate baseline (20-day average, shifted)
df_daily['Baseline'] = df_daily['Early_TR'].rolling(20, min_periods=10).mean().shift(1)

# Drop first days without baseline
df_daily = df_daily[df_daily['Baseline'].notna()].reset_index(drop=True)

# Calculate Early_Ratio
df_daily['Early_Ratio'] = df_daily['Early_TR'] / df_daily['Baseline']

print(f'[OK] Calculated Early_Ratio: {len(df_daily)} valid days')

# Outlier trimming (remove top 5% and bottom 5%)
p5 = df_daily['Early_Ratio'].quantile(0.05)
p95 = df_daily['Early_Ratio'].quantile(0.95)

df_daily['Early_Ratio_Trimmed'] = df_daily['Early_Ratio'].clip(lower=p5, upper=p95)

print(f'[OK] Outlier trimming: clipped at {p5:.2f} (5th percentile) and {p95:.2f} (95th percentile)')

# Rolling 60-day percentile rank (on trimmed data)
df_daily['Early_Percentile'] = df_daily['Early_Ratio_Trimmed'].rolling(60, min_periods=30).apply(
    lambda x: pd.Series(x).rank(pct=True).iloc[-1] if len(x) > 0 else np.nan
)

# Quartile assignment
def assign_quartile(percentile):
    if pd.isna(percentile):
        return None
    if percentile <= 0.25:
        return 'Q1_Quiet'
    elif percentile <= 0.50:
        return 'Q2_Neutral'
    elif percentile <= 0.75:
        return 'Q3_Neutral'
    else:
        return 'Q4_Spicy'

df_daily['Regime'] = df_daily['Early_Percentile'].apply(assign_quartile)

# Drop days without regime
df_daily = df_daily[df_daily['Regime'].notna()].reset_index(drop=True)

print(f'[OK] Volatility regime assigned: {len(df_daily)} days')
print(f'\nRegime distribution:')
for regime, count in df_daily['Regime'].value_counts().items():
    pct = count / len(df_daily) * 100
    print(f'  {regime:12} {count:4d} days ({pct:5.1f}%)')


[STEP 3] Calculate Local Pivots AND Volatility Regime
[OK] Calculated local pivots and Early_TR for 1276 days
[OK] Calculated Early_Ratio: 1266 valid days
[OK] Outlier trimming: clipped at 0.60 (5th percentile) and 1.67 (95th percentile)
[OK] Volatility regime assigned: 1237 days

Regime distribution:
  Q4_Spicy      322 days ( 26.0%)
  Q2_Neutral    322 days ( 26.0%)
  Q1_Quiet      297 days ( 24.0%)
  Q3_Neutral    296 days ( 23.9%)


## Step 4: Get 10:00 Price and Classify Zone

**Why 10:00?**
Local pivots are calculated from 9:00-10:00 range, so we need to wait until 10:00 to classify the opening zone and start tracking conditional probabilities.

In [4]:
print('\n[STEP 4] Get 10:00 Price and Classify Zone')
print('='*80)

df_10am = df_m5_rth[
    (df_m5_rth['hour'] == 10) & (df_m5_rth['minute'] == 0)
][['date', 'open']].copy()

df_10am = df_10am.rename(columns={'open': 'price_10am'})
df_daily = df_daily.merge(df_10am, on='date', how='left')
df_daily = df_daily[df_daily['price_10am'].notna()].reset_index(drop=True)

def classify_local_zone(row):
    price = row['price_10am']
    if price > row['LR3']:
        return 'Above_LR3'
    elif row['LR2'] < price <= row['LR3']:
        return 'LR2_LR3'
    elif row['LR1'] < price <= row['LR2']:
        return 'LR1_LR2'
    elif row['LPP'] < price <= row['LR1']:
        return 'LPP_LR1'
    elif row['LS1'] < price <= row['LPP']:
        return 'LS1_LPP'
    elif row['LS2'] < price <= row['LS1']:
        return 'LS2_LS1'
    elif row['LS3'] < price <= row['LS2']:
        return 'LS3_LS2'
    else:
        return 'Below_LS3'

df_daily['opening_zone'] = df_daily.apply(classify_local_zone, axis=1)

print(f'[OK] Classified opening zones: {len(df_daily)} days')
print(f'\nZone distribution:')
for zone, count in df_daily['opening_zone'].value_counts().items():
    print(f'  {zone:12} {count:4d} days')


[STEP 4] Get 10:00 Price and Classify Zone
[OK] Classified opening zones: 1237 days

Zone distribution:
  LPP_LR1       680 days
  LS1_LPP       557 days


## Step 5: Define Target Levels

In [5]:
print('\n[STEP 5] Define Target Levels')
print('='*80)

def calculate_local_targets(row):
    targets = {}
    
    targets['LS3'] = row['LS3']
    targets['LS2'] = row['LS2']
    targets['LS1'] = row['LS1']
    targets['LPP'] = row['LPP']
    targets['LR1'] = row['LR1']
    targets['LR2'] = row['LR2']
    targets['LR3'] = row['LR3']
    
    for low_level, high_level, prefix in [
        ('LS3', 'LS2', 'LS2_LS3'),
        ('LS2', 'LS1', 'LS1_LS2'),
        ('LS1', 'LPP', 'LS1_LPP'),
        ('LPP', 'LR1', 'LPP_LR1'),
        ('LR1', 'LR2', 'LR1_LR2'),
        ('LR2', 'LR3', 'LR2_LR3'),
    ]:
        low_val = row[low_level]
        high_val = row[high_level]
        dist = high_val - low_val
        targets[f'{prefix}_025'] = low_val + 0.25 * dist
        targets[f'{prefix}_050'] = low_val + 0.50 * dist
        targets[f'{prefix}_075'] = low_val + 0.75 * dist
    
    return targets

df_daily['targets'] = df_daily.apply(calculate_local_targets, axis=1)

print(f'[OK] Target levels calculated')


[STEP 5] Define Target Levels
[OK] Target levels calculated


## Step 6: Track First Touch Timestamps (10:00-17:30 Session)

**Why track timestamps?**
To enforce temporal ordering - we only count a target as reached AFTER the condition level was hit.

In [6]:
print('\n[STEP 6] Track First Touch Timestamps')
print('='*80)
print('[WARNING] Processing intrabar data - may take 2-3 minutes...')

df_m5_session = df_m5_rth[df_m5_rth['hour'] >= 10].copy()

def track_first_touches_session(date, targets_dict, df_m5_day):
    first_touches = {key: None for key in targets_dict.keys()}
    
    for timestamp, bar in df_m5_day.iterrows():
        bar_high = bar['high']
        bar_low = bar['low']
        
        for target_name, target_price in targets_dict.items():
            if first_touches[target_name] is not None:
                continue
            
            if bar_low <= target_price <= bar_high:
                first_touches[target_name] = timestamp
    
    return first_touches

first_touch_data = []

for idx, day_row in df_daily.iterrows():
    date = day_row['date']
    targets_dict = day_row['targets']
    
    df_m5_day = df_m5_session[df_m5_session['date'] == date].copy()
    
    if len(df_m5_day) < 10:
        continue
    
    first_touches = track_first_touches_session(date, targets_dict, df_m5_day)
    
    first_touch_data.append({
        'date': date,
        'first_touches': first_touches
    })
    
    if (idx + 1) % 200 == 0:
        print(f'  Processed {idx + 1}/{len(df_daily)} days...')

df_touch = pd.DataFrame(first_touch_data)
df_daily = df_daily.merge(df_touch, on='date', how='left')

print(f'\n[OK] First touch timestamps tracked')


[STEP 6] Track First Touch Timestamps
  Processed 200/1237 days...
  Processed 400/1237 days...
  Processed 600/1237 days...
  Processed 800/1237 days...
  Processed 1000/1237 days...
  Processed 1200/1237 days...

[OK] First touch timestamps tracked


## Step 7: Calculate Conditional Probabilities BY REGIME (Spatial + Temporal)

**Key Innovation:**
We split the conditional probability calculation by volatility regime.

**Why this matters:**
- In Q1 (Quiet): Local pivots may act as strong support/resistance → higher mean reversion probabilities
- In Q4 (Spicy): Local pivots may break more easily → lower hold probabilities, higher breakout probabilities

**Spatial + Temporal Ordering:**
To count a target as reached after condition, we check:
1. **Temporal:** timestamp_target > timestamp_condition
2. **Spatial:** ALL intermediate levels between condition and target were reached

This eliminates false positives from price gaps.

In [7]:
print('\n[STEP 7] Calculate Conditional Probabilities BY REGIME')
print('='*80)

LEVEL_ORDER = [
    'LS3',
    'LS2_LS3_025', 'LS2_LS3_050', 'LS2_LS3_075',
    'LS2',
    'LS1_LS2_025', 'LS1_LS2_050', 'LS1_LS2_075',
    'LS1',
    'LS1_LPP_025', 'LS1_LPP_050', 'LS1_LPP_075',
    'LPP',
    'LPP_LR1_025', 'LPP_LR1_050', 'LPP_LR1_075',
    'LR1',
    'LR1_LR2_025', 'LR1_LR2_050', 'LR1_LR2_075',
    'LR2',
    'LR2_LR3_025', 'LR2_LR3_050', 'LR2_LR3_075',
    'LR3',
]

def get_intermediate_levels(condition, target, level_order):
    try:
        cond_idx = level_order.index(condition)
        target_idx = level_order.index(target)
    except ValueError:
        return []
    
    if cond_idx == target_idx:
        return []
    
    if cond_idx < target_idx:
        return level_order[cond_idx + 1:target_idx + 1]
    else:
        return level_order[target_idx:cond_idx][::-1]

def calculate_conditional_probs_by_regime(df, zone_name, condition_level, regime_filter):
    """
    Calculate P(Target | Condition, Zone, Regime) with spatial + temporal ordering.
    
    Args:
        regime_filter: 'Q1_Quiet', 'Q2_Neutral', 'Q3_Neutral', 'Q4_Spicy', or None (all regimes)
    """
    zone_data = df[df['opening_zone'] == zone_name].copy()
    
    # Filter by regime if specified
    if regime_filter is not None:
        zone_data = zone_data[zone_data['Regime'] == regime_filter].copy()
    
    if len(zone_data) < 10:
        return None
    
    zone_data['condition_timestamp'] = zone_data['first_touches'].apply(
        lambda ft: ft.get(condition_level) if ft is not None else None
    )
    
    condition_met = zone_data[zone_data['condition_timestamp'].notna()].copy()
    n_condition = len(condition_met)
    
    if n_condition < 5:
        return None
    
    target_keys = list(condition_met.iloc[0]['targets'].keys())
    
    results = []
    for target_key in target_keys:
        n_sequential_spatial = 0
        
        intermediates = get_intermediate_levels(condition_level, target_key, LEVEL_ORDER)
        
        for _, row in condition_met.iterrows():
            condition_ts = row['condition_timestamp']
            target_ts = row['first_touches'].get(target_key) if row['first_touches'] is not None else None
            
            if target_ts is None or condition_ts is None:
                continue
            
            if target_ts <= condition_ts:
                continue
            
            all_intermediates_reached = True
            for inter_level in intermediates:
                inter_ts = row['first_touches'].get(inter_level) if row['first_touches'] is not None else None
                if inter_ts is None:
                    all_intermediates_reached = False
                    break
            
            if all_intermediates_reached:
                n_sequential_spatial += 1
        
        prob_conditional = n_sequential_spatial / n_condition if n_condition > 0 else 0
        
        results.append({
            'target': target_key,
            'count_sequential': n_sequential_spatial,
            'prob_conditional': prob_conditional,
            'n_condition': n_condition,
        })
    
    return pd.DataFrame(results)

print('[OK] Conditional probability function defined (regime-aware)')


[STEP 7] Calculate Conditional Probabilities BY REGIME
[OK] Conditional probability function defined (regime-aware)


## Step 8: Analyze Key Zones - Compare Regimes

**Goal:** For each zone + condition, compare conditional probabilities across regimes.

**Key Question:** Does Q1 (Quiet) show different probabilities than Q4 (Spicy)?

**Example:**
```
Zone: LPP_LR1, Condition: LR1

Q1 (Quiet):  P(LPP | LR1) = 75% (strong mean reversion)
Q4 (Spicy):  P(LPP | LR1) = 45% (weak mean reversion, price tends to break)

Delta: +30% → In quiet markets, LR1 acts as resistance and price reverts to LPP
```

In [8]:
print('\n[STEP 8] Analyze Key Zones - Compare Across Regimes')
print('='*80)

# Focus on most common zones
key_zones = ['LPP_LR1', 'LS1_LPP']
key_conditions = {
    'LPP_LR1': ['LPP', 'LPP_LR1_050', 'LR1'],
    'LS1_LPP': ['LS1', 'LS1_LPP_050', 'LPP'],
}

# Key targets to compare (mean reversion vs continuation)
key_targets = {
    'LPP_LR1': {
        'LR1': ['LPP', 'LPP_LR1_050', 'LR1_LR2_050', 'LR2'],  # Reversion vs breakout
    },
    'LS1_LPP': {
        'LS1': ['LS2', 'LS1_LS2_050', 'LS1_LPP_050', 'LPP'],  # Reversion vs breakdown
    },
}

all_results = {}

for zone in key_zones:
    if zone not in df_daily['opening_zone'].values:
        continue
    
    zone_data = df_daily[df_daily['opening_zone'] == zone]
    n_zone = len(zone_data)
    
    print(f'\n{"="*100}')
    print(f'ZONE: {zone} (N = {n_zone} days total)')
    print(f'{"="*100}')
    
    all_results[zone] = {}
    
    for condition in key_conditions[zone]:
        print(f'\n  CONDITION: {condition}')
        print(f'  {"-"*95}')
        
        # Calculate for each regime
        regime_results = {}
        
        for regime in ['Q1_Quiet', 'Q2_Neutral', 'Q3_Neutral', 'Q4_Spicy']:
            df_regime = calculate_conditional_probs_by_regime(df_daily, zone, condition, regime)
            if df_regime is not None:
                regime_results[regime] = df_regime
        
        # Also calculate for ALL regimes combined (baseline)
        df_all = calculate_conditional_probs_by_regime(df_daily, zone, condition, None)
        if df_all is not None:
            regime_results['ALL'] = df_all
        
        all_results[zone][condition] = regime_results
        
        # Display comparison for key targets
        if condition in key_targets.get(zone, {}):
            targets_to_show = key_targets[zone][condition]
            
            print(f'\n  {"Target":<20} {"Q1 (Quiet)":>12} {"Q2":>12} {"Q3":>12} {"Q4 (Spicy)":>12} {"ALL":>12} {"Q1-Q4 Delta":>15}')
            print(f'  {"-"*110}')
            
            for target in targets_to_show:
                probs = {}
                for regime in ['Q1_Quiet', 'Q2_Neutral', 'Q3_Neutral', 'Q4_Spicy', 'ALL']:
                    if regime in regime_results:
                        df_r = regime_results[regime]
                        row = df_r[df_r['target'] == target]
                        if len(row) > 0:
                            probs[regime] = row.iloc[0]['prob_conditional']
                        else:
                            probs[regime] = np.nan
                    else:
                        probs[regime] = np.nan
                
                # Calculate delta Q1 - Q4
                if not np.isnan(probs.get('Q1_Quiet', np.nan)) and not np.isnan(probs.get('Q4_Spicy', np.nan)):
                    delta = probs['Q1_Quiet'] - probs['Q4_Spicy']
                else:
                    delta = np.nan
                
                # Format output
                q1_str = f"{probs.get('Q1_Quiet', np.nan):.0%}" if not np.isnan(probs.get('Q1_Quiet', np.nan)) else "N/A"
                q2_str = f"{probs.get('Q2_Neutral', np.nan):.0%}" if not np.isnan(probs.get('Q2_Neutral', np.nan)) else "N/A"
                q3_str = f"{probs.get('Q3_Neutral', np.nan):.0%}" if not np.isnan(probs.get('Q3_Neutral', np.nan)) else "N/A"
                q4_str = f"{probs.get('Q4_Spicy', np.nan):.0%}" if not np.isnan(probs.get('Q4_Spicy', np.nan)) else "N/A"
                all_str = f"{probs.get('ALL', np.nan):.0%}" if not np.isnan(probs.get('ALL', np.nan)) else "N/A"
                delta_str = f"{delta:+.0%}" if not np.isnan(delta) else "N/A"
                
                print(f"  {target:20} {q1_str:>12} {q2_str:>12} {q3_str:>12} {q4_str:>12} {all_str:>12} {delta_str:>15}")
            
            # Show sample sizes
            print(f'\n  Sample sizes (N days where condition reached):')
            for regime in ['Q1_Quiet', 'Q2_Neutral', 'Q3_Neutral', 'Q4_Spicy', 'ALL']:
                if regime in regime_results:
                    n = regime_results[regime].iloc[0]['n_condition'] if len(regime_results[regime]) > 0 else 0
                    print(f'    {regime:12} N = {n}')

print(f'\n{"="*100}')
print('[OK] Regime comparison complete')
print(f'{"="*100}')


[STEP 8] Analyze Key Zones - Compare Across Regimes

ZONE: LPP_LR1 (N = 680 days total)

  CONDITION: LPP
  -----------------------------------------------------------------------------------------------

  CONDITION: LPP_LR1_050
  -----------------------------------------------------------------------------------------------

  CONDITION: LR1
  -----------------------------------------------------------------------------------------------

  Target                 Q1 (Quiet)           Q2           Q3   Q4 (Spicy)          ALL     Q1-Q4 Delta
  --------------------------------------------------------------------------------------------------------------
  LPP                           20%          28%          32%          16%          24%             +4%
  LPP_LR1_050                    1%           0%           0%           0%           0%             +1%
  LR1_LR2_050                   73%          70%          77%          82%          75%             -9%
  LR2                    

## Step 9: Interpretation - Does Regime Matter?

**How to read the results:**

1. **Q1-Q4 Delta = POSITIVE (e.g., +20%):**
   - In quiet markets (Q1), this target is reached MORE often after condition
   - Example: P(LPP | LR1) higher in Q1 → stronger mean reversion in quiet markets
   - **Trading implication:** Fade breakouts in quiet markets

2. **Q1-Q4 Delta = NEGATIVE (e.g., -15%):**
   - In spicy markets (Q4), this target is reached MORE often after condition
   - Example: P(LR2 | LR1) higher in Q4 → stronger momentum in spicy markets
   - **Trading implication:** Follow breakouts in spicy markets

3. **Q1-Q4 Delta = SMALL (e.g., ±5%):**
   - Regime doesn't significantly affect this probability
   - Use the ALL (combined) probability for trading decisions

**Key Patterns to Look For:**
- **Mean reversion stronger in Q1:** P(reversion_target | breakout_level) higher in Q1
- **Momentum stronger in Q4:** P(continuation_target | breakout_level) higher in Q4
- **Symmetric across regimes:** No delta → use regime-agnostic probabilities

In [9]:
print('\n[STEP 9] Interpretation Guide')
print('='*80)
print('''
HOW TO USE THESE RESULTS:

1. IDENTIFY REGIME-SPECIFIC EDGES:
   - Look for large Q1-Q4 deltas (>15%)
   - Positive delta → Q1 (Quiet) has higher probability
   - Negative delta → Q4 (Spicy) has higher probability

2. TRADING APPLICATIONS:
   
   Example A: Mean Reversion Edge in Quiet Markets
   --------------------------------------------------
   Zone: LPP_LR1, Condition: LR1
   Target: LPP (mean reversion)
   Q1: 75%, Q4: 45%, Delta: +30%
   
   → In quiet markets (Q1), after hitting LR1, price reverts to LPP 75% of the time
   → In spicy markets (Q4), only 45% reversion rate
   → TRADE: Fade LR1 breakouts in Q1, avoid fading in Q4
   
   Example B: Momentum Edge in Spicy Markets
   --------------------------------------------------
   Zone: LPP_LR1, Condition: LR1
   Target: LR2 (continuation)
   Q1: 25%, Q4: 55%, Delta: -30%
   
   → In quiet markets (Q1), only 25% continuation to LR2
   → In spicy markets (Q4), 55% continuation rate
   → TRADE: Follow LR1 breakouts in Q4, avoid chasing in Q1

3. WHEN TO IGNORE REGIME:
   - If Q1-Q4 delta < 10% → regime doesn't matter much
   - Use the ALL (combined) probability for trading decisions
   - Focus on temporal + spatial conditional probabilities only

4. SAMPLE SIZE CHECK:
   - Only trust results with N >= 20 days per regime
   - If sample size too small, results may be noise
   - Combine Q2+Q3 if needed to increase sample size
''')
print('='*80)


[STEP 9] Interpretation Guide

HOW TO USE THESE RESULTS:

1. IDENTIFY REGIME-SPECIFIC EDGES:
   - Look for large Q1-Q4 deltas (>15%)
   - Positive delta → Q1 (Quiet) has higher probability
   - Negative delta → Q4 (Spicy) has higher probability

2. TRADING APPLICATIONS:

   Example A: Mean Reversion Edge in Quiet Markets
   --------------------------------------------------
   Zone: LPP_LR1, Condition: LR1
   Target: LPP (mean reversion)
   Q1: 75%, Q4: 45%, Delta: +30%

   → In quiet markets (Q1), after hitting LR1, price reverts to LPP 75% of the time
   → In spicy markets (Q4), only 45% reversion rate
   → TRADE: Fade LR1 breakouts in Q1, avoid fading in Q4

   Example B: Momentum Edge in Spicy Markets
   --------------------------------------------------
   Zone: LPP_LR1, Condition: LR1
   Target: LR2 (continuation)
   Q1: 25%, Q4: 55%, Delta: -30%

   → In quiet markets (Q1), only 25% continuation to LR2
   → In spicy markets (Q4), 55% continuation rate
   → TRADE: Follow LR1 bre

## Step 10: Export Summary

In [10]:
print('\n[STEP 10] Export Summary')
print('='*80)

export_rows = []

for zone, conditions_dict in all_results.items():
    for condition, regime_dict in conditions_dict.items():
        for regime, df_result in regime_dict.items():
            for _, row in df_result.iterrows():
                export_rows.append({
                    'zone': zone,
                    'condition': condition,
                    'regime': regime,
                    'target': row['target'],
                    'prob_conditional': row['prob_conditional'],
                    'count_sequential': row['count_sequential'],
                    'n_condition': row['n_condition'],
                })

if len(export_rows) > 0:
    df_export = pd.DataFrame(export_rows)
    
    print(f'[OK] Summary table: {len(df_export)} rows')
    
    # Find strongest regime-specific edges (largest Q1-Q4 deltas)
    # Pivot to compare Q1 vs Q4
    df_pivot = df_export.pivot_table(
        index=['zone', 'condition', 'target'],
        columns='regime',
        values='prob_conditional'
    ).reset_index()
    
    if 'Q1_Quiet' in df_pivot.columns and 'Q4_Spicy' in df_pivot.columns:
        df_pivot['Q1_Q4_Delta'] = df_pivot['Q1_Quiet'] - df_pivot['Q4_Spicy']
        df_pivot = df_pivot.dropna(subset=['Q1_Q4_Delta'])
        
        print(f'\nTop 15 strongest regime effects (Q1-Q4 delta):')
        print('\nMEAN REVERSION EDGES (Q1 > Q4, positive delta):')
        top_positive = df_pivot.nlargest(10, 'Q1_Q4_Delta')[['zone', 'condition', 'target', 'Q1_Quiet', 'Q4_Spicy', 'Q1_Q4_Delta']]
        if len(top_positive) > 0:
            for _, row in top_positive.iterrows():
                print(f"  {row['zone']:12} | {row['condition']:15} → {row['target']:15} | Q1: {row['Q1_Quiet']:.0%}, Q4: {row['Q4_Spicy']:.0%}, Δ: {row['Q1_Q4_Delta']:+.0%}")
        
        print(f'\nMOMENTUM EDGES (Q4 > Q1, negative delta):')
        top_negative = df_pivot.nsmallest(10, 'Q1_Q4_Delta')[['zone', 'condition', 'target', 'Q1_Quiet', 'Q4_Spicy', 'Q1_Q4_Delta']]
        if len(top_negative) > 0:
            for _, row in top_negative.iterrows():
                print(f"  {row['zone']:12} | {row['condition']:15} → {row['target']:15} | Q1: {row['Q1_Quiet']:.0%}, Q4: {row['Q4_Spicy']:.0%}, Δ: {row['Q1_Q4_Delta']:+.0%}")
    
    # Optionally save
    # df_export.to_csv('../../output/local_pivot_conditional_by_regime.csv', index=False)
    # print(f'\n[OK] Saved to output/local_pivot_conditional_by_regime.csv')
else:
    print('[WARNING] No results to export')

print('\n[COMPLETE] Local Pivot Conditional Probabilities by Regime Analysis Finished')
print('[NEXT STEP] Review Q1-Q4 deltas to identify regime-specific edges for backtesting')
print('='*80)


[STEP 10] Export Summary
[OK] Summary table: 750 rows

Top 15 strongest regime effects (Q1-Q4 delta):

MEAN REVERSION EDGES (Q1 > Q4, positive delta):
  LS1_LPP      | LPP             → LR3             | Q1: 47%, Q4: 29%, Δ: +18%
  LS1_LPP      | LPP             → LR2_LR3_075     | Q1: 50%, Q4: 32%, Δ: +18%
  LS1_LPP      | LPP             → LR2_LR3_050     | Q1: 51%, Q4: 37%, Δ: +14%
  LPP_LR1      | LPP_LR1_050     → LR2_LR3_075     | Q1: 55%, Q4: 41%, Δ: +14%
  LPP_LR1      | LPP_LR1_050     → LR3             | Q1: 49%, Q4: 36%, Δ: +13%
  LPP_LR1      | LPP             → LR2_LR3_075     | Q1: 45%, Q4: 33%, Δ: +13%
  LS1_LPP      | LPP             → LR2             | Q1: 61%, Q4: 49%, Δ: +12%
  LS1_LPP      | LPP             → LR2_LR3_025     | Q1: 56%, Q4: 44%, Δ: +12%
  LPP_LR1      | LPP             → LR3             | Q1: 41%, Q4: 29%, Δ: +12%
  LPP_LR1      | LR1             → LR2_LR3_075     | Q1: 63%, Q4: 51%, Δ: +12%

MOMENTUM EDGES (Q4 > Q1, negative delta):
  LPP_LR1      