# Tutorial: Crypto Premium Regime Gating Final Report

This notebook is the final storytelling surface for the hiring project.
It demonstrates:

- Stablecoin debiasing (`BTCUSDC` vs `BTCUSDT` synthetic premium)
- Robust filtering + regime gating + optional Hawkes contagion gating
- Dedicated on-chain validation feed (DefiLlama stablecoin prices)
- Ablation ladder (`debias_only`, `+robust`, `+regime`, `+hawkes`)


## 1. Setup


In [None]:
import sys
from pathlib import Path
import pandas as pd
import matplotlib.pyplot as plt
from IPython.display import display, Image

def find_repo_root(start: Path) -> Path:
    for candidate in (start, *start.parents):
        if (candidate / '.git').exists():
            return candidate
    return start

ROOT = find_repo_root(Path.cwd().resolve())
if str(ROOT) not in sys.path:
    sys.path.append(str(ROOT))

from src.pipeline import load_config, load_price_matrix, run_pipeline


## 2. Episode Inventory and Metrics Snapshot


In [None]:
EPISODES = {
    'bybit_usdc_depeg_2023': 'Bybit spot - USDC depeg (2023-03-10 to 2023-03-11)',
    'okx_usdc_depeg_2023': 'OKX futures - USDC depeg (2023-03-10 to 2023-03-11)',
    'yen_unwind_2024_binance': 'Binance futures - Yen unwind (2024-08-05 to 2024-08-06)',
    'yen_followthrough_2024_binance': 'Binance futures - Follow-through (2024-08-07 to 2024-08-08)',
    'march_vol_2024_binance': 'Binance futures - March volatility (2024-03-12 to 2024-03-13)',
}

rows = []
for episode_id, label in EPISODES.items():
    metrics_path = ROOT / 'reports' / 'episodes' / episode_id / 'tables' / 'metrics.csv'
    if not metrics_path.exists():
        continue
    metrics = pd.read_csv(metrics_path, index_col=0)
    for variant in ['naive', 'gated']:
        if variant not in metrics.index:
            continue
        row = metrics.loc[variant].to_dict()
        row['episode'] = episode_id
        row['label'] = label
        row['variant'] = variant
        rows.append(row)

summary_df = pd.DataFrame(rows)
summary_df = summary_df[['episode', 'label', 'variant', 'sharpe', 'pnl_net', 'max_drawdown', 'turnover', 'flip_rate', 'active_ratio']]
summary_df.sort_values(['episode', 'variant'])


## 3. On-Chain Validation Layer

`onchain_proxy` comes from DefiLlama stablecoin price history (`tether` vs `usd-coin`) and is compared to market-implied `stablecoin_proxy`.

Key diagnostics:
- `onchain_proxy`
- `onchain_divergence = stablecoin_proxy - onchain_proxy`
- `onchain_depeg_flag`


### Proxy Availability & Interpretation Notes

- Debiased premium `p` is most informative on episodes where cross-asset USDC/USDT proxy pairs are available (typical Binance perp windows).
- For episodes with limited cross-asset coverage, this project uses fail-closed defaults (`premium.fail_on_missing_proxy: true`) and skips incompatible windows in multi-episode reports.
- If fail-closed is intentionally disabled for diagnostics, interpret those runs primarily as depeg safety/on-chain validation checks rather than full debiased-premium demonstrations.



In [None]:
def plot_onchain_episode(episode_id: str, title: str):
    signal_path = ROOT / 'reports' / 'episodes' / episode_id / 'tables' / 'signal_frame.csv'
    if not signal_path.exists():
        print(f'Missing signal frame for {episode_id}: {signal_path}')
        return

    frame = pd.read_csv(signal_path)
    frame['timestamp_utc'] = pd.to_datetime(frame['timestamp_utc'], utc=True, errors='coerce')
    frame = frame.set_index('timestamp_utc').sort_index()

    fig, axes = plt.subplots(2, 1, figsize=(14, 8), sharex=True)

    axes[0].plot(frame.index, frame['stablecoin_proxy'], label='stablecoin_proxy (market-implied)', color='#003f5c')
    if 'onchain_proxy' in frame.columns:
        axes[0].plot(frame.index, frame['onchain_proxy'], label='onchain_proxy (DefiLlama)', color='#bc5090', alpha=0.8)
    axes[0].set_title(title)
    axes[0].set_ylabel('log(USDT/USDC)')
    axes[0].legend(loc='best')
    axes[0].grid(alpha=0.2)

    if 'onchain_divergence' in frame.columns:
        axes[1].plot(frame.index, frame['onchain_divergence'], color='#ff7f0e', label='onchain_divergence')
    if 'onchain_depeg_flag' in frame.columns:
        flag = frame['onchain_depeg_flag'].fillna(False)
        axes[1].fill_between(frame.index, -0.01, 0.01, where=flag, color='#d62728', alpha=0.10, label='onchain_depeg_flag')
    axes[1].set_ylabel('divergence')
    axes[1].set_xlabel('timestamp_utc')
    axes[1].legend(loc='best')
    axes[1].grid(alpha=0.2)

    plt.tight_layout()
    plt.show()


In [None]:
plot_onchain_episode('bybit_usdc_depeg_2023', 'Bybit USDC depeg window (March 2023)')
plot_onchain_episode('okx_usdc_depeg_2023', 'OKX USDC depeg window (March 2023)')


## 4. Ablation Ladder (Automated)

Expected variants:

- `naive`
- `debias_only`
- `plus_robust`
- `plus_regime`
- `plus_hawkes`

You can regenerate with:

`python -m src.ablation_report --price-matrix <matrix_path> --output-dir reports/tables`


In [None]:
ablation_path = ROOT / 'reports' / 'tables' / 'ablation_metrics.csv'
if ablation_path.exists():
    ablation_df = pd.read_csv(ablation_path, index_col=0)
    display(ablation_df)
else:
    print('No ablation_metrics.csv found yet. Run src.ablation_report first.')


## 5. Gating Parameter Tuning Results (2024 episodes)


In [None]:
tuning_files = sorted((ROOT / 'reports' / 'tables').glob('gating_tuning_*.csv'))
if tuning_files:
    latest = tuning_files[-1]
    tuning_df = pd.read_csv(latest)
    print('Latest tuning file:', latest)
    display(tuning_df.head(15))
else:
    print('No tuning output found. Run src.tune_gating first.')


## 6. Figure Gallery (Latest Episode Exports)


In [None]:
def show_episode_figures(episode_id: str):
    fig_dir = ROOT / 'reports' / 'episodes' / episode_id / 'figures'
    if not fig_dir.exists():
        print(f'Missing figure folder: {fig_dir}')
        return
    for name in ['figure_1_timeline.png', 'figure_2_panel.png', 'figure_3_phase_space.png']:
        p = fig_dir / name
        if p.exists():
            display(Image(filename=str(p)))
        else:
            print('Missing', p)

show_episode_figures('yen_unwind_2024_binance')


## 7. Final Checklist vs Notice + Hawkes

- [x] Debiased premium and depeg logic
- [x] Robust filter, event process, stat-mech variables
- [x] Regime gating and decision stack
- [x] Optional Hawkes contagion gating
- [x] Dedicated on-chain validation feed integrated into pipeline
- [x] Automated ablation report script
- [x] Multi-source episode loaders (Binance, Bybit, OKX)


## 8. Final Presentation Pack

The polished artifacts generated by `src.presentation_pack` and `src.calibration_report` are loaded below.


In [None]:
final_dir = ROOT / 'reports' / 'final'
print('final_dir:', final_dir)

files = [
    'final_episode_metrics_long.csv',
    'final_episode_metrics_wide.csv',
    'final_onchain_snapshot.csv',
    'calibration_details.csv',
    'calibration_aggregate.csv',
    'executive_summary.md',
]
for f in files:
    p = final_dir / f
    print(f, '->', 'OK' if p.exists() else 'MISSING')


In [None]:
metrics_long = pd.read_csv(ROOT / 'reports' / 'final' / 'final_episode_metrics_long.csv')
display(metrics_long)

calib = pd.read_csv(ROOT / 'reports' / 'final' / 'calibration_details.csv')
display(calib)


In [None]:
for fig_name in ['sharpe_naive_vs_gated.png', 'pnl_naive_vs_gated.png', 'fliprate_naive_vs_gated.png']:
    p = ROOT / 'reports' / 'final' / 'figures' / fig_name
    if p.exists():
        display(Image(filename=str(p)))
