# Experiment: Signal Assessment Comparison (30d vs 180d)

**Objective**: Compare recent (30d) vs longer (180d) backtest behavior for queued signals to spot stability vs fading.

**Success criteria**: Identify which signals are robust across windows and which are likely overfitting recent noise.


In [None]:
# Setup: imports and paths
from __future__ import annotations

from pathlib import Path
import pandas as pd

# Prefer absolute path; fall back to cwd/data if needed
DATA_DIR = Path('/Users/ducjeremyvu/cdx-trade/data')
if not DATA_DIR.exists():
    DATA_DIR = Path.cwd() / 'data'
DATA_DIR


## Plan

- Hypothesis: Signals with stable avg_r / median_r across 30d and 180d are more reliable.
- Variables: window (30d vs 180d), symbol, setup.
- Metrics: trades, win_rate, avg_r, median_r, best_r, worst_r.


In [None]:
# Load the assessment CSVs generated by assess-signal
entries = [
    {"symbol": "QQQ", "setup": "MeanReversion_D1", "window": 30},
    {"symbol": "QQQ", "setup": "MeanReversion_D1", "window": 180},
    {"symbol": "XLP", "setup": "TwoDayBreakout_D1", "window": 30},
    {"symbol": "XLP", "setup": "TwoDayBreakout_D1", "window": 180},
    {"symbol": "KRE", "setup": "PrevDayBreakout_D1", "window": 30},
    {"symbol": "KRE", "setup": "PrevDayBreakout_D1", "window": 180},
    {"symbol": "EEM", "setup": "MeanReversion_D1", "window": 30},
    {"symbol": "EEM", "setup": "MeanReversion_D1", "window": 180},
    {"symbol": "IEMG", "setup": "MeanReversion_D1", "window": 30},
    {"symbol": "IEMG", "setup": "MeanReversion_D1", "window": 180},
]

def load_summary(symbol: str, setup: str, window: int) -> dict:
    path = DATA_DIR / f"backtest_assess_{symbol}_{setup}_{window}d.csv"
    if not path.exists():
        return {
            "symbol": symbol,
            "setup": setup,
            "window": window,
            "trades": 0,
            "win_rate": 0.0,
            "avg_r": 0.0,
            "median_r": 0.0,
            "best_r": 0.0,
            "worst_r": 0.0,
            "missing_file": str(path),
        }
    df = pd.read_csv(path)
    return {
        "symbol": symbol,
        "setup": setup,
        "window": window,
        "trades": int(len(df)),
        "win_rate": float((df["outcome"] == "win").mean()) if len(df) else 0.0,
        "avg_r": float(df["r_multiple"].mean()) if len(df) else 0.0,
        "median_r": float(df["r_multiple"].median()) if len(df) else 0.0,
        "best_r": float(df["r_multiple"].max()) if len(df) else 0.0,
        "worst_r": float(df["r_multiple"].min()) if len(df) else 0.0,
    }

rows = [load_summary(e["symbol"], e["setup"], e["window"]) for e in entries]
summary = pd.DataFrame(rows).sort_values(["symbol", "window"])
summary


## What avg_r and median_r mean

Each trade has an **R-multiple** (profit/loss in units of risk). In this codebase:
- `risk = entry_price - stop_price`
- If stop is hit, `r_multiple = -1`
- If target is hit, `r_multiple = risk_multiple` (default 2.0)
- If time stop exits, `r_multiple = (exit_price - entry_price) / risk`

Then:
- **avg_r** is the mean of `r_multiple` across trades (expectancy per trade).
- **median_r** is the median of `r_multiple` (robustness to outliers; helps spot skew).


In [None]:
# Comparison table and simple visualization
display(summary)

try:
    import matplotlib.pyplot as plt

    pivot = summary.pivot_table(index=["symbol", "setup"], columns="window", values="avg_r")
    pivot = pivot.rename(columns={30: 'avg_r_30d', 180: 'avg_r_180d'})
    ax = pivot.plot(kind='bar', figsize=(10, 4), title='avg_r: 30d vs 180d')
    ax.set_ylabel('avg_r')
    plt.tight_layout()
except Exception as exc:
    print('Plot skipped:', exc)


## Results / Notes

- Key observations:
- Surprises or failure modes:
- Decision: continue, pivot, or stop:
