In [None]:
"""
================================================================================
Trailing Stop-Loss (CAPPED) — OOS Full-Year Simulation (Step8 + Step9) — Next-Day Only
================================================================================

PURPOSE
-------
This script evaluates a trailing stop-loss overlay on existing trade records using
15-minute resampled bars built from raw M5 data.

Key design:
- Trades are assumed to be entered at/near end-of-day (EOD).
- An INITIAL stop-loss is placed at the entry close (>= 1% distance by rule).
- Stop monitoring / trailing begins from the NEXT trading day only (no same-day trailing).

The output compares:
- Original exit (as per trade file)
vs
- Exit produced by Initial SL / Trailing SL logic

and reports both trade-level results and summary metrics.

--------------------------------------------------------------------------------
INPUTS
------
1) TRADE_CSV_PATH
   Trade-level data with columns (case-sensitive as used in code):
     - TradeID
     - Currency
     - Direction
     - date
     - price
     - Opening Date
     - Closing Date
   Notes:
   - Script filters trades to YEAR_FILTER based on the 'date' column year.
   - TradeID may have multiple rows; script aggregates to 1 row per TradeID.

2) M5 raw price data files (CSV)
   Each ticker file must contain columns:
     - date, open, high, low, close
   The script:
     - parses 'date' to datetime
     - filters market hours (MARKET_START to MARKET_END)
     - resamples to 15-minute OHLC (RESAMPLE_RULE)
     - computes ATR(ATR_PERIOD) on resampled bars

3) M5 file list (M5_FILE_LIST_PATH)
   CSV with columns:
     - ticker, filename
   Used to locate each ticker's raw M5 data file under M5_DIR.

4) Instrument mapping (MAPPING_XLSX_PATH)
   Excel with columns:
     - org_symbol  (ticker name used in M5 files)
     - Symbol      (trade symbol / currency, e.g., "EUR/USD")
   Mapping logic:
     - TradeBase = base symbol from Currency/Symbol (before "/")
     - TickerNorm = mapped to M5 ticker via org_symbol

5) Step8 / Step9 best configs
   - best_config_step8.csv (expects first row keys):
       ATAN_LONG, ATAN_SHORT, MAX_STOP_LONG, MAX_STOP_SHORT
   - best_config_step9.csv (expects first row keys, defaults to 1.0 if missing):
       SENS_LONG, SENS_SHORT
   These are merged into one config dict per run.

--------------------------------------------------------------------------------
CORE LOGIC (HIGH LEVEL)
----------------------
A) Trade aggregation
   For each TradeID (within YEAR_FILTER):
   - EntryDate  = first non-null Opening Date
   - ExitDate   = first non-null Closing Date
   - EntryPrice = last available 'price' row on EntryDate
   - ExitPrice  = last available 'price' row on ExitDate
   Eligibility filters:
   - Must have mapped ticker available in M5 file list
   - Direction must be LONG or SHORT
   - Entry/Exit dates and prices must exist
   - ExitDate must be strictly after EntryDate (at least 1-day window)

B) Price series preparation per ticker (cached)
   For each eligible ticker:
   - load raw M5 CSV
   - filter to market hours [MARKET_START, MARKET_END]
   - resample to 15-min OHLC
   - compute ATR(14)
   The resampled series is cached so multiple trades reuse it.

C) Initial Stop-Loss (placed at entry close)
   - Initial stop distance percentage must be >= INITIAL_MIN_STOP_PCT (>= 1%).
   - Attempt priority:
       1) adaptive_stop_distance_pct() computed from historical bars up to entry close
       2) ATR% * ATR_FALLBACK_MULT
       3) INITIAL_MIN_STOP_PCT
   - Always capped by MAX_STOP_LONG/MAX_STOP_SHORT from Step8.

D) Trailing Stop-Loss (NEXT DAY ONLY)
   - Monitoring starts at: EntryDate + 1 day @ MARKET_START
   - For each 15-min bar in the monitoring window:
       1) Check stop hit FIRST:
          * If TRIGGER_ON_CLOSE=True -> stop is evaluated on bar close
          * Else -> evaluated on low (long) / high (short)
          If hit before any trailing update -> ExitReason="InitialSL"
          If hit after trailing updates -> ExitReason="TrailingSL"
       2) Tighten stop using adaptive_stop_distance_pct():
          - Uses last REG_WINDOW closes (quadratic fit) to detect slope regime.
          - If slope regime strong: uses slope-based stop distance (scaled by sensitivity).
          - Else uses ATR fallback (ATR% * ATR_FALLBACK_MULT).
          - Trailing stop minimum distance is MIN_STOP_PCT (can be tighter than 1%).
   - If no stop hit, exit remains the original exit ("OriginalExit").

E) Outputs
   - trades_full_{YEAR_FILTER}.csv
       Trade-by-trade results: initial stop, exit reason, trailing exit, improvement metrics
   - summary_full_{YEAR_FILTER}.csv
       Aggregated metrics for ALL trades and by Direction:
         Trades, SLTriggered, PctTriggered, Avg impact, etc.
   - debug_reasons_full_{YEAR_FILTER}.csv + reason_counts_full_{YEAR_FILTER}.csv
       Diagnostics / exit reason counts

--------------------------------------------------------------------------------
IMPORTANT SETTINGS / ASSUMPTIONS
--------------------------------
- MARKET_START / MARKET_END define the intraday window used for resampling and simulation.
- RESAMPLE_RULE is set to 15T (15-minute candles).
- APPLY_FROM_NEXT_DAY_ONLY=True means there is NO same-day monitoring or trailing.
- TRIGGER_ON_CLOSE=True means stop triggers on 15-min CLOSE (not intrabar extremes).
  Set to False to trigger on LOW/HIGH (more realistic for stop execution).
- Timezone: the script assumes the timestamps in M5 'date' column are already aligned
  with the intended MARKET_START / MARKET_END clock times.

--------------------------------------------------------------------------------
HOW TO RUN (MINIMUM STEPS)
-------------------------
1) Update CONFIG paths:
   - TRADE_CSV_PATH
   - M5_DIR, M5_FILE_LIST_PATH
   - MAPPING_XLSX_PATH
   - STEP8_DIR / STEP9_DIR (best_config files)
   - YEAR_FILTER (or remove year filter if needed)

2) Ensure required columns exist in the trade CSV and M5 raw files.

3) Run:
   python your_script.py

--------------------------------------------------------------------------------
"""

In [None]:
import os
import numpy as np
import pandas as pd
from dataclasses import dataclass
from datetime import timedelta

# ============================================================
# CONFIG (EDIT THESE PATHS)
# ============================================================

TRADE_CSV_PATH = r"D:/work/Client/Maatra/Trade Level Data/Equities 14 vol trade data (Jan 2021 to Nov 2025).csv"

M5_DIR            = r"D:/work/Client/Maatra/Trade Level Data/M5 Raw data"
M5_FILE_LIST_PATH = r"D:/work/Client/Maatra/Trade Level Data/M5 Raw data/m5_file_list.csv"
MAPPING_XLSX_PATH = r"D:/work/Client/Maatra/Trade Level Data/Instrument Mapping.xlsx"

# Step 8 base params (dir-specific)
STEP8_DIR = r"D:/work/Client/Maatra/Trade Level Data/TrailingSL_CAPPED_MKT1430_2100_SME_15T_STEP8_DIRSPEC"
BEST_STEP8_PATH = os.path.join(STEP8_DIR, "best_config_step8.csv")

# Step 9 sensitivity
STEP9_DIR = r"D:/work/Client/Maatra/Trade Level Data/TrailingSL_CAPPED_MKT1430_2100_SME_15T_STEP9_SENSITIVITY"
BEST_STEP9_PATH = os.path.join(STEP9_DIR, "best_config_step9.csv")

YEAR_FILTER = 2025

OUT_DIR = os.path.join(os.path.dirname(TRADE_CSV_PATH), f"TrailingSL_OOS_{YEAR_FILTER}_FULLYEAR_STEP8STEP9_NEXTDAY_ONLY_INITIALSL")
os.makedirs(OUT_DIR, exist_ok=True)

# ============================================================
# Locked settings (same as pilot)
# ============================================================
MARKET_START = "14:30"
MARKET_END   = "21:00"
RESAMPLE_RULE = "15T"

ATR_PERIOD = 14
REG_WINDOW = 16
DELAY_MIN = 0

TRIGGER_ON_CLOSE = True
ATR_FALLBACK_MULT = 3.0

# --- Your new requirement: Initial SL must be >= 1% ---
INITIAL_MIN_STOP_PCT = 0.01   # 1%
# Keep trailing minimum as before (can be tighter later)
MIN_STOP_PCT = 0.0010         # 0.10%

# YOUR RULE
APPLY_FROM_NEXT_DAY_ONLY = True

# Progress printing frequency
PRINT_EVERY = 500

# ============================================================
# Robust CSV reader
# ============================================================
def read_trade_csv(path: str) -> pd.DataFrame:
    for enc in ["utf-8", "cp1252", "latin1"]:
        try:
            return pd.read_csv(path, encoding=enc, low_memory=False)
        except UnicodeDecodeError:
            continue
    return pd.read_csv(path, encoding="utf-8", encoding_errors="ignore", low_memory=False)

# ============================================================
# Helpers
# ============================================================
def normalize_ticker(x: str) -> str:
    if x is None or (isinstance(x, float) and np.isnan(x)):
        return ""
    x = str(x).strip().upper()
    return "".join([ch for ch in x if ch.isalnum()])

def trade_base_from_currency(currency: str) -> str:
    if currency is None or (isinstance(currency, float) and np.isnan(currency)):
        return ""
    s = str(currency).strip().upper()
    base = s.split("/")[0].strip() if "/" in s else s
    return normalize_ticker(base)

def normalize_direction(x) -> str:
    if x is None or (isinstance(x, float) and np.isnan(x)):
        return "UNKNOWN"
    s = str(x).strip().upper()
    if s in ["LONG", "BUY", "B", "1", "1.0", "+1", "+1.0"]:
        return "LONG"
    if s in ["SHORT", "SELL", "S", "-1", "-1.0"]:
        return "SHORT"
    try:
        v = float(s)
        if v > 0: return "LONG"
        if v < 0: return "SHORT"
    except Exception:
        pass
    return "UNKNOWN"

def _todelta_hhmm(hhmm: str) -> pd.Timedelta:
    return pd.to_timedelta(hhmm + ":00")

# ============================================================
# Load mapping file
# ============================================================
def load_trade_to_m5_mapping(mapping_xlsx_path: str) -> dict:
    mp = pd.read_excel(mapping_xlsx_path)
    mp.columns = [c.strip() for c in mp.columns]
    if "org_symbol" not in mp.columns or "Symbol" not in mp.columns:
        raise ValueError("Mapping file must contain columns: org_symbol, Symbol")
    mp["m5_ticker"] = mp["org_symbol"].apply(normalize_ticker)
    mp["trade_base"] = mp["Symbol"].astype(str).str.upper().str.strip().str.split("/").str[0].apply(normalize_ticker)
    mp = mp[(mp["trade_base"] != "") & (mp["m5_ticker"] != "")]
    mp = mp.drop_duplicates(subset=["trade_base"], keep="first")
    return dict(zip(mp["trade_base"], mp["m5_ticker"]))

TRADE_TO_M5 = load_trade_to_m5_mapping(MAPPING_XLSX_PATH)

def map_trade_base_to_m5(trade_base: str) -> str:
    t = normalize_ticker(trade_base)
    return TRADE_TO_M5.get(t, t)

# ============================================================
# Load M5 file list
# ============================================================
def load_m5_file_map(m5_dir: str, file_list_path: str) -> dict:
    fl = pd.read_csv(file_list_path)
    fl.columns = [c.strip().lower() for c in fl.columns]
    if "ticker" not in fl.columns or "filename" not in fl.columns:
        raise ValueError("m5_file_list.csv must have columns: ticker, filename")
    fl["ticker_norm"] = fl["ticker"].apply(normalize_ticker)
    fl["path"] = fl["filename"].apply(lambda f: os.path.join(m5_dir, f))
    return dict(zip(fl["ticker_norm"], fl["path"]))

M5_MAP = load_m5_file_map(M5_DIR, M5_FILE_LIST_PATH)
M5_TICKERS = set(M5_MAP.keys())

# ============================================================
# M5 preparation (market hours + resample + ATR) with caching
# ============================================================
def filter_market_hours(df: pd.DataFrame) -> pd.DataFrame:
    if df.empty:
        return df
    dfi = df.set_index("_dt", drop=False).sort_index()
    dfi = dfi.between_time(MARKET_START, MARKET_END, inclusive="both")
    return dfi.reset_index(drop=True)

def apply_delay_filter(window: pd.DataFrame, delay_minutes: int) -> pd.DataFrame:
    if delay_minutes <= 0 or window.empty:
        return window
    w = window.copy()
    w["d"] = w["_dt"].dt.normalize()
    thr = w["d"] + pd.to_timedelta(MARKET_START + ":00")
    w = w[w["_dt"] >= (thr + pd.to_timedelta(delay_minutes, unit="m"))]
    return w.drop(columns=["d"])

def resample_ohlc(df: pd.DataFrame, rule: str) -> pd.DataFrame:
    if df.empty:
        return df
    dfi = df.set_index("_dt").sort_index()
    ohlc = dfi[["open","high","low","close"]].resample(rule).agg({
        "open": "first",
        "high": "max",
        "low": "min",
        "close": "last",
    }).dropna()
    return ohlc.reset_index()

def compute_atr(df: pd.DataFrame, period: int) -> pd.Series:
    high = df["high"]
    low = df["low"]
    close = df["close"]
    tr = pd.concat([
        high - low,
        (high - close.shift(1)).abs(),
        (low - close.shift(1)).abs()
    ], axis=1).max(axis=1)
    return tr.rolling(period, min_periods=period).mean()

_RS_CACHE = {}

def prepare_rs_for_ticker(ticker: str):
    if ticker in _RS_CACHE:
        return _RS_CACHE[ticker]

    fp = M5_MAP.get(ticker)
    if not fp or not os.path.exists(fp):
        _RS_CACHE[ticker] = None
        return None

    df = pd.read_csv(fp)
    df.columns = [c.strip().lower() for c in df.columns]
    needed = ["date", "open", "high", "low", "close"]
    if not all(c in df.columns for c in needed):
        _RS_CACHE[ticker] = None
        return None

    df["_dt"] = pd.to_datetime(df["date"], errors="coerce")
    df = df.dropna(subset=["_dt"]).sort_values("_dt").reset_index(drop=True)

    for c in ["open", "high", "low", "close"]:
        df[c] = pd.to_numeric(df[c], errors="coerce")
    df = df.dropna(subset=["open", "high", "low", "close"])

    df = filter_market_hours(df)
    if df.empty:
        _RS_CACHE[ticker] = None
        return None

    rs = resample_ohlc(df, RESAMPLE_RULE)
    if rs.empty:
        _RS_CACHE[ticker] = None
        return None

    rs = apply_delay_filter(rs, DELAY_MIN)
    if rs.empty:
        _RS_CACHE[ticker] = None
        return None

    rs["ATR"] = compute_atr(rs, ATR_PERIOD)

    _RS_CACHE[ticker] = rs
    return rs

# ============================================================
# Load & merge config (Step8 base + Step9 sensitivity)
# ============================================================
def load_best_step8(path: str) -> dict:
    b8 = pd.read_csv(path).iloc[0].to_dict()
    need = ["ATAN_LONG","ATAN_SHORT","MAX_STOP_LONG","MAX_STOP_SHORT"]
    miss = [k for k in need if k not in b8 or pd.isna(b8.get(k))]
    if miss:
        raise ValueError(f"best_config_step8.csv missing/NaN keys: {miss}")
    return {k: float(b8[k]) for k in need}

def load_best_step9(path: str) -> dict:
    b9 = pd.read_csv(path).iloc[0].to_dict()
    return {
        "SENS_LONG": float(b9.get("SENS_LONG", 1.0)),
        "SENS_SHORT": float(b9.get("SENS_SHORT", 1.0)),
    }

def load_merged_config() -> dict:
    base = load_best_step8(BEST_STEP8_PATH)
    sens = load_best_step9(BEST_STEP9_PATH)
    cfg = {**base, **sens}
    print("Merged config:", cfg)
    return cfg

# ============================================================
# Trade extraction (NEW CSV schema)
# ============================================================
def build_trade_summary(raw: pd.DataFrame) -> pd.DataFrame:
    df = raw.copy()
    df.columns = [c.strip() for c in df.columns]

    required = ["TradeID","Currency","Direction","date","price","Opening Date","Closing Date"]
    missing = [c for c in required if c not in df.columns]
    if missing:
        raise ValueError(f"Trade file missing required columns: {missing}")

    df["date"] = pd.to_datetime(df["date"], errors="coerce")
    df["Opening Date"] = pd.to_datetime(df["Opening Date"], errors="coerce")
    df["Closing Date"] = pd.to_datetime(df["Closing Date"], errors="coerce")
    df["price"] = pd.to_numeric(df["price"], errors="coerce")

    df = df[df["date"].dt.year == YEAR_FILTER].copy()

    df["TradeBase"] = df["Currency"].apply(trade_base_from_currency)
    df["TickerNorm"] = df["TradeBase"].apply(map_trade_base_to_m5)
    df["DirectionNorm"] = df["Direction"].apply(normalize_direction)

    rows = []
    for tid, g in df.groupby("TradeID", dropna=True):
        g = g.sort_values("date")

        entry_date = g["Opening Date"].dropna().iloc[0] if g["Opening Date"].notna().any() else pd.NaT
        exit_date  = g["Closing Date"].dropna().iloc[0] if g["Closing Date"].notna().any() else pd.NaT

        direction = g["DirectionNorm"].dropna().iloc[0] if g["DirectionNorm"].notna().any() else "UNKNOWN"
        ticker = g["TickerNorm"].dropna().iloc[0] if g["TickerNorm"].notna().any() else ""

        entry_px = np.nan
        if pd.notna(entry_date):
            day_rows = g[g["date"].dt.normalize() == entry_date.normalize()]
            if day_rows["price"].notna().any():
                entry_px = float(day_rows["price"].dropna().iloc[-1])

        exit_px = np.nan
        if pd.notna(exit_date):
            day_rows = g[g["date"].dt.normalize() == exit_date.normalize()]
            if day_rows["price"].notna().any():
                exit_px = float(day_rows["price"].dropna().iloc[-1])

        valid_window = pd.notna(entry_date) and pd.notna(exit_date) and (exit_date.normalize() > entry_date.normalize())
        core_ok = (ticker != "" and ticker in M5_TICKERS and direction != "UNKNOWN" and
                   pd.notna(entry_date) and pd.notna(exit_date) and pd.notna(entry_px) and pd.notna(exit_px))

        rows.append({
            "TradeID": tid,
            "TickerNorm": ticker,
            "Direction": direction,
            "EntryDate": entry_date,
            "ExitDate_Original": exit_date,
            "EntryPrice": entry_px,
            "ExitPrice_Original": exit_px,
            "HasAllCoreFields": bool(core_ok),
            "HasValidTrailingWindow": bool(valid_window)
        })
    return pd.DataFrame(rows)

# ============================================================
# Trailing simulation
# ============================================================
def adaptive_stop_distance_pct(bars: pd.DataFrame,
                               atan_threshold: float,
                               atr_fallback_mult: float,
                               min_stop_pct: float,
                               max_stop_pct: float,
                               sensitivity: float) -> float:
    if len(bars) < REG_WINDOW or bars["ATR"].dropna().empty:
        return np.nan

    recent = bars["close"].iloc[-REG_WINDOW:].astype(float)
    if len(recent) < REG_WINDOW:
        return np.nan

    x = np.arange(len(recent), dtype=float)
    a, b, c = np.polyfit(x, recent.values, 2)

    slope_norm = (2 * a * x[-1] + b) / recent.iloc[-1]
    accel = (2 * a)
    atan_slope = float(np.arctan(slope_norm))

    atr = bars["ATR"].iloc[-1]
    px = bars["close"].iloc[-1]
    atr_pct = float(atr / px) if (pd.notna(atr) and atr > 0 and px > 0) else np.nan
    if pd.isna(atr_pct):
        return np.nan

    if abs(atan_slope) > atan_threshold:
        stop_pct = abs(slope_norm) * sensitivity * (1.0 + abs(accel))
        return float(np.clip(stop_pct, min_stop_pct, max_stop_pct))

    stop_pct = atr_pct * atr_fallback_mult
    return float(np.clip(stop_pct, min_stop_pct, max_stop_pct))

def compute_initial_stop_pct(rs: pd.DataFrame,
                             entry_date: pd.Timestamp,
                             entry_px: float,
                             direction: str,
                             cfg: dict) -> float:
    """
    Initial SL at ENTRY CLOSE:
    - Must be >= INITIAL_MIN_STOP_PCT (>=1%)
    - Can be wider, but capped by MAX_STOP_* for direction.
    Priority:
      1) adaptive_stop_distance_pct on history up to entry close (if available)
      2) ATR% * ATR_FALLBACK_MULT (if ATR available)
      3) INITIAL_MIN_STOP_PCT
    """
    direction = normalize_direction(direction)
    is_long = (direction == "LONG")

    atan_thr = cfg["ATAN_LONG"] if is_long else cfg["ATAN_SHORT"]
    max_stop = cfg["MAX_STOP_LONG"] if is_long else cfg["MAX_STOP_SHORT"]
    sens     = cfg["SENS_LONG"] if is_long else cfg["SENS_SHORT"]

    entry_close_dt = entry_date.normalize() + _todelta_hhmm(MARKET_END)

    hist = rs[rs["_dt"] <= entry_close_dt].copy()
    if hist.empty:
        return float(np.clip(INITIAL_MIN_STOP_PCT, INITIAL_MIN_STOP_PCT, max_stop))

    stop_pct = adaptive_stop_distance_pct(
        hist, atan_thr, ATR_FALLBACK_MULT, INITIAL_MIN_STOP_PCT, max_stop, sens
    )
    if pd.notna(stop_pct):
        return float(stop_pct)

    last = hist.iloc[-1]
    atr = last.get("ATR", np.nan)
    px  = last.get("close", np.nan)
    if pd.notna(atr) and atr > 0 and pd.notna(px) and px > 0:
        atr_pct = float(atr / px)
        stop_pct = atr_pct * ATR_FALLBACK_MULT
        return float(np.clip(stop_pct, INITIAL_MIN_STOP_PCT, max_stop))

    return float(np.clip(INITIAL_MIN_STOP_PCT, INITIAL_MIN_STOP_PCT, max_stop))

@dataclass
class TrailResult:
    exit_time: pd.Timestamp
    exit_price: float
    exit_reason: str
    initial_stop_pct: float
    initial_stop_price: float

def price_improvement(direction: str, new_exit: float, old_exit: float) -> float:
    d = normalize_direction(direction)
    return (new_exit - old_exit) if d == "LONG" else (old_exit - new_exit)

def simulate_trade(tr: pd.Series, rs: pd.DataFrame, cfg: dict) -> TrailResult:
    entry_date = pd.to_datetime(tr["EntryDate"])
    exit_date  = pd.to_datetime(tr["ExitDate_Original"])
    entry_px   = float(tr["EntryPrice"])
    orig_exit_px = float(tr["ExitPrice_Original"])

    if exit_date.normalize() <= entry_date.normalize():
        return TrailResult(exit_date, orig_exit_px, "SameDayOrInvalidWindow",
                           np.nan, np.nan)

    direction = normalize_direction(tr["Direction"])
    is_long = (direction == "LONG")

    # ----------------------------
    # Initial SL placed at entry close (>= 1%)
    # ----------------------------
    init_stop_pct = compute_initial_stop_pct(rs, entry_date, entry_px, direction, cfg)
    if is_long:
        stop = entry_px * (1.0 - init_stop_pct)
    else:
        stop = entry_px * (1.0 + init_stop_pct)

    initial_stop_price = float(stop)
    best_close = entry_px
    stop_source = "Initial"  # becomes "Trailing" once adaptive updates

    # Monitoring starts from next day market open
    start_dt = entry_date.normalize() + timedelta(days=1) + _todelta_hhmm(MARKET_START)
    end_dt   = exit_date + timedelta(days=1)

    w = rs[(rs["_dt"] >= start_dt) & (rs["_dt"] < end_dt)].copy()
    if w.empty:
        return TrailResult(exit_date, orig_exit_px, "NoRSDataInWindow",
                           float(init_stop_pct), float(initial_stop_price))

    for i in range(len(w)):
        sub = w.iloc[:i+1]
        row = w.iloc[i]
        dt = row["_dt"]

        close = float(row["close"])
        high = float(row["high"])
        low  = float(row["low"])

        if is_long:
            best_close = max(best_close, close)
        else:
            best_close = min(best_close, close)

        # 1) Check current stop first (so Initial SL can trigger immediately)
        if is_long:
            hit_val = close if TRIGGER_ON_CLOSE else low
            if hit_val <= stop:
                reason = "InitialSL" if stop_source == "Initial" else "TrailingSL"
                return TrailResult(dt, float(stop), reason,
                                   float(init_stop_pct), float(initial_stop_price))
        else:
            hit_val = close if TRIGGER_ON_CLOSE else high
            if hit_val >= stop:
                reason = "InitialSL" if stop_source == "Initial" else "TrailingSL"
                return TrailResult(dt, float(stop), reason,
                                   float(init_stop_pct), float(initial_stop_price))

        # 2) Then try to tighten via adaptive logic (may be NaN early)
        atan_thr = cfg["ATAN_LONG"] if is_long else cfg["ATAN_SHORT"]
        max_stop = cfg["MAX_STOP_LONG"] if is_long else cfg["MAX_STOP_SHORT"]
        sens     = cfg["SENS_LONG"] if is_long else cfg["SENS_SHORT"]

        stop_pct = adaptive_stop_distance_pct(
            sub, atan_thr, ATR_FALLBACK_MULT, MIN_STOP_PCT, max_stop, sens
        )
        if pd.isna(stop_pct):
            continue

        if is_long:
            stop = max(stop, best_close * (1.0 - stop_pct))
        else:
            stop = min(stop, best_close * (1.0 + stop_pct))

        stop_source = "Trailing"

    return TrailResult(exit_date, orig_exit_px, "OriginalExit",
                       float(init_stop_pct), float(initial_stop_price))

def summarize(out: pd.DataFrame) -> pd.DataFrame:
    df = out.copy()
    df["TriggeredFlag"] = df["ExitReason"].isin(["TrailingSL", "InitialSL"])
    df["ImprovedFlag"] = df["PriceImprovement"] > 0

    def agg(g):
        n = len(g)
        trig = int(g["TriggeredFlag"].sum())
        return {
            "Trades": n,
            "SLTriggered": trig,
            "PctTriggered": trig / n if n else np.nan,
            "Triggered_PctImproved": float(g.loc[g["TriggeredFlag"], "ImprovedFlag"].mean()) if trig else np.nan,
            "AvgPctImpact_All": float(g["PctImprovement_vs_OriginalExit"].mean(skipna=True)),
            "AvgPctImpact_Triggered": float(g.loc[g["TriggeredFlag"], "PctImprovement_vs_OriginalExit"].mean(skipna=True)) if trig else np.nan,
        }

    rows = [{"Group":"ALL", **agg(df)}]
    for d, g in df.groupby(df["Direction"].astype(str).str.upper().str.strip()):
        rows.append({"Group": f"Direction={d}", **agg(g)})
    return pd.DataFrame(rows)

# ============================================================
# MAIN (FULL YEAR)
# ============================================================
def main():
    cfg = load_merged_config()

    raw = read_trade_csv(TRADE_CSV_PATH)
    trades = build_trade_summary(raw)

    print("\nTradeIDs in year (pre-filter):", trades["TradeID"].nunique())
    print("Trades with core fields OK:", int(trades["HasAllCoreFields"].sum()))
    print("Trades with valid trailing window:", int(trades["HasValidTrailingWindow"].sum()))

    eligible = trades[(trades["HasAllCoreFields"]) & (trades["HasValidTrailingWindow"])].copy()
    if eligible.empty:
        raise RuntimeError("No eligible trades after filtering (core fields + exit>entry).")

    eligible = eligible.sort_values(["TickerNorm", "EntryDate", "TradeID"]).reset_index(drop=True)
    print("\nEligible trades to process:", len(eligible))
    print("Unique tickers in eligible:", eligible["TickerNorm"].nunique())

    # Preload RS for all tickers
    tickers = sorted(eligible["TickerNorm"].unique())
    print("\nPreloading resampled market-hours data for tickers:", len(tickers))
    for i, tkr in enumerate(tickers, 1):
        prepare_rs_for_ticker(tkr)
        if i % 25 == 0:
            print(f"  Loaded {i}/{len(tickers)} tickers...")

    out_rows = []
    dbg_rows = []

    for i, tr in eligible.iterrows():
        rs = prepare_rs_for_ticker(tr["TickerNorm"])
        if rs is None or rs.empty:
            dbg_rows.append({"TradeID": tr["TradeID"], "Reason": "MissingRSData"})
            continue

        res = simulate_trade(tr, rs, cfg)

        old_exit = float(tr["ExitPrice_Original"])
        new_exit = float(res.exit_price)
        impr = price_improvement(tr["Direction"], new_exit, old_exit)

        out_rows.append({
            "TradeID": tr["TradeID"],
            "TickerNorm": tr["TickerNorm"],
            "Direction": tr["Direction"],
            "EntryDate": tr["EntryDate"],
            "ExitDate_Original": tr["ExitDate_Original"],
            "EntryPrice": float(tr["EntryPrice"]),
            "InitialStopPct": res.initial_stop_pct,
            "InitialStopPrice": res.initial_stop_price,
            "ExitPrice_Original": old_exit,
            "ExitTime_Trailing": res.exit_time,
            "ExitPrice_Trailing": new_exit,
            "ExitReason": res.exit_reason,
            "PriceImprovement": impr,
            "PctImprovement_vs_OriginalExit": (impr / old_exit) if old_exit else np.nan
        })
        dbg_rows.append({"TradeID": tr["TradeID"], "Reason": res.exit_reason})

        if (i + 1) % PRINT_EVERY == 0:
            print(f"Processed {i+1}/{len(eligible)} trades...")

    out_df = pd.DataFrame(out_rows)
    summ_df = summarize(out_df)
    dbg_df = pd.DataFrame(dbg_rows)

    reason_counts = dbg_df["Reason"].value_counts(dropna=False)

    out_path = os.path.join(OUT_DIR, f"trades_full_{YEAR_FILTER}.csv")
    summ_path = os.path.join(OUT_DIR, f"summary_full_{YEAR_FILTER}.csv")
    dbg_path = os.path.join(OUT_DIR, f"debug_reasons_full_{YEAR_FILTER}.csv")
    reason_path = os.path.join(OUT_DIR, f"reason_counts_full_{YEAR_FILTER}.csv")

    out_df.to_csv(out_path, index=False)
    summ_df.to_csv(summ_path, index=False)
    dbg_df.to_csv(dbg_path, index=False)
    reason_counts.rename_axis("Reason").reset_index(name="Count").to_csv(reason_path, index=False)

    print("\nSaved:", out_path)
    print("Saved:", summ_path)
    print("Saved:", dbg_path)
    print("Saved:", reason_path)

    print("\n=== FULL YEAR SUMMARY ===")
    print(summ_df.to_string(index=False))

    print("\n=== EXIT REASONS (full year) ===")
    print(reason_counts.to_string())

if __name__ == "__main__":
    main()
