In [None]:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
===============================================================
Strategy & Pipeline — Quick Reference
===============================================================

ENTRY (long-only)
-----------------
Signal day (D0):
  • Trend: EMA10 > EMA20.
  • Pattern: Two consecutive green candles (D-1 and D0) AND the D0 candle's
    O, H, L, C are each strictly > the D-1 candle's respective O, H, L, C.
  • Confirmation: At least ONE of the following is true on D0:
      - RSI > 50
      - MACD line > signal (hist > 0)
      - ADX > 20 AND +DI > -DI
      - Close > SMA50
      - Close > Bollinger middle band

Execution day (D1):
  • If configured (entry_on_next_open=True), the buy is placed at D1's Open.
  • Stop & Target are set off the executed entry price:
      SL = entry_price * (1 - stop_loss_pct)
      TP = entry_price * (1 + target_pct)

EXIT
----
Checked every bar after entry:
  1) Hard StopLoss (HSL): If next bar’s Low ≤ SL → exit (filled=SL).
  2) TakeProfit (TP):     If next bar’s High ≥ TP → exit (filled=TP).
     (If both SL and TP touch on the same bar, we assume TP priority and
      mark filled=TP — optimistic fill.)
  3) Trend reverse: If EMA10 < EMA20 on the signal bar, exit next open.

Exit reasons are verbose in CSV:
  • "StopLoss hit (5.0%). filled=..., SL=..., ret=...%, held=...d"
  • "TakeProfit hit (10.0%). filled=..., TP=..., ret=...%, held=...d"
  • "Trend reverse: EMA10 cross↓ below EMA20 (10=..., 20=...);
     bearish=RSI<50, MACD≤Signal, ...; filled=..., ret=...%, held=...d"

PORTFOLIO PIPELINE (daily loop)
-------------------------------
1) Aggregate all per-ticker BUY 'candidate' legs dated for D1 from signals on D0.
2) Optional filter: price within >= {within_pct_of_52w_high:.0%} of 52-week high.
3) Exclude tickers we already hold.
4) Rank survivors by VOLAᵣ (excess return vs benchmark / own volatility; 252d lookback).
5) Capacity gate:
   • slots = max_concurrent_positions - current_open_positions
   • pick = min(top_k_daily, slots)
6) Sizing via Markowitz (MVO), long-only:
   • Compute mu (means) and Sigma (cov) on 252d daily returns on D0.
   • Solve for weights that maximize mean/vol (with long-only projection).
   • Cap *deployable cash* to deploy_cash_frac × current cash (e.g., 25%).
   • Allocate that capped bucket by MVO weights across today's picks.
7) Execute BUYs at D1 (close or open per config), deduct fees, open positions.
8) Process SELLs before BUYs each day; update equity curve after all legs.

top_k_daily vs max_concurrent_positions
---------------------------------------
• If you have 1 open position and max_concurrent_positions=4, you have 3 slots.
• Even if top_k_daily=2, you'll only fill up to min(2, 3)=2 new entries on that day.
• No position becomes "unavailable"; we just respect the smaller of the two limits.

MINI TIMELINE EXAMPLE
---------------------
Assume: entry_on_next_open=True, stop=5%, target=10%, deploy_cash_frac=25%,
        top_k_daily=3, max_concurrent_positions=4, initial cash=₹200,000.

D0 (signal day):
  Ticker A meets: EMA10>EMA20, two strict green candles, confirmations: RSI>50, Close>SMA50.
  Ticker B meets: EMA10>EMA20, two strict greens, confirmations: MACD>Signal.
  Ticker C meets: EMA10>EMA20, two strict greens, confirmations: Close>BBmid.
  → All three create BUY 'candidate' legs for D1.

  Portfolio selection on D0 for D1 execution:
    • 52w filter removes none.
    • Already-held: none.
    • Rank by VOLAᵣ (252d): A=1.2, B=0.9, C=0.6 → order: A, B, C.
    • slots = 4 - 0 = 4; pick = min(top_k_daily=3, slots=4) = 3 → keep A,B,C.
    • deployable_cash = 25% * 200k = ₹50,000
    • MVO over (A,B,C) returns  weights e.g. [0.50, 0.30, 0.20]
      → Allocations: A=₹25,000; B=₹15,000; C=₹10,000

D1 (execution/open):
  • BUY A, B, C at D1 Open; compute shares=alloc / price, pay fees.
  • For each position, set SL and TP from the executed entry price.

D3:
  • A hits TP intraday → SELL recorded with reason:
    "TakeProfit hit (10.0%). filled=..., TP=..., ret=..., held=2d"
  • B shows EMA10<EMA20 on D3 bar → exit at D4 Open with reason:
    "Trend reverse: EMA10 cross↓ below EMA20 (10=..., 20=...); bearish=...; filled=..."
  • C continues to run.

FILES
-----
outputs/trades_legs_*.csv         : All executed legs (BUY and SELL) with rich reasons
outputs/trades_roundtrips_*.csv    : Pair-matched entries/exits with final P&L stats
outputs/equity_*.csv / equity_*.png: Daily equity series and plot
outputs/metrics_*.json             : Key summary stats (CAGR, Sharpe, MDD, Win rate, trades)
"""

import os, json, math, warnings, logging
from dataclasses import dataclass
from typing import Dict, List, Tuple, Optional

import numpy as np
import pandas as pd

try:
    import yfinance as yf
    import matplotlib.pyplot as plt
except Exception:
    pass

warnings.filterwarnings("ignore", category=FutureWarning)

# =========================
# LOGGING
# =========================
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s | %(levelname)s | %(message)s",
    datefmt="%Y-%m-%d %H:%M:%S",
)
log = logging.getLogger("ema10_ema20_2green_confirm_v1")

# =========================
# CONFIG
# =========================
@dataclass
class Config:
    # Data
    start_date: str = "2015-01-01"
    end_date: str   = "2025-01-01"
    static_symbols: Optional[List[str]] = None
    static_symbols_path: Optional[str] = None
    cache_dir: str = "cache"
    out_dir: str   = "outputs"
    plot: bool     = True

    # --- Strategy params ---
    ema_fast: int = 10            # EMA10
    ema_slow: int = 20            # EMA20
    sma_confirm_len: int = 50     # SMA50 for confirm
    sma_trend_len: int = 200      # SMA200 (kept for breadth/context if needed)
    rsi_len: int = 14
    macd_fast: int = 12
    macd_slow: int = 26
    macd_signal: int = 9
    bb_len: int = 20
    bb_std: float = 2.0
    adx_len: int = 14
    adx_min: float = 20.0

    # Exits
    stop_loss_pct: float = 0.10
    target_pct: float    = 0.10

    # Portfolio
    apply_fees: bool    = True
    initial_capital: float = 500_000.0
    max_concurrent_positions: int = 5
    deploy_cash_frac: float = 0.30
    entry_on_next_open: bool = True
    exit_on_next_open: bool  = True

    # Candidate ranking & filters
    benchmark_try: Tuple[str,...] = ("^CNX500","^CRSLDX","^NSE500","^NIFTY500","^BSE500","^NSEI")
    volar_lookback: int = 252
    filter_52w_window: int = 252
    within_pct_of_52w_high: float = 0.50
    top_k_daily: int = 300

    # Liquidity guards (OFF by default)
    enable_basic_liquidity: bool = False
    min_price_inr: float = 50.0
    min_avg_vol_20d: float = 50_000.0

CFG = Config()

# =========================
# FEES (per user spec)
# =========================
APPLY_FEES = True

def calc_fees(turnover_buy: float, turnover_sell: float) -> float:
    if not APPLY_FEES:
        return 0.0
    BROKER_PCT = 0.001
    BROKER_MIN = 5.0
    BROKER_CAP = 20.0
    STT_PCT = 0.001
    STAMP_BUY_PCT = 0.00015
    EXCH_PCT = 0.0000297
    SEBI_PCT = 0.000001
    IPFT_PCT = 0.000001
    GST_PCT = 0.18
    DP_SELL = 20.0 if turnover_sell >= 100 else 0.0

    def _broker(turnover):
        if turnover <= 0:
            return 0.0
        fee = turnover * BROKER_PCT
        fee = max(BROKER_MIN, min(fee, BROKER_CAP))
        return fee

    br_buy  = _broker(turnover_buy)
    br_sell = _broker(turnover_sell)
    stt   = STT_PCT * (turnover_buy + turnover_sell)
    stamp = STAMP_BUY_PCT * turnover_buy
    exch  = EXCH_PCT * (turnover_buy + turnover_sell)
    sebi  = SEBI_PCT * (turnover_buy + turnover_sell)
    ipft  = IPFT_PCT * (turnover_buy + turnover_sell)
    dp    = DP_SELL
    gst_base = br_buy + br_sell + dp + exch + sebi + ipft
    gst   = GST_PCT * gst_base
    return float((br_buy + br_sell) + stt + stamp + exch + sebi + ipft + dp + gst)

# =========================
# Helpers
# =========================
def ensure_dirs(*paths):
    for p in paths:
        os.makedirs(p, exist_ok=True)

def today_str():
    return pd.Timestamp.today(tz="Asia/Kolkata").strftime("%Y-%m-%d")

def load_static_symbols(static_symbols: Optional[List[str]], static_symbols_path: Optional[str]) -> List[str]:
    syms: List[str] = []
    if static_symbols and len(static_symbols) > 0:
        syms = list(static_symbols)
    elif static_symbols_path and os.path.exists(static_symbols_path):
        with open(static_symbols_path, "r") as f:
            syms = [line.strip() for line in f if line.strip()]
    else:
        raise ValueError(
            "Provide CFG.static_symbols=[...] ('.NS' suffixes) or set CFG.static_symbols_path "
            "to a file containing one symbol per line."
        )
    out = []
    for s in syms:
        s = s.strip().upper()
        if not s.endswith(".NS"):
            s = f"{s}.NS"
        out.append(s)
    seen = set()
    uniq = []
    for s in out:
        if s not in seen:
            uniq.append(s)
            seen.add(s)
    return uniq

def fetch_prices(tickers: List[str], start: str, end: Optional[str], cache_dir: str) -> Dict[str, pd.DataFrame]:
    ensure_dirs(cache_dir)
    data = {}
    end = end or today_str()
    for ticker in tickers:
        cache_path = os.path.join(cache_dir, f"{ticker.replace('^', '_')}.parquet")
        if os.path.exists(cache_path):
            try:
                df = pd.read_parquet(cache_path)
                if len(df) and pd.to_datetime(df.index[-1]).strftime("%Y-%m-%d") >= end:
                    data[ticker] = df
                    continue
            except Exception:
                pass
        try:
            df = yf.download(ticker, start=start, end=end, auto_adjust=True, progress=False, multi_level_index=False)
            if df is None or df.empty:
                continue
            df = df.rename(columns=str.title)  # Open, High, Low, Close, Volume
            df = df[['Open', 'High', 'Low', 'Close', 'Volume']].dropna()
            df.index.name = "date"
            df.to_parquet(cache_path)
            data[ticker] = df
        except Exception:
            continue
    return data

def ema(series: pd.Series, span: int) -> pd.Series:
    return series.ewm(span=span, adjust=False, min_periods=span).mean()

def sma(series: pd.Series, window: int) -> pd.Series:
    return series.rolling(window).mean()

def rsi(series: pd.Series, length: int = 14) -> pd.Series:
    delta = series.diff()
    gain = (delta.where(delta > 0, 0.0)).rolling(length).mean()
    loss = (-delta.where(delta < 0, 0.0)).rolling(length).mean()
    rs = gain / loss.replace(0.0, np.nan)
    out = 100 - (100 / (1 + rs))
    return out.fillna(50.0)

def macd(series: pd.Series, fast=12, slow=26, signal=9):
    ema_fast = series.ewm(span=fast, adjust=False, min_periods=slow).mean()
    ema_slow = series.ewm(span=slow, adjust=False, min_periods=slow).mean()
    macd_line = ema_fast - ema_slow
    macd_signal = macd_line.ewm(span=signal, adjust=False, min_periods=signal).mean()
    hist = macd_line - macd_signal
    return macd_line, macd_signal, hist

def _true_range(high: pd.Series, low: pd.Series, prev_close: pd.Series) -> pd.Series:
    return pd.concat([
        (high - low).abs(),
        (high - prev_close).abs(),
        (low - prev_close).abs()
    ], axis=1).max(axis=1)

def dmi_adx(high: pd.Series, low: pd.Series, close: pd.Series, length: int = 14):
    prev_high = high.shift(1)
    prev_low  = low.shift(1)
    prev_close = close.shift(1)

    up_move   = high - prev_high
    down_move = prev_low - low
    plus_dm  = up_move.where((up_move > down_move) & (up_move > 0), 0.0)
    minus_dm = down_move.where((down_move > up_move) & (down_move > 0), 0.0)

    tr = _true_range(high, low, prev_close)
    alpha = 1.0 / length
    atr = tr.ewm(alpha=alpha, adjust=False, min_periods=length).mean()
    plus_di  = 100 * (plus_dm.ewm(alpha=alpha, adjust=False, min_periods=length).mean() / atr)
    minus_di = 100 * (minus_dm.ewm(alpha=alpha, adjust=False, min_periods=length).mean() / atr)

    dx = 100 * (plus_di - minus_di).abs() / (plus_di + minus_di).replace(0, np.nan)
    adx_series = dx.ewm(alpha=alpha, adjust=False, min_periods=length).mean()
    return plus_di, minus_di, adx_series

def bollinger(series: pd.Series, length=20, nstd=2.0):
    mid = series.rolling(length).mean()
    sd  = series.rolling(length).std(ddof=0)
    upper = mid + nstd * sd
    lower = mid - nstd * sd
    return mid, upper, lower

def compute_indicators(df: pd.DataFrame, cfg: Config) -> pd.DataFrame:
    out = df.copy()
    out["ema_fast"] = ema(out["Close"], cfg.ema_fast)   # 10
    out["ema_slow"] = ema(out["Close"], cfg.ema_slow)   # 20
    out["sma50"]    = sma(out["Close"], cfg.sma_confirm_len)
    out["sma200"]   = sma(out["Close"], cfg.sma_trend_len)
    out["rsi"]      = rsi(out["Close"], cfg.rsi_len)
    macd_line, macd_signal, macd_hist = macd(out["Close"], cfg.macd_fast, cfg.macd_slow, cfg.macd_signal)
    out["macd_line"] = macd_line
    out["macd_signal"] = macd_signal
    out["macd_hist"] = macd_hist
    out["bb_mid"], out["bb_up"], out["bb_low"] = bollinger(out["Close"], cfg.bb_len, cfg.bb_std)
    out["+DI"], out["-DI"], out["ADX"] = dmi_adx(out["High"], out["Low"], out["Close"], cfg.adx_len)
    out["avg_vol_20"] = out["Volume"].rolling(20).mean()
    out["high_52w"] = out["Close"].rolling(cfg.filter_52w_window).max()
    return out.dropna()

def basic_liquidity_ok(row: pd.Series, cfg: Config) -> bool:
    if not cfg.enable_basic_liquidity:
        return True
    if row["Close"] < cfg.min_price_inr:
        return False
    if row["avg_vol_20"] < cfg.min_avg_vol_20d:
        return False
    return True

# ---------- NEW ENTRY/EXIT LOGIC ----------
def two_green_strict(prev_row: pd.Series, row: pd.Series) -> bool:
    g1 = prev_row["Close"] > prev_row["Open"]
    g2 = row["Close"] > row["Open"]
    # second candle's O/H/L/C > first candle's O/H/L/C
    strictly_above = (row["Open"]  > prev_row["Open"]) and \
                     (row["High"]  > prev_row["High"]) and \
                     (row["Low"]   > prev_row["Low"])  and \
                     (row["Close"] > prev_row["Close"])
    return bool(g1 and g2 and strictly_above)

def confirmation_any(row: pd.Series, cfg: Config) -> Tuple[bool, str]:
    checks = []
    # RSI > 50
    checks.append(("RSI>50", row["rsi"] > 50.0))
    # MACD line > signal (hist>0)
    checks.append(("MACD>Signal", row["macd_hist"] > 0.0))
    # ADX>20 and +DI > -DI
    checks.append(("ADX>20 & +DI>-DI", (row["ADX"] > cfg.adx_min) and (row["+DI"] > row["-DI"])))
    # Close > SMA50
    checks.append(("Close>SMA50", row["Close"] > row["sma50"]))
    # Close > Bollinger Mid
    checks.append(("Close>BBmid", row["Close"] > row["bb_mid"]))

    passed = [name for name, ok in checks if ok]
    return (len(passed) >= 1, ", ".join(passed) if passed else "none")

def simulate_ticker(ticker: str, df: pd.DataFrame, cfg: Config):
    d = compute_indicators(df, cfg).copy()
    cols = ["ticker","side","date","price","shares","reason","signal_reason","score",
            "rsi","ADX","+DI","-DI","ema_fast","ema_slow","sma50","sma200",
            "macd_line","macd_signal","macd_hist","bb_mid","bb_up","bb_low",
            "close","high_52w"]
    if d.empty or len(d) < 60:
        return pd.DataFrame(columns=cols), pd.Series(dtype=float)

    in_pos = False
    entry_px = stop_px = tgt_px = 0.0
    entry_dt = None  # <-- added: to report held days only
    trades = []

    idx = list(d.index)

    for i in range(1, len(idx)-1):
        dt = idx[i]           # signal candle (second green)
        nxt = idx[i+1]        # execution / evaluation next bar
        prev_dt = idx[i-1]

        row = d.loc[dt]
        prev_row = d.loc[prev_dt]
        nxt_row = d.loc[nxt]

        # Trend reverse condition for exit
        trend_reverse_now = row["ema_fast"] < row["ema_slow"]

        if not in_pos:
            # ENTRY: EMA10 > EMA20 AND two strict green candles AND ≥1 confirmation
            ema_ok = row["ema_fast"] > row["ema_slow"]
            two_green_ok = two_green_strict(prev_row, row)
            confirm_ok, confirm_str = confirmation_any(row, cfg)

            if ema_ok and two_green_ok and confirm_ok and basic_liquidity_ok(row, cfg):
                # Execute at next open (or current close if configured)
                px = float(nxt_row["Open"] if cfg.entry_on_next_open else row["Close"])
                sig_reason = f"EMA10>EMA20 & 2xGreenStrict; confirms: {confirm_str}"
                score = 1.0  # placeholder; downstream selection uses VOLAᵣ
                trades.append({
                    "ticker": ticker, "side": "BUY", "date": nxt if cfg.entry_on_next_open else dt,
                    "price": px, "shares": 0, "reason": "candidate",
                    "signal_reason": sig_reason, "score": float(score),
                    "rsi": float(row["rsi"]), "ADX": float(row["ADX"]),
                    "+DI": float(row["+DI"]), "-DI": float(row["-DI"]),
                    "ema_fast": float(row["ema_fast"]), "ema_slow": float(row["ema_slow"]),
                    "sma50": float(row["sma50"]), "sma200": float(row["sma200"]),
                    "macd_line": float(row["macd_line"]), "macd_signal": float(row["macd_signal"]), "macd_hist": float(row["macd_hist"]),
                    "bb_mid": float(row["bb_mid"]), "bb_up": float(row["bb_up"]), "bb_low": float(row["bb_low"]),
                    "close": float(row["Close"]), "high_52w": float(row["high_52w"])
                })
                in_pos = True
                entry_px = px
                entry_dt = nxt if cfg.entry_on_next_open else dt  # <-- added: remember entry date for held days
                stop_px = entry_px * (1 - cfg.stop_loss_pct)
                tgt_px  = entry_px * (1 + cfg.target_pct)

        else:
            # EXIT rules: HSL / TP / Trend reverse
            hit = None
            exec_date = nxt if cfg.exit_on_next_open else dt

            # Check next bar extremes for HSL/TP
            if nxt_row["Low"] <= stop_px and nxt_row["High"] >= tgt_px:
                hit, exec_price = "target", float(tgt_px)  # optimistic fill (unchanged)
            elif nxt_row["Low"] <= stop_px:
                hit, exec_price = "stop", float(stop_px)
            elif nxt_row["High"] >= tgt_px:
                hit, exec_price = "target", float(tgt_px)
            elif trend_reverse_now:
                exec_price = float(nxt_row["Open"] if cfg.exit_on_next_open else row["Close"])
                hit = "trend_reverse"

            if hit is not None:
                # ----- Only the text below is new; exits/prices are unchanged -----
                pnl_pct = float((exec_price / entry_px - 1.0) * 100.0) if entry_px else float("nan")
                held_days = int((pd.to_datetime(exec_date) - pd.to_datetime(entry_dt)).days) if entry_dt else 0

                if hit == "stop":
                    exit_reason = (
                        f"StopLoss hit ({cfg.stop_loss_pct*100:.1f}%). "
                        f"filled={exec_price:.2f}, SL={stop_px:.2f}, ret={pnl_pct:.2f}%, held={held_days}d"
                    )
                elif hit == "target":
                    exit_reason = (
                        f"TakeProfit hit ({cfg.target_pct*100:.1f}%). "
                        f"filled={exec_price:.2f}, TP={tgt_px:.2f}, ret={pnl_pct:.2f}%, held={held_days}d"
                    )
                else:  # trend_reverse
                    crossdown = (prev_row["ema_fast"] >= prev_row["ema_slow"]) and (row["ema_fast"] < row["ema_slow"])
                    cross_txt = " cross↓" if crossdown else ""
                    exit_reason = (
                        f"Trend reverse: EMA10{cross_txt} below EMA20 "
                        f"(10={row['ema_fast']:.2f}, 20={row['ema_slow']:.2f}); "
                        f"filled={exec_price:.2f}, ret={pnl_pct:.2f}%, held={held_days}d"
                    )
                # -------------------------------------------------------------------

                trades.append({
                    "ticker": ticker, "side": "SELL", "date": exec_date,
                    "price": float(exec_price), "shares": 0, "reason": exit_reason,  # <-- reason now verbose
                    "signal_reason": "", "score": np.nan,
                    "rsi": float(row["rsi"]), "ADX": float(row["ADX"]),
                    "+DI": float(row["+DI"]), "-DI": float(row["-DI"]),
                    "ema_fast": float(row["ema_fast"]), "ema_slow": float(row["ema_slow"]),
                    "sma50": float(row["sma50"]), "sma200": float(row["sma200"]),
                    "macd_line": float(row["macd_line"]), "macd_signal": float(row["macd_signal"]), "macd_hist": float(row["macd_hist"]),
                    "bb_mid": float(row["bb_mid"]), "bb_up": float(row["bb_up"]), "bb_low": float(row["bb_low"]),
                    "close": float(row["Close"]), "high_52w": float(row["high_52w"])
                })
                in_pos = False
                entry_px = stop_px = tgt_px = 0.0
                entry_dt = None  # <-- reset

    # Force close open pos at last bar
    if in_pos:
        last_dt = d.index[-1]; row = d.loc[last_dt]
        exec_price = float(row["Close"])
        pnl_pct = float((exec_price / entry_px - 1.0) * 100.0) if entry_px else float("nan")
        held_days = int((pd.to_datetime(last_dt) - pd.to_datetime(entry_dt)).days) if entry_dt else 0
        exit_reason = (
            f"Final close (backtest end). filled={exec_price:.2f}, ret={pnl_pct:.2f}%, held={held_days}d"
        )
        trades.append({
            "ticker": ticker, "side": "SELL", "date": last_dt,
            "price": exec_price, "shares": 0, "reason": exit_reason,  # <-- verbose
            "signal_reason": "", "score": np.nan,
            "rsi": float(row["rsi"]), "ADX": float(row["ADX"]),
            "+DI": float(row["+DI"]), "-DI": float(row["-DI"]),
            "ema_fast": float(row["ema_fast"]), "ema_slow": float(row["ema_slow"]),
            "sma50": float(row["sma50"]), "sma200": float(row["sma200"]),
            "macd_line": float(row["macd_line"]), "macd_signal": float(row["macd_signal"]), "macd_hist": float(row["macd_hist"]),
            "bb_mid": float(row["bb_mid"]), "bb_up": float(row["bb_up"]), "bb_low": float(row["bb_low"]),
            "close": float(row["Close"]), "high_52w": float(row["high_52w"])
        })

    return pd.DataFrame(trades, columns=cols), pd.Series(dtype=float)


# ---------- Portfolio: unchanged engine ----------
def pick_benchmark(benchmarks: Tuple[str,...], start: str, end: Optional[str], cache_dir: str) -> Tuple[str, pd.DataFrame]:
    for t in benchmarks:
        data = fetch_prices([t], start, end, cache_dir)
        df = data.get(t)
        if df is not None and not df.empty:
            log.info("Using benchmark: %s", t)
            return t, df
    idx = pd.date_range(start=start, end=end or today_str(), freq="B")
    df = pd.DataFrame({"Close": np.ones(len(idx))}, index=idx)
    log.warning("No benchmark found; using synthetic flat series.")
    return "SYNTH_BENCH", df

def compute_volar_scores(end_dt: pd.Timestamp, tickers: List[str], data_map: Dict[str,pd.DataFrame], bench_df: pd.DataFrame, lookback: int) -> Dict[str, float]:
    scores = {}
    bser = bench_df["Close"].loc[:end_dt].pct_change().dropna().iloc[-lookback:]
    for t in tickers:
        df = data_map.get(t)
        if df is None or df.empty:
            scores[t] = 0.0
            continue
        if end_dt not in df.index:
            df = df[df.index <= end_dt]
            if df.empty:
                scores[t] = 0.0
                continue
        r = df["Close"].loc[:end_dt].pct_change().dropna().iloc[-lookback:]
        common = pd.concat([r, bser], axis=1, keys=["s","b"]).dropna()
        if common.shape[0] < max(20, int(0.4*lookback)):
            scores[t] = 0.0
            continue
        excess = common["s"] - common["b"]
        vol = common["s"].std(ddof=0)
        scores[t] = 0.0 if vol <= 1e-8 else float((excess.mean() / vol) * math.sqrt(252.0))
    return scores

def markowitz_long_only(mu: np.ndarray, Sigma: np.ndarray) -> np.ndarray:
    n = len(mu)
    eps = 1e-6
    Sigma = Sigma + eps*np.eye(n)

    def solve_lambda(lmbd: float, active_mask=None):
        if active_mask is None:
            A = np.block([[2*lmbd*Sigma, np.ones((n,1))],[np.ones((1,n)), np.zeros((1,1))]])
            b = np.concatenate([mu, np.array([1.0])])
            try:
                sol = np.linalg.solve(A, b)
                w = sol[:n]
            except np.linalg.LinAlgError:
                w = np.full(n, 1.0/n)
            return w
        else:
            idx = np.where(active_mask)[0]
            if len(idx)==0:
                return np.full(n, 1.0/n)
            S = Sigma[np.ix_(idx, idx)]
            o = np.ones(len(idx))
            m = mu[idx]
            A = np.block([[2*lmbd*S, o[:,None]],[o[None,:], np.zeros((1,1))]])
            b = np.concatenate([m, np.array([1.0])])
            try:
                sol = np.linalg.solve(A, b)
                w_sub = sol[:len(idx)]
            except np.linalg.LinAlgError:
                w_sub = np.full(len(idx), 1.0/len(idx))
            w = np.zeros(n)
            w[idx] = w_sub
            return w

    best_w = np.full(n, 1.0/n)
    best_sr = -1e9
    lambdas = np.logspace(-3, 3, 31)
    for lmbd in lambdas:
        active = np.ones(n, dtype=bool)
        w = None
        for _ in range(n):
            w = solve_lambda(lmbd, active_mask=active)
            neg = w < 0
            if not neg.any():
                break
            worst = np.argmin(w)
            active[worst] = False
        if w is None:
            continue
        w = np.clip(w, 0, None)
        if w.sum() <= 0:
            continue
        w = w / w.sum()
        mu_p = float(mu @ w)
        vol_p = float(np.sqrt(w @ Sigma @ w))
        if vol_p <= 1e-8:
            continue
        sr = mu_p / vol_p
        if sr > best_sr:
            best_sr = sr
            best_w = w.copy()
    return best_w

def aggregate_and_apply(all_trades: pd.DataFrame, data_map: Dict[str, pd.DataFrame], bench_df: pd.DataFrame, cfg: Config):
    if all_trades.empty:
        return all_trades, pd.Series(dtype=float), {}

    side_order = {"BUY": 0, "SELL": 1}
    all_trades = (all_trades
        .assign(_sorder=all_trades["side"].map(side_order))
        .sort_values(by=["date", "_sorder"], kind="stable")
        .drop(columns=["_sorder"])
        .reset_index(drop=True)
    )
    all_trades["date"] = pd.to_datetime(all_trades["date"])

    equity_curve = []
    dates = sorted(all_trades["date"].unique().tolist())
    cash = cfg.initial_capital
    open_positions = {}
    completed_legs = []

    global APPLY_FEES
    APPLY_FEES = cfg.apply_fees

    def _get_close_on(tkr, dt):
        df = data_map.get(tkr)
        if df is None or df.empty:
            return np.nan
        if dt in df.index:
            return float(df.loc[dt, "Close"])
        prev = df[df.index <= dt]
        if prev.empty:
            return np.nan
        return float(prev["Close"].iloc[-1])

    if dates:
        seed_date = pd.to_datetime(dates[0]) - pd.Timedelta(days=1)
        equity_curve.append((seed_date, float(cash)))

    # Walk dates
    for dt in dates:
        day_trades = all_trades[all_trades["date"] == dt].copy()

        # SELLs first
        for _, tr in day_trades[day_trades["side"] == "SELL"].iterrows():
            tkr = tr["ticker"]
            price = float(tr["price"])
            pos = open_positions.get(tkr)
            if pos is None:
                continue
            shares = int(pos["shares"])
            turnover_sell = shares * price
            fee = calc_fees(0.0, turnover_sell)
            pnl = (price - pos["entry_px"]) * shares
            cash += (turnover_sell - fee)
            realized = pnl - fee - pos.get("buy_fee", 0.0)
            completed_legs.append({
                "ticker": tkr, "side": "SELL", "date": dt, "price": price,
                "shares": shares, "reason": tr.get("reason",""),
                "turnover": turnover_sell, "fees_inr": fee, "pnl_inr": realized,
                "rsi": tr.get("rsi", np.nan), "ADX": tr.get("ADX", np.nan),
                "+DI": tr.get("+DI", np.nan), "-DI": tr.get("-DI", np.nan),
                "ema_fast": tr.get("ema_fast", np.nan), "ema_slow": tr.get("ema_slow", np.nan),
                "sma50": tr.get("sma50", np.nan), "sma200": tr.get("sma200", np.nan),
                "macd_line": tr.get("macd_line", np.nan), "macd_signal": tr.get("macd_signal", np.nan), "macd_hist": tr.get("macd_hist", np.nan),
                "bb_mid": tr.get("bb_mid", np.nan), "bb_up": tr.get("bb_up", np.nan), "bb_low": tr.get("bb_low", np.nan),
                "close": tr.get("close", np.nan), "high_52w": tr.get("high_52w", np.nan),
                "volar": tr.get("volar", np.nan), "mvo_weight": np.nan, "alloc_inr": np.nan
            })
            log.info("Exit %-12s px=%8.2f sh=%6d reason=%s net=%.2f cash=%.2f",
                     tkr, price, shares, tr.get("reason",""), realized, cash)
            del open_positions[tkr]

        # BUY candidates
        buys_today = day_trades[day_trades["side"] == "BUY"].copy()

        # 52w proximity filter
        if not buys_today.empty:
            keep = []
            for _, rr in buys_today.iterrows():
                df = data_map.get(rr["ticker"])
                if df is None or df.empty or dt not in df.index:
                    continue
                close = float(df.loc[dt, "Close"])
                hist = df["Close"].loc[:dt]
                window = hist.iloc[-cfg.filter_52w_window:] if len(hist)>=cfg.filter_52w_window else hist
                high_52w = float(window.max())
                if high_52w>0 and close >= cfg.within_pct_of_52w_high * high_52w:
                    keep.append(rr)
            buys_today = pd.DataFrame(keep) if keep else pd.DataFrame(columns=buys_today.columns)

        # Exclude already-held tickers
        if not buys_today.empty:
            buys_today = buys_today[~buys_today["ticker"].isin(open_positions.keys())]

        # Rank by VOLAᵣ
        if not buys_today.empty:
            tickers = buys_today["ticker"].tolist()
            volar_scores = compute_volar_scores(dt, tickers, data_map, bench_df, cfg.volar_lookback)
            buys_today["volar"] = buys_today["ticker"].map(volar_scores)
            buys_today = buys_today.sort_values("volar", ascending=False).reset_index(drop=True)

        slots = cfg.max_concurrent_positions - len(open_positions)
        selected = pd.DataFrame(columns=buys_today.columns)
        if slots > 0 and not buys_today.empty:
            selected = buys_today.head(min(cfg.top_k_daily, slots)).copy()

        if not selected.empty:
            log.info("Selected %d BUY candidates on %s:", selected.shape[0], dt.date())
            for i, rr in selected.reset_index(drop=True).iterrows():
                log.info("  %-12s volar=%6.2f rank=%d px=%8.2f", rr["ticker"], rr.get("volar",0.0), i+1, rr["price"])

            names = selected["ticker"].tolist()
            rets = []
            for t in names:
                df = data_map.get(t)
                ser = df["Close"].loc[:dt].pct_change().dropna().iloc[-cfg.volar_lookback:]
                rets.append(ser)
            R = pd.concat(rets, axis=1)
            R.columns = names
            R = R.dropna()
            if R.empty or R.shape[0] < max(20, int(0.4*cfg.volar_lookback)) or R.shape[1] == 0:
                weights = np.full(len(names), 1.0/len(names))
            else:
                mu = R.mean().values
                Sigma = R.cov().values
                weights = markowitz_long_only(mu, Sigma)

            deploy_cash = max(0.0, float(cash)) * float(cfg.deploy_cash_frac)

            if deploy_cash <= 0:
                log.info("No deployable cash (cap=%.0f%%) on %s", 100*cfg.deploy_cash_frac, dt.date())
            else:
                alloc = (weights / weights.sum()) * deploy_cash if weights.sum()>0 else np.full(len(names), deploy_cash/len(names))
                rank_map = {row["ticker"]: (idx+1) for idx, (_, row) in enumerate(selected.iterrows())}
                for w_amt, t in zip(alloc, names):
                    df_t = data_map[t]
                    price = float(df_t.loc[dt, "Close"] if dt in df_t.index else df_t["Close"].loc[:dt].iloc[-1])
                    shares = int(math.floor(w_amt / price))
                    if shares <= 0:
                        log.info("Skip BUY %-12s (alloc %.2f too small)", t, w_amt)
                        continue
                    turn = shares * price
                    fee = calc_fees(turn, 0.0)
                    total_cost = turn + fee
                    if total_cost > cash:
                        shares = int(math.floor((cash - fee) / price))
                        if shares <= 0:
                            log.info("Skip BUY %-12s due to cash/fees", t)
                            continue
                        turn = shares * price
                        total_cost = turn + fee
                    cash -= total_cost
                    open_positions[t] = {"entry_date": dt, "entry_px": price, "shares": shares, "buy_fee": fee, "entry_reason": "entry"}

                    row_sel = selected[selected["ticker"]==t].iloc[0]
                    volar_val = float(row_sel.get("volar", np.nan))
                    rank_pos = rank_map.get(t, np.nan)
                    high_52w = float(row_sel.get("high_52w", np.nan))
                    close_val = float(row_sel.get("close", np.nan))
                    pct_52w = (close_val / high_52w) if (high_52w and high_52w>0) else np.nan
                    mvo_weight_today = (w_amt / deploy_cash) if deploy_cash > 0 else 0.0
                    sig_reason = row_sel.get("signal_reason", "EMA10>EMA20 & 2xGreenStrict (≥1 confirm)")
                    reason_text = (
                        f"{sig_reason}; 52w%={pct_52w:.1%} (>= {CFG.within_pct_of_52w_high:.0%}); "
                        f"VOLAR rank {int(rank_pos)}/{len(names)} (VOLAR={volar_val:.2f}); "
                        f"MVO weight={mvo_weight_today:.1%} of capped cash ({100*cfg.deploy_cash_frac:.0f}% of available)"
                    )
                    completed_legs.append({
                        "ticker": t, "side": "BUY", "date": dt, "price": price,
                        "shares": shares, "reason": reason_text,
                        "turnover": turn, "fees_inr": fee, "pnl_inr": 0.0,
                        "rsi": float(row_sel.get("rsi", np.nan)), "ADX": float(row_sel.get("ADX", np.nan)),
                        "+DI": float(row_sel.get("+DI", np.nan)), "-DI": float(row_sel.get("-DI", np.nan)),
                        "ema_fast": float(row_sel.get("ema_fast", np.nan)), "ema_slow": float(row_sel.get("ema_slow", np.nan)),
                        "sma50": float(row_sel.get("sma50", np.nan)), "sma200": float(row_sel.get("sma200", np.nan)),
                        "macd_line": float(row_sel.get("macd_line", np.nan)), "macd_signal": float(row_sel.get("macd_signal", np.nan)), "macd_hist": float(row_sel.get("macd_hist", np.nan)),
                        "bb_mid": float(row_sel.get("bb_mid", np.nan)), "bb_up": float(row_sel.get("bb_up", np.nan)), "bb_low": float(row_sel.get("bb_low", np.nan)),
                        "close": close_val, "high_52w": high_52w,
                        "volar": volar_val, "mvo_weight": float(mvo_weight_today), "alloc_inr": float(w_amt)
                    })
                    log.info("BUY %-12s px=%8.2f sh=%6d fee=%.2f cash=%.2f :: %s",
                             t, price, shares, fee, cash, reason_text)

        # MTM valuation
        mtm = 0.0
        for _tkr, pos in open_positions.items():
            px = _get_close_on(_tkr, dt)
            if not np.isnan(px):
                mtm += pos["shares"] * px
        total_equity = cash + mtm
        equity_curve.append((dt, float(total_equity)))

    eq_ser = pd.Series([e for _, e in equity_curve], index=[d for d, _ in equity_curve])
    legs_df = pd.DataFrame(completed_legs).sort_values(["date", "ticker", "side"]).reset_index(drop=True)

    # Roundtrips
    roundtrips = []
    by_tkr_open = {}
    for _, leg in legs_df.iterrows():
        tkr = leg["ticker"]
        if leg["side"] == "BUY":
            by_tkr_open[tkr] = leg
        else:
            buy = by_tkr_open.pop(tkr, None)
            if buy is None:
                continue
            fees_total = float(buy.get("fees_inr", 0.0) + leg.get("fees_inr", 0.0))
            gross_pnl = (leg["price"] - buy["price"]) * buy["shares"]
            net_pnl   = gross_pnl - fees_total
            ret_pct   = (leg["price"] / buy["price"] - 1.0) * 100.0
            days_held = (pd.to_datetime(leg["date"]) - pd.to_datetime(buy["date"])).days
            roundtrips.append({
                "ticker": tkr,
                "entry_date": pd.to_datetime(buy["date"]),
                "entry_price": float(buy["price"]),
                "exit_date": pd.to_datetime(leg["date"]),
                "exit_price": float(leg["price"]),
                "days_held": int(days_held),
                "shares": int(buy["shares"]),
                "entry_reason": buy.get("reason",""),
                "exit_reason": leg.get("reason",""),
                "gross_pnl_inr": float(gross_pnl),
                "fees_total_inr": float(fees_total),
                "net_pnl_inr": float(net_pnl),
                "return_pct": float(ret_pct),
                "rsi_entry": float(buy.get("rsi", np.nan)),
                "adx_entry": float(buy.get("ADX", np.nan)),
                "pdi_entry": float(buy.get("+DI", np.nan)),
                "mdi_entry": float(buy.get("-DI", np.nan)),
                "ema_fast_entry": float(buy.get("ema_fast", np.nan)),
                "ema_slow_entry": float(buy.get("ema_slow", np.nan)),
                "sma50_entry": float(buy.get("sma50", np.nan)),
                "sma200_entry": float(buy.get("sma200", np.nan)),
                "macd_line_entry": float(buy.get("macd_line", np.nan)),
                "macd_signal_entry": float(buy.get("macd_signal", np.nan)),
                "macd_hist_entry": float(buy.get("macd_hist", np.nan)),
                "bb_mid_entry": float(buy.get("bb_mid", np.nan)),
                "bb_up_entry": float(buy.get("bb_up", np.nan)),
                "bb_low_entry": float(buy.get("bb_low", np.nan)),
                "close_entry": float(buy.get("close", np.nan)),
                "high_52w_entry": float(buy.get("high_52w", np.nan)),
                "volar_entry": float(buy.get("volar", np.nan)),
                "mvo_weight_entry": float(buy.get("mvo_weight", np.nan)),
                "alloc_inr_entry": float(buy.get("alloc_inr", np.nan))
            })
    trips_df = pd.DataFrame(roundtrips).sort_values(["entry_date","ticker"]).reset_index(drop=True)

    metrics = compute_metrics(eq_ser, legs_df)
    return legs_df, trips_df, eq_ser, metrics

def compute_metrics(equity: pd.Series, legs_df: pd.DataFrame):
    out = {}
    if equity is None or equity.empty:
        return out
    eq = equity.dropna()
    daily_ret = eq.pct_change().fillna(0.0)

    days = (eq.index[-1] - eq.index[0]).days or 1
    years = days / 365.25
    cagr = (eq.iloc[-1] / eq.iloc[0]) ** (1/years) - 1 if years > 0 else 0.0

    if daily_ret.std(ddof=0) > 0:
        sharpe = (daily_ret.mean() / daily_ret.std(ddof=0)) * np.sqrt(252)
    else:
        sharpe = 0.0

    cummax = eq.cummax()
    dd = (eq - cummax) / cummax
    max_dd = dd.min()

    wins = 0
    n_sells = legs_df[legs_df["side"] == "SELL"].shape[0] if legs_df is not None and not legs_df.empty else 0
    for _, r in legs_df[legs_df["side"] == "SELL"].iterrows():
        if float(r.get("pnl_inr", 0.0)) > 0:
            wins += 1
    win_rate = (wins / n_sells) * 100.0 if n_sells > 0 else 0.0

    out.update({
        "start_equity_inr": float(eq.iloc[0]),
        "final_equity_inr": float(eq.iloc[-1]),
        "cagr_pct": float(cagr * 100),
        "sharpe": float(sharpe),
        "max_drawdown_pct": float(max_dd * 100),
        "win_rate_pct": float(win_rate),
        "n_trades": int(n_sells),
    })
    return out

def plot_equity(equity: pd.Series, out_path: str):
    if equity is None or equity.empty:
        return
    try:
        import matplotlib.pyplot as plt
        plt.figure(figsize=(10,5))
        plt.plot(equity.index, equity.values)
        plt.title("Equity Curve")
        plt.xlabel("Date")
        plt.ylabel("Equity (INR)")
        plt.tight_layout()
        plt.savefig(out_path)
        plt.close()
    except Exception:
        pass

def backtest(cfg: Config):
    ensure_dirs(cfg.cache_dir, cfg.out_dir)
    log.info("Universe: loading static symbols...")
    symbols = load_static_symbols(cfg.static_symbols, cfg.static_symbols_path)
    log.info("Loaded %d symbols.", len(symbols))

    log.info("Data: fetching OHLCV from yfinance (adjusted)...")
    data_map = fetch_prices(symbols, cfg.start_date, cfg.end_date, cfg.cache_dir)
    log.info("Downloaded %d symbols with data.", len(data_map))

    bench_tkr, bench_df = pick_benchmark(cfg.benchmark_try, cfg.start_date, cfg.end_date, cfg.cache_dir)
    log.info("Benchmark selected: %s", bench_tkr)

    log.info("Signals: generating new EMA10/EMA20 + 2xGreen + ≥1 confirm entries...")
    all_trades = []
    for i, tkr in enumerate(symbols, 1):
        df = data_map.get(tkr)
        if df is None or df.empty:
            continue
        tr, _ = simulate_ticker(tkr, df, cfg)
        if not tr.empty:
            all_trades.append(tr)
        if i % 50 == 0:
            log.info("  processed %d/%d tickers...", i, len(symbols))

    if not all_trades:
        log.warning("No signals generated; check thresholds or universe.")
        return None, None, None, {}
    all_trades = pd.concat(all_trades, ignore_index=True)

    log.info("Portfolio: cap daily deploy to %.0f%% of cash; 52w>=%.0f%% high; top-%d by VOLAᵣ; MVO; max %d positions.",
             cfg.deploy_cash_frac*100, cfg.within_pct_of_52w_high*100, cfg.top_k_daily, cfg.max_concurrent_positions)
    legs_df, trips_df, equity, metrics = aggregate_and_apply(all_trades, data_map, bench_df, cfg)

    stamp = pd.Timestamp.today(tz="Asia/Kolkata").strftime("%Y%m%d_%H%M%S")
    legs_path = os.path.join(cfg.out_dir, f"trades_legs_{stamp}.csv")
    trips_path = os.path.join(cfg.out_dir, f"trades_roundtrips_{stamp}.csv")
    equity_path = os.path.join(cfg.out_dir, f"equity_{stamp}.csv")
    metrics_path = os.path.join(cfg.out_dir, f"metrics_{stamp}.json")
    eq_plot_path = os.path.join(cfg.out_dir, f"equity_{stamp}.png")

    if legs_df is not None:
        legs_df.to_csv(legs_path, index=False)
    if trips_df is not None:
        trips_df.to_csv(trips_path, index=False)
    if equity is not None:
        pd.DataFrame({"date": equity.index, "equity": equity.values}).to_csv(equity_path, index=False)
    with open(metrics_path, "w") as f:
        json.dump(metrics, f, indent=2)

    if cfg.plot and equity is not None:
        plot_equity(equity, eq_plot_path)

    log.info("=== METRICS ===\n%s", json.dumps(metrics, indent=2))
    log.info("Files written:\n  %s\n  %s\n  %s\n  %s", legs_path, trips_path, equity_path, metrics_path)
    if cfg.plot:
        log.info("  %s", eq_plot_path)

def main():
    global APPLY_FEES
    APPLY_FEES = bool(CFG.apply_fees)
    # Example: set your universe here or via file path.
    # CFG.static_symbols = ['TCS.NS','HDFCBANK.NS','RELIANCE.NS','INFY.NS','ICICIBANK.NS']  # put your long list back here
    CFG.static_symbols_path = "nifty500.txt"
    backtest(CFG)

if __name__ == "__main__":
    main()


2025-10-08 17:13:36 | INFO | Universe: loading static symbols...
2025-10-08 17:13:36 | INFO | Loaded 500 symbols.
2025-10-08 17:13:36 | INFO | Data: fetching OHLCV from yfinance (adjusted)...
2025-10-08 17:13:40 | ERROR | 
1 Failed download:
2025-10-08 17:13:40 | ERROR | ['ABLBL.NS']: YFPricesMissingError('possibly delisted; no price data found  (1d 2015-01-01 -> 2025-01-01) (Yahoo error = "Data doesn\'t exist for startDate = 1420050600, endDate = 1735669800")')
2025-10-08 17:13:52 | ERROR | 
1 Failed download:
2025-10-08 17:13:52 | ERROR | ['AEGISVOPAK.NS']: YFPricesMissingError('possibly delisted; no price data found  (1d 2015-01-01 -> 2025-01-01) (Yahoo error = "Data doesn\'t exist for startDate = 1420050600, endDate = 1735669800")')
2025-10-08 17:13:53 | ERROR | 
1 Failed download:
2025-10-08 17:13:53 | ERROR | ['AGARWALEYE.NS']: YFPricesMissingError('possibly delisted; no price data found  (1d 2015-01-01 -> 2025-01-01) (Yahoo error = "Data doesn\'t exist for startDate = 1420050600

In [4]:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
===============================================================
Strategy & Pipeline — Quick Reference
===============================================================

ENTRY (long-only)
-----------------
Signal day (D0):
  • Trend: Close > EMA50.
  • Pattern: "Pivot Reclaim". The D-1 candle's Close was BELOW a pivot
    (PP, S1, or S2) AND the D0 candle's Close is ABOVE that same pivot.
  • Confirmation: At least ONE of the following is true on D0:
      - RSI > 50
      - MACD line > signal (hist > 0)
      - ADX > 20 AND +DI > -DI
      - Close > SMA50
      - Close > Bollinger middle band

Execution day (D1):
  • If configured (entry_on_next_open=True), the buy is placed at D1's Open.
  • Stop & Target are set off the executed entry price:
      SL = entry_price * (1 - stop_loss_pct)
      TP = entry_price * (1 + target_pct)

EXIT
----
Checked every bar after entry:
  1) Hard StopLoss (HSL): If next bar’s Low ≤ SL → exit (filled=SL).
  2) TakeProfit (TP):     If next bar’s High ≥ TP → exit (filled=TP).
     (If both SL and TP touch on the same bar, we assume TP priority.)
  3) Trend reverse: If Close < EMA50 on the signal bar, exit next open.

Exit reasons are verbose in CSV:
  • "StopLoss hit (5.0%). filled=..., SL=..., ret=...%, held=...d"
  • "TakeProfit hit (10.0%). filled=..., TP=..., ret=...%, held=...d"
  • "Trend reverse: Close below EMA50 (Close=..., EMA50=...);
     filled=..., ret=...%, held=...d"

PORTFOLIO PIPELINE (daily loop)
-------------------------------
1) Aggregate all per-ticker BUY 'candidate' legs dated for D1 from signals on D0.
2) Optional filter: price within >= {within_pct_of_52w_high:.0%} of 52-week high.
3) Exclude tickers we already hold.
4) Rank survivors by VOLAᵣ (excess return vs benchmark / own volatility; 252d lookback).
5) Capacity gate:
   • slots = max_concurrent_positions - current_open_positions
   • pick = min(top_k_daily, slots)
6) Sizing via Markowitz (MVO), long-only:
   • Compute mu (means) and Sigma (cov) on 252d daily returns on D0.
   • Solve for weights that maximize mean/vol (with long-only projection).
   • Cap *deployable cash* to deploy_cash_frac × current cash (e.g., 25%).
   • Allocate that capped bucket by MVO weights across today's picks.
7) Execute BUYs at D1 (close or open per config), deduct fees, open positions.
8) Process SELLs before BUYs each day; update equity curve after all legs.
"""

import os, json, math, warnings, logging
from dataclasses import dataclass
from typing import Dict, List, Tuple, Optional

import numpy as np
import pandas as pd

try:
    import yfinance as yf
    import matplotlib.pyplot as plt
except Exception:
    pass

warnings.filterwarnings("ignore", category=FutureWarning)

# =========================
# LOGGING
# =========================
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s | %(levelname)s | %(message)s",
    datefmt="%Y-%m-%d %H:%M:%S",
)
log = logging.getLogger("pivot_reclaim_ema50_confirm_v1")

# =========================
# CONFIG
# =========================
@dataclass
class Config:
    # Data
    start_date: str = "2015-01-01"
    end_date: str   = "2025-01-01"
    static_symbols: Optional[List[str]] = None
    static_symbols_path: Optional[str] = None
    cache_dir: str = "cache"
    out_dir: str   = "outputs"
    plot: bool     = True

    # --- Strategy params ---
    ema_trend_len: int = 50       # EMA50 for main trend
    sma_confirm_len: int = 50     # SMA50 for confirm
    sma_trend_len: int = 200      # SMA200 (kept for breadth/context if needed)
    rsi_len: int = 14
    macd_fast: int = 12
    macd_slow: int = 26
    macd_signal: int = 9
    bb_len: int = 20
    bb_std: float = 2.0
    adx_len: int = 14
    adx_min: float = 20.0

    # Exits
    stop_loss_pct: float = 0.10
    target_pct: float    = 0.10

    # Portfolio
    apply_fees: bool    = True
    initial_capital: float = 500_000.0
    max_concurrent_positions: int = 5
    deploy_cash_frac: float = 0.30
    entry_on_next_open: bool = True
    exit_on_next_open: bool  = True

    # Candidate ranking & filters
    benchmark_try: Tuple[str,...] = ("^CNX500","^CRSLDX","^NSE500","^NIFTY500","^BSE500","^NSEI")
    volar_lookback: int = 252
    filter_52w_window: int = 252
    within_pct_of_52w_high: float = 0.50
    top_k_daily: int = 300

    # Liquidity guards (OFF by default)
    enable_basic_liquidity: bool = False
    min_price_inr: float = 50.0
    min_avg_vol_20d: float = 50_000.0

CFG = Config()

# =========================
# FEES (per user spec)
# =========================
APPLY_FEES = True

def calc_fees(turnover_buy: float, turnover_sell: float) -> float:
    if not APPLY_FEES:
        return 0.0
    BROKER_PCT = 0.001
    BROKER_MIN = 5.0
    BROKER_CAP = 20.0
    STT_PCT = 0.001
    STAMP_BUY_PCT = 0.00015
    EXCH_PCT = 0.0000297
    SEBI_PCT = 0.000001
    IPFT_PCT = 0.000001
    GST_PCT = 0.18
    DP_SELL = 20.0 if turnover_sell >= 100 else 0.0

    def _broker(turnover):
        if turnover <= 0:
            return 0.0
        fee = turnover * BROKER_PCT
        fee = max(BROKER_MIN, min(fee, BROKER_CAP))
        return fee

    br_buy  = _broker(turnover_buy)
    br_sell = _broker(turnover_sell)
    stt   = STT_PCT * (turnover_buy + turnover_sell)
    stamp = STAMP_BUY_PCT * turnover_buy
    exch  = EXCH_PCT * (turnover_buy + turnover_sell)
    sebi  = SEBI_PCT * (turnover_buy + turnover_sell)
    ipft  = IPFT_PCT * (turnover_buy + turnover_sell)
    dp    = DP_SELL
    gst_base = br_buy + br_sell + dp + exch + sebi + ipft
    gst   = GST_PCT * gst_base
    return float((br_buy + br_sell) + stt + stamp + exch + sebi + ipft + dp + gst)

# =========================
# Helpers
# =========================
def ensure_dirs(*paths):
    for p in paths:
        os.makedirs(p, exist_ok=True)

def today_str():
    return pd.Timestamp.today(tz="Asia/Kolkata").strftime("%Y-%m-%d")

def load_static_symbols(static_symbols: Optional[List[str]], static_symbols_path: Optional[str]) -> List[str]:
    syms: List[str] = []
    if static_symbols and len(static_symbols) > 0:
        syms = list(static_symbols)
    elif static_symbols_path and os.path.exists(static_symbols_path):
        with open(static_symbols_path, "r") as f:
            syms = [line.strip() for line in f if line.strip()]
    else:
        raise ValueError(
            "Provide CFG.static_symbols=[...] ('.NS' suffixes) or set CFG.static_symbols_path "
            "to a file containing one symbol per line."
        )
    out = []
    for s in syms:
        s = s.strip().upper()
        if not s.endswith(".NS"):
            s = f"{s}.NS"
        out.append(s)
    seen = set()
    uniq = []
    for s in out:
        if s not in seen:
            uniq.append(s)
            seen.add(s)
    return uniq

def fetch_prices(tickers: List[str], start: str, end: Optional[str], cache_dir: str) -> Dict[str, pd.DataFrame]:
    ensure_dirs(cache_dir)
    data = {}
    end = end or today_str()
    for ticker in tickers:
        cache_path = os.path.join(cache_dir, f"{ticker.replace('^', '_')}.parquet")
        if os.path.exists(cache_path):
            try:
                df = pd.read_parquet(cache_path)
                if len(df) and pd.to_datetime(df.index[-1]).strftime("%Y-%m-%d") >= end:
                    data[ticker] = df
                    continue
            except Exception:
                pass
        try:
            df = yf.download(ticker, start=start, end=end, auto_adjust=True, progress=False, multi_level_index=False)
            if df is None or df.empty:
                continue
            df = df.rename(columns=str.title)  # Open, High, Low, Close, Volume
            df = df[['Open', 'High', 'Low', 'Close', 'Volume']].dropna()
            df.index.name = "date"
            df.to_parquet(cache_path)
            data[ticker] = df
        except Exception:
            continue
    return data

def ema(series: pd.Series, span: int) -> pd.Series:
    return series.ewm(span=span, adjust=False, min_periods=span).mean()

def sma(series: pd.Series, window: int) -> pd.Series:
    return series.rolling(window).mean()

def rsi(series: pd.Series, length: int = 14) -> pd.Series:
    delta = series.diff()
    gain = (delta.where(delta > 0, 0.0)).rolling(length).mean()
    loss = (-delta.where(delta < 0, 0.0)).rolling(length).mean()
    rs = gain / loss.replace(0.0, np.nan)
    out = 100 - (100 / (1 + rs))
    return out.fillna(50.0)

def macd(series: pd.Series, fast=12, slow=26, signal=9):
    ema_fast = series.ewm(span=fast, adjust=False, min_periods=slow).mean()
    ema_slow = series.ewm(span=slow, adjust=False, min_periods=slow).mean()
    macd_line = ema_fast - ema_slow
    macd_signal = macd_line.ewm(span=signal, adjust=False, min_periods=signal).mean()
    hist = macd_line - macd_signal
    return macd_line, macd_signal, hist

def _true_range(high: pd.Series, low: pd.Series, prev_close: pd.Series) -> pd.Series:
    return pd.concat([
        (high - low).abs(),
        (high - prev_close).abs(),
        (low - prev_close).abs()
    ], axis=1).max(axis=1)

def dmi_adx(high: pd.Series, low: pd.Series, close: pd.Series, length: int = 14):
    prev_high = high.shift(1)
    prev_low  = low.shift(1)
    prev_close = close.shift(1)

    up_move   = high - prev_high
    down_move = prev_low - low
    plus_dm  = up_move.where((up_move > down_move) & (up_move > 0), 0.0)
    minus_dm = down_move.where((down_move > up_move) & (down_move > 0), 0.0)

    tr = _true_range(high, low, prev_close)
    alpha = 1.0 / length
    atr = tr.ewm(alpha=alpha, adjust=False, min_periods=length).mean()
    plus_di  = 100 * (plus_dm.ewm(alpha=alpha, adjust=False, min_periods=length).mean() / atr)
    minus_di = 100 * (minus_dm.ewm(alpha=alpha, adjust=False, min_periods=length).mean() / atr)

    dx = 100 * (plus_di - minus_di).abs() / (plus_di + minus_di).replace(0, np.nan)
    adx_series = dx.ewm(alpha=alpha, adjust=False, min_periods=length).mean()
    return plus_di, minus_di, adx_series

def bollinger(series: pd.Series, length=20, nstd=2.0):
    mid = series.rolling(length).mean()
    sd  = series.rolling(length).std(ddof=0)
    upper = mid + nstd * sd
    lower = mid - nstd * sd
    return mid, upper, lower

def compute_indicators(df: pd.DataFrame, cfg: Config) -> pd.DataFrame:
    out = df.copy()
    
    # --- NEW: Add EMA50 for trend
    out["ema50"] = ema(out["Close"], cfg.ema_trend_len)
    
    # --- NEW: Add Standard Pivot Points ---
    prev_high = out['High'].shift(1)
    prev_low = out['Low'].shift(1)
    prev_close = out['Close'].shift(1)
    out['PP'] = (prev_high + prev_low + prev_close) / 3
    out['R1'] = (2 * out['PP']) - prev_low
    out['S1'] = (2 * out['PP']) - prev_high
    out['R2'] = out['PP'] + (prev_high - prev_low)
    out['S2'] = out['PP'] - (prev_high - prev_low)
    # --- End NEW ---

    # Keep all other indicators for the 'confirmation_any' function
    out["sma50"]    = sma(out["Close"], cfg.sma_confirm_len)
    out["sma200"]   = sma(out["Close"], cfg.sma_trend_len)
    out["rsi"]      = rsi(out["Close"], cfg.rsi_len)
    macd_line, macd_signal, macd_hist = macd(out["Close"], cfg.macd_fast, cfg.macd_slow, cfg.macd_signal)
    out["macd_line"] = macd_line
    out["macd_signal"] = macd_signal
    out["macd_hist"] = macd_hist
    out["bb_mid"], out["bb_up"], out["bb_low"] = bollinger(out["Close"], cfg.bb_len, cfg.bb_std)
    out["+DI"], out["-DI"], out["ADX"] = dmi_adx(out["High"], out["Low"], out["Close"], cfg.adx_len)
    
    out["avg_vol_20"] = out["Volume"].rolling(20).mean()
    out["high_52w"] = out["Close"].rolling(cfg.filter_52w_window).max()
    return out.dropna()

def basic_liquidity_ok(row: pd.Series, cfg: Config) -> bool:
    if not cfg.enable_basic_liquidity:
        return True
    if row["Close"] < cfg.min_price_inr:
        return False
    if row["avg_vol_20"] < cfg.min_avg_vol_20d:
        return False
    return True

# ---------- NEW ENTRY/EXIT LOGIC ----------
# (two_green_strict function removed as it's no longer used)

def confirmation_any(row: pd.Series, cfg: Config) -> Tuple[bool, str]:
    # This function is UNCHANGED. It serves as our confirmation.
    checks = []
    # RSI > 50
    checks.append(("RSI>50", row["rsi"] > 50.0))
    # MACD line > signal (hist>0)
    checks.append(("MACD>Signal", row["macd_hist"] > 0.0))
    # ADX>20 and +DI > -DI
    checks.append(("ADX>20 & +DI>-DI", (row["ADX"] > cfg.adx_min) and (row["+DI"] > row["-DI"])))
    # Close > SMA50
    checks.append(("Close>SMA50", row["Close"] > row["sma50"]))
    # Close > Bollinger Mid
    checks.append(("Close>BBmid", row["Close"] > row["bb_mid"]))

    passed = [name for name, ok in checks if ok]
    return (len(passed) >= 1, ", ".join(passed) if passed else "none")

def simulate_ticker(ticker: str, df: pd.DataFrame, cfg: Config):
    d = compute_indicators(df, cfg).copy()
    
    # --- NEW: Updated columns
    cols = ["ticker","side","date","price","shares","reason","signal_reason","score",
            "rsi","ADX","+DI","-DI",
            "ema50", "PP", "S1", "S2", # <-- NEW
            "sma50","sma200",
            "macd_line","macd_signal","macd_hist","bb_mid","bb_up","bb_low",
            "close","high_52w"]
            
    if d.empty or len(d) < 60:
        return pd.DataFrame(columns=cols), pd.Series(dtype=float)

    in_pos = False
    entry_px = stop_px = tgt_px = 0.0
    entry_dt = None
    trades = []

    idx = list(d.index)

    for i in range(1, len(idx)-1):
        dt = idx[i]
        nxt = idx[i+1]
        prev_dt = idx[i-1]

        row = d.loc[dt]
        prev_row = d.loc[prev_dt]
        nxt_row = d.loc[nxt]

        # --- NEW: Trend reverse condition for exit
        trend_reverse_now = row["Close"] < row["ema50"]

        if not in_pos:
            # --- NEW: Replaced entry logic ---
            # 1. Trend: Close > EMA50
            trend_ok = row["Close"] > row["ema50"]
            
            # 2. Pattern: "Pivot Reclaim" (was below, now above)
            reclaim_s1 = (prev_row["Close"] < row["S1"]) and (row["Close"] >= row["S1"])
            reclaim_s2 = (prev_row["Close"] < row["S2"]) and (row["Close"] >= row["S2"])
            reclaim_pp = (prev_row["Close"] < row["PP"]) and (row["Close"] >= row["PP"])
            pivot_ok = reclaim_s1 or reclaim_s2 or reclaim_pp
            
            # 3. Confirmation: Use existing function
            confirm_ok, confirm_str = confirmation_any(row, cfg)

            if trend_ok and pivot_ok and confirm_ok and basic_liquidity_ok(row, cfg):
                # --- End NEW logic ---
                
                # Execute at next open (or current close if configured)
                px = float(nxt_row["Open"] if cfg.entry_on_next_open else row["Close"])
                
                # --- NEW: Updated signal reason
                reclaimed_level = "S1" if reclaim_s1 else ("S2" if reclaim_s2 else "PP")
                sig_reason = f"Close>EMA50 & PivotReclaim ({reclaimed_level}); confirms: {confirm_str}"
                
                score = 1.0  # placeholder; downstream selection uses VOLAᵣ
                
                # --- NEW: Updated trade dict
                trades.append({
                    "ticker": ticker, "side": "BUY", "date": nxt if cfg.entry_on_next_open else dt,
                    "price": px, "shares": 0, "reason": "candidate",
                    "signal_reason": sig_reason, "score": float(score),
                    "rsi": float(row["rsi"]), "ADX": float(row["ADX"]),
                    "+DI": float(row["+DI"]), "-DI": float(row["-DI"]),
                    "ema50": float(row["ema50"]), # <-- NEW
                    "PP": float(row["PP"]), "S1": float(row["S1"]), "S2": float(row["S2"]), # <-- NEW
                    "sma50": float(row["sma50"]), "sma200": float(row["sma200"]),
                    "macd_line": float(row["macd_line"]), "macd_signal": float(row["macd_signal"]), "macd_hist": float(row["macd_hist"]),
                    "bb_mid": float(row["bb_mid"]), "bb_up": float(row["bb_up"]), "bb_low": float(row["bb_low"]),
                    "close": float(row["Close"]), "high_52w": float(row["high_52w"])
                })
                in_pos = True
                entry_px = px
                entry_dt = nxt if cfg.entry_on_next_open else dt
                stop_px = entry_px * (1 - cfg.stop_loss_pct)
                tgt_px  = entry_px * (1 + cfg.target_pct)

        else:
            # EXIT rules: HSL / TP / Trend reverse
            hit = None
            exec_date = nxt if cfg.exit_on_next_open else dt

            if nxt_row["Low"] <= stop_px and nxt_row["High"] >= tgt_px:
                hit, exec_price = "target", float(tgt_px)
            elif nxt_row["Low"] <= stop_px:
                hit, exec_price = "stop", float(stop_px)
            elif nxt_row["High"] >= tgt_px:
                hit, exec_price = "target", float(tgt_px)
            elif trend_reverse_now: # <-- This variable was updated
                exec_price = float(nxt_row["Open"] if cfg.exit_on_next_open else row["Close"])
                hit = "trend_reverse"

            if hit is not None:
                pnl_pct = float((exec_price / entry_px - 1.0) * 100.0) if entry_px else float("nan")
                held_days = int((pd.to_datetime(exec_date) - pd.to_datetime(entry_dt)).days) if entry_dt else 0

                if hit == "stop":
                    exit_reason = (
                        f"StopLoss hit ({cfg.stop_loss_pct*100:.1f}%). "
                        f"filled={exec_price:.2f}, SL={stop_px:.2f}, ret={pnl_pct:.2f}%, held={held_days}d"
                    )
                elif hit == "target":
                    exit_reason = (
                        f"TakeProfit hit ({cfg.target_pct*100:.1f}%). "
                        f"filled={exec_price:.2f}, TP={tgt_px:.2f}, ret={pnl_pct:.2f}%, held={held_days}d"
                    )
                else:  # trend_reverse
                    # --- NEW: Updated trend reverse exit reason
                    exit_reason = (
                        f"Trend reverse: Close below EMA50 "
                        f"(Close={row['Close']:.2f}, EMA50={row['ema50']:.2f}); "
                        f"filled={exec_price:.2f}, ret={pnl_pct:.2f}%, held={held_days}d"
                    )
                
                # --- NEW: Updated trade dict
                trades.append({
                    "ticker": ticker, "side": "SELL", "date": exec_date,
                    "price": float(exec_price), "shares": 0, "reason": exit_reason,
                    "signal_reason": "", "score": np.nan,
                    "rsi": float(row["rsi"]), "ADX": float(row["ADX"]),
                    "+DI": float(row["+DI"]), "-DI": float(row["-DI"]),
                    "ema50": float(row["ema50"]), # <-- NEW
                    "PP": np.nan, "S1": np.nan, "S2": np.nan, # <-- NEW (nan for exits)
                    "sma50": float(row["sma50"]), "sma200": float(row["sma200"]),
                    "macd_line": float(row["macd_line"]), "macd_signal": float(row["macd_signal"]), "macd_hist": float(row["macd_hist"]),
                    "bb_mid": float(row["bb_mid"]), "bb_up": float(row["bb_up"]), "bb_low": float(row["bb_low"]),
                    "close": float(row["Close"]), "high_52w": float(row["high_52w"])
                })
                in_pos = False
                entry_px = stop_px = tgt_px = 0.0
                entry_dt = None

    # Force close open pos at last bar
    if in_pos:
        last_dt = d.index[-1]; row = d.loc[last_dt]
        exec_price = float(row["Close"])
        pnl_pct = float((exec_price / entry_px - 1.0) * 100.0) if entry_px else float("nan")
        held_days = int((pd.to_datetime(last_dt) - pd.to_datetime(entry_dt)).days) if entry_dt else 0
        exit_reason = (
            f"Final close (backtest end). filled={exec_price:.2f}, ret={pnl_pct:.2f}%, held={held_days}d"
        )
        # --- NEW: Updated trade dict
        trades.append({
            "ticker": ticker, "side": "SELL", "date": last_dt,
            "price": exec_price, "shares": 0, "reason": exit_reason,
            "signal_reason": "", "score": np.nan,
            "rsi": float(row["rsi"]), "ADX": float(row["ADX"]),
            "+DI": float(row["+DI"]), "-DI": float(row["-DI"]),
            "ema50": float(row["ema50"]), # <-- NEW
            "PP": np.nan, "S1": np.nan, "S2": np.nan, # <-- NEW (nan for exits)
            "sma50": float(row["sma50"]), "sma200": float(row["sma200"]),
            "macd_line": float(row["macd_line"]), "macd_signal": float(row["macd_signal"]), "macd_hist": float(row["macd_hist"]),
            "bb_mid": float(row["bb_mid"]), "bb_up": float(row["bb_up"]), "bb_low": float(row["bb_low"]),
            "close": float(row["Close"]), "high_52w": float(row["high_52w"])
        })

    return pd.DataFrame(trades, columns=cols), pd.Series(dtype=float)


# ---------- Portfolio: unchanged engine ----------
def pick_benchmark(benchmarks: Tuple[str,...], start: str, end: Optional[str], cache_dir: str) -> Tuple[str, pd.DataFrame]:
    for t in benchmarks:
        data = fetch_prices([t], start, end, cache_dir)
        df = data.get(t)
        if df is not None and not df.empty:
            log.info("Using benchmark: %s", t)
            return t, df
    idx = pd.date_range(start=start, end=end or today_str(), freq="B")
    df = pd.DataFrame({"Close": np.ones(len(idx))}, index=idx)
    log.warning("No benchmark found; using synthetic flat series.")
    return "SYNTH_BENCH", df

def compute_volar_scores(end_dt: pd.Timestamp, tickers: List[str], data_map: Dict[str,pd.DataFrame], bench_df: pd.DataFrame, lookback: int) -> Dict[str, float]:
    scores = {}
    bser = bench_df["Close"].loc[:end_dt].pct_change().dropna().iloc[-lookback:]
    for t in tickers:
        df = data_map.get(t)
        if df is None or df.empty:
            scores[t] = 0.0
            continue
        if end_dt not in df.index:
            df = df[df.index <= end_dt]
            if df.empty:
                scores[t] = 0.0
                continue
        r = df["Close"].loc[:end_dt].pct_change().dropna().iloc[-lookback:]
        common = pd.concat([r, bser], axis=1, keys=["s","b"]).dropna()
        if common.shape[0] < max(20, int(0.4*lookback)):
            scores[t] = 0.0
            continue
        excess = common["s"] - common["b"]
        vol = common["s"].std(ddof=0)
        scores[t] = 0.0 if vol <= 1e-8 else float((excess.mean() / vol) * math.sqrt(252.0))
    return scores

def markowitz_long_only(mu: np.ndarray, Sigma: np.ndarray) -> np.ndarray:
    n = len(mu)
    eps = 1e-6
    Sigma = Sigma + eps*np.eye(n)

    def solve_lambda(lmbd: float, active_mask=None):
        if active_mask is None:
            A = np.block([[2*lmbd*Sigma, np.ones((n,1))],[np.ones((1,n)), np.zeros((1,1))]])
            b = np.concatenate([mu, np.array([1.0])])
            try:
                sol = np.linalg.solve(A, b)
                w = sol[:n]
            except np.linalg.LinAlgError:
                w = np.full(n, 1.0/n)
            return w
        else:
            idx = np.where(active_mask)[0]
            if len(idx)==0:
                return np.full(n, 1.0/n)
            S = Sigma[np.ix_(idx, idx)]
            o = np.ones(len(idx))
            m = mu[idx]
            A = np.block([[2*lmbd*S, o[:,None]],[o[None,:], np.zeros((1,1))]])
            b = np.concatenate([m, np.array([1.0])])
            try:
                sol = np.linalg.solve(A, b)
                w_sub = sol[:len(idx)]
            except np.linalg.LinAlgError:
                w_sub = np.full(len(idx), 1.0/len(idx))
            w = np.zeros(n)
            w[idx] = w_sub
            return w

    best_w = np.full(n, 1.0/n)
    best_sr = -1e9
    lambdas = np.logspace(-3, 3, 31)
    for lmbd in lambdas:
        active = np.ones(n, dtype=bool)
        w = None
        for _ in range(n):
            w = solve_lambda(lmbd, active_mask=active)
            neg = w < 0
            if not neg.any():
                break
            worst = np.argmin(w)
            active[worst] = False
        if w is None:
            continue
        w = np.clip(w, 0, None)
        if w.sum() <= 0:
            continue
        w = w / w.sum()
        mu_p = float(mu @ w)
        vol_p = float(np.sqrt(w @ Sigma @ w))
        if vol_p <= 1e-8:
            continue
        sr = mu_p / vol_p
        if sr > best_sr:
            best_sr = sr
            best_w = w.copy()
    return best_w

def aggregate_and_apply(all_trades: pd.DataFrame, data_map: Dict[str, pd.DataFrame], bench_df: pd.DataFrame, cfg: Config):
    if all_trades.empty:
        return all_trades, pd.Series(dtype=float), {}

    side_order = {"BUY": 0, "SELL": 1}
    all_trades = (all_trades
        .assign(_sorder=all_trades["side"].map(side_order))
        .sort_values(by=["date", "_sorder"], kind="stable")
        .drop(columns=["_sorder"])
        .reset_index(drop=True)
    )
    all_trades["date"] = pd.to_datetime(all_trades["date"])

    equity_curve = []
    dates = sorted(all_trades["date"].unique().tolist())
    cash = cfg.initial_capital
    open_positions = {}
    completed_legs = []

    global APPLY_FEES
    APPLY_FEES = cfg.apply_fees

    def _get_close_on(tkr, dt):
        df = data_map.get(tkr)
        if df is None or df.empty:
            return np.nan
        if dt in df.index:
            return float(df.loc[dt, "Close"])
        prev = df[df.index <= dt]
        if prev.empty:
            return np.nan
        return float(prev["Close"].iloc[-1])

    if dates:
        seed_date = pd.to_datetime(dates[0]) - pd.Timedelta(days=1)
        equity_curve.append((seed_date, float(cash)))

    # Walk dates
    for dt in dates:
        day_trades = all_trades[all_trades["date"] == dt].copy()

        # SELLs first
        for _, tr in day_trades[day_trades["side"] == "SELL"].iterrows():
            tkr = tr["ticker"]
            price = float(tr["price"])
            pos = open_positions.get(tkr)
            if pos is None:
                continue
            shares = int(pos["shares"])
            turnover_sell = shares * price
            fee = calc_fees(0.0, turnover_sell)
            pnl = (price - pos["entry_px"]) * shares
            cash += (turnover_sell - fee)
            realized = pnl - fee - pos.get("buy_fee", 0.0)
            
            # --- NEW: Updated completed_legs dict
            completed_legs.append({
                "ticker": tkr, "side": "SELL", "date": dt, "price": price,
                "shares": shares, "reason": tr.get("reason",""),
                "turnover": turnover_sell, "fees_inr": fee, "pnl_inr": realized,
                "rsi": tr.get("rsi", np.nan), "ADX": tr.get("ADX", np.nan),
                "+DI": tr.get("+DI", np.nan), "-DI": tr.get("-DI", np.nan),
                "ema50": tr.get("ema50", np.nan), # <-- NEW
                "sma50": tr.get("sma50", np.nan), "sma200": tr.get("sma200", np.nan),
                "macd_line": tr.get("macd_line", np.nan), "macd_signal": tr.get("macd_signal", np.nan), "macd_hist": tr.get("macd_hist", np.nan),
                "bb_mid": tr.get("bb_mid", np.nan), "bb_up": tr.get("bb_up", np.nan), "bb_low": tr.get("bb_low", np.nan),
                "close": tr.get("close", np.nan), "high_52w": tr.get("high_52w", np.nan),
                "volar": tr.get("volar", np.nan), "mvo_weight": np.nan, "alloc_inr": np.nan
            })
            log.info("Exit %-12s px=%8.2f sh=%6d reason=%s net=%.2f cash=%.2f",
                     tkr, price, shares, tr.get("reason",""), realized, cash)
            del open_positions[tkr]

        # BUY candidates
        buys_today = day_trades[day_trades["side"] == "BUY"].copy()

        # 52w proximity filter
        if not buys_today.empty:
            keep = []
            for _, rr in buys_today.iterrows():
                df = data_map.get(rr["ticker"])
                if df is None or df.empty or dt not in df.index:
                    continue
                # Use signal candle close (from trade row) for 52w check
                close = float(rr.get("close", df.loc[dt, "Close"])) 
                high_52w = float(rr.get("high_52w", 0.0))
                if high_52w > 0 and close >= cfg.within_pct_of_52w_high * high_52w:
                    keep.append(rr)
            buys_today = pd.DataFrame(keep) if keep else pd.DataFrame(columns=buys_today.columns)

        # Exclude already-held tickers
        if not buys_today.empty:
            buys_today = buys_today[~buys_today["ticker"].isin(open_positions.keys())]

        # Rank by VOLAᵣ
        if not buys_today.empty:
            tickers = buys_today["ticker"].tolist()
            volar_scores = compute_volar_scores(dt, tickers, data_map, bench_df, cfg.volar_lookback)
            buys_today["volar"] = buys_today["ticker"].map(volar_scores)
            buys_today = buys_today.sort_values("volar", ascending=False).reset_index(drop=True)

        slots = cfg.max_concurrent_positions - len(open_positions)
        selected = pd.DataFrame(columns=buys_today.columns)
        if slots > 0 and not buys_today.empty:
            selected = buys_today.head(min(cfg.top_k_daily, slots)).copy()

        if not selected.empty:
            log.info("Selected %d BUY candidates on %s:", selected.shape[0], dt.date())
            for i, rr in selected.reset_index(drop=True).iterrows():
                log.info("  %-12s volar=%6.2f rank=%d px=%8.2f", rr["ticker"], rr.get("volar",0.0), i+1, rr["price"])

            names = selected["ticker"].tolist()
            rets = []
            for t in names:
                df = data_map.get(t)
                ser = df["Close"].loc[:dt].pct_change().dropna().iloc[-cfg.volar_lookback:]
                rets.append(ser)
            R = pd.concat(rets, axis=1)
            R.columns = names
            R = R.dropna()
            if R.empty or R.shape[0] < max(20, int(0.4*cfg.volar_lookback)) or R.shape[1] == 0:
                weights = np.full(len(names), 1.0/len(names))
            else:
                mu = R.mean().values
                Sigma = R.cov().values
                weights = markowitz_long_only(mu, Sigma)

            deploy_cash = max(0.0, float(cash)) * float(cfg.deploy_cash_frac)

            if deploy_cash <= 0:
                log.info("No deployable cash (cap=%.0f%%) on %s", 100*cfg.deploy_cash_frac, dt.date())
            else:
                alloc = (weights / weights.sum()) * deploy_cash if weights.sum()>0 else np.full(len(names), deploy_cash/len(names))
                rank_map = {row["ticker"]: (idx+1) for idx, (_, row) in enumerate(selected.iterrows())}
                for w_amt, t in zip(alloc, names):
                    row_sel = selected[selected["ticker"]==t].iloc[0]
                    # Use the actual execution price from the 'BUY' candidate row
                    price = float(row_sel["price"])
                    shares = int(math.floor(w_amt / price))
                    if shares <= 0:
                        log.info("Skip BUY %-12s (alloc %.2f too small)", t, w_amt)
                        continue
                    turn = shares * price
                    fee = calc_fees(turn, 0.0)
                    total_cost = turn + fee
                    if total_cost > cash:
                        shares = int(math.floor((cash - fee) / price))
                        if shares <= 0:
                            log.info("Skip BUY %-12s due to cash/fees", t)
                            continue
                        turn = shares * price
                        total_cost = turn + fee
                    cash -= total_cost
                    # Use actual execution price for position tracking
                    open_positions[t] = {"entry_date": dt, "entry_px": price, "shares": shares, "buy_fee": fee, "entry_reason": "entry"} 

                    volar_val = float(row_sel.get("volar", np.nan))
                    rank_pos = rank_map.get(t, np.nan)
                    high_52w = float(row_sel.get("high_52w", np.nan))
                    close_val = float(row_sel.get("close", np.nan))
                    pct_52w = (close_val / high_52w) if (high_52w and high_52w>0) else np.nan
                    mvo_weight_today = (w_amt / deploy_cash) if deploy_cash > 0 else 0.0
                    sig_reason = row_sel.get("signal_reason", "Close>EMA50 & PivotReclaim (≥1 confirm)")
                    reason_text = (
                        f"{sig_reason}; 52w%={pct_52w:.1%} (>= {CFG.within_pct_of_52w_high:.0%}); "
                        f"VOLAR rank {int(rank_pos)}/{len(names)} (VOLAR={volar_val:.2f}); "
                        f"MVO weight={mvo_weight_today:.1%} of capped cash ({100*cfg.deploy_cash_frac:.0f}% of available)"
                    )
                    
                    # --- NEW: Updated completed_legs dict
                    completed_legs.append({
                        "ticker": t, "side": "BUY", "date": dt, "price": price,
                        "shares": shares, "reason": reason_text,
                        "turnover": turn, "fees_inr": fee, "pnl_inr": 0.0,
                        "rsi": float(row_sel.get("rsi", np.nan)), "ADX": float(row_sel.get("ADX", np.nan)),
                        "+DI": float(row_sel.get("+DI", np.nan)), "-DI": float(row_sel.get("-DI", np.nan)),
                        "ema50": float(row_sel.get("ema50", np.nan)), # <-- NEW
                        "PP": float(row_sel.get("PP", np.nan)), "S1": float(row_sel.get("S1", np.nan)), "S2": float(row_sel.get("S2", np.nan)), # <-- NEW
                        "sma50": float(row_sel.get("sma50", np.nan)), "sma200": float(row_sel.get("sma200", np.nan)),
                        "macd_line": float(row_sel.get("macd_line", np.nan)), "macd_signal": float(row_sel.get("macd_signal", np.nan)), "macd_hist": float(row_sel.get("macd_hist", np.nan)),
                        "bb_mid": float(row_sel.get("bb_mid", np.nan)), "bb_up": float(row_sel.get("bb_up", np.nan)), "bb_low": float(row_sel.get("bb_low", np.nan)),
                        "close": close_val, "high_52w": high_52w,
                        "volar": volar_val, "mvo_weight": float(mvo_weight_today), "alloc_inr": float(w_amt)
                    })
                    log.info("BUY %-12s px=%8.2f sh=%6d fee=%.2f cash=%.2f :: %s",
                             t, price, shares, fee, cash, reason_text)

        # MTM valuation
        mtm = 0.0
        for _tkr, pos in open_positions.items():
            px = _get_close_on(_tkr, dt)
            if not np.isnan(px):
                mtm += pos["shares"] * px
        total_equity = cash + mtm
        equity_curve.append((dt, float(total_equity)))

    eq_ser = pd.Series([e for _, e in equity_curve], index=[d for d, _ in equity_curve])
    legs_df = pd.DataFrame(completed_legs).sort_values(["date", "ticker", "side"]).reset_index(drop=True)

    # Roundtrips
    roundtrips = []
    by_tkr_open = {}
    for _, leg in legs_df.iterrows():
        tkr = leg["ticker"]
        if leg["side"] == "BUY":
            by_tkr_open[tkr] = leg
        else:
            buy = by_tkr_open.pop(tkr, None)
            if buy is None:
                continue
            fees_total = float(buy.get("fees_inr", 0.0) + leg.get("fees_inr", 0.0))
            gross_pnl = (leg["price"] - buy["price"]) * buy["shares"]
            net_pnl   = gross_pnl - fees_total
            ret_pct   = (leg["price"] / buy["price"] - 1.0) * 100.0
            days_held = (pd.to_datetime(leg["date"]) - pd.to_datetime(buy["date"])).days
            
            # --- NEW: Updated roundtrips dict
            roundtrips.append({
                "ticker": tkr,
                "entry_date": pd.to_datetime(buy["date"]),
                "entry_price": float(buy["price"]),
                "exit_date": pd.to_datetime(leg["date"]),
                "exit_price": float(leg["price"]),
                "days_held": int(days_held),
                "shares": int(buy["shares"]),
                "entry_reason": buy.get("reason",""),
                "exit_reason": leg.get("reason",""),
                "gross_pnl_inr": float(gross_pnl),
                "fees_total_inr": float(fees_total),
                "net_pnl_inr": float(net_pnl),
                "return_pct": float(ret_pct),
                "rsi_entry": float(buy.get("rsi", np.nan)),
                "adx_entry": float(buy.get("ADX", np.nan)),
                "pdi_entry": float(buy.get("+DI", np.nan)),
                "mdi_entry": float(buy.get("-DI", np.nan)),
                "ema50_entry": float(buy.get("ema50", np.nan)), # <-- NEW
                "pp_entry": float(buy.get("PP", np.nan)), # <-- NEW
                "s1_entry": float(buy.get("S1", np.nan)), # <-- NEW
                "s2_entry": float(buy.get("S2", np.nan)), # <-- NEW
                "sma50_entry": float(buy.get("sma50", np.nan)),
                "sma200_entry": float(buy.get("sma200", np.nan)),
                "macd_line_entry": float(buy.get("macd_line", np.nan)),
                "macd_signal_entry": float(buy.get("macd_signal", np.nan)),
                "macd_hist_entry": float(buy.get("macd_hist", np.nan)),
                "bb_mid_entry": float(buy.get("bb_mid", np.nan)),
                "bb_up_entry": float(buy.get("bb_up", np.nan)),
                "bb_low_entry": float(buy.get("bb_low", np.nan)),
                "close_entry": float(buy.get("close", np.nan)),
                "high_52w_entry": float(buy.get("high_52w", np.nan)),
                "volar_entry": float(buy.get("volar", np.nan)),
                "mvo_weight_entry": float(buy.get("mvo_weight", np.nan)),
                "alloc_inr_entry": float(buy.get("alloc_inr", np.nan))
            })
    trips_df = pd.DataFrame(roundtrips).sort_values(["entry_date","ticker"]).reset_index(drop=True)

    metrics = compute_metrics(eq_ser, legs_df)
    return legs_df, trips_df, eq_ser, metrics

def compute_metrics(equity: pd.Series, legs_df: pd.DataFrame):
    out = {}
    if equity is None or equity.empty:
        return out
    eq = equity.dropna()
    daily_ret = eq.pct_change().fillna(0.0)

    days = (eq.index[-1] - eq.index[0]).days or 1
    years = days / 365.25
    cagr = (eq.iloc[-1] / eq.iloc[0]) ** (1/years) - 1 if years > 0 else 0.0

    if daily_ret.std(ddof=0) > 0:
        sharpe = (daily_ret.mean() / daily_ret.std(ddof=0)) * np.sqrt(252)
    else:
        sharpe = 0.0

    cummax = eq.cummax()
    dd = (eq - cummax) / cummax
    max_dd = dd.min()

    wins = 0
    n_sells = legs_df[legs_df["side"] == "SELL"].shape[0] if legs_df is not None and not legs_df.empty else 0
    for _, r in legs_df[legs_df["side"] == "SELL"].iterrows():
        if float(r.get("pnl_inr", 0.0)) > 0:
            wins += 1
    win_rate = (wins / n_sells) * 100.0 if n_sells > 0 else 0.0

    out.update({
        "start_equity_inr": float(eq.iloc[0]),
        "final_equity_inr": float(eq.iloc[-1]),
        "cagr_pct": float(cagr * 100),
        "sharpe": float(sharpe),
        "max_drawdown_pct": float(max_dd * 100),
        "win_rate_pct": float(win_rate),
        "n_trades": int(n_sells),
    })
    return out

def plot_equity(equity: pd.Series, out_path: str):
    if equity is None or equity.empty:
        return
    try:
        import matplotlib.pyplot as plt
        plt.figure(figsize=(10,5))
        plt.plot(equity.index, equity.values)
        plt.title("Equity Curve")
        plt.xlabel("Date")
        plt.ylabel("Equity (INR)")
        plt.tight_layout()
        plt.savefig(out_path)
        plt.close()
    except Exception:
        pass

def backtest(cfg: Config):
    ensure_dirs(cfg.cache_dir, cfg.out_dir)
    log.info("Universe: loading static symbols...")
    symbols = load_static_symbols(cfg.static_symbols, cfg.static_symbols_path)
    log.info("Loaded %d symbols.", len(symbols))

    log.info("Data: fetching OHLCV from yfinance (adjusted)...")
    data_map = fetch_prices(symbols, cfg.start_date, cfg.end_date, cfg.cache_dir)
    log.info("Downloaded %d symbols with data.", len(data_map))

    bench_tkr, bench_df = pick_benchmark(cfg.benchmark_try, cfg.start_date, cfg.end_date, cfg.cache_dir)
    log.info("Benchmark selected: %s", bench_tkr)

    # --- NEW: Updated log message
    log.info("Signals: generating new EMA50-Trend + PivotReclaim + ≥1 confirm entries...")
    all_trades = []
    for i, tkr in enumerate(symbols, 1):
        df = data_map.get(tkr)
        if df is None or df.empty:
            continue
        tr, _ = simulate_ticker(tkr, df, cfg)
        if not tr.empty:
            all_trades.append(tr)
        if i % 50 == 0:
            log.info("  processed %d/%d tickers...", i, len(symbols))

    if not all_trades:
        log.warning("No signals generated; check thresholds or universe.")
        return None, None, None, {}
    all_trades = pd.concat(all_trades, ignore_index=True)

    log.info("Portfolio: cap daily deploy to %.0f%% of cash; 52w>=%.0f%% high; top-%d by VOLAᵣ; MVO; max %d positions.",
             cfg.deploy_cash_frac*100, cfg.within_pct_of_52w_high*100, cfg.top_k_daily, cfg.max_concurrent_positions)
    legs_df, trips_df, equity, metrics = aggregate_and_apply(all_trades, data_map, bench_df, cfg)

    stamp = pd.Timestamp.today(tz="Asia/Kolkata").strftime("%Y%m%d_%H%M%S")
    legs_path = os.path.join(cfg.out_dir, f"trades_legs_{stamp}.csv")
    trips_path = os.path.join(cfg.out_dir, f"trades_roundtrips_{stamp}.csv")
    equity_path = os.path.join(cfg.out_dir, f"equity_{stamp}.csv")
    metrics_path = os.path.join(cfg.out_dir, f"metrics_{stamp}.json")
    eq_plot_path = os.path.join(cfg.out_dir, f"equity_{stamp}.png")

    if legs_df is not None:
        legs_df.to_csv(legs_path, index=False)
    if trips_df is not None:
        trips_df.to_csv(trips_path, index=False)
    if equity is not None:
        pd.DataFrame({"date": equity.index, "equity": equity.values}).to_csv(equity_path, index=False)
    with open(metrics_path, "w") as f:
        json.dump(metrics, f, indent=2)

    if cfg.plot and equity is not None:
        plot_equity(equity, eq_plot_path)

    log.info("=== METRICS ===\n%s", json.dumps(metrics, indent=2))
    log.info("Files written:\n  %s\n  %s\n  %s\n  %s", legs_path, trips_path, equity_path, metrics_path)
    if cfg.plot:
        log.info("  %s", eq_plot_path)

def main():
    global APPLY_FEES
    APPLY_FEES = bool(CFG.apply_fees)
    # Example: set your universe here or via file path.
    # CFG.static_symbols = ['TCS.NS','HDFCBANK.NS','RELIANCE.NS','INFY.NS','ICICIBANK.NS']
    CFG.static_symbols_path = "nifty500.txt" # Make sure you have this file
    backtest(CFG)

if __name__ == "__main__":
    main()

2025-10-24 20:53:51 | INFO | Universe: loading static symbols...
2025-10-24 20:53:51 | INFO | Loaded 500 symbols.
2025-10-24 20:53:51 | INFO | Data: fetching OHLCV from yfinance (adjusted)...
2025-10-24 20:53:55 | ERROR | 
1 Failed download:
2025-10-24 20:53:55 | ERROR | ['ABLBL.NS']: YFPricesMissingError('possibly delisted; no price data found  (1d 2015-01-01 -> 2025-01-01) (Yahoo error = "Data doesn\'t exist for startDate = 1420050600, endDate = 1735669800")')
2025-10-24 20:54:01 | ERROR | 
1 Failed download:
2025-10-24 20:54:01 | ERROR | ['AEGISVOPAK.NS']: YFPricesMissingError('possibly delisted; no price data found  (1d 2015-01-01 -> 2025-01-01) (Yahoo error = "Data doesn\'t exist for startDate = 1420050600, endDate = 1735669800")')
2025-10-24 20:54:02 | ERROR | 
1 Failed download:
2025-10-24 20:54:02 | ERROR | ['AGARWALEYE.NS']: YFPricesMissingError('possibly delisted; no price data found  (1d 2015-01-01 -> 2025-01-01) (Yahoo error = "Data doesn\'t exist for startDate = 1420050600

KeyboardInterrupt: 