# Here is the refactored solution. I have separated the concerns into three distinct layers:
1.  **The Data Contract:** explicit `dataclasses` defining exactly what goes in and comes out.
2.  **The Engine:** A purely mathematical class (`AlphaEngine`) containing the logic, with no widget/plotting dependencies.
3.  **The UI:** A cleaned-up dashboard function that simply sends inputs to the Engine and visualizes the Output

### Following is the reverse chronological fix log (most recent entry is at the top )

v45  
You are very welcome! We have taken the code from a basic script and transformed it into a professional-grade **Alpha Engine** with:

1.  **Centralized Control:** `GLOBAL_SETTINGS` now acts as the single "Source of Truth" for your entire environment.
2.  **Mathematical Integrity:** By linking the Registry to your helper functions, you've ensured your rankings and performance reports will never disagree.
3.  **Data Resilience:** The new "Clean-at-Entry" layer protects your backtests from data glitches like the `0.0` price spikes you found.
4.  **User-Centric UX:** The improved timeline validation makes the tool much more intuitive to use.

Happy backtesting! If you run into any more "data ghosts" or want to add more complex logic, feel free to reach out. **Good luck with the strategy!**

v44  
I think the **"Pinpointed Changes"** approach is much better for learning. It allows you to see exactly how the "plumbing" of the code connects without getting overwhelmed by 500 lines of existing logic.

The changes are simple because they follow a **"Find and Replace"** pattern in three specific spots. 

Here is how I will present them to you:

### The 3 Pinpoint Locations:

1.  **The "Head" (New Global Config):**
    *   **Where:** Right before Section B (Metric Registry).
    *   **What:** We will insert the "Source of Truth" dictionary here with all your descriptive comments.

2.  **The "Heart" (Engine Input):**
    *   **Where:** In Section C, inside the `@dataclass class EngineInput`.
    *   **What:** We will change the `default_factory` to point to the Head. This ensures the engine always knows the rules.

3.  **The "Face" (The UI Function):**
    *   **Where:** In Section E, at the start of `def plot_walk_forward_analyzer`.
    *   **What:** we will remove the hard-coded dictionary and tell it to use the Head as the default.

---

### Why this is better than a full code dump:
*   **You learn "DRY" Principles:** You'll see how to avoid "Repeating Yourself" (DRY) by using a single variable in multiple places.
*   **You see the "Mapping":** You will understand how a variable defined at the top of a script flows down into a class, and finally into a visual widget.
*   **Easier Debugging:** If you want to change the liquidity to "Top 20%" tomorrow, you will know exactly which single line to touch.

**Are you ready for me to provide these three specific code blocks for you to swap in?**

v43  
**1. The Logic Core (AlphaEngine v2.4) -> üü¢ CLEAN**
*   **Verdict:** This is robust.
*   **Why:** We stripped out the hidden index math and replaced it with **Explicit Dates** (`decision_date` vs `buy_date`).
*   **Big Win:** The separation of `_validate_timeline`, `_select_tickers`, and `_calculate_period_metrics` means you can debug one part without breaking the others.
*   **The Verification Suite:** The fact that we have 3 independent auditors (`verify_engine_integrity`, `verify_portfolio_construction`, `verify_ticker_ranking_logic`) means we are no longer "guessing" if the math is right. We *know* it is right.

v38  
Added Momentum and Pullback strategies

v37  
For an automated bot (and for rigorous scientific testing), **Silent Auto-Correction (Clamping)** is dangerous. If the bot asks for "2080" and receives data for "2025", it creates "Data Hallucinations."

We need to replace the "Clamping" logic with **"Strict Validation" logic**.


v36  
The step `self.df_close = df_ohlcv['Adj Close'].unstack(level=0)` is an expensive "Pivot" operation. It takes a "Tall and Skinny" table (1 million rows) and reshapes it into a "Short and Wide" matrix (Dates x Tickers). Pandas hates doing this repeatedly.

Yes, we can pre-compute this matrix and pass it in, just like we did with the featur

v35  
This is a great optimization step. In data science, this is called **"Memoization"** or **"Caching"**‚Äîdo the heavy math once, save it, and reuse it.

To achieve this, we need to modify the **`AlphaEngine` constructor** to accept pre-calculated features, and update the **`plot_walk_forward_analyzer`** to pass them down.


v34  
I have refactored **SECTION D (`AlphaEngine.run`)** to implement the "Decision Anchor" logic, and **SECTION E (`plot_walk_forward_analyzer`)** to implement the new "Timeline" UI layout and renaming.

### Key Changes:
1.  **Engine Logic:** The `start_date` input is now treated as the **Decision Date (T0)**. The engine calculates backward for the Lookback period and forward for the Holding period.
2.  **Universe Selection:** Tickers are now filtered based on liquidity **on the Decision Date**, not the start of history.
3.  **UI Layout:** Inputs are arranged horizontally: `[Lookback] <-> [Decision Date] <-> [Holding]`.
4.  **Renaming:** "Metric" is now **"Strategy"**.


v33  
**Action Item to make it perfect:**
Change the "Fwd Gain" calculation in `AlphaEngine.run` to skip one day, or accept that your results are slightly optimistic due to the "Overnight Gap" bias.

```
To see the full dataframe of all tickers (both those that passed and those that failed) for a specific date, we need to capture a snapshot of the universe inside the `_get_eligible_universe` method.

I have updated the **`AlphaEngine`** class below.
```

```
To verify that the relative percentile logic is working, we can modify the `AlphaEngine` to report exactly **how the cutoff was calculated** for the specific start date.

We want to see evidence that:
1.  In earlier years (e.g., 2005), the volume cutoff is lower (e.g., $200k).
2.  In later years (e.g., 2024), the volume cutoff is higher (e.g., $5M).

Here is the updated `AlphaEngine` and `UI` code. I have added a **"Audit Log"** feature. When you run the tool, it will now print exactly what the Dollar Volume Threshold was for that specific day.
```

```
The best way to solve this is to switch from a **Fixed Dollar Threshold** (e.g., "$1 Million") to a **Relative Percentile Threshold** (e.g., "Top 50% of the market").

In 2004, a stock trading $200k might have been in the top 50% of liquid stocks. In 2024, that same $200k is illiquid garbage. Using a percentile automatically adjusts for inflation and market growth over time.

Here is how to modify your code to support this.
```

```
To fix this, we need to pass the **actual** calculated start date (the trading day the engine "snapped" to) back from the `AlphaEngine` to the UI. Then, the UI can compare the *Requested Date* vs. the *Actual Date* and display the warning message if they differ.

Here is the plan:
1.  **Update `EngineOutput`**: Add a `start_date` field to the dataclass.
2.  **Update `AlphaEngine.run`**: Populate this new field with `safe_start_date`.
3.  **Update `plot_walk_forward_analyzer`**: Add logic to compare the user's input date with the engine's returned date and print the "Info" message if they are different.

Here is the updated code (Sections C, D, and E have changed):
```

```
I have updated the `AlphaEngine.run` method. specifically inside the `if inputs.mode == 'Manual List':` block. It now iterates through every manual ticker and performs two checks:
1.  **Existence**: Is the ticker in the database?
2.  **Availability**: Does the ticker have a valid price on the specific `Start Date`?

If any ticker fails, it compiles a specific error message explaining why (e.g., "No price data on start date") and aborts the calculation immediately.  
```

```
The `snapshot_df` contains **every single feature** calculated by your `generate_features` function for that specific day, plus the new audit columns we added.

Here is exactly what is inside that DataFrame:

### 1. The Core Features (from `generate_features`)
*   **`TR`**: True Range
*   **`ATR`**: Average True Range
*   **`ATRP`**: Average True Range Percent (Volatility)
*   **`RollingStalePct`**: How often the price didn't move or volume was 0.
*   **`RollMedDollarVol`**: Median Daily Dollar Volume (Liquidity).
*   **`RollingSameVolCount`**: Data quality check for repeated volume numbers.

### 2. The Audit Columns (Added during filtering)
*   **`Calculated_Cutoff`**: The specific dollar amount required to pass on that day.
*   **`Passed_Vol_Check`**: `True` if the ticker met the liquidity requirement.
*   **`Passed_Final`**: `True` if it passed **all** checks (Liquidity + Stale + Quality).

=========================================

Here are the formulas translated directly into the Python `pandas` code used in your `generate_features` function.

I have simplified the code slightly to assume a single ticker context (removing the `groupby` wrapper) so you can see the raw math clearly.

### 1. True Range (TR)
Calculates the maximum of the three price differences.

prev_close = df_ohlcv['Adj Close'].shift(1)

# The three components
diff1 = df_ohlcv['Adj High'] - df_ohlcv['Adj Low']
diff2 = (df_ohlcv['Adj High'] - prev_close).abs()
diff3 = (df_ohlcv['Adj Low'] - prev_close).abs()

# Taking the max of the three
tr = pd.concat([diff1, diff2, diff3], axis=1).max(axis=1)

### 2. Average True Range (ATR)
Uses an Exponential Weighted Mean (EWM) with a specific alpha smoothing factor.

# N = atr_period (e.g., 14)
# alpha = 1 / N
atr = tr.ewm(alpha=1/14, adjust=False).mean()

### 3. ATR Percent (ATRP)
Simple division to normalize volatility.

atrp = atr / df_ohlcv['Adj Close']

### 4. Rolling Stale Percentage
Checks if volume is 0 OR if High equals Low (price didn't move), then averages that 1 or 0 signal over the window.

# 1. Define the Stale Signal (1 for stale, 0 for active)
is_stale = np.where(
    (df_ohlcv['Volume'] == 0) | (df_ohlcv['Adj High'] == df_ohlcv['Adj Low']), 
    1,  
    0
)

# 2. Calculate average over window (W=252)
rolling_stale_pct = pd.Series(is_stale).rolling(window=252).mean()

### 5. Rolling Median Dollar Volume
Calculates raw dollar volume, then finds the median over the window.

# 1. Calculate Daily Dollar Volume
dollar_volume = df_ohlcv['Adj Close'] * df_ohlcv['Volume']

# 2. Get Median over window (W=252)
roll_med_dollar_vol = dollar_volume.rolling(window=252).median()

### 6. Rolling Same Volume Count
Checks if today's volume is exactly the same as yesterday's (a sign of bad data), then sums those occurrences.

# 1. Check if Volume(t) - Volume(t-1) equals 0
# .diff() calculates current row minus previous row
has_same_volume = (df_ohlcv['Volume'].diff() == 0).astype(int)

# 2. Sum the errors over window (W=252)
rolling_same_vol_count = has_same_volume.rolling(window=252).sum()

```

================================  

In [25]:
import pandas as pd
import numpy as np
import plotly.graph_objects as go
import ipywidgets as widgets

from dataclasses import dataclass, field
from typing import List, Dict, Optional, Any, Union, TypedDict
from collections import Counter
from datetime import datetime, date
from pandas.testing import assert_series_equal


# pd.set_option('display.max_rows', None)  display all rows
pd.set_option("display.max_rows", 100)
pd.set_option("display.max_columns", None)
pd.set_option("display.width", 1000)
pd.set_option("display.max_colwidth", 50)
pd.set_option("display.precision", 4)


# ==============================================================================
# GLOBAL SETTINGS: The "Control Panel" for the Strategy
# ==============================================================================

GLOBAL_SETTINGS = {
    # ENVIRONMENT (The "Where")
    "benchmark_ticker": "SPY",
    "calendar_ticker": "SPY",  # Used as the "Master Clock" for trading days
    # DATA SANITIZER (The "Glitches & Gaps" Protector)
    "handle_zeros_as_nan": True,  # Convert 0.0 prices to NaN to prevent math errors
    "max_data_gap_ffill": 1,  # Max consecutive days to "Forward Fill" missing data
    # IMPLICATION OF nan_price_replacement:
    # - This defines what happens if the "Forward Fill" limit is exceeded.
    # - If set to 0.0: A permanent data gap will look like a "total loss" (-100%).
    #   The equity curve will plummet. Good for "disaster detection."
    #   Sharpe and Sharpe(ATR) drop because: return (gets smaller) / std (gets larger)
    # - If set to np.nan: A permanent gap will cause portfolio calculations to return NaN.
    #   The chart may break or show gaps. Good for "math integrity."
    "nan_price_replacement": 0.0,
    # STRATEGY PARAMETERS (The "How")
    "atr_period": 14,  # Used for volatility normalization
    "quality_window": 252,  # 1 year lookback for liquidity/quality stats
    "quality_min_periods": 126,  # Min history required to judge a stock
    # QUALITY THRESHOLDS (The "Rules")
    "thresholds": {
        # HARD LIQUIDITY FLOOR
        # Logic: Calculates (Adj Close * Volume) daily, then takes the ROLLING MEDIAN
        # over the quality_window (252 days). Filters out stocks where the
        # typical daily dollar turnover is below this absolute value.
        "min_median_dollar_volume": 1_000_000,
        # DYNAMIC LIQUIDITY CUTOFF (Relative to Universe)
        # Logic: On the decision date, the engine calculates the X-quantile
        # of 'RollMedDollarVol' across ALL available stocks.
        # Setting this to 0.40 calculates the 60th percentile and requires
        # stocks to be above it‚Äîeffectively keeping only the TOP 60% of the market.
        "min_liquidity_percentile": 0.40,
        # PRICE/VOLUME STALENESS
        # Logic: Creates a binary flag (1 if Volume is 0 OR High equals Low).
        # It then calculates the ROLLING MEAN of this flag.
        # A value of 0.05 means the stock is rejected if it was "stale"
        # for more than 5% of the trading days in the rolling window.
        "max_stale_pct": 0.05,
        # DATA INTEGRITY (FROZEN VOLUME)
        # Logic: Checks if Volume is identical to the previous day (Volume.diff() == 0).
        # It calculates the ROLLING SUM of these occurrences over the window.
        # If the exact same volume is reported more than 10 times, the stock
        # is rejected as having "frozen" or low-quality data.
        "max_same_vol_count": 10,
    },
}


# ==============================================================================
# SECTION A: CORE HELPER FUNCTIONS & FEATURE GENERATION
# (Unchanged from previous version)
# ==============================================================================
# ... (Keep generate_features, calculate_gain, calculate_sharpe,
#      calculate_sharpe_atr, calculate_buy_and_hold_performance as is) ...


def generate_features(
    df_ohlcv: pd.DataFrame,
    atr_period: int = 14,
    quality_window: int = 252,
    quality_min_periods: int = 126,
) -> pd.DataFrame:
    # 1. Sort and Group
    if not df_ohlcv.index.is_monotonic_increasing:
        df_ohlcv = df_ohlcv.sort_index()
    grouped = df_ohlcv.groupby(level="Ticker")

    # 2. ATR Calculation (Existing)
    prev_close = grouped["Adj Close"].shift(1)
    tr = pd.concat(
        [
            df_ohlcv["Adj High"] - df_ohlcv["Adj Low"],
            abs(df_ohlcv["Adj High"] - prev_close),
            abs(df_ohlcv["Adj Low"] - prev_close),
        ],
        axis=1,
    ).max(axis=1, skipna=False)

    atr = tr.groupby(level="Ticker").transform(
        lambda x: x.ewm(alpha=1 / atr_period, adjust=False).mean()
    )
    atrp = (atr / df_ohlcv["Adj Close"]).replace([np.inf, -np.inf], np.nan)

    # 3. --- NEW: MOMENTUM / RETURN FEATURES ---
    # We calculate percentage change for specific windows useful for Pullbacks (3D, 5D) and Trends (21D)
    # Note: We use grouped.pct_change to respect Ticker boundaries
    roc_1 = grouped["Adj Close"].pct_change(1)
    roc_3 = grouped["Adj Close"].pct_change(3)
    roc_5 = grouped["Adj Close"].pct_change(5)
    roc_10 = grouped["Adj Close"].pct_change(10)
    roc_21 = grouped["Adj Close"].pct_change(21)

    indicator_df = pd.DataFrame(
        {
            "ATR": atr,
            "ATRP": atrp,
            "ROC_1": roc_1,
            "ROC_3": roc_3,
            "ROC_5": roc_5,
            "ROC_10": roc_10,
            "ROC_21": roc_21,
        }
    )

    # 4. Quality/Liquidity Features (Existing)
    quality_temp_df = pd.DataFrame(
        {
            "IsStale": np.where(
                (df_ohlcv["Volume"] == 0)
                | (df_ohlcv["Adj High"] == df_ohlcv["Adj Low"]),
                1,
                0,
            ),
            "DollarVolume": df_ohlcv["Adj Close"] * df_ohlcv["Volume"],
            "HasSameVolume": (grouped["Volume"].diff() == 0).astype(int),
        },
        index=df_ohlcv.index,
    )

    rolling_result = (
        quality_temp_df.groupby(level="Ticker")
        .rolling(window=quality_window, min_periods=quality_min_periods)
        .agg({"IsStale": "mean", "DollarVolume": "median", "HasSameVolume": "sum"})
        .rename(
            columns={
                "IsStale": "RollingStalePct",
                "DollarVolume": "RollMedDollarVol",
                "HasSameVolume": "RollingSameVolCount",
            }
        )
        .reset_index(level=0, drop=True)
    )

    # 5. Merge
    return pd.concat([indicator_df, rolling_result], axis=1)


def calculate_buy_and_hold_performance(
    df_close, features_df, tickers, start_date, end_date
):
    if not tickers:
        return pd.Series(dtype=float), pd.Series(dtype=float), pd.Series(dtype=float)
    ticker_counts = Counter(tickers)
    initial_weights = pd.Series({t: c / len(tickers) for t, c in ticker_counts.items()})
    prices_raw = df_close[initial_weights.index.tolist()].loc[start_date:end_date]
    if prices_raw.dropna(how="all").empty:
        return pd.Series(dtype=float), pd.Series(dtype=float), pd.Series(dtype=float)
    prices_norm = prices_raw.div(prices_raw.bfill().iloc[0])
    weighted_growth = prices_norm.mul(initial_weights, axis="columns")
    value_series = weighted_growth.sum(axis=1)
    return_series = value_series.ffill().pct_change()
    full_idx = pd.MultiIndex.from_product(
        [initial_weights.index.tolist(), return_series.index], names=["Ticker", "Date"]
    )
    feat_subset = features_df.reindex(full_idx)["ATRP"].unstack(level="Ticker")
    atrp_series = (
        weighted_growth.div(value_series, axis="index").align(
            feat_subset, join="inner", axis=1
        )[0]
        * weighted_growth.div(value_series, axis="index").align(
            feat_subset, join="inner", axis=1
        )[1]
    ).sum(axis=1)
    return value_series, return_series, atrp_series


def calculate_summary_gain(price_series: pd.Series) -> float:
    """REPORTING: Returns the total return of a single series."""
    if price_series.dropna().shape[0] < 2: 
        return 0.0
    # (Final Price / Starting Price) - 1
    res = (price_series.ffill().iloc[-1] / price_series.bfill().iloc[0]) - 1
    return float(res) if np.isfinite(res) else 0.0

def calculate_cross_sectional_gain(price_df: pd.DataFrame) -> pd.Series:
    """RANKING: Returns the total return for every ticker in the universe."""
    if price_df.empty: 
        return pd.Series(dtype=float)
    # Vectorized calculation across all columns (tickers)
    res = (price_df.ffill().iloc[-1] / price_df.bfill().iloc[0]) - 1
    return res.replace([np.inf, -np.inf], np.nan).fillna(0.0)

def calculate_summary_sharpe(return_series: pd.Series) -> float:
    """REPORTING: Returns a single Reward value."""
    if return_series.dropna().shape[0] < 2: return 0.0
    mu, std = return_series.mean(), return_series.std()
    
    # SENIOR FIX: Volatility floor to prevent 'Infinity' or 'Exploding' rewards
    if std < 1e-6: return 0.0 
    
    with np.errstate(divide='ignore', invalid='ignore'):
        res = (mu / std) * np.sqrt(252)
    return float(res) if np.isfinite(res) else 0.0

def calculate_cross_sectional_sharpe(return_df: pd.DataFrame) -> pd.Series:
    """RANKING: Returns a Series of values for the whole universe."""
    if return_df.empty: return pd.Series(dtype=float)
    mu, std = return_df.mean(), return_df.std()
    
    with np.errstate(divide='ignore', invalid='ignore'):
        res = (mu / std) * np.sqrt(252)
    
    # SENIOR FIX: Convert 'Broken' data (std=0) into 0.0 reward
    return res.replace([np.inf, -np.inf], np.nan).fillna(0.0)

def calculate_summary_sharpe_atr(return_series: pd.Series, atrp_input: Union[pd.Series, float]) -> float:
    """REPORTING: Returns a single Reward value normalized by Volatility."""
    if return_series.dropna().shape[0] < 2: return 0.0
    avg_atrp = atrp_input.mean() if hasattr(atrp_input, 'mean') else atrp_input
    
    if avg_atrp < 1e-6: return 0.0 # Safety floor
    
    with np.errstate(divide='ignore', invalid='ignore'):
        res = return_series.mean() / avg_atrp
    return float(res) if np.isfinite(res) else 0.0

def calculate_cross_sectional_sharpe_atr(return_df: pd.DataFrame, atrp_series: pd.Series) -> pd.Series:
    """RANKING: Returns a Series of Volatility-normalized values."""
    with np.errstate(divide='ignore', invalid='ignore'):
        res = return_df.mean() / atrp_series
    return res.replace([np.inf, -np.inf], np.nan).fillna(0.0)


# ==============================================================================
# SECTION B: METRIC REGISTRY (UPDATED VARIABLES)
# ==============================================================================


class MarketObservation(TypedDict):
    """
    The 'STATE' (Observation) in Reinforcement Learning.
    This defines the context given to the agent to make a decision.
    """

    lookback_returns: pd.DataFrame  # (Time x Tickers)
    lookback_close: pd.DataFrame  # (Time x Tickers)
    atrp: pd.Series  # (Tickers,) - The mean ATR% over lookback
    roc_1: pd.Series  # (Tickers,) - Current 1D Momentum
    roc_3: pd.Series  # ... etc
    roc_5: pd.Series
    roc_10: pd.Series
    roc_21: pd.Series


# # Use the centralized helper functions for calculations


def metric_price(obs: MarketObservation) -> pd.Series:
    return calculate_cross_sectional_gain(obs["lookback_close"])


def metric_sharpe(obs: MarketObservation) -> pd.Series:
    return calculate_cross_sectional_sharpe(obs["lookback_returns"])


def metric_sharpe_atr(obs: MarketObservation) -> pd.Series:
    return calculate_cross_sectional_sharpe_atr(obs["lookback_returns"], obs["atrp"])


METRIC_REGISTRY = {
    "Price": metric_price,
    "Sharpe": metric_sharpe,
    "Sharpe (ATR)": metric_sharpe_atr,
    "Momentum 1D": lambda obs: obs["roc_1"],
    "Momentum 3D": lambda obs: obs["roc_3"],
    "Momentum 5D": lambda obs: obs["roc_5"],
    "Momentum 10D": lambda obs: obs["roc_10"],
    "Momentum 1M": lambda obs: obs["roc_21"],
    "Pullback 1D": lambda obs: -obs["roc_1"],
    "Pullback 3D": lambda obs: -obs["roc_3"],
    "Pullback 5D": lambda obs: -obs["roc_5"],
    "Pullback 10D": lambda obs: -obs["roc_10"],
    "Pullback 1M": lambda obs: -obs["roc_21"],
}


# ==============================================================================
# SECTION C: DATA CONTRACTS (UPDATED v2.2 - Verification Ready)
# ==============================================================================


@dataclass
class EngineInput:
    mode: str
    start_date: pd.Timestamp
    lookback_period: int
    holding_period: int
    metric: str
    benchmark_ticker: str
    rank_start: int = 1
    rank_end: int = 10
    # Default factory pulls from Global thresholds
    quality_thresholds: Dict[str, float] = field(
        default_factory=lambda: GLOBAL_SETTINGS["thresholds"].copy()
    )
    manual_tickers: List[str] = field(default_factory=list)
    debug: bool = False


@dataclass
class EngineOutput:
    portfolio_series: pd.Series
    benchmark_series: pd.Series
    normalized_plot_data: pd.DataFrame
    tickers: List[str]
    initial_weights: pd.Series
    perf_metrics: Dict[str, float]
    results_df: pd.DataFrame

    # Dates
    start_date: pd.Timestamp
    decision_date: pd.Timestamp
    buy_date: pd.Timestamp
    holding_end_date: pd.Timestamp

    error_msg: Optional[str] = None
    debug_data: Optional[Dict[str, Any]] = None


class AlphaEngine:
    def __init__(
        self,
        df_ohlcv: pd.DataFrame,
        features_df: pd.DataFrame = None,
        df_close_wide: pd.DataFrame = None,
        master_ticker: str = GLOBAL_SETTINGS["calendar_ticker"],
    ):
        print("--- ‚öôÔ∏è Initializing AlphaEngine v2.2 (Sanitized) ---")

        # 1. SETUP PRICES (CLEAN-AT-ENTRY)
        if df_close_wide is not None:
            self.df_close = df_close_wide
        else:
            print("üê¢ Pivoting and Sanitizing Price Data...")
            self.df_close = df_ohlcv["Adj Close"].unstack(level=0)

        # APPLY DATA SANITIZER LOGIC
        if GLOBAL_SETTINGS["handle_zeros_as_nan"]:
            # Replace 0.0 with NaN so math functions (mean/std) ignore them
            self.df_close = self.df_close.replace(0, np.nan)

        # Smooth over 1-2 day glitches (The "FNV" Fix)
        self.df_close = self.df_close.ffill(limit=GLOBAL_SETTINGS["max_data_gap_ffill"])

        # Handle the remaining "unfillable" gaps
        self.df_close = self.df_close.fillna(GLOBAL_SETTINGS["nan_price_replacement"])

        # 2. SETUP FEATURES
        if features_df is not None:
            self.features_df = features_df
        else:
            # We pass the cleaned price data if needed, or calculate from raw
            self.features_df = generate_features(
                df_ohlcv,
                atr_period=GLOBAL_SETTINGS["atr_period"],
                quality_window=GLOBAL_SETTINGS["quality_window"],
                quality_min_periods=GLOBAL_SETTINGS["quality_min_periods"],
            )

        # 3. Setup Calendar
        if master_ticker not in self.df_close.columns:
            master_ticker = self.df_close.columns[0]
        self.trading_calendar = (
            self.df_close[master_ticker].dropna().index.unique().sort_values()
        )

    def run(self, inputs: EngineInput) -> EngineOutput:

        # --- Step 1: Validate Timeline ---
        dates, error = self._validate_timeline(inputs)
        if error:
            return self._error_result(error)
        (safe_start, safe_decision, safe_buy, safe_end) = dates

        # --- Step 2: Select Assets ---
        tickers_to_trade, results_table, debug_dict, error = self._select_tickers(
            inputs, safe_start, safe_decision
        )
        if error:
            return self._error_result(error)

        # --- Step 3: Generate Equity Curves ---
        p_val, p_ret, p_atrp = calculate_buy_and_hold_performance(
            self.df_close, self.features_df, tickers_to_trade, safe_start, safe_end
        )
        b_val, b_ret, b_atrp = calculate_buy_and_hold_performance(
            self.df_close,
            self.features_df,
            [inputs.benchmark_ticker],
            safe_start,
            safe_end,
        )

        # --- Step 4: Calculate Unified Metrics & CAPTURE SLICES ---
        metrics = {}

        # Portfolio Calculation
        p_metrics, p_slices = self._calculate_period_metrics(
            p_val, p_ret, p_atrp, safe_decision, safe_buy, prefix="p"
        )
        metrics.update(p_metrics)

        # Benchmark Calculation
        b_metrics, b_slices = self._calculate_period_metrics(
            b_val, b_ret, b_atrp, safe_decision, safe_buy, prefix="b"
        )
        metrics.update(b_metrics)

        # Store Verification Data
        debug_dict["verification"] = {"portfolio": p_slices, "benchmark": b_slices}
        # --- VERIFICATION ADDITION: Portfolio Raw Components ---
        debug_dict["portfolio_raw_components"] = {
            "prices": self.df_close[tickers_to_trade].loc[safe_start:safe_end],
            "atrp": self.features_df.loc[(tickers_to_trade, slice(safe_start, safe_end)), "ATRP"].unstack(level=0)
        }
        # Add benchmark components for ATRP verification
        debug_dict["benchmark_raw_components"] = {
            "atrp": self.features_df.loc[([inputs.benchmark_ticker], slice(safe_start, safe_end)), "ATRP"].unstack(level=0)
        }

        # --- Step 5: Final Packaging ---
        plot_data = self._get_normalized_plot_data(
            tickers_to_trade, safe_start, safe_end
        )

        if not plot_data.empty and not results_table.empty:
            holding_period_slice = plot_data.loc[safe_buy:]
            if len(holding_period_slice) > 0:
                gains = (
                    holding_period_slice.iloc[-1] / holding_period_slice.iloc[0]
                ) - 1
                results_table["Holding Gain"] = results_table.index.map(gains)

        ticker_counts = Counter(tickers_to_trade)
        weights = pd.Series(
            {t: c / len(tickers_to_trade) for t, c in ticker_counts.items()}
        )

        return EngineOutput(
            portfolio_series=p_val,
            benchmark_series=b_val,
            normalized_plot_data=plot_data,
            tickers=tickers_to_trade,
            initial_weights=weights,
            perf_metrics=metrics,
            results_df=results_table,
            start_date=safe_start,
            decision_date=safe_decision,
            buy_date=safe_buy,
            holding_end_date=safe_end,
            error_msg=None,
            debug_data=debug_dict,
        )

    # ==============================================================================
    # INTERNAL LOGIC MODULES
    # ==============================================================================

    def _validate_timeline(self, inputs: EngineInput):
        cal = self.trading_calendar
        last_idx = len(cal) - 1

        if len(cal) <= inputs.lookback_period:
            return (
                None,
                f"‚ùå Dataset too small.\nNeed > {inputs.lookback_period} days of history.",
            )

        # 2. Check "Past" Constraints (Lookback)
        min_decision_date = cal[inputs.lookback_period]
        if inputs.start_date < min_decision_date:
            # Added \n here
            return None, (
                f"‚ùå Not enough history for a {inputs.lookback_period}-day lookback.\n"
                f"Earliest valid Decision Date: {min_decision_date.date()}"
            )

        # 3. Check "Future" Constraints (Entry T+1 and Holding Period)
        required_future_days = 1 + inputs.holding_period
        latest_valid_idx = last_idx - required_future_days

        if latest_valid_idx < 0:
            return (
                None,
                f"‚ùå Holding period too long.\n{inputs.holding_period} days exceeds available data.",
            )

        # If user picked a date beyond the available "future" runway
        if inputs.start_date > cal[latest_valid_idx]:
            latest_date = cal[latest_valid_idx].date()
            # Added \n here and shortened the text slightly to fit better
            return None, (
                f"‚ùå Decision Date too late for a {inputs.holding_period}-day hold.\n"
                f"Latest valid date: {latest_date}. Please move picker back."
            )

        # 4. Map the safe indices
        decision_idx = cal.searchsorted(inputs.start_date)
        if decision_idx > latest_valid_idx:
            decision_idx = latest_valid_idx

        start_idx = decision_idx - inputs.lookback_period
        entry_idx = decision_idx + 1
        end_idx = entry_idx + inputs.holding_period

        return (cal[start_idx], cal[decision_idx], cal[entry_idx], cal[end_idx]), None

    def _select_tickers(self, inputs: EngineInput, start_date, decision_date):
        debug_dict = {}
        if inputs.mode == "Manual List":
            validation_errors = []
            valid_tickers = []
            for t in inputs.manual_tickers:
                if t not in self.df_close.columns:
                    validation_errors.append(f"‚ùå {t}: Not found.")
                    continue
                if pd.isna(self.df_close.at[start_date, t]):
                    validation_errors.append(f"‚ö†Ô∏è {t}: No data on start date.")
                    continue
                valid_tickers.append(t)

            if validation_errors:
                return [], pd.DataFrame(), {}, "\n".join(validation_errors)
            if not valid_tickers:
                return [], pd.DataFrame(), {}, "No valid tickers found."
            return valid_tickers, pd.DataFrame(index=valid_tickers), {}, None

        else:  # Ranking
            audit_info = {}
            eligible_tickers = self._filter_universe(
                decision_date, inputs.quality_thresholds, audit_info
            )
            debug_dict["audit_liquidity"] = audit_info

            if not eligible_tickers:
                return (
                    [],
                    pd.DataFrame(),
                    debug_dict,
                    "No tickers passed quality filters.",
                )

            lookback_close = self.df_close.loc[
                start_date:decision_date, eligible_tickers
            ]
            idx_product = pd.MultiIndex.from_product(
                [eligible_tickers, lookback_close.index], names=["Ticker", "Date"]
            )

            feat_slice_current = self.features_df.xs(
                decision_date, level="Date"
            ).reindex(eligible_tickers)
            feat_slice_period = self.features_df.loc[
                (slice(None), lookback_close.index), :
            ].reindex(idx_product)
            atrp_mean = feat_slice_period["ATRP"].groupby(level="Ticker").mean()

            # 1. Package the Observation (The 'State')
            observation: MarketObservation = {
                "lookback_close": lookback_close,
                "lookback_returns": lookback_close.ffill().pct_change(),
                "atrp": atrp_mean,
                "roc_1": feat_slice_current["ROC_1"],
                "roc_3": feat_slice_current["ROC_3"],
                "roc_5": feat_slice_current["ROC_5"],
                "roc_10": feat_slice_current["ROC_10"],
                "roc_21": feat_slice_current["ROC_21"],
            }

            # 2. Run the Strategy (The 'Agent')
            if inputs.metric not in METRIC_REGISTRY:
                return [], pd.DataFrame(), {}, f"Strategy '{inputs.metric}' not found."

            metric_vals = METRIC_REGISTRY[inputs.metric](observation)
            sorted_tickers = metric_vals.sort_values(ascending=False)
            start_r = max(0, inputs.rank_start - 1)
            end_r = inputs.rank_end
            selected_tickers = sorted_tickers.iloc[start_r:end_r].index.tolist()

            # --- VERIFICATION ADDITION: Ranking Audit (Bot Version) ---
            debug_dict["full_universe_ranking"] = pd.DataFrame(
                {
                    "Strategy_Score": metric_vals,
                    "Lookback_Return_Ann": observation["lookback_returns"].mean() * 252,
                    "Lookback_ATRP": observation["atrp"],
                }
            )

            if not selected_tickers:
                return (
                    [],
                    pd.DataFrame(),
                    debug_dict,
                    "No tickers generated from ranking.",
                )

            results_table = pd.DataFrame(
                {
                    "Rank": range(
                        inputs.rank_start, inputs.rank_start + len(selected_tickers)
                    ),
                    "Ticker": selected_tickers,
                    "Strategy Value": sorted_tickers.loc[selected_tickers].values,
                }
            ).set_index("Ticker")

            return selected_tickers, results_table, debug_dict, None

    def _filter_universe(self, date_ts, thresholds, audit_container=None):
        avail_dates = (
            self.features_df.index.get_level_values("Date").unique().sort_values()
        )
        valid_dates = avail_dates[avail_dates <= date_ts]
        if valid_dates.empty:
            return []
        target_date = valid_dates[-1]
        day_features = self.features_df.xs(target_date, level="Date")

        vol_cutoff = thresholds.get("min_median_dollar_volume", 0)
        percentile_used = "N/A"
        if "min_liquidity_percentile" in thresholds:
            percentile_used = thresholds["min_liquidity_percentile"]
            dynamic_val = day_features["RollMedDollarVol"].quantile(percentile_used)
            vol_cutoff = max(vol_cutoff, dynamic_val)

        mask = (
            (day_features["RollMedDollarVol"] >= vol_cutoff)
            & (day_features["RollingStalePct"] <= thresholds["max_stale_pct"])
            & (day_features["RollingSameVolCount"] <= thresholds["max_same_vol_count"])
        )

        if audit_container is not None:
            audit_container["date"] = target_date
            audit_container["total_tickers_available"] = len(day_features)
            audit_container["percentile_setting"] = percentile_used
            audit_container["final_cutoff_usd"] = vol_cutoff
            audit_container["tickers_passed"] = mask.sum()
            snapshot = day_features.copy()
            snapshot["Calculated_Cutoff"] = vol_cutoff
            snapshot["Passed_Vol_Check"] = snapshot["RollMedDollarVol"] >= vol_cutoff
            snapshot["Passed_Final"] = mask
            snapshot = snapshot.sort_values("RollMedDollarVol", ascending=False)
            audit_container["universe_snapshot"] = snapshot

        return day_features[mask].index.tolist()



    def _calculate_period_metrics(
        self, val_series, ret_series, atrp_series, decision_date, buy_date, prefix
    ):
        """
        Returns (metrics_dict, verification_slices_dict)
        """
        metrics = {}
        slices = {}  # Store the exact Series used for math

        if val_series.empty:
            return metrics, slices

        # --- A. Define Slices (The 'State' Slices) ---

        # 1. Full (The entire timeline provided)
        # We create aliases here so the math section (Part B) has a perfect pattern.
        full_val = val_series
        full_ret = ret_series
        full_atrp = atrp_series

        # 2. Lookback (Start -> Decision Date)
        lookback_val = val_series.loc[:decision_date]
        lookback_ret = ret_series.loc[:decision_date]
        lookback_atrp = atrp_series.loc[lookback_ret.index]

        # 3. Holding (Buy Date -> End)
        holding_val = val_series.loc[buy_date:]
        if not holding_val.empty:
            holding_ret = holding_val.pct_change()
            holding_atrp = atrp_series.reindex(holding_ret.index)
        else:
            holding_ret = pd.Series(dtype=float)
            holding_atrp = pd.Series(dtype=float)

        # --- B. Calculate Metrics (The 'Rewards') ---
        # We use the 'summary' versions because we are evaluating ONE equity curve.
        # Notice the clean pattern: Full -> Lookback -> Holding

        # 1. Gain Metrics (Total Growth)
        metrics[f"full_{prefix}_gain"] = calculate_summary_gain(full_val)
        metrics[f"lookback_{prefix}_gain"] = calculate_summary_gain(lookback_val)
        metrics[f"holding_{prefix}_gain"] = calculate_summary_gain(holding_val)

        # 2. Sharpe Metrics (Risk-Adjusted Returns)
        metrics[f"full_{prefix}_sharpe"] = calculate_summary_sharpe(full_ret)
        metrics[f"lookback_{prefix}_sharpe"] = calculate_summary_sharpe(lookback_ret)
        metrics[f"holding_{prefix}_sharpe"] = calculate_summary_sharpe(holding_ret)

        # 3. Sharpe (ATR) Metrics (Volatility-Normalized Returns)
        metrics[f"full_{prefix}_sharpe_atr"] = calculate_summary_sharpe_atr(full_ret, full_atrp)
        metrics[f"lookback_{prefix}_sharpe_atr"] = calculate_summary_sharpe_atr(lookback_ret, lookback_atrp)
        metrics[f"holding_{prefix}_sharpe_atr"] = calculate_summary_sharpe_atr(holding_ret, holding_atrp)

        # --- C. Populate Slices for Verification (FIXED) ---
        slices["full_val"] = full_val
        slices["full_ret"] = full_ret
        slices["full_atrp"] = full_atrp
        
        slices["lookback_val"] = lookback_val
        slices["lookback_ret"] = lookback_ret
        slices["lookback_atrp"] = lookback_atrp
        
        slices["holding_val"] = holding_val
        slices["holding_ret"] = holding_ret
        slices["holding_atrp"] = holding_atrp

        return metrics, slices

    def _get_normalized_plot_data(self, tickers, start_date, end_date):
        if not tickers:
            return pd.DataFrame()
        data = self.df_close[list(set(tickers))].loc[start_date:end_date]
        if data.empty:
            return pd.DataFrame()
        return data / data.bfill().iloc[0]

    def _error_result(self, msg):
        return EngineOutput(
            portfolio_series=pd.Series(dtype=float),
            benchmark_series=pd.Series(dtype=float),
            normalized_plot_data=pd.DataFrame(),
            tickers=[],
            initial_weights=pd.Series(dtype=float),
            perf_metrics={},
            results_df=pd.DataFrame(),
            start_date=pd.Timestamp.min,
            decision_date=pd.Timestamp.min,
            buy_date=pd.Timestamp.min,
            holding_end_date=pd.Timestamp.min,
            error_msg=msg,
        )


# ==============================================================================
# SECTION E: THE UI (Visualization) - UPDATED v2.4 (Complete Timeline)
# ==============================================================================


def plot_walk_forward_analyzer(
    df_ohlcv,
    precomputed_features=None,
    precomputed_close=None,
    default_start_date="2020-01-01",
    default_lookback=126,
    default_holding=63,
    default_strategy="Sharpe (ATR)",
    default_rank_start=1,
    default_rank_end=10,
    default_benchmark_ticker="SPY",
    master_calendar_ticker="XOM",
    quality_thresholds=None,
    debug=False,
):

    engine = AlphaEngine(
        df_ohlcv,
        features_df=precomputed_features,
        df_close_wide=precomputed_close,
        master_ticker=master_calendar_ticker,
    )

    # Initialize containers
    results_container = [None]
    debug_container = [{}]

    # If no thresholds passed, use the global Source of Truth
    if quality_thresholds is None:
        quality_thresholds = GLOBAL_SETTINGS["thresholds"]

    # --- Widgets ---
    mode_selector = widgets.RadioButtons(
        options=["Ranking", "Manual List"],
        value="Ranking",
        description="Mode:",
        layout={"width": "max-content"},
        style={"description_width": "initial"},
    )
    lookback_input = widgets.IntText(
        value=default_lookback,
        description="Lookback (Days):",
        layout={"width": "200px"},
        style={"description_width": "initial"},
    )
    decision_date_picker = widgets.DatePicker(
        description="Decision Date:",
        value=pd.to_datetime(default_start_date),
        layout={"width": "auto"},
        style={"description_width": "initial"},
    )
    holding_input = widgets.IntText(
        value=default_holding,
        description="Holding (Days):",
        layout={"width": "200px"},
        style={"description_width": "initial"},
    )
    strategy_dropdown = widgets.Dropdown(
        options=list(METRIC_REGISTRY.keys()),
        value=default_strategy,
        description="Strategy:",
        layout={"width": "220px"},
        style={"description_width": "initial"},
    )
    benchmark_input = widgets.Text(
        value=default_benchmark_ticker,
        description="Benchmark:",
        placeholder="Enter Ticker",
        layout={"width": "180px"},
        style={"description_width": "initial"},
    )
    rank_start_input = widgets.IntText(
        value=default_rank_start,
        description="Rank Start:",
        layout={"width": "150px"},
        style={"description_width": "initial"},
    )
    rank_end_input = widgets.IntText(
        value=default_rank_end,
        description="Rank End:",
        layout={"width": "150px"},
        style={"description_width": "initial"},
    )
    manual_tickers_input = widgets.Textarea(
        value="",
        placeholder="Enter tickers...",
        description="Manual Tickers:",
        layout={"width": "400px", "height": "80px"},
        style={"description_width": "initial"},
    )
    update_button = widgets.Button(description="Run Simulation", button_style="primary")
    ticker_list_output = widgets.Output()

    # --- Layouts ---
    timeline_box = widgets.HBox(
        [lookback_input, decision_date_picker, holding_input],
        layout=widgets.Layout(
            justify_content="space-between",
            border="1px solid #ddd",
            padding="10px",
            margin="5px",
        ),
    )
    strategy_box = widgets.HBox([strategy_dropdown, benchmark_input])
    ranking_box = widgets.HBox([rank_start_input, rank_end_input])

    def on_mode_change(c):
        ranking_box.layout.display = "flex" if c["new"] == "Ranking" else "none"
        manual_tickers_input.layout.display = (
            "none" if c["new"] == "Ranking" else "flex"
        )
        strategy_dropdown.disabled = c["new"] == "Manual List"

    mode_selector.observe(on_mode_change, names="value")
    on_mode_change({"new": mode_selector.value})

    ui = widgets.VBox(
        [
            widgets.HTML(
                "<b>1. Timeline Configuration:</b> (Past <--- Decision ---> Future)"
            ),
            timeline_box,
            widgets.HTML("<b>2. Strategy Settings:</b>"),
            widgets.HBox([mode_selector, strategy_box]),
            ranking_box,
            manual_tickers_input,
            widgets.HTML("<hr>"),
            update_button,
            ticker_list_output,
        ],
        layout=widgets.Layout(margin="10px 0 20px 0"),
    )

    fig = go.FigureWidget()
    fig.update_layout(
        title="Event-Driven Walk-Forward Analysis",
        height=600,
        template="plotly_white",
        hovermode="x unified",
    )
    for i in range(50):
        fig.add_trace(go.Scatter(visible=False, line=dict(width=2)))
    fig.add_trace(
        go.Scatter(
            name="Benchmark",
            visible=True,
            line=dict(color="black", width=3, dash="dash"),
        )
    )
    fig.add_trace(
        go.Scatter(
            name="Group Portfolio", visible=True, line=dict(color="green", width=3)
        )
    )

    # --- Update Logic ---
    def update_plot(b):
        ticker_list_output.clear_output()
        manual_list = [
            t.strip().upper()
            for t in manual_tickers_input.value.split(",")
            if t.strip()
        ]
        decision_date_raw = pd.to_datetime(decision_date_picker.value)

        inputs = EngineInput(
            mode=mode_selector.value,
            start_date=decision_date_raw,
            lookback_period=lookback_input.value,
            holding_period=holding_input.value,
            metric=strategy_dropdown.value,
            benchmark_ticker=benchmark_input.value.strip().upper(),
            rank_start=rank_start_input.value,
            rank_end=rank_end_input.value,
            quality_thresholds=quality_thresholds,
            manual_tickers=manual_list,
            debug=debug,
        )

        # --- CAPTURE INPUTS FOR AUDIT ---
        debug_container[0]["inputs"] = inputs

        with ticker_list_output:
            res = engine.run(inputs)
            results_container[0] = res

            # --- MERGE ENGINE DEBUG DATA ---
            if res.debug_data:
                debug_container[0].update(res.debug_data)

            if res.error_msg:
                print(f"‚ö†Ô∏è Simulation Stopped: {res.error_msg}")
                return

            # Plotting
            with fig.batch_update():
                cols = res.normalized_plot_data.columns.tolist()
                for i in range(50):
                    if i < len(cols):
                        fig.data[i].update(
                            x=res.normalized_plot_data.index,
                            y=res.normalized_plot_data[cols[i]],
                            name=cols[i],
                            visible=True,
                        )
                    else:
                        fig.data[i].visible = False

                fig.data[50].update(
                    x=res.benchmark_series.index,
                    y=res.benchmark_series.values,
                    name=f"Benchmark ({inputs.benchmark_ticker})",
                    visible=not res.benchmark_series.empty,
                )
                fig.data[51].update(
                    x=res.portfolio_series.index,
                    y=res.portfolio_series.values,
                    visible=True,
                )

                # Visual Lines
                fig.layout.shapes = [
                    dict(
                        type="line",
                        x0=res.decision_date,
                        y0=0,
                        x1=res.decision_date,
                        y1=1,
                        xref="x",
                        yref="paper",
                        line=dict(color="red", width=2, dash="dash"),
                    ),
                    dict(
                        type="line",
                        x0=res.buy_date,
                        y0=0,
                        x1=res.buy_date,
                        y1=1,
                        xref="x",
                        yref="paper",
                        line=dict(color="blue", width=2, dash="dot"),
                    ),
                ]

                fig.layout.annotations = [
                    dict(
                        x=res.decision_date,
                        y=0.05,
                        xref="x",
                        yref="paper",
                        text="DECISION",
                        showarrow=False,
                        bgcolor="red",
                        font=dict(color="white"),
                    ),
                    dict(
                        x=res.buy_date,
                        y=1.0,
                        xref="x",
                        yref="paper",
                        text="ENTRY (T+1)",
                        showarrow=False,
                        bgcolor="blue",
                        font=dict(color="white"),
                    ),
                ]

            start_date = res.start_date.date()
            act_date = res.decision_date.date()
            entry_date = res.buy_date.date()

            # Liquidity Audit Print
            if (
                inputs.mode == "Ranking"
                and res.debug_data
                and "audit_liquidity" in res.debug_data
            ):
                audit = res.debug_data["audit_liquidity"]
                if audit:
                    raw_percentile = audit.get("percentile_setting", 0)
                    keep_pct = (
                        1 - raw_percentile
                    ) * 100  # Calculates the actual portion kept
                    cut_val = audit.get("final_cutoff_usd", 0)

                    print("-" * 60)
                    print(f"üîç LIQUIDITY CHECK (On Decision Date: {act_date})")
                    print(
                        f"   Universe Size: {audit.get('total_tickers_available')} tickers"
                    )
                    print(
                        f"   Liquidity Threshold: {raw_percentile*100:.0f}th Percentile"
                    )
                    print(f"   Action: Keeping the Top {keep_pct:.0f}% of Market")
                    print(f"   Calculated Cutoff: ${cut_val:,.0f} / day")
                    print(f"   Tickers Remaining: {audit.get('tickers_passed')}")
                    print("-" * 60)

            # --- UPDATED TIMELINE PRINT ---
            print(
                f"Timeline: Start [ {start_date} ] --> Decision [ {act_date} ] --> Cash (1d) --> Entry [ {entry_date} ] --> End [ {res.holding_end_date.date()} ]"
            )

            if inputs.mode == "Ranking":
                print(f"Ranked Tickers ({len(res.tickers)}):")
                for i in range(0, len(res.tickers), 10):
                    print(", ".join(res.tickers[i : i + 10]))
            else:
                print("Manual Portfolio Tickers:")
                for i in range(0, len(res.tickers), 10):
                    print(", ".join(res.tickers[i : i + 10]))

            m = res.perf_metrics

            # --- DRY UI GENERATION ---
            # 1. Define the metrics we want to display
            metrics_to_show = [
                ("Gain", "gain"),
                ("Sharpe", "sharpe"),
                ("Sharpe (ATR)", "sharpe_atr"),
            ]

            rows = []
            for label, key in metrics_to_show:
                p_row = {
                    "Metric": f"Group {label}",
                    "Full": m.get(f"full_p_{key}"),
                    "Lookback": m.get(f"lookback_p_{key}"),
                    "Holding": m.get(f"holding_p_{key}"),
                }
                b_row = {
                    "Metric": f"Benchmark {label}",
                    "Full": m.get(f"full_b_{key}"),
                    "Lookback": m.get(f"lookback_b_{key}"),
                    "Holding": m.get(f"holding_b_{key}"),
                }

                # Delta calculation
                d_row = {"Metric": f"== {label} Delta"}
                for col in ["Full", "Lookback", "Holding"]:
                    d_row[col] = (p_row[col] or 0) - (b_row[col] or 0)

                rows.extend([p_row, b_row, d_row])

            df_report = pd.DataFrame(rows).set_index("Metric")

            # --- 2. STYLING (The "Senior" Design) ---
            # --- 1. PREP DATA (Flattening the Index) ---
            # We convert the index to a column so "Metric" sits on the same row as other headers
            df_report = pd.DataFrame(rows)
            df_report = df_report.set_index("Metric")

            # --- 2. THE STYLING (Sleek & Proportional) ---
            def apply_sleek_style(styler):
                # Match notebook font size (usually 13px)
                styler.format("{:+.4f}", na_rep="N/A")

                # Dynamic Row Highlighting
                def row_logic(row):
                    if "Delta" in row.name:
                        return [
                            "background-color: #f9f9f9; font-weight: 600; border-top: 1px solid #ddd"
                        ] * len(row)
                    if "Group" in row.name:
                        return ["color: #2c5e8f; background-color: #fcfdfe"] * len(row)
                    return ["color: #555"] * len(
                        row
                    )  # Benchmark rows are slightly muted

                styler.apply(row_logic, axis=1)

                styler.set_table_styles(
                    [
                        # Base Table Font - Scaling down to match standard text
                        {
                            "selector": "",
                            "props": [
                                ("font-family", "inherit"),
                                ("font-size", "12px"),
                                ("border-collapse", "collapse"),
                                ("width", "auto"),
                                ("margin-left", "0"),
                            ],
                        },
                        # Header Row - Flattened and Muted
                        {
                            "selector": "th",
                            "props": [
                                ("background-color", "white"),
                                ("color", "#222"),
                                ("font-weight", "600"),
                                ("padding", "6px 12px"),
                                ("border-bottom", "2px solid #444"),
                                ("text-align", "center"),
                                (
                                    "vertical-align",
                                    "bottom",
                                ),  # Aligns 'Metric' with others
                            ],
                        },
                        # Index Column (The "Metric" labels)
                        {
                            "selector": "th.row_heading",
                            "props": [
                                ("text-align", "left"),
                                ("padding-right", "30px"),
                                ("border-bottom", "1px solid #eee"),
                            ],
                        },
                        # Cell Data - Tighter padding
                        {
                            "selector": "td",
                            "props": [
                                ("padding", "4px 12px"),
                                ("border-bottom", "1px solid #eee"),
                            ],
                        },
                        # Remove the extra "Index Name" row completely
                        {
                            "selector": "thead tr:nth-child(1) th",
                            "props": [("display", "table-cell")],
                        },
                    ]
                )

                # Hack to fix the 'Metric' alignment:
                # We remove the index name and set it as the horizontal label for the index
                styler.index.name = None

                return styler

            display(apply_sleek_style(df_report.style))

    update_button.on_click(update_plot)
    update_plot(None)
    display(ui, fig)
    return results_container, debug_container

In [30]:
# ==============================================================================
# SECTION F: UTILITIES
# ==============================================================================


# def print_nested(d, indent=0, width=4):
#     """Pretty-print any nested dict/list/tuple combination."""
#     spacing = " " * indent
#     if isinstance(d, dict):
#         for k, v in d.items():
#             print(f"{spacing}{k}:")
#             print_nested(v, indent + width, width)
#     elif isinstance(d, (list, tuple)):
#         for item in d:
#             print_nested(item, indent, width)
#     else:
#         print(f"{spacing}{d}")


def print_nested(d, indent=0, width=4):
    """Pretty-print nested containers.
    Leaves are rendered as two lines:  key\\nvalue ."""
    spacing = " " * indent

    def _kind(node):
        if not isinstance(node, dict):
            return None
        return "sep" if all(isinstance(v, dict) for v in node.values()) else "nest"

    if isinstance(d, dict):
        for k, v in d.items():
            kind = _kind(v)
            tag = "" if kind is None else f"  [{'SEP' if kind == 'sep' else 'NEST'}]"
            print(f"{spacing}{k}{tag}")
            print_nested(v, indent + width, width)

    elif isinstance(d, (list, tuple)):
        for idx, item in enumerate(d):
            print(f"{spacing}[{idx}]")
            print_nested(item, indent + width, width)

    else:  # leaf ‚Äì primitive value
        print(f"{spacing}{d}")


def get_ticker_OHLCV(
    df_ohlcv: pd.DataFrame,
    tickers: Union[str, List[str]],
    date_start: str,
    date_end: str,
    return_format: str = "dataframe",
    verbose: bool = True,
) -> Union[pd.DataFrame, dict]:
    """
    Get OHLCV data for specified tickers within a date range.

    Parameters
    ----------
    df_ohlcv : pd.DataFrame
        DataFrame with MultiIndex of (ticker, date) and OHLCV columns
    tickers : str or list of str
        Ticker symbol(s) to retrieve
    date_start : str
        Start date in 'YYYY-MM-DD' format
    date_end : str
        End date in 'YYYY-MM-DD' format
    return_format : str, optional
        Format to return data in. Options:
        - 'dataframe': Single DataFrame with MultiIndex (default)
        - 'dict': Dictionary with tickers as keys and DataFrames as values
        - 'separate': List of separate DataFrames for each ticker
    verbose : bool, optional
        Whether to print summary information (default: True)

    Returns
    -------
    Union[pd.DataFrame, dict, list]
        Filtered OHLCV data in specified format

    Raises
    ------
    ValueError
        If input parameters are invalid
    KeyError
        If tickers not found in DataFrame

    Examples
    --------
    >>> # Get data for single ticker
    >>> vlo_data = get_ticker_OHLCV(df_ohlcv, 'VLO', '2025-08-13', '2025-09-04')

    >>> # Get data for multiple tickers
    >>> multi_data = get_ticker_OHLCV(df_ohlcv, ['VLO', 'JPST'], '2025-08-13', '2025-09-04')

    >>> # Get data as dictionary
    >>> data_dict = get_ticker_OHLCV(df_ohlcv, ['VLO', 'JPST'], '2025-08-13',
    ...                              '2025-09-04', return_format='dict')
    """

    # Input validation
    if not isinstance(df_ohlcv, pd.DataFrame):
        raise TypeError("df_ohlcv must be a pandas DataFrame")

    if not isinstance(df_ohlcv.index, pd.MultiIndex):
        raise ValueError("DataFrame must have MultiIndex of (ticker, date)")

    if len(df_ohlcv.index.levels) != 2:
        raise ValueError("MultiIndex must have exactly 2 levels: (ticker, date)")

    # Convert single ticker to list for consistent processing
    if isinstance(tickers, str):
        tickers = [tickers]
    elif not isinstance(tickers, list):
        raise TypeError("tickers must be a string or list of strings")

    # Convert dates to Timestamps
    try:
        start_date = pd.Timestamp(date_start)
        end_date = pd.Timestamp(date_end)
    except ValueError as e:
        raise ValueError(f"Invalid date format. Use 'YYYY-MM-DD': {e}")

    if start_date > end_date:
        raise ValueError("date_start must be before or equal to date_end")

    # Check if tickers exist in the DataFrame
    available_tickers = df_ohlcv.index.get_level_values(0).unique()
    missing_tickers = [t for t in tickers if t not in available_tickers]

    if missing_tickers:
        raise KeyError(f"Ticker(s) not found in DataFrame: {missing_tickers}")

    # Filter the data using MultiIndex slicing
    try:
        filtered_data = df_ohlcv.loc[(tickers, slice(date_start, date_end)), :]
    except Exception as e:
        raise ValueError(f"Error filtering data: {e}")

    # Handle empty results
    if filtered_data.empty:
        if verbose:
            print(
                f"No data found for tickers {tickers} in date range {date_start} to {date_end}"
            )
        return filtered_data

    # Print summary if verbose
    if verbose:
        print(
            f"Data retrieved for {len(tickers)} ticker(s) from {date_start} to {date_end}"
        )
        print(f"Total rows: {len(filtered_data)}")
        print(
            f"Date range in data: {filtered_data.index.get_level_values(1).min()} to "
            f"{filtered_data.index.get_level_values(1).max()}"
        )

        # Print ticker-specific counts
        ticker_counts = filtered_data.index.get_level_values(0).value_counts()
        for ticker in tickers:
            count = ticker_counts.get(ticker, 0)
            if count > 0:
                print(f"  {ticker}: {count} rows")
            else:
                print(f"  {ticker}: No data in range")

    # Return in requested format
    if return_format == "dict":
        result = {}
        for ticker in tickers:
            try:
                result[ticker] = filtered_data.xs(ticker, level=0).loc[
                    date_start:date_end
                ]
            except KeyError:
                result[ticker] = pd.DataFrame()
        return result

    elif return_format == "separate":
        result = []
        for ticker in tickers:
            try:
                result.append(
                    filtered_data.xs(ticker, level=0).loc[date_start:date_end]
                )
            except KeyError:
                result.append(pd.DataFrame())
        return result

    elif return_format == "dataframe":
        return filtered_data

    else:
        raise ValueError(
            f"Invalid return_format: {return_format}. "
            f"Must be 'dataframe', 'dict', or 'separate'"
        )


def get_ticker_features(
    features_df: pd.DataFrame,
    tickers: Union[str, List[str]],
    date_start: str,
    date_end: str,
    return_format: str = "dataframe",
    verbose: bool = True,
) -> Union[pd.DataFrame, dict]:
    """
    Get features data for specified tickers within a date range.

    Parameters
    ----------
    features_df : pd.DataFrame
        DataFrame with MultiIndex of (ticker, date) and feature columns
    tickers : str or list of str
        Ticker symbol(s) to retrieve
    date_start : str
        Start date in 'YYYY-MM-DD' format
    date_end : str
        End date in 'YYYY-MM-DD' format
    return_format : str, optional
        Format to return data in. Options:
        - 'dataframe': Single DataFrame with MultiIndex (default)
        - 'dict': Dictionary with tickers as keys and DataFrames as values
        - 'separate': List of separate DataFrames for each ticker
    verbose : bool, optional
        Whether to print summary information (default: True)

    Returns
    -------
    Union[pd.DataFrame, dict, list]
        Filtered features data in specified format
    """
    # Convert single ticker to list for consistent processing
    if isinstance(tickers, str):
        tickers = [tickers]

    # Filter the data using MultiIndex slicing
    try:
        filtered_data = features_df.loc[(tickers, slice(date_start, date_end)), :]
    except Exception as e:
        if verbose:
            print(f"Error filtering data: {e}")
        return pd.DataFrame() if return_format == "dataframe" else {}

    # Handle empty results
    if filtered_data.empty:
        if verbose:
            print(
                f"No data found for tickers {tickers} in date range {date_start} to {date_end}"
            )
        return filtered_data

    # Print summary if verbose
    if verbose:
        print(
            f"Features data retrieved for {len(tickers)} ticker(s) from {date_start} to {date_end}"
        )
        print(f"Total rows: {len(filtered_data)}")
        print(
            f"Date range in data: {filtered_data.index.get_level_values(1).min()} to "
            f"{filtered_data.index.get_level_values(1).max()}"
        )
        print(f"Available features: {', '.join(filtered_data.columns.tolist())}")

        # Print ticker-specific counts
        ticker_counts = filtered_data.index.get_level_values(0).value_counts()
        for ticker in tickers:
            count = ticker_counts.get(ticker, 0)
            if count > 0:
                print(f"  {ticker}: {count} rows")
            else:
                print(f"  {ticker}: No data in range")

    # Return in requested format
    if return_format == "dict":
        result = {}
        for ticker in tickers:
            try:
                result[ticker] = filtered_data.xs(ticker, level=0).loc[
                    date_start:date_end
                ]
            except KeyError:
                result[ticker] = pd.DataFrame()
        return result

    elif return_format == "separate":
        result = []
        for ticker in tickers:
            try:
                result.append(
                    filtered_data.xs(ticker, level=0).loc[date_start:date_end]
                )
            except KeyError:
                result.append(pd.DataFrame())
        return result

    elif return_format == "dataframe":
        return filtered_data

    else:
        raise ValueError(
            f"Invalid return_format: {return_format}. "
            f"Must be 'dataframe', 'dict', or 'separate'"
        )


def create_combined_dict(
    df_ohlcv: pd.DataFrame,
    features_df: pd.DataFrame,
    tickers: Union[str, List[str]],
    date_start: str,
    date_end: str,
    verbose: bool = True,
) -> dict:
    """
    Create a combined dictionary with both OHLCV and features data for each ticker.

    Parameters:
    -----------
    df_ohlcv : pd.DataFrame
        DataFrame with OHLCV data (MultiIndex: ticker, date)
    features_df : pd.DataFrame
        DataFrame with features data (MultiIndex: ticker, date)
    tickers : str or list of str
        Ticker symbol(s) to retrieve
    date_start : str
        Start date in 'YYYY-MM-DD' format
    date_end : str
        End date in 'YYYY-MM-DD' format
    verbose : bool, optional
        Whether to print progress information (default: True)

    Returns:
    --------
    dict
        Dictionary with tickers as keys and combined DataFrames (OHLCV + features) as values
    """
    # Convert single ticker to list
    if isinstance(tickers, str):
        tickers = [tickers]

    if verbose:
        print(f"Creating combined dictionary for {len(tickers)} ticker(s)")
        print(f"Date range: {date_start} to {date_end}")
        print("=" * 60)

    # Get OHLCV data as dictionary
    ohlcv_dict = get_ticker_OHLCV(
        df_ohlcv, tickers, date_start, date_end, return_format="dict", verbose=verbose
    )

    # Get features data as dictionary
    features_dict = get_ticker_features(
        features_df,
        tickers,
        date_start,
        date_end,
        return_format="dict",
        verbose=verbose,
    )

    # Create combined_dict
    combined_dict = {}

    for ticker in tickers:
        if verbose:
            print(f"\nProcessing {ticker}...")

        # Check if ticker exists in both dictionaries
        if ticker in ohlcv_dict and ticker in features_dict:
            ohlcv_data = ohlcv_dict[ticker]
            features_data = features_dict[ticker]

            # Check if both dataframes have data
            if not ohlcv_data.empty and not features_data.empty:
                # Combine OHLCV and features data
                # Note: Both dataframes have the same index (dates), so we can concatenate
                combined_df = pd.concat([ohlcv_data, features_data], axis=1)

                # Ensure proper index naming
                combined_df.index.name = "Date"

                # Store in combined_dict
                combined_dict[ticker] = combined_df

                if verbose:
                    print(f"  ‚úì Successfully combined data")
                    print(f"  OHLCV shape: {ohlcv_data.shape}")
                    print(f"  Features shape: {features_data.shape}")
                    print(f"  Combined shape: {combined_df.shape}")
                    print(
                        f"  Date range: {combined_df.index.min()} to {combined_df.index.max()}"
                    )
            else:
                if verbose:
                    print(f"  ‚úó Cannot combine: One or both dataframes are empty")
                    print(f"    OHLCV empty: {ohlcv_data.empty}")
                    print(f"    Features empty: {features_data.empty}")
                combined_dict[ticker] = pd.DataFrame()
        else:
            if verbose:
                print(f"  ‚úó Ticker not found in both dictionaries")
                if ticker not in ohlcv_dict:
                    print(f"    Not in OHLCV data")
                if ticker not in features_dict:
                    print(f"    Not in features data")
            combined_dict[ticker] = pd.DataFrame()

    # Print summary
    if verbose:
        print("\n" + "=" * 60)
        print("SUMMARY")
        print("=" * 60)
        print(f"Total tickers processed: {len(tickers)}")

        tickers_with_data = [
            ticker for ticker, df in combined_dict.items() if not df.empty
        ]
        print(f"Tickers with combined data: {len(tickers_with_data)}")

        if tickers_with_data:
            print("\nTicker details:")
            for ticker in tickers_with_data:
                df = combined_dict[ticker]
                print(f"  {ticker}: {df.shape} - {df.index.min()} to {df.index.max()}")
                print(f"    Columns: {len(df.columns)}")

        empty_tickers = [ticker for ticker, df in combined_dict.items() if df.empty]
        if empty_tickers:
            print(f"\nTickers with no data: {', '.join(empty_tickers)}")

    return combined_dict

In [None]:
# ==============================================================================
# SECTION G: UNIT TEST FOR GENERATED FEATURES
# ==============================================================================


def test_true_range_calculation():
    """Test TR = max(High-Low, |High-PrevClose|, |Low-PrevClose|) using robust pandas testing"""
    print("Running test_true_range_calculation (Robust Version)...")

    # 1. SETUP: Create test data
    test_data = {
        "Adj Open": [100, 105, 95, 98, 102],
        "Adj High": [105, 108, 97, 102, 105],
        "Adj Low": [95, 103, 93, 100, 98],
        "Adj Close": [100, 106, 96, 99, 103],
        "Volume": [1000, 1200, 800, 900, 1100],
    }

    index = pd.MultiIndex.from_tuples(
        [
            ("TEST", pd.Timestamp("2024-01-01")),
            ("TEST", pd.Timestamp("2024-01-02")),
            ("TEST", pd.Timestamp("2024-01-03")),
            ("TEST", pd.Timestamp("2024-01-04")),
            ("TEST", pd.Timestamp("2024-01-05")),
        ],
        names=["Ticker", "Date"],
    )

    df_test = pd.DataFrame(test_data, index=index)
    print(f"test_true_range_df input:\n{df_test}\n")

    # 2. EXECUTION: Run function
    result = generate_features(df_test, quality_window=5, quality_min_periods=2)

    # 3. ASSERTION: Define EXACT expected series
    # Day 1: NaN (No prev close)
    # Day 2: 8.0 (PrevClose=100, High=108, Low=103. |108-100|=8)
    # Day 3: 13.0 (PrevClose=106, High=97, Low=93. |93-106|=13)
    # Day 4: 6.0 (PrevClose=96, High=102, Low=100. |102-96|=6)
    # Day 5: 7.0 (PrevClose=99, High=105, Low=98. High-Low=7)
    expected_tr_values = [np.nan, 8.0, 13.0, 6.0, 7.0]

    expected_series = pd.Series(
        expected_tr_values, index=result.index, name="TR", dtype="float64"
    )

    try:
        # Check Series Equality
        # check_exact=False allows for minor floating point differences (rtol=1e-4)
        assert_series_equal(result["TR"], expected_series, check_exact=False, rtol=1e-4)

        print("‚úÖ PD Testing Assertion Passed! All TR values match expected logic.")
        return True

    except AssertionError as e:
        print(f"‚ùå PD Testing Assertion Failed: {e}")

        # Helper output to see what went wrong
        print("\nDetailed Comparison (Actual vs Expected):")
        comparison = pd.concat(
            [result["TR"], expected_series], axis=1, keys=["Actual_TR", "Expected_TR"]
        )
        comparison["Diff"] = comparison["Actual_TR"] - comparison["Expected_TR"]
        print(comparison)

        return False


def test_atr_calculation():
    """Test ATR = EWMA of TR with alpha=1/period"""
    print("\n" + "=" * 50)
    print("Running test_atr_calculation...")

    # Test data with 5 days
    test_data = {
        "Adj Open": [100, 102, 103, 110, 108],
        "Adj High": [101, 103, 103, 112, 110],
        "Adj Low": [99, 101, 103, 108, 107],
        "Adj Close": [100, 102, 103, 111, 109],
        "Volume": [1000, 1000, 1000, 1000, 1000],  # All non-zero for simplicity
    }

    index = pd.MultiIndex.from_tuples(
        [("TEST", pd.Timestamp(f"2024-01-{i:02d}")) for i in range(1, 6)],
        names=["Ticker", "Date"],
    )

    df_test = pd.DataFrame(test_data, index=index)

    print(f"test_true_range_df:\n{df_test}\n")

    result = generate_features(df_test, atr_period=14)

    print("\nATR Calculation Results:")
    print(result[["TR", "ATR", "ATRP"]])

    # Manual calculation from our earlier example
    # CORRECTED EXPECTED VALUES WITH MORE PRECISION
    expected_atr = {
        "2024-01-02": 3.0,
        "2024-01-03": 40 / 14,  # ‚âà 2.857142857142857
        "2024-01-04": 646 / 196,  # ‚âà 3.2959183673469388
        "2024-01-05": 9182 / 2744,  # ‚âà 3.3462099125364433
    }

    all_passed = True
    for date_str, expected in expected_atr.items():
        actual = result.loc[("TEST", pd.Timestamp(date_str)), "ATR"]
        if abs(actual - expected) < 0.0001:
            print(f"‚úì {date_str} ATR: {actual:.6f} ‚âà {expected:.6f}")
        else:
            print(f"‚úó {date_str} ATR: {actual:.6f} != {expected:.6f}")
            all_passed = False

    if all_passed:
        print("\n‚úÖ All ATR tests passed!")
    else:
        print("\n‚ùå Some ATR tests failed!")

    return all_passed


def test_is_stale_calculation():
    """Test IsStale = 1 when Volume=0 OR High=Low"""
    print("\n" + "=" * 50)
    print("Running test_is_stale_calculation...")

    test_data = {
        "Adj Open": [100, 102, 103, 104],
        "Adj High": [101, 103, 103, 105],  # Day 3: High=Low
        "Adj Low": [99, 101, 103, 104],
        "Adj Close": [100, 102, 103, 105],
        "Volume": [1000, 0, 500, 1000],  # Day 2: Volume=0
    }

    index = pd.MultiIndex.from_tuples(
        [("TEST", pd.Timestamp(f"2024-01-{i:02d}")) for i in range(1, 5)],
        names=["Ticker", "Date"],
    )

    df_test = pd.DataFrame(test_data, index=index)

    print(f"test_is_stale_df:\n{df_test}\n")

    # Create IsStale manually to verify
    is_stale_manual = np.where(
        (df_test["Volume"] == 0) | (df_test["Adj High"] == df_test["Adj Low"]), 1, 0
    )

    print("\nüìä Manual IsStale Calculation:")
    print("=" * 50)
    print("IsStale = 1 if EITHER condition is true:")
    print("  1. Volume == 0")
    print("  2. Adj High == Adj Low (no price movement)")
    print("Otherwise, IsStale = 0")
    print("=" * 50)

    # Create a temporary DataFrame to display the calculation clearly
    manual_calc_df = df_test.copy()
    manual_calc_df["IsStale_Manual"] = is_stale_manual
    manual_calc_df["Volume==0"] = manual_calc_df["Volume"] == 0
    manual_calc_df["High==Low"] = (
        manual_calc_df["Adj High"] == manual_calc_df["Adj Low"]
    )

    print("\nCalculation details:")
    for idx, row in manual_calc_df.iterrows():
        ticker_date = f"{idx[0]}, {idx[1].strftime('%Y-%m-%d')}"
        conditions = []
        if row["Volume==0"]:
            conditions.append("Volume=0")
        if row["High==Low"]:
            conditions.append("High=Low")

        condition_str = " OR ".join(conditions) if conditions else "None (both False)"
        result = row["IsStale_Manual"]

        print(f"  {ticker_date}:")
        print(
            f"    Volume={row['Volume']}, High={row['Adj High']}, Low={row['Adj Low']}"
        )
        print(f"    Conditions met: {condition_str}")
        print(f"    ‚Üí IsStale = {result}")
        print()

    expected = [
        0,
        1,
        1,
        0,
    ]  # Day 1: normal, Day 2: vol=0, Day 3: high=low, Day 4: normal

    print(f"\nManual IsStale calculation: {is_stale_manual}")
    print(f"Expected: {expected}")

    if list(is_stale_manual) == expected:
        print("‚úì IsStale calculation logic is correct")
        return True
    else:
        print(
            f"‚úó IsStale calculation failed. Got {is_stale_manual}, expected {expected}"
        )
        return False


def test_multiple_tickers():
    """Test that calculations don't mix data between tickers"""
    print("\n" + "=" * 50)
    print("Running test_multiple_tickers...")

    test_data = {
        "Adj Open": [100, 102, 50, 51],
        "Adj High": [101, 103, 52, 53],
        "Adj Low": [99, 101, 48, 49],
        "Adj Close": [100, 102, 49, 52],
        "Volume": [1000, 1000, 2000, 2000],
    }

    index = pd.MultiIndex.from_tuples(
        [
            ("A", pd.Timestamp("2024-01-01")),
            ("A", pd.Timestamp("2024-01-02")),
            ("B", pd.Timestamp("2024-01-01")),
            ("B", pd.Timestamp("2024-01-02")),
        ],
        names=["Ticker", "Date"],
    )

    df_test = pd.DataFrame(test_data, index=index)

    print(f"test_multiple_tickers_df:\n{df_test}\n")

    result = generate_features(df_test)

    print("\nMultiple Ticker Results:")
    print(result[["TR", "ATR"]])

    # Ticker A day 2 TR should use A day 1 close, not B day 1 close
    tr_a2 = result.loc[("A", "2024-01-02"), "TR"]
    expected_a2 = 3.0  # max(103-101=2, |103-100|=3, |101-100|=1) = 3

    tr_b2 = result.loc[("B", "2024-01-02"), "TR"]
    expected_b2 = 4.0  # max(53-49=4, |53-49|=4, |49-49|=0) = 4

    tests_passed = 0
    total_tests = 2

    if abs(tr_a2 - expected_a2) < 0.0001:
        print(f"‚úì Ticker A TR: {tr_a2} (expected {expected_a2})")
        tests_passed += 1
    else:
        print(f"‚úó Ticker A TR: {tr_a2} != {expected_a2}")

    if abs(tr_b2 - expected_b2) < 0.0001:
        print(f"‚úì Ticker B TR: {tr_b2} (expected {expected_b2})")
        tests_passed += 1
    else:
        print(f"‚úó Ticker B TR: {tr_b2} != {expected_b2}")

    if tests_passed == total_tests:
        print("‚úÖ Ticker separation test passed!")
        return True
    else:
        print(f"‚ùå Ticker separation test failed: {tests_passed}/{total_tests} passed")
        return False


def test_edge_cases():
    """Test edge cases like zero price, single row, etc."""
    print("\n" + "=" * 50)
    print("Running test_edge_cases...")

    all_passed = True

    # Test 1: Very low price (penny stock)
    print("\n1. Testing penny stock with low price...")
    test_data = {
        "Adj Open": [0.10, 0.11],
        "Adj High": [0.10, 0.11],
        "Adj Low": [0.10, 0.11],
        "Adj Close": [0.10, 0.11],
        "Volume": [1000, 1000],
    }

    index = pd.MultiIndex.from_tuples(
        [
            ("PENNY", pd.Timestamp("2024-01-01")),
            ("PENNY", pd.Timestamp("2024-01-02")),
        ],
        names=["Ticker", "Date"],
    )

    df_penny = pd.DataFrame(test_data, index=index)

    print(f"df_penny_stock:\n{df_penny}\n")

    result = generate_features(df_penny)

    # Check ATRP is reasonable (not inf/nan)
    atrp_val = result.loc[("PENNY", "2024-01-02"), "ATRP"]
    if pd.isna(atrp_val) or np.isinf(atrp_val):
        print(f"‚úó Penny stock ATRP is {atrp_val} (should be finite)")
        all_passed = False
    else:
        print(f"‚úì Penny stock ATRP is {atrp_val:.4f}")

    # Test 2: Single row
    print("\n2. Testing single row data...")
    test_data_single = {
        "Adj Open": [100],
        "Adj High": [101],
        "Adj Low": [99],
        "Adj Close": [100],
        "Volume": [1000],
    }

    index_single = pd.MultiIndex.from_tuples(
        [("SINGLE", pd.Timestamp("2024-01-01"))], names=["Ticker", "Date"]
    )

    df_single = pd.DataFrame(test_data_single, index=index_single)

    print(f"df_single:\n{df_single}\n")

    result_single = generate_features(
        df_single, quality_window=3, quality_min_periods=2
    )

    # TR should be NaN (no previous close)
    if pd.isna(result_single.loc[("SINGLE", "2024-01-01"), "TR"]):
        print("‚úì Single row TR is NaN (correct)")
    else:
        print(
            f"‚úó Single row TR should be NaN but got {result_single.loc[('SINGLE', '2024-01-01'), 'TR']}"
        )
        all_passed = False

    # Rolling metrics should be NaN with min_periods=2
    if pd.isna(result_single.loc[("SINGLE", "2024-01-01"), "RollingStalePct"]):
        print("‚úì Single row rolling metrics are NaN (correct - insufficient periods)")
    else:
        print(
            f"‚úó Rolling metrics should be NaN but got {result_single.loc[('SINGLE', '2024-01-01'), 'RollingStalePct']}"
        )
        all_passed = False

    if all_passed:
        print("\n‚úÖ All edge case tests passed!")
    else:
        print("\n‚ùå Some edge case tests failed!")

    return all_passed


def test_zero_division_protection():
    """Test that Zero Price doesn't cause Inf values in ATRP"""
    print("\n" + "=" * 50)
    print("Running test_zero_division_protection...")

    test_data = {
        "Adj Open": [10, 10, 10],
        "Adj High": [12, 12, 12],
        "Adj Low": [8, 8, 8],
        "Adj Close": [10, 0, 10],  # Day 2 Price is ZERO
        "Volume": [1000, 1000, 1000],
    }
    index = pd.MultiIndex.from_tuples(
        [("ZERO", pd.Timestamp(f"2024-01-0{i}")) for i in range(1, 4)],
        names=["Ticker", "Date"],
    )
    df_test = pd.DataFrame(test_data, index=index)

    print(f"test_zero_division_protection_df:\n{df_test}\n")

    # Run features
    result = generate_features(df_test, atr_period=2)

    # Check Day 2 ATRP
    atrp_val = result.loc[("ZERO", "2024-01-02"), "ATRP"]

    if pd.isna(atrp_val):
        print("‚úÖ Zero Division Test Passed: ATRP is NaN when Close is 0.")
        return True
    elif np.isinf(atrp_val):
        print(f"‚ùå Zero Division Test Failed: ATRP is Infinite ({atrp_val}).")
        return False
    else:
        print(f"‚ùå Zero Division Test Failed: Unexpected value {atrp_val}")
        return False


def test_unsorted_input_handling():
    """Test that function handles unsorted dates correctly via sorting"""
    print("\n" + "=" * 50)
    print("Running test_unsorted_input_handling...")

    # Data is out of order: Day 2, Day 1, Day 3
    index = pd.MultiIndex.from_tuples(
        [
            ("A", pd.Timestamp("2024-01-02")),
            ("A", pd.Timestamp("2024-01-01")),
            ("A", pd.Timestamp("2024-01-03")),
        ],
        names=["Ticker", "Date"],
    )

    # Prices: 100 -> 105 -> 110
    # If processed in order given:
    # 1. 105 (No prev)
    # 2. 100 (Prev is 105) -> Change -5
    # 3. 110 (Prev is 100) -> Change +10

    # If sorted correctly:
    # 1. 100 (No prev)
    # 2. 105 (Prev is 100) -> Change +5
    # 3. 110 (Prev is 105) -> Change +5

    test_data = {
        "Adj Open": [100, 100, 100],
        "Adj High": [105, 100, 110],
        "Adj Low": [105, 100, 110],
        "Adj Close": [105, 100, 110],  # 105, 100, 110
        "Volume": [100, 100, 100],
    }

    df_test = pd.DataFrame(test_data, index=index)

    print(f"test_unsorted_input_handling_df:\n{df_test}\n")

    # Run features
    result = generate_features(df_test)

    # Inspect 2024-01-02 (Should be Day 2 in sorted order)
    # Prev close (Jan 1) was 100. Current High 105. TR should be roughly 5.
    tr_day_2 = result.loc[("A", "2024-01-02"), "TR"]

    # If it wasn't sorted, Day 2 would be treated as the first row (TR=NaN)
    # or compared against whatever came before it in memory.

    if pd.isna(tr_day_2):
        print("‚ùå Sorting Test Failed: Day 2 TR is NaN (likely treated as first row).")
        return False
    elif abs(tr_day_2 - 5.0) < 0.1:
        print("‚úÖ Sorting Test Passed: Logic applied in correct chronological order.")
        return True
    else:
        print(f"‚ùå Sorting Test Failed: Day 2 TR is {tr_day_2}, expected ~5.0")
        return False

    """Test TR = max(High-Low, |High-PrevClose|, |Low-PrevClose|) using robust pandas testing"""
    print("Running test_true_range_calculation (Robust Version)...")

    # 1. SETUP: Create test data
    test_data = {
        "Adj Open": [100, 105, 95, 98, 102],
        "Adj High": [105, 108, 97, 102, 105],
        "Adj Low": [95, 103, 93, 100, 98],
        "Adj Close": [100, 106, 96, 99, 103],
        "Volume": [1000, 1200, 800, 900, 1100],
    }

    index = pd.MultiIndex.from_tuples(
        [
            ("TEST", pd.Timestamp("2024-01-01")),
            ("TEST", pd.Timestamp("2024-01-02")),
            ("TEST", pd.Timestamp("2024-01-03")),
            ("TEST", pd.Timestamp("2024-01-04")),
            ("TEST", pd.Timestamp("2024-01-05")),
        ],
        names=["Ticker", "Date"],
    )

    df_test = pd.DataFrame(test_data, index=index)
    print(f"test_true_range_df input:\n{df_test}\n")

    # 2. EXECUTION: Run function
    result = generate_features(df_test, quality_window=5, quality_min_periods=2)

    # 3. ASSERTION: Define EXACT expected series
    # Day 1: NaN (No prev close)
    # Day 2: 8.0 (PrevClose=100, High=108, Low=103. |108-100|=8)
    # Day 3: 13.0 (PrevClose=106, High=97, Low=93. |93-106|=13)
    # Day 4: 6.0 (PrevClose=96, High=102, Low=100. |102-96|=6)
    # Day 5: 7.0 (PrevClose=99, High=105, Low=98. High-Low=7)
    expected_tr_values = [np.nan, 8.0, 13.0, 6.0, 7.0]

    expected_series = pd.Series(
        expected_tr_values, index=result.index, name="TR", dtype="float64"
    )

    try:
        # Check Series Equality
        # check_exact=False allows for minor floating point differences (rtol=1e-4)
        assert_series_equal(result["TR"], expected_series, check_exact=False, rtol=1e-4)

        print("‚úÖ PD Testing Assertion Passed! All TR values match expected logic.")
        return True

    except AssertionError as e:
        print(f"‚ùå PD Testing Assertion Failed: {e}")

        # Helper output to see what went wrong
        print("\nDetailed Comparison (Actual vs Expected):")
        comparison = pd.concat(
            [result["TR"], expected_series], axis=1, keys=["Actual_TR", "Expected_TR"]
        )
        comparison["Diff"] = comparison["Actual_TR"] - comparison["Expected_TR"]
        print(comparison)

        return False


def test_quality_rolling_features():
    """
    Test RollingStalePct, RollMedDollarVol, and RollingSameVolCount
    verifying logic for Stale(Vol=0), Stale(H=L), SameVolume, and Median calculations.
    """
    print("\n" + "=" * 50)
    print("Running test_quality_rolling_features...")

    # 1. SETUP: Create specific test data
    # We set up 5 days to test a window of 4
    test_data = {
        # Day 1: Normal Base Day. $Vol = 10*100 = 1000.
        # Day 2: Same Volume as D1. $Vol = 10*100 = 1000.
        # Day 3: Stale (Volume=0). $Vol = 20*0 = 0.
        # Day 4: Stale (High=Low). $Vol = 20*50 = 1000.
        # Day 5: Normal High Vol. $Vol = 30*200 = 6000.
        "Adj Open": [10, 10, 20, 20, 30],
        "Adj High": [12, 12, 22, 20, 35],  # Day 4 High=20
        "Adj Low": [8, 8, 18, 20, 25],  # Day 4 Low=20 (H=L)
        "Adj Close": [10, 10, 20, 20, 30],
        "Volume": [100, 100, 0, 50, 200],  # Day 2 same as D1, Day 3 is 0
    }

    index = pd.MultiIndex.from_tuples(
        [("TEST", pd.Timestamp(f"2024-01-0{i}")) for i in range(1, 6)],
        names=["Ticker", "Date"],
    )

    df_test = pd.DataFrame(test_data, index=index)
    print(f"Input Data:\n{df_test}")

    # 2. EXECUTION: Use window=4, min_periods=2 to capture partial rolling
    # We expect Day 1 to be NaN (count=1 < min_periods=2)
    result = generate_features(df_test, quality_window=4, quality_min_periods=2)

    # 3. VERIFICATION

    # --- A. Test RollingStalePct ---
    print("\n--- Testing RollingStalePct ---")
    # Logic:
    # Day 1: IsStale=0. Result=NaN (min_periods)
    # Day 2: IsStale=0. Window=[0,0]. Mean=0.0
    # Day 3: IsStale=1 (Vol=0). Window=[0,0,1]. Mean=1/3 (~0.333)
    # Day 4: IsStale=1 (High=Low). Window=[0,0,1,1]. Mean=2/4 = 0.5
    # Day 5: IsStale=0. Window=[0,1,1,0] (Day 1 drops off). Mean=2/4 = 0.5

    expected_stale = pd.Series(
        [np.nan, 0.0, 1 / 3, 0.5, 0.5], index=result.index, name="RollingStalePct"
    )

    try:
        assert_series_equal(
            result["RollingStalePct"], expected_stale, check_exact=False, rtol=1e-4
        )
        print("‚úÖ RollingStalePct Passed")
    except AssertionError as e:
        print(f"‚ùå RollingStalePct Failed: {e}")
        return False

    # --- B. Test RollMedDollarVol ---
    print("\n--- Testing RollMedDollarVol ---")
    # Logic: $Vol = Close * Volume
    # D1: 1000. Result=NaN
    # D2: 1000. Window=[1000, 1000]. Median=1000
    # D3: 0.    Window=[1000, 1000, 0]. Sorted=[0, 1000, 1000]. Median=1000
    # D4: 1000. Window=[1000, 1000, 0, 1000]. Sorted=[0, 1000, 1000, 1000]. Median=(1000+1000)/2 = 1000
    # D5: 6000. Window=[1000, 0, 1000, 6000]. Sorted=[0, 1000, 1000, 6000]. Median=1000

    expected_dollar = pd.Series(
        [np.nan, 1000.0, 1000.0, 1000.0, 1000.0],
        index=result.index,
        name="RollMedDollarVol",
    )

    try:
        assert_series_equal(
            result["RollMedDollarVol"], expected_dollar, check_exact=False, rtol=1e-4
        )
        print("‚úÖ RollMedDollarVol Passed")
    except AssertionError as e:
        print(f"‚ùå RollMedDollarVol Failed: {e}")
        return False

    # --- C. Test RollingSameVolCount ---
    print("\n--- Testing RollingSameVolCount ---")
    # Logic: HasSameVolume = (Volume == PrevVolume)
    # D1: NaN/0 (First row diff is NaN, astype(int) -> 0). Result=NaN (min_periods)
    # D2: 1 (100==100). Window=[0, 1]. Sum=1
    # D3: 0 (0!=100).   Window=[0, 1, 0]. Sum=1
    # D4: 0 (50!=0).    Window=[0, 1, 0, 0]. Sum=1
    # D5: 0 (200!=50).  Window=[1, 0, 0, 0] (D1 drops). Sum=1

    expected_same_vol = pd.Series(
        [np.nan, 1.0, 1.0, 1.0, 1.0], index=result.index, name="RollingSameVolCount"
    )

    try:
        assert_series_equal(
            result["RollingSameVolCount"],
            expected_same_vol,
            check_exact=False,
            rtol=1e-4,
        )
        print("‚úÖ RollingSameVolCount Passed")
    except AssertionError as e:
        print(f"‚ùå RollingSameVolCount Failed: {e}")
        return False

    print("\nüéâ All Rolling Quality Feature Tests Passed!")
    return True


def test_math_metrics():
    """Verifies calculate_gain and calculate_sharpe are mathematically precise."""
    print("Running test_math_metrics...")
    all_passed = True

    # --- 1. Test calculate_gain ---
    s_gain = pd.Series([100, 105, 110])
    res_gain = calculate_gain(s_gain)
    if abs(res_gain - 0.10) < 1e-9:
        print("‚úÖ Gain Calc (Positive): Passed")
    else:
        print(f"‚ùå Gain Calc (Positive): Failed. Got {res_gain}")
        all_passed = False

    s_loss = pd.Series([100, 95, 90])
    res_loss = calculate_gain(s_loss)
    if abs(res_loss - (-0.10)) < 1e-9:
        print("‚úÖ Gain Calc (Negative): Passed")
    else:
        print(f"‚ùå Gain Calc (Negative): Failed. Got {res_loss}")
        all_passed = False

    # --- 2. Test calculate_sharpe ---
    s_flat = pd.Series([0.01, 0.01, 0.01])
    res_sharpe_flat = calculate_sharpe(s_flat)
    if res_sharpe_flat == 0.0:
        print("‚úÖ Sharpe (Zero Volatility): Passed (Returns 0.0)")
    else:
        print(f"‚ùå Sharpe (Zero Volatility): Failed. Got {res_sharpe_flat}")
        all_passed = False

    return all_passed


def test_engine_lag_logic():
    """
    CRITICAL TEST: Verifies the '1-Day Lag' logic using NEW Variable Names.
    """
    print("\nRunning test_engine_lag_logic...")

    # 1. Setup Mock Data
    dates = pd.to_datetime(["2024-01-01", "2024-01-02", "2024-01-03", "2024-01-04"])
    # Prices:
    # Jan 1: 100
    # Jan 2: 100 (Decision T0)
    # Jan 3: 110 (Entry T1 - We buy HERE)
    # Jan 4: 121 (Exit - We sell HERE)
    prices = [100.0, 100.0, 110.0, 121.0]

    df_ohlcv = pd.DataFrame(
        {
            "Adj Close": prices,
            "Adj High": prices,
            "Adj Low": prices,
            "Adj Open": prices,
            "Volume": [1000] * 4,
        },
        index=pd.MultiIndex.from_product([["MOCK"], dates], names=["Ticker", "Date"]),
    )

    # 2. Initialize Engine
    engine = AlphaEngine(df_ohlcv)

    # 3. Run Strategy
    # Decision Date = 2024-01-02.
    inputs = EngineInput(
        mode="Manual List",
        start_date=pd.Timestamp("2024-01-02"),
        lookback_period=1,
        holding_period=1,  # <--- FIXED: Set to 1 to fit the 4-day dataset
        metric="Price",
        benchmark_ticker="MOCK",
        manual_tickers=["MOCK"],
    )

    res = engine.run(inputs)

    # --- SAFETY CHECK ---
    if res.error_msg:
        print(f"‚ùå TEST FAILED: Engine returned error: {res.error_msg}")
        return False

    # 4. Analyze Results
    metrics = res.perf_metrics

    print(f"  Decision Date: {res.decision_date.date()}")
    print(f"  Buy Date (T+1): {res.buy_date.date()}")

    # Use the NEW metric key
    if "holding_p_gain" in metrics:
        holding_gain = metrics["holding_p_gain"]
        print(f"  Holding Gain: {holding_gain:.4f}")
    else:
        print("‚ùå 'holding_p_gain' not found in metrics!")
        return False

    # Verification Logic
    # Buy @ Jan 3 Close (110) -> Sell @ Jan 4 Close (121)
    # Math: (121 / 110) - 1 = 1.1 - 1 = 0.10
    expected_gain = 0.10

    if abs(holding_gain - expected_gain) < 1e-4:
        print(f"‚úÖ LAG VERIFIED: Holding Gain is {holding_gain:.2%}.")
        return True
    else:
        print(f"‚ùå LAG FAILURE: Holding Gain is {holding_gain:.2%}.")
        print(f"   Expected {expected_gain:.2%} (based on 121/110).")
        return False


def run_all_tests():
    """Run all tests"""
    print("=" * 60)
    print("STARTING UNIT TESTS")
    print("=" * 60)

    results = {}

    # Run the math tests
    results["Math Metrics"] = test_math_metrics()

    # Run the engine logic test (Requires updated Engine code)
    try:
        results["Engine Lag Logic"] = test_engine_lag_logic()
    except Exception as e:
        print(f"‚ùå Engine Logic crashed: {e}")
        results["Engine Lag Logic"] = False

    print("\n" + "=" * 60)
    print("TEST SUMMARY")
    print("=" * 60)

    passed = sum(results.values())
    total = len(results)

    for test_name, result in results.items():
        status = "‚úÖ PASS" if result else "‚ùå FAIL"
        print(f"{status}: {test_name}")

    print("\n" + "=" * 60)

    return passed == total


run_all_tests()

In [None]:
# ==============================================================================
# SECTION H: UNIT TEST FOR GENERATED FEATURES
# ==============================================================================


def verify_portfolio_construction(df_close_wide, res_container):
    """
    Independent Auditor:
    1. Reads Tickers & Weights from the Engine Output.
    2. Re-pulls raw price data from the source.
    3. Re-calculates the weighted sum (Portfolio Value).
    4. Compares it against the Engine's reported Portfolio Series.
    """
    if not res_container[0]:
        print("‚ö†Ô∏è No results found. Run simulation first.")
        return

    res = res_container[0]

    print(f"--- üèóÔ∏è PORTFOLIO CONSTRUCTION AUDIT ---")
    print(f"Period: {res.start_date.date()} to {res.holding_end_date.date()}")
    print(f"Tickers ({len(res.tickers)}): {', '.join(res.tickers)}")

    # 1. RECONSTRUCT WEIGHTS
    # We verify that weights sum to 1.0
    weights = res.initial_weights
    print(f"\n1. WEIGHT DISTRIBUTION:")
    print(weights.to_string())
    if abs(weights.sum() - 1.0) > 1e-6:
        print(f"‚ö†Ô∏è WARNING: Weights do not sum to 1.0! Sum: {weights.sum()}")
    else:
        print(f"‚úÖ Weights sum to 1.0")

    # 2. RECONSTRUCT RAW DATA
    # We pull the exact slice the engine claimed to use
    raw_slice = df_close_wide[res.tickers].loc[res.start_date : res.holding_end_date]

    # 3. RECONSTRUCT NORMALIZATION (The "Index" Logic)
    # The engine normalizes everyone to start at 1.0 based on the first valid price
    start_prices = raw_slice.bfill().iloc[0]
    normalized_prices = raw_slice / start_prices

    # 4. APPLY WEIGHTS
    # Multiply every column by its weight
    weighted_components = normalized_prices.mul(weights)

    # 5. CALCULATE SHADOW SERIES
    # Sum across the row
    shadow_series = weighted_components.sum(axis=1)

    # 6. COMPARE VS ENGINE
    engine_series = res.portfolio_series

    # Align indexes just in case
    shadow_series, engine_series = shadow_series.align(engine_series, join="inner")

    delta = shadow_series - engine_series
    max_error = delta.abs().max()

    print(f"\n2. SERIES COMPARISON:")
    df_audit = pd.DataFrame(
        {
            "Shadow_Calc": shadow_series,
            "Engine_Output": engine_series,
            "Difference": delta,
        }
    )

    # Show Head (Start of sim)
    print("\n--- Start of Simulation (Head) ---")
    print(df_audit.head())

    # Show T1 (The Entry Point)
    if res.buy_date in df_audit.index:
        print(f"\n--- Entry Date (T+1: {res.buy_date.date()}) ---")
        print(df_audit.loc[[res.buy_date]])

    # Show Tail (End of sim)
    print("\n--- End of Simulation (Tail) ---")
    print(df_audit.tail())

    print("-" * 60)
    if max_error < 1e-9:
        print(
            f"‚úÖ SUCCESS: Portfolio Construction Verified (Max Error: {max_error:.12f})"
        )
    else:
        print(f"‚ùå FAILURE: Series Mismatch (Max Error: {max_error:.12f})")

    return df_audit


def verify_ticker_ranking_logic(df_close_wide, features_df, inputs, res_container):
    """
    Shadow Ranker:
    1. Re-applies Liquidity Filters on the full universe.
    2. Re-calculates the Strategy Metric for ALL eligible stocks.
    3. Sorts them independently.
    4. Slices the Top N.
    5. Compares the "Shadow List" against the "Engine List".
    """
    if not res_container[0]:
        print("‚ö†Ô∏è No results found. Run simulation first.")
        return

    res = res_container[0]
    decision_date = res.decision_date
    print(f"--- üïµÔ∏è TICKER SELECTION AUDIT ---")
    print(f"Decision Date: {decision_date.date()}")
    print(f"Strategy:      {inputs.metric}")
    print(f"Rank Window:   {inputs.rank_start} to {inputs.rank_end}")

    # --- STEP 1: SHADOW FILTERING (Replicating Universe Selection) ---
    # We manually inspect the feature dataframe at the decision date
    day_feats = features_df.xs(decision_date, level="Date")

    # Re-apply thresholds manually
    thresh = inputs.quality_thresholds

    # Handle dynamic percentile if present
    vol_cutoff = thresh.get("min_median_dollar_volume", 0)
    if "min_liquidity_percentile" in thresh:
        dynamic_val = day_feats["RollMedDollarVol"].quantile(
            thresh["min_liquidity_percentile"]
        )
        vol_cutoff = max(vol_cutoff, dynamic_val)

    # Boolean Mask
    mask = (
        (day_feats["RollMedDollarVol"] >= vol_cutoff)
        & (day_feats["RollingStalePct"] <= thresh["max_stale_pct"])
        & (day_feats["RollingSameVolCount"] <= thresh["max_same_vol_count"])
    )

    shadow_universe = day_feats[mask].index.tolist()

    print(f"\n1. UNIVERSE FILTERING:")
    print(f"   Total Tickers: {len(day_feats)}")
    print(f"   Eligible:      {len(shadow_universe)}")

    # --- STEP 2: SHADOW SCORING (Re-calculating the Metric) ---
    # We need to build the 'ingredients' dictionary for the metric function
    # but ONLY for the eligible universe

    start_date = res.start_date  # Start of lookback

    # Pull Raw Data
    lookback_close = df_close_wide.loc[start_date:decision_date, shadow_universe]

    # Re-assemble Ingredients (Independent of Engine internals)
    # We grab features specifically for the ranking
    feat_slice_period = features_df.loc[(slice(None), lookback_close.index), :].reindex(
        pd.MultiIndex.from_product(
            [shadow_universe, lookback_close.index], names=["Ticker", "Date"]
        )
    )
    atrp_mean = feat_slice_period["ATRP"].groupby(level="Ticker").mean()
    current_feats = day_feats.loc[shadow_universe]

    shadow_ingredients = {
        "lookback_close": lookback_close,
        "lookback_returns": lookback_close.ffill().pct_change(),
        "atrp": atrp_mean,
        "roc_1": current_feats["ROC_1"],
        "roc_3": current_feats["ROC_3"],
        "roc_5": current_feats["ROC_5"],
        "roc_10": current_feats["ROC_10"],
        "roc_21": current_feats["ROC_21"],
    }

    # Calculate Score
    if inputs.metric not in METRIC_REGISTRY:
        print(f"‚ùå Cannot verify unknown metric: {inputs.metric}")
        return

    shadow_scores = METRIC_REGISTRY[inputs.metric](shadow_ingredients)
    shadow_scores = shadow_scores.sort_values(ascending=False)

    # --- STEP 3: SHADOW RANKING ---
    # Apply the Rank Window (e.g., 1 to 10)
    # Python index is 0-based, so Rank 1 is index 0.
    start_idx = max(0, inputs.rank_start - 1)
    end_idx = inputs.rank_end

    shadow_picks = shadow_scores.iloc[start_idx:end_idx]
    shadow_tickers = shadow_picks.index.tolist()

    # --- STEP 4: COMPARISON ---
    engine_tickers = res.tickers

    print(f"\n2. SELECTION COMPARISON:")

    match = shadow_tickers == engine_tickers

    comparison_df = pd.DataFrame(
        {
            "Rank": range(inputs.rank_start, inputs.rank_start + len(shadow_tickers)),
            "Shadow_Ticker": shadow_tickers,
            "Shadow_Score": shadow_picks.values,
            "Engine_Ticker": (
                engine_tickers + [""] * (len(shadow_tickers) - len(engine_tickers))
                if len(shadow_tickers) > len(engine_tickers)
                else engine_tickers[: len(shadow_tickers)]
            ),
        }
    ).set_index("Rank")

    # Check for mismatches
    comparison_df["MATCH"] = (
        comparison_df["Shadow_Ticker"] == comparison_df["Engine_Ticker"]
    )

    print(comparison_df)
    print("-" * 60)

    if match:
        print("‚úÖ SUCCESS: Ticker Selection Logic Verified (Exact Match).")
    else:
        print("‚ùå FAILURE: Ticker Mismatch detected.")
        # Debugging hint
        if len(shadow_tickers) != len(engine_tickers):
            print(
                f"   Count Mismatch: Shadow {len(shadow_tickers)} vs Engine {len(engine_tickers)}"
            )
        else:
            print("   Order or Identity Mismatch. Check Strategy Logic.")


def verify_engine_integrity(res_container, debug_container):
    """
    Loops through reported metrics and recalculates them using the
    raw verification series stored in debug_container.
    """
    if not res_container[0] or not debug_container[0]:
        print("‚ö†Ô∏è No results found. Run the simulation first.")
        return

    metrics = res_container[0].perf_metrics
    verification_data = debug_container[0]["verification"]

    print(
        f"{'METRIC':<30} | {'REPORTED':<12} | {'CALCULATED':<12} | {'DIFF':<12} | {'STATUS'}"
    )
    print("-" * 85)

    # 1. Map prefixes to the keys in the verification dictionary
    targets = [("p", "portfolio"), ("b", "benchmark")]

    # 2. Define the periods we explicitly saved in v2.2
    # (Note: v2.2 saved Lookback and Holding. Full is implied but not explicitly saved in slices)
    periods = ["lookback", "holding"]

    for prefix, target_key in targets:
        slices = verification_data[target_key]

        for period in periods:
            # --- A. VERIFY GAIN ---
            metric_name = f"{period}_{prefix}_gain"
            val_key = f"{period}_val"

            if val_key in slices and not slices[val_key].empty:
                s = slices[val_key]
                # Gain Formula: (End / Start) - 1
                calc_gain = (s.iloc[-1] / s.iloc[0]) - 1
                reported_gain = metrics.get(metric_name, 0.0)

                diff = abs(calc_gain - reported_gain)
                status = "‚úÖ PASS" if diff < 1e-8 else "‚ùå FAIL"
                print(
                    f"{metric_name:<30} | {reported_gain:12.6f} | {calc_gain:12.6f} | {diff:12.9f} | {status}"
                )

            # --- B. VERIFY SHARPE ---
            metric_name = f"{period}_{prefix}_sharpe"
            ret_key = f"{period}_ret"

            if ret_key in slices and not slices[ret_key].empty:
                s = slices[ret_key]
                # Sharpe Formula: (Mean / Std) * sqrt(252)
                # Note: Pandas std() uses ddof=1 by default, which is correct
                if s.std() > 0:
                    calc_sharpe = (s.mean() / s.std()) * np.sqrt(252)
                else:
                    calc_sharpe = 0.0

                reported_sharpe = metrics.get(metric_name, 0.0)

                diff = abs(calc_sharpe - reported_sharpe)
                status = "‚úÖ PASS" if diff < 1e-8 else "‚ùå FAIL"
                print(
                    f"{metric_name:<30} | {reported_sharpe:12.6f} | {calc_sharpe:12.6f} | {diff:12.9f} | {status}"
                )

            # --- C. VERIFY SHARPE (ATR) ---
            metric_name = f"{period}_{prefix}_sharpe_atr"
            atrp_key = f"{period}_atrp"

            # Only verify if we saved the ATRP slice (v2.2 saves holding_atrp)
            if ret_key in slices and atrp_key in slices:
                s_ret = slices[ret_key]
                s_atrp = slices[atrp_key]

                if not s_ret.empty and not s_atrp.empty:
                    # Sharpe ATR Formula: Mean(Returns) / Mean(ATRP)
                    mean_atrp = s_atrp.mean()
                    if mean_atrp > 0:
                        calc_sharpe_atr = s_ret.mean() / mean_atrp
                    else:
                        calc_sharpe_atr = 0.0

                    reported_satr = metrics.get(metric_name, 0.0)

                    diff = abs(calc_sharpe_atr - reported_satr)
                    status = "‚úÖ PASS" if diff < 1e-8 else "‚ùå FAIL"
                    print(
                        f"{metric_name:<30} | {reported_satr:12.6f} | {calc_sharpe_atr:12.6f} | {diff:12.9f} | {status}"
                    )


def run_full_system_audit(df_close_wide, features_df, res_container, deb_container):
    """
    Automated wrapper that executes the 3 Audit Functions on the latest simulation.
    It attempts to detect '‚ùå' in the output to determine Pass/Fail status
    without needing to rewrite the void functions.
    """
    print("=" * 80)
    print("üöÄ STARTING FULL SYSTEM AUDIT")
    print("=" * 80)

    # 0. Pre-Flight Checks
    if not res_container[0] or not deb_container[0]:
        print("‚ùå ABORT: No simulation results found. Run the visualizer first.")
        return False

    inputs = deb_container[0].get("inputs")
    if not inputs:
        print(
            "‚ùå ABORT: 'inputs' not found in debug_container. Update plot_walk_forward_analyzer to v2.3+."
        )
        return False

    summary = {}

    # Helper to run a test and capture output to check for failure symbols
    def run_auditor(name, func, *args):
        print(f"\n>>> Running {name}...")
        try:
            # We assume the auditor prints to stdout.
            # If it prints a '‚ùå', we consider it a logical failure.
            # If it throws an exception, it's a code failure.

            # (Optional: Capture stdout if you wanted to suppress printing,
            # but seeing the details is better for audits. We just track execution here.)
            func(*args)
            return True
        except Exception as e:
            print(f"‚ùå CRITICAL ERROR in {name}: {e}")
            import traceback

            traceback.print_exc()
            return False

    # --- 1. Verify Engine Integrity (Math Check) ---
    summary["Math Integrity"] = run_auditor(
        "Engine Integrity Check", verify_engine_integrity, res_container, deb_container
    )

    # --- 2. Verify Portfolio Construction (Assembly Check) ---
    summary["Portfolio Construction"] = run_auditor(
        "Portfolio Construction Auditor",
        verify_portfolio_construction,
        df_close_wide,
        res_container,
    )

    # --- 3. Verify Ticker Ranking (Selection Check) ---
    summary["Ranking Logic"] = run_auditor(
        "Ticker Ranking Logic Auditor",
        verify_ticker_ranking_logic,
        df_close_wide,
        features_df,
        inputs,
        res_container,
    )

    # --- SUMMARY REPORT ---
    print("\n" + "=" * 80)
    print(f"{'AUDIT MODULE':<40} | {'STATUS'}")
    print("-" * 80)

    all_passed = True
    for name, executed in summary.items():
        # Note: This checks if the function RAN successfully.
        # You still need to visually check the printed table for '‚úÖ' vs '‚ùå'
        status = "‚úÖ EXECUTED" if executed else "‚ùå CRASHED"
        print(f"{name:<40} | {status}")
        if not executed:
            all_passed = False

    print("-" * 80)
    if all_passed:
        print(
            "‚ÑπÔ∏è  NOTE: Check individual module outputs above for specific math failures."
        )
    print("=" * 80)

In [5]:
data_path = (
    r"c:\Users\ping\Files_win10\python\py311\stocks\data\df_OHLCV_stocks_etfs.parquet"
)
df_ohlcv = pd.read_parquet(data_path, engine="pyarrow")
print(f"df_ohlcv.info():\n{df_ohlcv.info()}")
df_ohlcv

<class 'pandas.core.frame.DataFrame'>
MultiIndex: 9459773 entries, ('A', Timestamp('1999-11-18 00:00:00')) to ('ZWS', Timestamp('2025-12-19 00:00:00'))
Data columns (total 5 columns):
 #   Column     Dtype  
---  ------     -----  
 0   Adj Open   float64
 1   Adj High   float64
 2   Adj Low    float64
 3   Adj Close  float64
 4   Volume     int64  
dtypes: float64(4), int64(1)
memory usage: 397.6+ MB
df_ohlcv.info():
None


Unnamed: 0_level_0,Unnamed: 1_level_0,Adj Open,Adj High,Adj Low,Adj Close,Volume
Ticker,Date,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
A,1999-11-18,27.2452,29.9398,23.9518,26.3470,74716411
A,1999-11-19,25.7108,25.7482,23.8396,24.1764,18198349
A,1999-11-22,24.7378,26.3470,23.9893,26.3470,7857766
A,1999-11-23,25.4488,26.1225,23.9518,23.9518,7138324
A,1999-11-24,24.0267,25.1120,23.9518,24.5881,5785609
...,...,...,...,...,...,...
ZWS,2025-12-15,47.4700,47.5700,47.0100,47.4200,1065100
ZWS,2025-12-16,47.6800,47.6800,46.8700,47.1600,862500
ZWS,2025-12-17,47.0100,47.6800,46.3500,46.6400,1070400
ZWS,2025-12-18,47.1900,47.7100,46.9800,47.3600,1119300


In [6]:
# Calculate features ONCE and store them in a variable
print("Calculating features... this might take few minutes...")
print("1. Calculating Features...")
my_features = generate_features(
    df_ohlcv=df_ohlcv, atr_period=14, quality_window=252, quality_min_periods=126
)

print("2. Pivoting Price Matrix...")
# This is the line that takes 12 seconds, but now we only run it ONCE.
my_close_matrix = df_ohlcv["Adj Close"].unstack(level=0)

print("‚úÖ Optimization Complete. Ready for UI.")

Calculating features... this might take few minutes...
1. Calculating Features...
2. Pivoting Price Matrix...
‚úÖ Optimization Complete. Ready for UI.


In [7]:
print(f"my_features:\n{my_features}\n")
my_features.info()

my_features:
                      ATR    ATRP   ROC_1   ROC_3   ROC_5  ROC_10  ROC_21  RollingStalePct  RollMedDollarVol  RollingSameVolCount
Ticker Date                                                                                                                      
A      1999-11-18     NaN     NaN     NaN     NaN     NaN     NaN     NaN              NaN               NaN                  NaN
       1999-11-19  2.5074  0.1037 -0.0824     NaN     NaN     NaN     NaN              NaN               NaN                  NaN
       1999-11-22  2.4967  0.0948  0.0898     NaN     NaN     NaN     NaN              NaN               NaN                  NaN
       1999-11-23  2.4895  0.1039 -0.0909 -0.0909     NaN     NaN     NaN              NaN               NaN                  NaN
       1999-11-24  2.3945  0.0974  0.0266  0.0170     NaN     NaN     NaN              NaN               NaN                  NaN
...                   ...     ...     ...     ...     ...     ...     ...    

In [26]:
results_container, debug_container = plot_walk_forward_analyzer(
    df_ohlcv=df_ohlcv,
    precomputed_features=my_features,  # Fast Features
    precomputed_close=my_close_matrix,  # Fast Prices
    debug=True,  # <-- Activate the new mode!
)

--- ‚öôÔ∏è Initializing AlphaEngine v2.2 (Sanitized) ---


VBox(children=(HTML(value='<b>1. Timeline Configuration:</b> (Past <--- Decision ---> Future)'), HBox(children‚Ä¶

FigureWidget({
    'data': [{'line': {'width': 2},
              'name': 'BIL',
              'type': 'scatter',
              'uid': '10d5cfb8-b1d5-4ecc-917c-d152a2d5b29a',
              'visible': True,
              'x': array([datetime.datetime(2019, 7, 3, 0, 0),
                          datetime.datetime(2019, 7, 5, 0, 0),
                          datetime.datetime(2019, 7, 8, 0, 0),
                          datetime.datetime(2019, 7, 9, 0, 0),
                          datetime.datetime(2019, 7, 10, 0, 0),
                          datetime.datetime(2019, 7, 11, 0, 0),
                          datetime.datetime(2019, 7, 12, 0, 0),
                          datetime.datetime(2019, 7, 15, 0, 0),
                          datetime.datetime(2019, 7, 16, 0, 0),
                          datetime.datetime(2019, 7, 17, 0, 0),
                          datetime.datetime(2019, 7, 18, 0, 0),
                          datetime.datetime(2019, 7, 19, 0, 0),
                          datet

In [35]:
import os
import pandas as pd
from dataclasses import asdict, is_dataclass


def export_debug_to_csv(container, source_label="Simulation"):
    """
    Flattens the debug_container and saves all components to high-precision CSVs.
    """
    if not container or not container[0]:
        print("‚ùå Error: Debug container is empty.")
        return

    data = container[0]
    inputs = data.get("inputs")

    # 1. Create a clean folder name
    # e.g., Audit_Golden_20251211_SharpeATR
    date_str = inputs.start_date.strftime("%Y-%m-%d")
    strategy_str = inputs.metric.replace(" ", "").replace("(", "").replace(")", "")
    folder_name = f"Audit_{source_label}_{date_str}_{strategy_str}"

    if not os.path.exists(folder_name):
        os.makedirs(folder_name)

    print(f"üìÇ Exporting audit data to: ./{folder_name}/")

    # 2. Recursive function to traverse the dictionary and save files
    def process_item(item, path_prefix=""):
        # Handle Nested Dictionaries (like 'verification' or 'raw_components')
        if isinstance(item, dict):
            for key, value in item.items():
                new_prefix = f"{path_prefix}{key}_" if path_prefix else f"{key}_"
                process_item(value, new_prefix)

        # Handle DataFrames
        elif isinstance(item, pd.DataFrame):
            filename = f"{path_prefix.strip('_')}.csv"
            item.to_csv(os.path.join(folder_name, filename), float_format="%.8f")
            print(f"   ‚úÖ Saved DataFrame: {filename}")

        # Handle Series (Convert to DataFrame for CSV preservation of Index)
        elif isinstance(item, pd.Series):
            filename = f"{path_prefix.strip('_')}.csv"
            item.to_frame().to_csv(
                os.path.join(folder_name, filename), float_format="%.8f"
            )
            print(f"   ‚úÖ Saved Series:    {filename}")

        # Handle Dataclasses (like 'inputs')
        elif is_dataclass(item):
            filename = f"{path_prefix}Metadata.csv"
            # Convert to a vertical 2-column table for easy Excel reading
            meta_df = pd.DataFrame.from_dict(
                asdict(item), orient="index", columns=["Value"]
            )
            meta_df.to_csv(os.path.join(folder_name, filename))
            print(f"   ‚úÖ Saved Metadata:  {filename}")

    # 3. Start the extraction
    process_item(data)
    print(f"\n‚ú® Export Complete. All numbers saved with 8 decimal places.")

In [36]:
export_debug_to_csv(debug_container, source_label="bot_v46")

üìÇ Exporting audit data to: ./Audit_bot_v46_2025-12-11_SharpeATR/
   ‚úÖ Saved Metadata:  inputs_Metadata.csv
   ‚úÖ Saved DataFrame: audit_liquidity_universe_snapshot.csv
   ‚úÖ Saved DataFrame: full_universe_ranking.csv
   ‚úÖ Saved Series:    verification_portfolio_full_val.csv
   ‚úÖ Saved Series:    verification_portfolio_full_ret.csv
   ‚úÖ Saved Series:    verification_portfolio_full_atrp.csv
   ‚úÖ Saved Series:    verification_portfolio_lookback_val.csv
   ‚úÖ Saved Series:    verification_portfolio_lookback_ret.csv
   ‚úÖ Saved Series:    verification_portfolio_lookback_atrp.csv
   ‚úÖ Saved Series:    verification_portfolio_holding_val.csv
   ‚úÖ Saved Series:    verification_portfolio_holding_ret.csv
   ‚úÖ Saved Series:    verification_portfolio_holding_atrp.csv
   ‚úÖ Saved Series:    verification_benchmark_full_val.csv
   ‚úÖ Saved Series:    verification_benchmark_full_ret.csv
   ‚úÖ Saved Series:    verification_benchmark_full_atrp.csv
   ‚úÖ Saved Series:    verific

In [33]:
pd.set_option("display.precision", 8)  # Sets display precision to 8
pd.options.display.float_format = (
    "{:.8f}".format
)  # Forces 8 decimals (prevents scientific notation)

_rows = debug_container[0]["full_universe_ranking"]
pd.DataFrame(_rows).to_csv("bot_v46_full_universe_ranking.csv", index=True)

_rows = debug_container[0]["audit_liquidity"]["universe_snapshot"]
pd.DataFrame(_rows).to_csv("bot_v46_universe_snapshot.csv", index=True)

In [34]:
print_nested(debug_container[0])

inputs
    EngineInput(mode='Ranking', start_date=Timestamp('2025-12-11 00:00:00'), lookback_period=10, holding_period=5, metric='Sharpe (ATR)', benchmark_ticker='SPY', rank_start=1, rank_end=2, quality_thresholds={'min_median_dollar_volume': 1000000, 'min_liquidity_percentile': 0.4, 'max_stale_pct': 0.05, 'max_same_vol_count': 10}, manual_tickers=[], debug=True)
audit_liquidity  [NEST]
    date
        2025-12-11 00:00:00
    total_tickers_available
        1589
    percentile_setting
        0.4
    final_cutoff_usd
        92234060.67830002
    tickers_passed
        943
    universe_snapshot
                       ATR       ATRP       ROC_1       ROC_3       ROC_5     ROC_10      ROC_21  RollingStalePct     RollMedDollarVol  RollingSameVolCount  Calculated_Cutoff  Passed_Vol_Check  Passed_Final
Ticker                                                                                                                                                                                        

In [32]:
my_tickers = ["SHV", "CFLT"]
date_start = "2025-11-26"
date_end = "2025-12-19"

# Create combined dictionary
combined_dict = create_combined_dict(
    df_ohlcv=df_ohlcv,
    features_df=my_features,
    tickers=my_tickers,
    date_start=date_start,
    date_end=date_end,
    verbose=False,
)

print_nested(combined_dict)

SHV
                Adj Open  Adj High  Adj Low  Adj Close   Volume     ATR    ATRP       ROC_1   ROC_3   ROC_5  ROC_10  ROC_21  RollingStalePct  RollMedDollarVol  RollingSameVolCount
Date                                                                                                                                                                           
2025-11-26   109.712   109.722  109.712    109.712  1910952  0.0192  0.0002  1.8233e-04  0.0004  0.0008  0.0015  0.0029              0.0        3.2964e+08                  0.0
2025-11-28   109.752   109.752  109.742    109.742  2520560  0.0207  0.0002  2.7344e-04  0.0005  0.0010  0.0017  0.0033              0.0        3.2700e+08                  0.0
2025-12-01   109.764   109.764  109.754    109.764  3517277  0.0208  0.0002  2.0047e-04  0.0007  0.0008  0.0017  0.0034              0.0        3.2700e+08                  0.0
2025-12-02   109.774   109.784  109.774    109.784  1894327  0.0207  0.0002  1.8221e-04  0.0007  0.0009  0.0017 

In [None]:
# ==============================================================================
# EXECUTION
# ==============================================================================
# Run this after clicking "Run Simulation" in the UI
# Assuming global variables: precomputed_close, precomputed_features
run_full_system_audit(
    df_close_wide=my_close_matrix,
    features_df=my_features,
    res_container=results_container,
    deb_container=debug_container,
)

In [None]:
my_tickers = ["COHR", "APP"]
date_start = "2025-07-30"
date_end = "2025-08-21"

# Create combined dictionary
combined_dict = create_combined_dict(
    df_ohlcv=df_ohlcv,
    features_df=my_features,
    tickers=my_tickers,
    date_start=date_start,
    date_end=date_end,
    verbose=False,
)

print_nested(combined_dict)

In [None]:
_rows = debug_container[0]["full_universe_ranking"]
pd.DataFrame(_rows).to_csv("GOLDEN_bot_v45_full_universe_ranking.csv", index=True)

_rows = debug_container[0]["audit_liquidity"]["universe_snapshot"]
pd.DataFrame(_rows).to_csv("GOLDEN_bot_v45_universe_snapshot.csv", index=True)

================================  
================================  
================================  