### 1. Price Gain
**The "Vertical Velocity" (Pure Growth)**

*   **Formula:**
    The raw percentage change in price over the lookback window ($N$):
    $$\text{Price Gain} = \frac{\text{Price}_{t}}{\text{Price}_{t-N}} - 1$$

*   **Interpretation:**
    *   **High Value:** The stock has delivered the highest absolute capital appreciation. It ignores "how" it got there (volatility) and focuses only on the "where."
    *   **Low Value:** The stock is stagnant or in a downtrend.

*   **How the RL Agent Uses It:**
    *   **"Trend Chaser":** In strong bull markets, the Agent may pivot to raw Gain to capture the "hottest" runners where volatility is secondary to momentum.
    *   **Baseline:** It serves as the simplest performance benchmark against which risk-adjusted metrics are compared.

---

### 2. Sharpe (Standard)
**The "Institutional Standard" (Volatility-Adjusted Return)**

*   **Formula:**
    The ratio of mean daily returns to the standard deviation of those returns (annualized):
    $$\text{Sharpe} = \frac{\text{Mean}(\text{Daily Returns}, N)}{\text{Std}(\text{Daily Returns}, N)} \times \sqrt{252}$$

*   **Interpretation:**
    *   **High Value:** The stock provides a "smooth" ride. It earns its returns with low variance, suggesting a stable, predictable uptrend.
    *   **Low Value:** The returns are "noisy." Even if the gain is high, the high standard deviation suggests a chaotic price path that is prone to sharp reversals.

*   **How the RL Agent Uses It:**
    *   **"Portfolio Stabilizer":** The Agent selects high Sharpe stocks when the Macro VIX signals suggest a regime of rising uncertainty, favoring stability over raw speed.

---

### 3. Sharpe (ATRP)
**The "Regime Efficiency" Ratio (Volatility-Normalized Performance)**

*   **Formula:**
    Unlike the standard Sharpe (which uses realized standard deviation), this uses the **ATRP** (Average True Range Percent) to normalize returns by the stock's current volatility "personality":
    $$\text{ATRP} = \frac{\text{ATR}(14)}{\text{Price}}$$
    $$\text{Sharpe(ATRP)} = \frac{\text{Mean}(\text{Daily Returns}, N)}{\text{Mean}(\text{ATRP}, N)}$$

*   **Interpretation:**
    *   **High Value:** The stock is outperforming its "typical" daily volatility. It is effectively "quietly outperforming."
    *   **Low Value:** The stock is moving, but the moves are small compared to its high cost of carry (volatility).

*   **How the RL Agent Uses It:**
    *   **"Smart Beta" Selection:** The Agent uses this to find stocks that are "punching above their weight class" without the wild swings associated with high-beta names.

---

### 4. Sharpe (TRP)
**The "Daily Stress Test" Ratio (Event-Sensitive Performance)**

*   **Formula:**
    Unlike ATRP which uses a smoothed 14-day average, TRP uses the **raw daily True Range Percent** â€” the single-day volatility "shock" without any smoothing:
    $$\text{True Range} = \max\left(\text{High} - \text{Low}, \ |\text{High} - \text{Prev Close}|, \ |\text{Low} - \text{Prev Close}|\right)$$
    $$\text{TRP} = \frac{\text{True Range}}{\text{Price}}$$
    $$\text{Sharpe(TRP)} = \frac{\text{Mean}(\text{Daily Returns}, N)}{\text{Mean}(\text{TRP}, N)}$$

*   **Interpretation:**
    *   **High Value:** The stock is delivering returns while keeping its *worst single-day* volatility contained. It shows resilience against gap-downs and intraday shocks.
    *   **Low Value:** The stock may have "quiet" average days but suffers from occasional extreme moves (gaps, flash crashes) that the average return doesn't compensate for.

*   **How the RL Agent Uses It:**
    *   **"Crash Avoidance" Screening:** The Agent uses TRP to filter out names that appear stable on average (low ATRP) but have hidden tail risk â€” large single-day moves that can stop out positions or trigger risk limits. TRP captures the "what's the worst that can happen tomorrow" metric that ATRP smooths away.


### Key Difference: ATRP vs TRP

| Metric | Smoothing | Best For | Captures |
|--------|-----------|----------|----------|
| **ATRP** | 14-day EMA | Trend-following, stable regimes | Sustained volatility "personality" |
| **TRP** | None (daily) | Risk management, event detection | Single-day shocks, gap risk, tail events |

**When to use TRP:** When you care about *overnight risk* and *maximum adverse excursion* more than average behavior â€” e.g., before earnings, during high macro uncertainty, or when position sizing must account for worst-case single-day loss.  

---

### 4. Momentum (21d)
**The "Velocity Vector" (Medium-Term Strength)**

*   **Formula:**
    The 21-trading day (one month) rate of change:
    $$\text{Mom\_21} = \frac{\text{Price}_{t}}{\text{Price}_{t-21}} - 1$$

*   **Interpretation:**
    *   **High Value:** Strong recent capital inflow. The stock is "in play."
    *   **Low Value:** Negative momentum; the stock is being distribution-sold or ignored.

*   **How the RL Agent Uses It:**
    *   **"The Kickoff":** Momentum is often the first signal of a regime shift. The Agent uses this to identify the start of a "breakout" before it shows up in longer-term Sharpe metrics.

---

### 5. Information Ratio (IR)
**The "Alpha Specialist" (Consistency vs. Benchmark)**

*   **Formula:**
    The ratio of **Active Return** (Stock Ret - Market Ret) to the **Tracking Error** (Std Dev of Active Returns):
    $$\text{Active Ret} = R_{\text{stock}} - R_{\text{benchmark}}$$
    $$\text{IR} = \frac{\text{Mean}(\text{Active Ret}, 63)}{\text{Std}(\text{Active Ret}, 63)}$$

*   **Interpretation:**
    *   **High Value:** The stock consistently beats the market with very little "wavering." It is a reliable alpha generator.
    *   **Low Value:** The stock is either underperforming or its outperformance is erratic and unpredictable compared to the S&P 500.

*   **How the RL Agent Uses It:**
    *   **"The Hedge Fund Move":** In sideways markets, the Agent uses IR to find stocks that can decouple from the index and provide idiosyncratic gains.

---

### 6. Consistency (Win Rate)
**The "Reliability Metric" (Frequency of Green Days)**

*   **Formula:**
    The percentage of positive-return days over the last 10 trading days:
    $$\text{Consistency} = \frac{\text{Count}(R_{daily} > 0)}{10}$$

*   **Interpretation:**
    *   **Value of 0.8:** The stock has closed "green" 8 out of the last 10 days. 
    *   **High Value:** Indicates a "relentless" bid. This often precedes a parabolic move as shorts are squeezed and buyers FOMO in.

*   **How the RL Agent Uses It:**
    *   **"The Stealth Bid Detector":** Even if daily returns are small, high consistency tells the Agent that a "strong hand" is accumulating the stock daily.

---

### 7. Oversold (RSI)
**The "Elastic Snap" (Mean Reversion)**

*   **Formula:**
    The Relative Strength Index (standard 14-day), sorted in reverse (lower RSI ranks higher):
    $$RS = \frac{\text{Avg Gain}}{\text{Avg Loss}}$$
    $$RSI = 100 - \left(\frac{100}{1 + RS}\right)$$
    $$\text{Rank Score} = -RSI$$

*   **Interpretation:**
    *   **High Rank (Low RSI):** The stock is "Oversold" (e.g., RSI < 30). The price has been beaten down too fast, stretching the "rubber band."
    *   **Low Rank (High RSI):** The stock is "Overbought."

*   **How the RL Agent Uses It:**
    *   **"The Contrarian":** When the Agent detects a "Macro Panic" (High VIX), it may switch to RSI to buy the "blood in the streets," betting on a mean-reversion bounce.

---

### 8. Dip Buyer (Drawdown)
**The "Value Trap or Bargain" Finder**

*   **Formula:**
    The distance from the 21-day high, sorted in reverse (smaller/more negative drawdown ranks higher):
    $$\text{DD} = \frac{\text{Price}_{t}}{\text{MaxPrice}(21)} - 1$$
    $$\text{Rank Score} = -DD$$

*   **Interpretation:**
    *   **High Rank:** The stock is significantly off its recent highs (a deep "Dip").
    *   **Low Rank:** The stock is trading at or near its 21-day high (no dip).

*   **How the RL Agent Uses It:**
    *   **"Buy the Dip":** The Agent uses this metric during bull market pullbacks. It identifies stocks that are historically strong but are currently "on sale" relative to their recent peak.

---

### 9. Low Volatility
**The "Safety First" Filter**

*   **Formula:**
    The negative of the ATRP (Average True Range Percent):
    $$\text{Score} = -\left(\frac{ATR(14)}{\text{Price}}\right)$$

*   **Interpretation:**
    *   **High Rank:** The stock has very small daily price swings relative to its price (e.g., a "boring" utility or consumer staple).
    *   **Low Rank:** The stock is a "mover"â€”high volatility, large daily percentage swings.

*   **How the RL Agent Uses It:**
    *   **"Capital Preservation":** During high-risk macro regimes (VIX Backwardation), the Agent uses this to hide in the "quietest" stocks in the universe, minimizing the risk of a "flash crash" hit.

---

### 1. Market Regime (200d MA Deviation)
**The "Trend Guardian"**

*   **Formula:**
    The percentage distance of the benchmark price from its 200-day Simple Moving Average (SMA):
    $$\text{Regime} = \left( \frac{\text{Price}_{\text{Benchmark}}}{\text{SMA}_{200}(\text{Price}_{\text{Benchmark}})} \right) - 1$$

*   **Interpretation:**
    *   **Positive (> 0%):** The market is in a long-term uptrend (Bull Regime). Shaded green in the plot, this represents a "healthy" environment where momentum strategies usually thrive.
    *   **Negative (< 0%):** The market is in a long-term downtrend (Bear Regime). Trading below the 200d MA is a primary indicator of systemic risk and potential "tail-risk" events.

*   **How the RL Agent Uses It:**
    *   **"Environmental Context":** This is the master "On/Off" switch for risk. If this value is negative, the Agent may shift its internal reward function to favor cash or defensive assets, as the statistical probability of "buy-the-dip" success drops significantly when the long-term trend is broken.

---

### 2. Market Momentum (21d Z-Score)
**The "Acceleration Gauge"**

*   **Formula:**
    The 21-day change (velocity) in the Market Regime, standardized by its 3-month (63-day) rolling volatility:
    $$\text{Velocity} = \text{Regime}_{t} - \text{Regime}_{t-21}$$
    $$\text{Z-Score} = \frac{\text{Velocity}}{\sigma_{63}(\text{Regime})}$$

*   **Interpretation:**
    *   **Accel (Red Dash @ +2.0):** The market is "overheating." Price is moving higher much faster than its recent volatility profile suggests, often signaling a "blow-off top."
    *   **Decel (Green Dash @ -2.0):** The trend is rapidly losing steam or rolling over. Even in a bull market, a deep negative Z-score indicates the "brakes" are being slammed on.

*   **How the RL Agent Uses It:**
    *   **"Inflection Point Detection":** The Agent uses this to distinguish between a healthy pullback and a rapid trend reversal. A high "Accel" reading might trigger the Agent to tighten stop-losses, while a move into "Decel" serves as an early warning to exit momentum trades before the 200d MA is even breached.

---

### 3. Volatility Regime (VIX Z-Score)
**The "Fear Meter"**

*   **Formula:**
    The current VIX index standardized relative to its recent 3-month (63-day) mean and standard deviation:
    $$\text{VIX Z} = \frac{\text{VIX} - \mu_{63}(\text{VIX})}{\sigma_{63}(\text{VIX})}$$

*   **Interpretation:**
    *   **Fear (Red Dash @ +2.0):** Panic state. VIX is spiking far above its recent norm, typically coinciding with sharp price corrections and "forced selling" by institutions.
    *   **Calm (Green Dash @ -1.5):** Complacency. Volatility is unusually suppressed, which often occurs during long "melt-up" phases but can precede a sudden volatility "shock."

*   **How the RL Agent Uses It:**
    *   **"Contrarian Timing & Protection":** While high "Fear" signals danger, it also identifies potential "climax bottoms." The Agent looks for VIX Z-score spikes to identify when the "blood is in the streets" (a buying opportunity). Conversely, prolonged "Calm" may signal the Agent to rotate into "Low Volatility" stocks to protect against a volatility mean-reversion.



---
### 3. Volatility Regime (Term Structure Map)
**The "Systemic Crash" Dashboard**

*   **Logic:**
    This hybrid plot standardizes market fear intensity (The Line) against the underlying "Term Structure" of volatility (The Background). It measures not just *how much* the market is panicking, but the *urgency* of that panic by comparing current expectations against future hedges.

*   **Formulas:**
    1. **Intensity (VIX Z-Score):** Measures how extreme current volatility is relative to its own recent 3-month (63-day) behavior.
       $$\text{VIX Z} = \frac{\text{VIX}_t - \text{SMA}_{63}(\text{VIX})}{\sigma_{63}(\text{VIX})}$$
    2. **Stability (VIX Ratio):** Measures the "Term Structure" by comparing the 30-day VIX against the 90-day VIX3M. This determines if the market is in a healthy "Contango" or a crisis "Backwardation."
       $$\text{VIX Ratio} = \frac{\text{VIX}}{\text{VIX3M}}$$

*   **The 3 Colored Regimes:**
    *   ðŸŸ¢ **Green (STABILITY / Ratio < 0.9):** **Institutional Calm.** Current fear is significantly lower than future expectations (Deep Contango). This indicates a healthy, "Risk-On" environment where institutions are comfortable selling volatility.
    *   âšª **Grey (TRANSITION / Ratio 0.9 - 1.0):** **Structural Shift.** The gap between current and future fear is closing. This "flattening" of the term structure often precedes market distribution or a high-volatility regime shift.
    *   ðŸ”´ **Red (SYSTEMIC SHOCK / Ratio > 1.0):** **Backwardation.** Current panic exceeds future expectations. This structural inversion indicates an immediate liquidity crisis or crash event; "buying the dip" is statistically dangerous until the ratio resets.

*   **How the RL Agent Uses It:**
    *   **Dynamic Title Monitoring:** The Agent monitors the **Dynamic Status Header** and **VIX Ratio Value** to adjust its feature weights. It favors **Mom_21 (Momentum)** in the Green zone but rotates toward **Low Volatility (ATRP)** or **Cash** as the map turns Grey.
    *   **The "Emergency Brake":** If the background turns **Red (Ratio > 1.0)** and the purple line spikes, the Agent recognizes a systemic event. It overrides individual ticker "Buy" signals to prioritize capital preservation (SHV/BIL) until the structural inversion clears and the background returns to Grey or Green.