Skip to content

feat(features): per-ticker risk features (Stage 2a regime-conditioning rebuild)#202

Merged
cipher813 merged 2 commits into
mainfrom
feat/risk-features-stage-2a
May 10, 2026
Merged

feat(features): per-ticker risk features (Stage 2a regime-conditioning rebuild)#202
cipher813 merged 2 commits into
mainfrom
feat/risk-features-stage-2a

Conversation

@cipher813
Copy link
Copy Markdown
Owner

Summary

  • Stage 2a of the regime-conditioning rebuild — adds 4 institutional per-ticker risk-decomposition features to features/feature_engineer.py
  • Each feature varies cross-sectionally on a given date (rank-norm pipeline-compatible) and captures a distinct risk dimension the existing 6 vol features cannot make a split on
  • Cross-repo: alpha-engine-predictor Stage 2b consumes these features via the same parallel-observation pattern Stage 1b/1c uses (parallel prod_vol_risk_aug GBM + expected_move_risk_aug parallel field)

Plan reference

Plan doc: ~/Development/alpha-engine-docs/private/regime-conditioning-260510.md (gitignored). ROADMAP entry in alpha-engine-config #103.

New features

Feature What it captures Compute
beta_60d Systematic market exposure 60d rolling cov(stock, SPY) / var(SPY) on log-returns. NaN if SPY unavailable.
idio_vol_60d Idiosyncratic (non-market) risk 60d rolling std of (stock_log_return − beta × spy_log_return) × sqrt(252)
vol_of_vol_30d Vol regime stability 30d rolling std of realized_vol_20d
max_drawdown_60d Recent left-tail risk Worst peak-to-trough within trailing 60d window. Distinct from dist_from_52w_high which is current depth-from-rolling-252-high

FEATURE_CFG additions: beta_window=60, vol_of_vol_window=30, max_drawdown_window=60.

Stage sequencing

Operator follow-up (post-merge)

The new feature columns will appear in compute_features() output starting on the next Saturday SF firing of feature_engineer. Historical rows in ArcticDB universe library need a one-shot backfill before Stage 2b's training can consume them. Backfill command (to be confirmed against the existing universe library backfill pattern):

# Per the existing universe-library backfill discipline; exact command TBD
# alongside Stage 2b sequencing
python -m features.compute --backfill --columns beta_60d,idio_vol_60d,vol_of_vol_30d,max_drawdown_60d

The Stage 2b PR description will include the canonical backfill invocation once the predictor side is scoped.

Test plan

  • 16 new tests covering: schema contract (FEATURES + compute_features output), beta-of-self=1, beta-of-independent-series≈0, idio_vol-of-self=0, monotone-increasing→drawdown=0, post-decline→drawdown<-0.15, NaN-when-no-SPY for beta+idio_vol, vol_of_vol nonnegative
  • Full data suite: 644 passed + 1 skipped (16 new + 628 unchanged)

🤖 Generated with Claude Code

cipher813 and others added 2 commits May 10, 2026 08:29
…oning rebuild)

Stage 2a of the regime-conditioning rebuild (plan doc:
alpha-engine-docs/private/regime-conditioning-260510.md). Adds 4
institutional risk-decomposition features to features/feature_engineer.py
that capture per-ticker risk dimensions distinct from the existing
volatility features. Each varies cross-sectionally on a given date
(rank-norm pipeline-compatible) and gives the volatility GBM new splits
the existing 6 vol features cannot make.

Per-feature definitions:
- beta_60d:        60d rolling regression slope of stock log-returns vs
                   SPY log-returns. Systematic market exposure. NaN when
                   spy_series is unavailable.
- idio_vol_60d:    Residual vol after removing market-beta exposure.
                   residual = stock_log_return - beta * spy_log_return;
                   60d rolling std × sqrt(252). Idiosyncratic risk.
- vol_of_vol_30d:  30d rolling stdev of realized_vol_20d. Stability of
                   vol regime — high values signal vol-regime instability.
- max_drawdown_60d: Worst peak-to-trough drawdown within trailing 60d
                   window. Distinct from dist_from_52w_high (current
                   depth from rolling-252-high): captures the deepest
                   drawdown that occurred during the recent 60d, even if
                   the stock has since recovered. Always non-positive.

These four features are consumed by Stage 2b (alpha-engine-predictor)
which trains a parallel macro+risk-augmented volatility GBM
(prod_vol_risk_aug) alongside the plain vol GBM and the
vol_macro_aug variant from Stage 1b. Each variant is independently
gated by the variant_cutover_gate (≥15% relative IC lift over plain
baseline) before any cutover.

FEATURE_CFG additions: beta_window=60, vol_of_vol_window=30,
max_drawdown_window=60.

644 tests pass + 1 skip (16 new tests for the 4 risk features:
schema contract, beta-of-self=1, beta-of-independent=0,
idio_vol-of-self=0, monotone-increasing→drawdown=0,
post-decline→drawdown<0, NaN-when-no-SPY).

Operator follow-up: ArcticDB universe library backfill for the 4 new
columns. Saturday SF feature_engineer run will start writing them on
the next firing; historical rows need a one-shot backfill script
invocation against the universe library before Stage 2b training can
consume them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds 63-day realized volatility (per-ticker) to the v3.2 risk feature
set. Pairs with the existing realized_vol_20d to give the volatility
GBM a vol-term-structure signal — steep upward slope in the (20d, 63d)
pair indicates vol expansion; flat or inverted indicates mean-revert
regime.

Captures a slower vol regime than 20d. Trees can split on the
relationship between short and medium-window vol naturally.

3 new tests (warmup, non-negativity, smoother-than-20d).

Stage 2a-extended scope discipline: this is the only piece of the
expanded macro/risk set that lives in alpha-engine-data — 200d breadth,
VIX/VIX3M ratio, 10Y-2Y curve, and HY OAS features all live in
alpha-engine-predictor's regime_predictor.build_features() (macros) or
require new daily_closes.py time-series ingestion (DGS2 + HY OAS).
Those are scoped as Stage 2.5 (data-side ingestion) + Stage 2c
(predictor-side wiring).

19 risk feature tests pass total (16 pre-existing + 3 new).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cipher813 cipher813 merged commit 479a5f0 into main May 10, 2026
1 check passed
@cipher813 cipher813 deleted the feat/risk-features-stage-2a branch May 10, 2026 18:11
cipher813 added a commit that referenced this pull request May 18, 2026
…ht + EvalJudge/Rationale/Replay/CF dry_run_llm) (#263)

Closes the keystone gap: the 5 documented shell-run skip-exceptions are
flipped skip→dry. Under shell_run EVERY substantive workload now boots +
runs dry; ZERO skip-exceptions remain. All prerequisite dry flags were
already MERGED on origin/main of their repos.

Per-state mechanism:

| State                       | Type   | Mechanism (under shell_run)                          |
|-----------------------------|--------|------------------------------------------------------|
| DriftDetection              | spot   | commands.$ States.Format($.preflight_args) → ` --preflight-only` (data #261) |
| EvalJudgeSubmitFirstSaturday| Lambda | Payload "dry_run_llm.$": "$.research_dry" (research #202) |
| EvalJudgeSubmitWeekly       | Lambda | Payload "dry_run_llm.$": "$.research_dry" (research #202) |
| EvalJudgePoll               | Lambda | Payload "dry_run_llm.$": "$.research_dry" (research #202) |
| EvalJudgeProcess            | Lambda | Payload "dry_run_llm.$": "$.research_dry" (research #202) |
| RationaleClustering         | Lambda | Payload "dry_run_llm.$": "$.research_dry" (research #202) |
| ReplayConcordance           | Lambda | Payload "dry_run_llm.$": "$.research_dry" (backtester #225) |
| Counterfactual              | Lambda | Payload "dry_run_llm.$": "$.research_dry" (backtester #225) |

Exact canonical dry var: $.research_dry. It is THE canonical shell-run
LLM-dry signal — InitializeInput seeds it false on every run (so the
absent path / real Sat 02:00 PT firing is unchanged); ApplyShellRunDefaults
already sets it true under shell_run (it backed Research from the
keystone). No new var invented — research #202 / backtester #225 PR bodies
specify dry_run_llm, and reusing $.research_dry keeps the absent-path
guarantee automatic (no extra seeding needed; the seed already exists).

Changes:
- ApplyShellRunDefaults: removed skip_drift_detection / skip_eval_judge /
  skip_rationale_clustering / skip_replay_concordance / skip_counterfactual
  from the force-set JsonMerge blob. It now force-sets ZERO skip_*.
  Per-flag user overrides still win (merge order unchanged). The
  Choice-gated CheckSkip<State> gates are LEFT INTACT (still valid for
  targeted operator skips — verified by test_skip_gates_still_intact).
- DriftDetection: literal `commands` array → `commands.$` States.Array
  whose final entry is States.Format('bash infrastructure/
  spot_drift_detection.sh{} 2>&1 | tee /var/log/drift-detection.log',
  $.preflight_args). {} sits immediately after the script token with no
  literal space; preflight_args carries its leading space inside the var,
  so preflight_args="" reproduces the origin/main command char-for-char
  and " --preflight-only" yields exactly one separating space.
- 7 eval Lambda Payloads: added "dry_run_llm.$": "$.research_dry".
  EvalRollingMean (alpha-engine-research-eval-rolling-mean) was NOT touched
  — it has no skip gate, was never a keystone exception, and is a pure
  historical-metric reader (out of scope).

Byte-identical proof approach:
- shell_run absent ⇒ CheckShellRun.Default = CheckSkipMorningEnrich
  (unchanged); InitializeInput seeds preflight_args="", research_dry=false.
  Every spot States.Format resolves char-for-char to the frozen
  origin/main literal; every eval Lambda dry_run_llm.$ resolves to false
  (handlers default it false ⇒ behaviourally identical to pre-rewire).
- The frozen baseline fixture tests/fixtures/sf_prekeystone_spot_commands
  .json now INCLUDES DriftDetection's pre-rewire origin/main literal
  command (regenerated via the established generator at preflight_args="";
  the existing 7 entries are unchanged). The byte-identical test asserts
  DriftDetection's resolved command at preflight_args="" equals that
  frozen baseline and carries --preflight-only (single space) under
  shell_run.
- CI-safe: tests read only the committed fixture (no `git show
  origin/main` shell-out — that was the #260 CI failure).

Tests:
- _SPOT_STATES grew to 8 (added DriftDetection); _DRY_LAMBDA_STATES grew
  to 11 (added the 7 eval states); _KEYSTONE_SKIP_EXCEPTIONS = empty set.
- test_shell_defaults_force_set_ZERO_skip_exceptions asserts the blob
  force-sets no skip_* and none of the 16 workload skips (incl. the 5
  ex-exceptions) appear.
- TestHappyPathTraversal: under shell_run nothing is skipped (skipped ==
  set()); DriftDetection is VISITED (runs dry), not jumped past.
- Module + class docstrings updated to the rewire semantics.

JSON valid (58 top-level states, 91 incl. parallel branches). Full
alpha-engine-data suite: 1351 passed, 1 skipped, 0 failed.

Zero skip-exceptions remain — every substantive task runs dry under
shell_run (spots → --preflight-only, Lambdas → dry_run_llm).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant