feat(morning-enrich): chronic-polygon-gap self-heal via yfinance backfill#193
Merged
Conversation
…fill
Closes the 2026-05-09 weekly-SF DataPhase1 postflight failure. PSTG
ended at 5/5 in ArcticDB while SPY was at 5/8 (3d stale, > 2d threshold);
the other 3 chronic polygon gaps (BF-B / BRK-B / MOG-A) ended at 5/6
(2d, just under threshold today — could fail tomorrow). Polygon does
not reliably serve these 4 tickers (class B/A share dot-vs-dash naming
on 3 of them, intermittent coverage on PSTG since ~2026-04), so
MorningEnrich's polygon_only daily_append leaves them at whatever the
prior EOD yfinance pass landed; on days when EOD also dropped the
ticker the gap compounds and postflight catches them as stale.
This adds a `_self_heal_chronic_polygon_gaps` step that runs after
daily_append in MorningEnrich. For each ticker in
`config.chronic_polygon_gaps.tickers`:
- Read ArcticDB universe last_date.
- If last_date >= target_date, skip (idempotent).
- Else yfinance-fetch [last_date+1, target_date], patch
`predictor/price_cache/{ticker}.parquet` with the new rows
(dedupe by date keep="last"), and invoke
`builders.backfill(ticker_filter=ticker)` so the ArcticDB write
goes through the same per-ticker compute_features path as every
other ticker.
Best-effort by design: a yfinance hiccup on one chronic ticker logs
the error but doesn't halt MorningEnrich. Postflight remains the
load-bearing gate on freshness — if a ticker is still stale after
this step, postflight surfaces it.
Tickers absent from `chronic_polygon_gaps.tickers` still hard-fail
polygon_only collection when missing, preserving the strict
"no silent fails" default for the ~900 healthy tickers.
Companion config update lives in alpha-engine-config (private repo)
under `data/config.yaml`. The example here documents the schema for
future operators.
Tests cover: config loader (sorted keys, missing/malformed
permissive), already-fresh skip, yfinance fetch + parquet patch +
backfill invocation, dry-run no-side-effect, per-ticker error
isolation, empty-list no-op.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `git -C alpha-engine-config pull --ff-only origin main` to the SF DataPhase1 task command list, alongside the existing alpha-engine-data pull. Without this, a config change merged to alpha-engine-config (e.g. adding/removing a chronic_polygon_gaps ticker) doesn't reach the trading instance until something else triggers an external pull — masking the change while polluting downstream behavior. Triggered by the chronic_polygon_gaps allowlist landing in alpha-engine-config #88: the dispatcher's local clone was 2 days behind origin/main during the 2026-05-09 weekly-SF DataPhase1 recovery, and the new config section never reached weekly_collector until I SSM-ran a manual pull. Closing the loop here so future config changes are SF-pulled automatically. Scope: DataPhase1 only. Other states (RAGIngestion, Predictor, Backtester, Dashboard) don't read alpha-engine-config directly today; adding the pull universally is a separate decision (cheap +30ms each but expands the trust surface). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2 tasks
cipher813
added a commit
that referenced
this pull request
May 9, 2026
…ha-engine-config (#194) Closes the same staleness vector PR #193 closed for DataPhase1: the SF PredictorTraining task pulls alpha-engine-predictor on every run but relies on the dispatcher's local ``alpha-engine-predictor/config/predictor.yaml`` for training config. That file is gitignored in the predictor repo and must be staged from the alpha-engine-config sibling clone — but nothing in the SF flow was keeping the staged copy in lockstep with origin/main of alpha-engine-config. The 2026-05-09 horizon migration (alpha-engine-config #90: forward_days 5 → 21, output_distribution_gate_blocking false → true, purge_days bump) would not have reached the next Saturday training without a manual SSM-side intervention to copy the config from alpha-engine-config to alpha-engine-predictor. Adds two commands before the spot_train.sh invocation: - ``git -C alpha-engine-config pull --ff-only origin main`` - ``cp alpha-engine-config/predictor/predictor.yaml alpha-engine-predictor/config/predictor.yaml`` Now any merged config change in alpha-engine-config reaches the next PredictorTraining cycle automatically. Mirrors the symmetric DataPhase1 fix from PR #193. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
cipher813
added a commit
that referenced
this pull request
May 9, 2026
…195) Pairs with the chronic-gap self-heal step from PR #193. The chronic_polygon_gaps allowlist was added because polygon doesn't reliably serve BF-B / BRK-B / MOG-A / PSTG. If polygon coverage RECOVERS for any of these — polygon adds a Berkshire B share class CIK or fixes a flaky data feed — the allowlist entry becomes silent operational debt: yfinance fallback would still happen even though polygon now has the data. Adds `_detect_chronic_gap_polygon_recovery` step in MorningEnrich, runs BEFORE the self-heal so the signal is a clean read of what polygon shipped today (not contaminated by our yfinance backfill). Reads `staging/daily_closes/{date}.parquet` written by `daily_closes.collect(source="polygon_only")` and checks each chronic ticker for membership. Emits CW gauge `AlphaEngine/Data/chronic_gap_polygon_recovery_count` (always — gauge of 0 anchors the alarm baseline; CW missing-data is harder to alarm on than a steady 0). Operator action when count > 0 across multiple cycles: prune the allowlist entry from alpha-engine-config predictor.yaml. Best-effort by design — read errors / metric emit errors log a warning but never raise. MorningEnrich is not blocked by drift detection; postflight remains the load-bearing freshness gate. Tests: 5 new in test_chronic_gap_drift_detection.py - no recovery → metric emits 0 + absent_as_expected list - partial recovery (BRK-B + PSTG covered) → recovery list + metric=2 - parquet read failure → status=skipped, no raise - empty chronic list → noop, no S3 read, no metric - CW emit failure → swallowed, result still records counts Closes the chronic_polygon_gaps loop: self-heal (PR #193) backfills the gap; drift detection (this PR) flags when the gap closes upstream. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cipher813
added a commit
that referenced
this pull request
May 10, 2026
…econciliation (#199) PR 1 of the windowed-data-reconciliation arc (plan doc: alpha-engine-docs/private/windowed-data-reconciliation-260510.md). **Origin:** 2026-05-09 evaluator email "121 tickers with >5d gap" diagnostic. Investigation revealed 110 of the 121 are matrix-pivot artifacts — front-of-history mismatches and tail drift from the chronic-polygon-gap self-heal (data #193) widening the gap between the 5 chronically-stale tickers (which heal daily) and the rest of the universe (which only refresh on weekly DataPhase1). **This PR's scope:** structural orchestration only. - New ``window_days: int = 1`` parameter on ``collect()``. Default 1 preserves all single-date legacy behavior — no consumer call sites change shape. - New ``_previous_business_days(run_date, n)`` helper enumerating ``n`` BDays ending at ``run_date`` (inclusive), newest first. Saturday/Sunday run_date normalizes to the prior Friday so a Sat SF firing at 02:00 PT doesn't burn a slot on a non-trading day. - New ``_collect_window()`` helper that iterates oldest → newest, calling the existing ``collect(window_days=1)`` per date so all the fetch / coverage-gate / write logic reuses unchanged. Per-date failures don't kill the rest of the window — the aggregate's ``status`` flips to ``"partial"`` and successful dates still write. **Polygon free-tier rate-limit contract:** one ``grouped-daily`` call per date in the window, total ``window_days`` polygon calls — the only way to honor 14/day at the free tier. Test ``test_polygon_only_window_makes_one_grouped_daily_per_date`` pins this invariant for the production default ``window_days=14``. **Out of scope (later PRs in the arc):** - PR 2: per-cell skip-if-canonical optimization on the yfinance side (cells where ``source ∈ {"yfinance", "polygon"}`` skip the yfinance refetch, keeping yfinance batch cost near zero in steady state). - PR 3: SF wiring + ``window_days=14`` config knob. - PR 4: simulator gap-warning metric refactor reading the ``source`` column. - PR 5: ``chronic_polygon_gaps`` allowlist deprecation. +14 tests pinning the legacy-parity contract (``window_days=1`` produces byte-identical single-date result shape), the orchestration contract (window mode fans out to N per-date calls oldest first), the polygon rate-limit invariant, and the per-date-failure non-blocking behavior. Suite: 633 passed (was 619; +14 new). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Root-cause fix for the 2026-05-09 weekly-SF DataPhase1 postflight failure. After alpha-engine-data PR #192 fixed the arcticdb backfill regression, postflight failed at PSTG = 3d stale vs SPY (
>2dthreshold) — exposing a deeper problem: polygon does not reliably serve 4 known chronic-gap tickers (BF-B / BRK-B / MOG-A / PSTG), so they accumulate multi-day rot whenever the EOD yfinance pass also drops a day.The current design conflates two orthogonal concerns into a binary
source=polygon_only:This PR separates them: coverage expectation moves to config (
chronic_polygon_gaps.tickers), and a new self-heal step in MorningEnrich yfinance-backfills any ArcticDB row gap for chronic-gap tickers. Tickers absent from the allowlist still hard-fail polygon_only collection — preserving the strict "no silent fails" default for the ~900 healthy tickers.Companion config update in alpha-engine-config (private repo) at
data/config.yaml.This is PR A of a 3-PR root-cause arc:
sourcecolumn on ArcticDB OHLCV rows so postflight + downstream consumers can distinguish polygon vs yfinance origin per rowTest plan
tests/test_chronic_polygon_gap_self_heal.pycover the config loader (sorted/missing/malformed) + self-heal helper (already-fresh skip, yfinance + parquet patch + backfill invocation, dry-run, per-ticker error isolation, empty-list no-op)pytestfull suite (582 passed, 1 skipped — vs 574 on origin/main)chronic-gap self-heal: 1 healed, 3 already-fresh, 0 errors(or similar) on the rerunalready_fresh(no work after the initial heal)Followups (separate PRs)
sourcecolumn on ArcticDB rows (additive schema)🤖 Generated with Claude Code