weekday SF: add MorningEnrich step (polygon overwrite before predictor inference)#91
Merged
Merged
Conversation
Operational wiring for the split-by-source design from PR 1 (alpha-engine-data #90). step_function_daily.json (weekday SF, Mon-Fri 6:05 AM PT): Insert MorningEnrich SSM step on ae-trading between the trading-day check and PredictorInference. Runs: python weekly_collector.py --morning-enrich which finds the previous trading day, fetches polygon grouped-daily for it (hard-fails on PolygonForbiddenError — no yfinance fallback), and re-runs daily_append to overwrite the prior day's ArcticDB row with polygon's authoritative OHLCV+VWAP. PredictorInference is gated on this succeeding — failure routes to HandleFailure, not silent inference on uncorrected data (per feedback_no_silent_fails). This closes the operational loop on the 2026-04-17→2026-04-23 silent VWAP outage where the EOD yfinance pass was the only source and ArcticDB's VWAP column stayed universally null across the window. step_function_eod.json (EOD SF, daemon-shutdown trigger): Move PostMarketData from micro to ae-trading (InstanceIds.$ now uses $.trading_instance_id; same change for WaitForPostMarketData polling). Avoids the OOM regression that originally moved DailyData off micro on 2026-04-16. Bumps executionTimeout 180→720 to match observed ~7 min runtime + safety margin (15-min window between daemon shutdown at 1:15 PM PT and EC2 stop at 1:30 PM PT — 8-10 min usage, comfortable margin). Simplified the two-command pattern (--only daily_closes + builders.daily_append) to a single `python weekly_collector.py --daily` since PR 1 unified --daily under source=yfinance_only and the full _run_daily flow now does closes + features + append together. Comment updated: this SF is now the sole canonical EOD path. The alpha-engine-daily-data systemd timer that was racing this SF gets deleted in the paired alpha-engine PR. Validation: - Both SF JSONs parse cleanly (json.load smoke check) - 153/153 unit tests pass - Production validation gated on: 1. Deploy via infrastructure/deploy_step_function_daily.sh + deploy_step_function.sh (or equivalent) 2. Paired alpha-engine PR deletes systemd timer + retargets daemon._trigger_eod_pipeline (or accepts ec2_instance_id field being unused) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ally triggered alpha-engine PR #94 (merged 2026-04-22) removed _trigger_eod_pipeline from executor/daemon.py — the EOD SF (alpha-engine-eod-pipeline) is no longer fired from daemon shutdown. The canonical EOD path is now: * 1:05 PM PT — alpha-engine-daily-data.timer (systemd, ae-trading) runs `python weekly_collector.py --daily` (post-PR-1 = yfinance_only) * 1:20 PM PT — alpha-engine-eod.timer (systemd, ae-trading) runs `python executor/eod_reconcile.py` The EOD SF JSON exists only for manual disaster recovery. Modifying it (moving from micro→trading + simplifying commands) was based on stale context from earlier in this session — the original "EOD SF as canonical path" framing was true a week ago but no longer holds. Reverting keeps the EOD SF unchanged so this PR's scope stays minimal: just the MorningEnrich SSM step in step_function_daily.json. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
cipher813
added a commit
that referenced
this pull request
Apr 24, 2026
#92) Surfaced by the 2026-04-24 historical VWAP repair after PR #91 deployed: ArcticDB's universe_lib.update() raises "index must be monotonic increasing or decreasing" when asked to insert behind the latest stored date. daily_append was originally designed for "append today's row at the head" — fine for the steady-state daily pass but useless for historical backfills. This commit adds _write_row_backfill_safe(lib, symbol, new_row, existing_series=None) which routes by mode: * append (target_ts > all existing dates): use lib.update() — fast path, single-row write. Same behavior as before for the steady- state daily pass. * backfill (target_ts ≤ some existing date, including target_ts == latest): read full series, splice in the new row (replacing any existing same-date row, matching update() semantics), write the monotonic-sorted full series via lib.write(prune_previous_versions= True). ~10-100x slower per ticker but only fires for rare backfill operations. Per-ticker write site rewired to call _write_row_backfill_safe; the existing `hist` (already read for feature warmup) is passed in as existing_series so the helper doesn't double-read. Macro + sector ETF write sites also rewired. Each call's mode is captured into macro_write_modes so the post-write verification check can apply the right correctness assertion: * append mode → readback last index must equal target_ts (catches the 2026-04-15 silent-stale failure) * backfill mode → target_ts must be IN the readback index, anywhere (the last date is naturally future relative to a backfilled historical date) Existing semantics tests (test_daily_append_semantics.py) updated to match the new helper-routed call sites while preserving regression intent — lib.append() must never appear, counters increment after write, etc. 9 new tests in tests/test_daily_append_backfill_safe.py: - append uses lib.update when target after latest - append when existing series is empty (first-write-after-empty) - first write to nonexistent symbol uses write() - backfill uses lib.write when target before latest - backfill replaces existing same-date row (semantic match w/ update) - backfill target in middle of series — sorts monotonic - target == latest takes backfill path (conservative >, not >=) - lib.write called with prune_previous_versions=True - passing existing_series avoids extra lib.read Full suite: 162/162 pass. After deploy to ae-trading: re-run the historical backfill loop for D in 2026-04-17 2026-04-20 2026-04-21 2026-04-22 2026-04-23; do python weekly_collector.py --morning-enrich --date $D done to repair the universally-NaN VWAP column for the 2026-04-17→2026-04-23 window the polygon outage left behind. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cipher813
added a commit
that referenced
this pull request
Apr 27, 2026
2026-04-27 weekday SF dropped through to executor with a 65h-stale daily_data stamp because _run_morning_enrich (added 2026-04-24 PR #91) never refreshed it. The post-close DailyData run is the only writer of the stamp; on Mondays, that's Friday afternoon — well past the executor's 26h staleness gate. Today was the first Monday since MorningEnrich shipped, which is why this surfaced now. Fix: on the success path, _run_morning_enrich now calls _write_module_health(module_name="daily_data", ...) with the polygon overwrite results in the summary, so the stamp's last_success refreshes to the morning-enrich run time. Failure paths intentionally leave the prior stamp untouched — writing a fresh "ok" stamp on failure would mask outages from the executor's gate. Dry runs also skip the write. 3 new tests cover: stamp written on success with morning_enrich=true in the summary, stamp NOT written on PolygonForbiddenError, stamp NOT written in dry-run. Full suite: 218/218 passes. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cipher813
added a commit
that referenced
this pull request
Apr 27, 2026
…ma (#105) Background — 2026-04-27 EOD-email blackout investigation ======================================================== The structural fix in PR #104 decoupled macro/SPY freshness from stock-coverage correctness. Validation today exposed a second, latent issue: with the universe-coverage guard now passing, daily_append's per-stock writes finally execute — and 100% of them fail with an ArcticDB schema-mismatch error. Schema audit (2026-04-27 22:14 UTC) revealed heterogeneous universe state: - 816 symbols (~90%): 64 cols, no VWAP at all - 88 symbols (~10%): 65 cols, VWAP at idx=64 (appended at end) daily_append writes via OHLCV_COLS = [Open, High, Low, Close, Volume, VWAP, ...features], which puts VWAP at idx=5. ArcticDB update() requires column order match — both schema variants fail. Per-stock universe writes have therefore been failing since the polygon-VWAP work landed on 2026-04-24 (PRs #90/#91/#92), masked until today by the macro-coupled universe-coverage guard. Operational design (yfinance EOD → polygon morning) ==================================================== - yfinance EOD post-close hook writes daily_closes parquet with VWAP=NaN (yfinance does not expose true volume-weighted VWAP). - polygon morning enrichment overwrites the parquet with real VWAP values from polygon grouped-daily. - daily_append runs end-of-day and writes whatever VWAP is in the parquet to ArcticDB universe — NaN initially, real values after the morning enrichment re-runs daily_append. For that flow to work, VWAP must be a first-class column in the universe schema with a stable position. This migration normalizes every symbol to the canonical layout: [Open, High, Low, Close, Volume, VWAP] + FEATURES NaN-fills VWAP historically for the 816 symbols that didn't have it. Repositions VWAP for the 88 symbols that had it appended at idx=64. Existing FEATURES block keeps its relative order. Idempotent — symbols already in canonical order are skipped. Per-symbol error isolation — one symbol's write failure does not abort the batch (records into errors[], continues with the rest). Tests ===== - _canonical_column_order: VWAP inserted at idx=5, feature block preserved in relative order, drops nothing. - _is_canonical: recognizes correct layout, rejects appended-VWAP and missing-VWAP variants. - migrate_universe_vwap apply path: - Inserts VWAP at idx=5 with FLOAT64 NaN when absent. - Relocates VWAP from idx=last when appended (preserving values). - Skips already-canonical symbols (idempotent). - Honors --tickers override for canary / subset runs. - Per-symbol error isolation — partial-status return on partial failure. - All 275 existing tests still pass (261 + 14 new). Operational follow-up (not in this PR) ====================================== After merge, deploy + run: python -m builders.migrate_universe_vwap --apply on ae-trading. Expected: 904 symbols migrated (816 + 88), audit JSON written to s3://alpha-engine-research/builders/migrate_universe_vwap_audit/. Then rerun alpha-engine-daily-data.service (per-stock writes succeed) and alpha-engine-eod.service (held-stock close lookups succeed; EOD email + 2026-04-27 eod_pnl row land). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a single new step (
MorningEnrich) to the weekday Step Function that runspython weekly_collector.py --morning-enrichon ae-trading between the trading-day check andPredictorInference.The morning enrichment finds the previous trading day, fetches polygon's grouped-daily for it (hard-fails on
PolygonForbiddenError— no yfinance fallback), and re-runsdaily_appendto overwrite the prior day's ArcticDB row with polygon's authoritative OHLCV+VWAP. Predictor inference is gated on this succeeding — failure routes toHandleFailure, not silent inference on uncorrected data.This closes the operational loop on the 2026-04-17→2026-04-23 silent VWAP outage where the EOD yfinance pass was the only source.
Why no other changes
Earlier in the design conversation we discussed deleting the alpha-engine-daily-data systemd timer + retargeting the EOD SF from micro to ae-trading. Both ideas were based on the assumption that the EOD SF was the operational EOD path. That's no longer true — alpha-engine PR #94 (2026-04-22) removed
_trigger_eod_pipelinefromexecutor/daemon.py, leaving the EOD SF as a manual-only disaster-recovery artifact. The actual canonical EOD flow is:alpha-engine-daily-data.timer(systemd) runsweekly_collector.py --dailyon ae-trading. After PR Split daily collection by source: yfinance EOD + polygon morning enrichment #90 lands, this is the yfinance-only EOD pass.alpha-engine-eod.timer(systemd) runsexecutor/eod_reconcile.pydirectly.The systemd timer + the morning SF step give us the full split-by-source flow with no race. So this PR doesn't touch the systemd units OR the EOD SF — they're either operationally correct already or non-operational.
Sequencing
infrastructure/deploy_step_function_daily.shto deploy the SF JSONfor D in 2026-04-17 2026-04-20 2026-04-21 2026-04-22 2026-04-23 2026-04-24; do python weekly_collector.py --morning-enrich --date $D; doneon ae-trading to repair the affected windowTest plan
🤖 Generated with Claude Code