Skip to content

weekday SF: add MorningEnrich step (polygon overwrite before predictor inference)#91

Merged
cipher813 merged 2 commits into
mainfrom
feat/morning-enrich-sf-step
Apr 24, 2026
Merged

weekday SF: add MorningEnrich step (polygon overwrite before predictor inference)#91
cipher813 merged 2 commits into
mainfrom
feat/morning-enrich-sf-step

Conversation

@cipher813
Copy link
Copy Markdown
Owner

@cipher813 cipher813 commented Apr 24, 2026

Summary

Adds a single new step (MorningEnrich) to the weekday Step Function that runs python weekly_collector.py --morning-enrich on ae-trading between the trading-day check and PredictorInference.

The morning enrichment finds the previous trading day, fetches polygon's grouped-daily for it (hard-fails on PolygonForbiddenError — no yfinance fallback), and re-runs daily_append to overwrite the prior day's ArcticDB row with polygon's authoritative OHLCV+VWAP. Predictor inference is gated on this succeeding — failure routes to HandleFailure, not silent inference on uncorrected data.

This closes the operational loop on the 2026-04-17→2026-04-23 silent VWAP outage where the EOD yfinance pass was the only source.

Why no other changes

Earlier in the design conversation we discussed deleting the alpha-engine-daily-data systemd timer + retargeting the EOD SF from micro to ae-trading. Both ideas were based on the assumption that the EOD SF was the operational EOD path. That's no longer true — alpha-engine PR #94 (2026-04-22) removed _trigger_eod_pipeline from executor/daemon.py, leaving the EOD SF as a manual-only disaster-recovery artifact. The actual canonical EOD flow is:

The systemd timer + the morning SF step give us the full split-by-source flow with no race. So this PR doesn't touch the systemd units OR the EOD SF — they're either operationally correct already or non-operational.

Sequencing

  1. Merge this PR
  2. Run infrastructure/deploy_step_function_daily.sh to deploy the SF JSON
  3. Verify on Monday's 6:05 AM PT run: MorningEnrich logs show polygon overwrite, ArcticDB has VWAP populated for Friday
  4. One-shot historical backfill: for D in 2026-04-17 2026-04-20 2026-04-21 2026-04-22 2026-04-23 2026-04-24; do python weekly_collector.py --morning-enrich --date $D; done on ae-trading to repair the affected window

Test plan

  • Both SF JSONs parse cleanly (json.load smoke)
  • 153/153 alpha-engine-data unit tests pass
  • After merge + deploy: the next weekday SF run shows MorningEnrich firing successfully
  • Verify ArcticDB row for previous trading day has VWAP populated post-MorningEnrich

🤖 Generated with Claude Code

cipher813 and others added 2 commits April 24, 2026 13:43
Operational wiring for the split-by-source design from PR 1 (alpha-engine-data #90).

step_function_daily.json (weekday SF, Mon-Fri 6:05 AM PT):
  Insert MorningEnrich SSM step on ae-trading between the trading-day check
  and PredictorInference. Runs:

    python weekly_collector.py --morning-enrich

  which finds the previous trading day, fetches polygon grouped-daily for
  it (hard-fails on PolygonForbiddenError — no yfinance fallback), and
  re-runs daily_append to overwrite the prior day's ArcticDB row with
  polygon's authoritative OHLCV+VWAP. PredictorInference is gated on this
  succeeding — failure routes to HandleFailure, not silent inference on
  uncorrected data (per feedback_no_silent_fails).

  This closes the operational loop on the 2026-04-17→2026-04-23 silent
  VWAP outage where the EOD yfinance pass was the only source and
  ArcticDB's VWAP column stayed universally null across the window.

step_function_eod.json (EOD SF, daemon-shutdown trigger):
  Move PostMarketData from micro to ae-trading (InstanceIds.$ now uses
  $.trading_instance_id; same change for WaitForPostMarketData polling).
  Avoids the OOM regression that originally moved DailyData off micro
  on 2026-04-16. Bumps executionTimeout 180→720 to match observed
  ~7 min runtime + safety margin (15-min window between daemon
  shutdown at 1:15 PM PT and EC2 stop at 1:30 PM PT — 8-10 min usage,
  comfortable margin).

  Simplified the two-command pattern (--only daily_closes + builders.daily_append)
  to a single `python weekly_collector.py --daily` since PR 1 unified
  --daily under source=yfinance_only and the full _run_daily flow now
  does closes + features + append together.

  Comment updated: this SF is now the sole canonical EOD path. The
  alpha-engine-daily-data systemd timer that was racing this SF gets
  deleted in the paired alpha-engine PR.

Validation:
  - Both SF JSONs parse cleanly (json.load smoke check)
  - 153/153 unit tests pass
  - Production validation gated on:
    1. Deploy via infrastructure/deploy_step_function_daily.sh +
       deploy_step_function.sh (or equivalent)
    2. Paired alpha-engine PR deletes systemd timer + retargets
       daemon._trigger_eod_pipeline (or accepts ec2_instance_id field
       being unused)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ally triggered

alpha-engine PR #94 (merged 2026-04-22) removed _trigger_eod_pipeline
from executor/daemon.py — the EOD SF (alpha-engine-eod-pipeline) is no
longer fired from daemon shutdown. The canonical EOD path is now:

  * 1:05 PM PT — alpha-engine-daily-data.timer (systemd, ae-trading)
    runs `python weekly_collector.py --daily` (post-PR-1 = yfinance_only)
  * 1:20 PM PT — alpha-engine-eod.timer (systemd, ae-trading) runs
    `python executor/eod_reconcile.py`

The EOD SF JSON exists only for manual disaster recovery. Modifying it
(moving from micro→trading + simplifying commands) was based on stale
context from earlier in this session — the original "EOD SF as canonical
path" framing was true a week ago but no longer holds.

Reverting keeps the EOD SF unchanged so this PR's scope stays minimal:
just the MorningEnrich SSM step in step_function_daily.json.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cipher813 cipher813 changed the title SF: add MorningEnrich step + move EOD PostMarketData to ae-trading weekday SF: add MorningEnrich step (polygon overwrite before predictor inference) Apr 24, 2026
@cipher813 cipher813 marked this pull request as ready for review April 24, 2026 20:49
@cipher813 cipher813 merged commit b70ba49 into main Apr 24, 2026
1 check passed
@cipher813 cipher813 deleted the feat/morning-enrich-sf-step branch April 24, 2026 20:51
cipher813 added a commit that referenced this pull request Apr 24, 2026
#92)

Surfaced by the 2026-04-24 historical VWAP repair after PR #91 deployed:
ArcticDB's universe_lib.update() raises "index must be monotonic
increasing or decreasing" when asked to insert behind the latest stored
date. daily_append was originally designed for "append today's row at
the head" — fine for the steady-state daily pass but useless for
historical backfills.

This commit adds _write_row_backfill_safe(lib, symbol, new_row,
existing_series=None) which routes by mode:

  * append (target_ts > all existing dates): use lib.update() — fast
    path, single-row write. Same behavior as before for the steady-
    state daily pass.

  * backfill (target_ts ≤ some existing date, including target_ts ==
    latest): read full series, splice in the new row (replacing any
    existing same-date row, matching update() semantics), write the
    monotonic-sorted full series via lib.write(prune_previous_versions=
    True). ~10-100x slower per ticker but only fires for rare
    backfill operations.

Per-ticker write site rewired to call _write_row_backfill_safe; the
existing `hist` (already read for feature warmup) is passed in as
existing_series so the helper doesn't double-read.

Macro + sector ETF write sites also rewired. Each call's mode is
captured into macro_write_modes so the post-write verification check
can apply the right correctness assertion:

  * append mode → readback last index must equal target_ts (catches
    the 2026-04-15 silent-stale failure)
  * backfill mode → target_ts must be IN the readback index, anywhere
    (the last date is naturally future relative to a backfilled
    historical date)

Existing semantics tests (test_daily_append_semantics.py) updated to
match the new helper-routed call sites while preserving regression
intent — lib.append() must never appear, counters increment after
write, etc.

9 new tests in tests/test_daily_append_backfill_safe.py:
  - append uses lib.update when target after latest
  - append when existing series is empty (first-write-after-empty)
  - first write to nonexistent symbol uses write()
  - backfill uses lib.write when target before latest
  - backfill replaces existing same-date row (semantic match w/ update)
  - backfill target in middle of series — sorts monotonic
  - target == latest takes backfill path (conservative >, not >=)
  - lib.write called with prune_previous_versions=True
  - passing existing_series avoids extra lib.read

Full suite: 162/162 pass.

After deploy to ae-trading: re-run the historical backfill loop
  for D in 2026-04-17 2026-04-20 2026-04-21 2026-04-22 2026-04-23; do
    python weekly_collector.py --morning-enrich --date $D
  done
to repair the universally-NaN VWAP column for the 2026-04-17→2026-04-23
window the polygon outage left behind.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cipher813 added a commit that referenced this pull request Apr 27, 2026
2026-04-27 weekday SF dropped through to executor with a 65h-stale
daily_data stamp because _run_morning_enrich (added 2026-04-24 PR #91)
never refreshed it. The post-close DailyData run is the only writer of
the stamp; on Mondays, that's Friday afternoon — well past the executor's
26h staleness gate. Today was the first Monday since MorningEnrich
shipped, which is why this surfaced now.

Fix: on the success path, _run_morning_enrich now calls
_write_module_health(module_name="daily_data", ...) with the polygon
overwrite results in the summary, so the stamp's last_success refreshes
to the morning-enrich run time. Failure paths intentionally leave the
prior stamp untouched — writing a fresh "ok" stamp on failure would mask
outages from the executor's gate. Dry runs also skip the write.

3 new tests cover: stamp written on success with morning_enrich=true in
the summary, stamp NOT written on PolygonForbiddenError, stamp NOT
written in dry-run. Full suite: 218/218 passes.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cipher813 added a commit that referenced this pull request Apr 27, 2026
…ma (#105)

Background — 2026-04-27 EOD-email blackout investigation
========================================================
The structural fix in PR #104 decoupled macro/SPY freshness from
stock-coverage correctness. Validation today exposed a second, latent
issue: with the universe-coverage guard now passing, daily_append's
per-stock writes finally execute — and 100% of them fail with an
ArcticDB schema-mismatch error.

Schema audit (2026-04-27 22:14 UTC) revealed heterogeneous universe state:

  - 816 symbols (~90%): 64 cols, no VWAP at all
  - 88  symbols (~10%): 65 cols, VWAP at idx=64 (appended at end)

daily_append writes via OHLCV_COLS = [Open, High, Low, Close, Volume,
VWAP, ...features], which puts VWAP at idx=5. ArcticDB update() requires
column order match — both schema variants fail. Per-stock universe
writes have therefore been failing since the polygon-VWAP work landed
on 2026-04-24 (PRs #90/#91/#92), masked until today by the macro-coupled
universe-coverage guard.

Operational design (yfinance EOD → polygon morning)
====================================================
- yfinance EOD post-close hook writes daily_closes parquet with
  VWAP=NaN (yfinance does not expose true volume-weighted VWAP).
- polygon morning enrichment overwrites the parquet with real VWAP
  values from polygon grouped-daily.
- daily_append runs end-of-day and writes whatever VWAP is in the
  parquet to ArcticDB universe — NaN initially, real values after the
  morning enrichment re-runs daily_append.

For that flow to work, VWAP must be a first-class column in the
universe schema with a stable position. This migration normalizes
every symbol to the canonical layout:

    [Open, High, Low, Close, Volume, VWAP] + FEATURES

NaN-fills VWAP historically for the 816 symbols that didn't have it.
Repositions VWAP for the 88 symbols that had it appended at idx=64.
Existing FEATURES block keeps its relative order.

Idempotent — symbols already in canonical order are skipped.
Per-symbol error isolation — one symbol's write failure does not abort
the batch (records into errors[], continues with the rest).

Tests
=====
- _canonical_column_order: VWAP inserted at idx=5, feature block
  preserved in relative order, drops nothing.
- _is_canonical: recognizes correct layout, rejects appended-VWAP and
  missing-VWAP variants.
- migrate_universe_vwap apply path:
  - Inserts VWAP at idx=5 with FLOAT64 NaN when absent.
  - Relocates VWAP from idx=last when appended (preserving values).
  - Skips already-canonical symbols (idempotent).
  - Honors --tickers override for canary / subset runs.
  - Per-symbol error isolation — partial-status return on partial failure.
- All 275 existing tests still pass (261 + 14 new).

Operational follow-up (not in this PR)
======================================
After merge, deploy + run:
    python -m builders.migrate_universe_vwap --apply
on ae-trading. Expected: 904 symbols migrated (816 + 88), audit JSON
written to s3://alpha-engine-research/builders/migrate_universe_vwap_audit/.
Then rerun alpha-engine-daily-data.service (per-stock writes succeed)
and alpha-engine-eod.service (held-stock close lookups succeed; EOD
email + 2026-04-27 eod_pnl row land).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant