Feat/sat morning enrich cron 09utc#107
Merged
Merged
Conversation
Sequential migration hit SSM's 1-hour timeout at 60% complete (542/904 symbols). Each symbol is read → reorder columns → write back, all S3 round-trips, all GIL-released — perfect fit for the thread-pool fan-out pattern daily_append already uses for its Phase 2 writes. - One ThreadPoolExecutor across every target symbol - Worker count env-overridable via MIGRATE_UNIVERSE_VWAP_WORKERS (default 16), same shape as DAILY_APPEND_WRITE_WORKERS — prod can tune without redeploy - Per-symbol outcome dict captures read/write errors instead of raising, so one bad symbol can't abort the batch - Aggregation runs on the main thread (counter mutation stays single- threaded; no locks needed) - Summary includes elapsed_seconds + workers so SSM-timeout-vs-finish postmortems can see actual runtime Tests: - test_migration_uses_threadpool_executor — source-text invariant - test_migration_workers_env_overridable — env-var override invariant - test_migration_threaded_all_writes_succeed — N=20 functional check that every result lands in the right bucket (catches generator-double-iter regressions) - test_migration_threaded_summary_includes_elapsed_and_workers — ops field - test_migration_threaded_aggregates_mixed_outcomes — mixed-outcome run (canonical + needs-fix + read-fail + write-fail) all aggregate correctly - All 280 existing tests still pass Expected runtime on 904-symbol universe: ~10-15 min at 16 workers (down from sequential ~120 min). The migration is idempotent — running it against a partially-migrated universe (already-canonical symbols skipped) just resumes from where the prior run died. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…C Sat
Two coupled changes that prepare the Saturday Step Function for the
post-Tier-4 re-enable.
(1) Add `--morning-enrich` step to `infrastructure/spot_data_weekly.sh`
BEFORE Phase 1 + builders.prune_delisted_tickers.
Polygon's grouped-daily aggregate for date T isn't fully settled
until calendar day T+1. The Friday weekday-SF run (Friday ~13:05 PT
via systemd timer + the weekday-SF MorningEnrich Lambda) collects
daily_closes pre-settlement, so Friday's row in S3 + ArcticDB may
carry stale / partial polygon data.
By the time the Saturday SF kicks off (09:00 UTC Sat — see #2),
polygon's Friday data IS settled. This step calls
`weekly_collector.py --morning-enrich` (same code path the weekday
SF MorningEnrich Lambda uses, exists since alpha-engine-data#91)
to refetch Friday's daily_closes via polygon and re-append to
ArcticDB so all downstream Saturday work (Phase 1 prices, RAG,
predictor training, backtester) reads polygon-authoritative
Friday closes.
Hard-fail on morning_enrich failure: the bundle aborts so RAG +
Phase 1 don't run on stale upstream data. Matches the no-silent-
fails posture for unstable system state.
(2) Reschedule EventBridge rule cron from `(0 0 ? * SAT *)` to
`(0 9 ? * SAT *)`.
Old: Sat 00:00 UTC = Fri ~5pm PT. Polygon's Friday data NOT yet
settled — morning_enrich step (above) would refetch stale / not-
yet-settled data, defeating its purpose.
New: Sat 09:00 UTC = 02:00 AM PT Sat (PDT) / 01:00 AM PT (PST).
Polygon T+1 settle complete by 09:00 UTC; morning_enrich pulls
authoritative Friday closes; downstream work reads correct data.
Updated in:
- infrastructure/deploy_step_function.sh:216 (put-rule schedule
+ description) + line 277 (echo'd summary)
- infrastructure/cloudformation/alpha-engine-orchestration.yaml:111
(CFN template ScheduleExpression + Description)
The live AWS rule (currently DISABLED at cron(0 0 ? * SAT *)) still
needs an `aws events put-rule` to apply this change + an
`enable-rule` to start firing. CLI-side step ordering: this PR
captures IaC intent; live rule update happens after merge.
Tests: 280 unit tests pass (no test surface for the spot script
itself; bash -n syntax-check clean). morning_enrich-related tests
(test_weekly_collector_morning_enrich.py, test_daily_closes_source_modes.py)
verify the underlying CLI semantics this step depends on.
Companion changes (separate repos):
- alpha-engine-backtester: flip use_vectorized_sweep default-on
(PR #123-ish, this same session)
- alpha-engine-docs: SYSTEM_STATE.md Tier 4 deploy entry + Sat-fill
addition + new cron rationale
- CLI ops: aws events put-rule --schedule-expression "cron(0 9 ? * SAT *)"
+ aws events enable-rule (final step that actually starts firing)
Closes ROADMAP P0 "Re-enable Saturday SF EventBridge after Tier 4
lands" (added 2026-04-27, 5-PR Tier 4 deployment arc closes here).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.