Skip to content

Feat/sat morning enrich cron 09utc#107

Merged
cipher813 merged 3 commits into
mainfrom
feat/sat-morning-enrich-cron-09utc
Apr 28, 2026
Merged

Feat/sat morning enrich cron 09utc#107
cipher813 merged 3 commits into
mainfrom
feat/sat-morning-enrich-cron-09utc

Conversation

@cipher813
Copy link
Copy Markdown
Owner

No description provided.

cipher813 and others added 3 commits April 27, 2026 18:32
Sequential migration hit SSM's 1-hour timeout at 60% complete (542/904
symbols). Each symbol is read → reorder columns → write back, all S3
round-trips, all GIL-released — perfect fit for the thread-pool fan-out
pattern daily_append already uses for its Phase 2 writes.

- One ThreadPoolExecutor across every target symbol
- Worker count env-overridable via MIGRATE_UNIVERSE_VWAP_WORKERS (default 16),
  same shape as DAILY_APPEND_WRITE_WORKERS — prod can tune without redeploy
- Per-symbol outcome dict captures read/write errors instead of raising,
  so one bad symbol can't abort the batch
- Aggregation runs on the main thread (counter mutation stays single-
  threaded; no locks needed)
- Summary includes elapsed_seconds + workers so SSM-timeout-vs-finish
  postmortems can see actual runtime

Tests:
- test_migration_uses_threadpool_executor — source-text invariant
- test_migration_workers_env_overridable — env-var override invariant
- test_migration_threaded_all_writes_succeed — N=20 functional check that
  every result lands in the right bucket (catches generator-double-iter
  regressions)
- test_migration_threaded_summary_includes_elapsed_and_workers — ops field
- test_migration_threaded_aggregates_mixed_outcomes — mixed-outcome run
  (canonical + needs-fix + read-fail + write-fail) all aggregate correctly
- All 280 existing tests still pass

Expected runtime on 904-symbol universe: ~10-15 min at 16 workers (down
from sequential ~120 min). The migration is idempotent — running it
against a partially-migrated universe (already-canonical symbols skipped)
just resumes from where the prior run died.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…C Sat

Two coupled changes that prepare the Saturday Step Function for the
post-Tier-4 re-enable.

(1) Add `--morning-enrich` step to `infrastructure/spot_data_weekly.sh`
    BEFORE Phase 1 + builders.prune_delisted_tickers.

    Polygon's grouped-daily aggregate for date T isn't fully settled
    until calendar day T+1. The Friday weekday-SF run (Friday ~13:05 PT
    via systemd timer + the weekday-SF MorningEnrich Lambda) collects
    daily_closes pre-settlement, so Friday's row in S3 + ArcticDB may
    carry stale / partial polygon data.

    By the time the Saturday SF kicks off (09:00 UTC Sat — see #2),
    polygon's Friday data IS settled. This step calls
    `weekly_collector.py --morning-enrich` (same code path the weekday
    SF MorningEnrich Lambda uses, exists since alpha-engine-data#91)
    to refetch Friday's daily_closes via polygon and re-append to
    ArcticDB so all downstream Saturday work (Phase 1 prices, RAG,
    predictor training, backtester) reads polygon-authoritative
    Friday closes.

    Hard-fail on morning_enrich failure: the bundle aborts so RAG +
    Phase 1 don't run on stale upstream data. Matches the no-silent-
    fails posture for unstable system state.

(2) Reschedule EventBridge rule cron from `(0 0 ? * SAT *)` to
    `(0 9 ? * SAT *)`.

    Old: Sat 00:00 UTC = Fri ~5pm PT. Polygon's Friday data NOT yet
    settled — morning_enrich step (above) would refetch stale / not-
    yet-settled data, defeating its purpose.

    New: Sat 09:00 UTC = 02:00 AM PT Sat (PDT) / 01:00 AM PT (PST).
    Polygon T+1 settle complete by 09:00 UTC; morning_enrich pulls
    authoritative Friday closes; downstream work reads correct data.

    Updated in:
      - infrastructure/deploy_step_function.sh:216 (put-rule schedule
        + description) + line 277 (echo'd summary)
      - infrastructure/cloudformation/alpha-engine-orchestration.yaml:111
        (CFN template ScheduleExpression + Description)

    The live AWS rule (currently DISABLED at cron(0 0 ? * SAT *)) still
    needs an `aws events put-rule` to apply this change + an
    `enable-rule` to start firing. CLI-side step ordering: this PR
    captures IaC intent; live rule update happens after merge.

Tests: 280 unit tests pass (no test surface for the spot script
itself; bash -n syntax-check clean). morning_enrich-related tests
(test_weekly_collector_morning_enrich.py, test_daily_closes_source_modes.py)
verify the underlying CLI semantics this step depends on.

Companion changes (separate repos):
  - alpha-engine-backtester: flip use_vectorized_sweep default-on
    (PR #123-ish, this same session)
  - alpha-engine-docs: SYSTEM_STATE.md Tier 4 deploy entry + Sat-fill
    addition + new cron rationale
  - CLI ops: aws events put-rule --schedule-expression "cron(0 9 ? * SAT *)"
    + aws events enable-rule (final step that actually starts firing)

Closes ROADMAP P0 "Re-enable Saturday SF EventBridge after Tier 4
lands" (added 2026-04-27, 5-PR Tier 4 deployment arc closes here).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cipher813 cipher813 merged commit f741a5b into main Apr 28, 2026
1 check passed
@cipher813 cipher813 deleted the feat/sat-morning-enrich-cron-09utc branch April 28, 2026 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant