fix(daily_append): scope missing-from-closes check to caller's request list#132
Merged
Merged
Conversation
…t list MorningEnrich's missing-from-closes hard-fail was tripping every S&P churn week. Today (2026-05-02): 8 tickers got dropped from the index this past week (ASGN, GTM, HOLX, KMPR, LW, MOH, MTCH, PAYC). They're still in ArcticDB universe (awaiting next prune cycle); they're absent from the new constituents.json Phase 1 wrote at 09:20; MorningEnrich no longer requests them from polygon. The pre-existing check (arctic_universe - closes) saw 8 churn-outs + 4 chronic polygon- coverage gaps = 12 missing > threshold of 5 → SF halt at MorningEnrich. Add an optional ``expected_tickers`` parameter to ``daily_append``. When the caller passes its request list, the check scopes to ``arctic ∩ expected`` instead of the full ArcticDB universe. Tickers absent from the request (S&P churn-out stragglers) are excluded from the alarm and logged at INFO so operators see drift building up between prune cycles. Backward compatible — callers that don't pass it retain the prior whole-universe behavior. Both call sites (``_run_morning_enrich`` and ``_run_daily``) now pass their constituents-derived ticker list. The ticker list was already in scope at both sites. Net effect on the 2026-05-02 SF redrive: missing-from-closes count drops 12 → 4 (only the chronic BF-B/BRK-B/MOG-A/PSTG remain — well under the 5-threshold WARN-only path), MorningEnrich completes, Phase 1 runs, PR #130's backfill regression preflight passes, ArcticDB lands at 5/1, postflight passes, downstream Research/Predictor Training/ Backtester all run. 5 new tests in tests/test_daily_append_missing_from_closes.py (360 total) cover: stragglers excluded, real constituents-gap still raises, caret-prefix stripping, None preserves legacy behavior, straggler-count INFO log fires. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 2, 2026
cipher813
added a commit
that referenced
this pull request
May 2, 2026
#133) Companion to PR #132. The pre-write missing-from-closes check correctly excluded 8 S&P churn-out stragglers (ASGN, GTM, HOLX, KMPR, LW, MOH, MTCH, PAYC) on today's redrive, daily writes completed cleanly (n_ok=898), and then ``_scan_universe_and_emit_freshness_receipt`` — the post-write scan that audits every ArcticDB universe symbol's last-row date — re-tripped on the same 8 stragglers (HOLX 25d stale, the rest 8d stale) and halted the SF a 4th time. Plumb ``expected_tickers`` through to the freshness scan with the same semantics as the pre-write check: scope to ``arctic ∩ expected``, exclude stragglers, log them at INFO so operators see drift building up between prune cycles. A genuinely-stale symbol that IS in expected_tickers still raises (silent-fail rule preserved). Empty intersection raises loudly so a misconfigured caller can't silently emit a meaningless all-fresh receipt. Backward compatible: expected_tickers=None preserves the prior whole-library scan. The call site (only one — ``daily_append`` line 1013) passes its own expected_tickers parameter through. No new wiring needed in weekly_collector.py. 6 new tests in tests/test_daily_append_universe_freshness.py cover: stragglers excluded; stale-in-expected still raises; INFO log fires; caret-prefix stripping; None preserves legacy; empty-intersection raises loudly. 366 tests total. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cipher813
added a commit
that referenced
this pull request
May 2, 2026
…rite (#134) Architectural fix for the 2026-05-02 SF-halt class. PR #132 + PR #133 shipped earlier today scope the missing-from-closes + freshness checks to the caller's expected_tickers list, gracefully tolerating S&P churn-out stragglers that linger in ArcticDB awaiting the post-Phase-1 prune cycle. Those bandages are sound but they're now the load-bearing path for every MorningEnrich invocation. Reorder MorningEnrich to make the universe coherent BEFORE any check fires: 1. constituents.collect() — pulls fresh S&P 500/400 from Wikipedia, writes the new constituents.json + sector_map.json. Hard-fails the step on any error (Wikipedia outage / sector-mapping completeness) per feedback_no_silent_fails — daily_closes can't proceed against stale tickers. 2. prune_delisted_tickers(constituents_override=fresh_set, absent_days=5, apply=True) — drops ArcticDB stragglers absent from the fresh constituents and ≥5 days stale (matching the freshness scan threshold). Best-effort: a prune failure logs ERROR and lands a ``prune_preflight_warning`` entry on the result, but does NOT block the rest of MorningEnrich (PR #132/#133 still tolerate stragglers as fallback). 3. Existing daily_closes + daily_append flow runs against a coherent universe. The bandage scoping in PR #132/#133 becomes a quiet no-op for the happy path. The new ``constituents_override`` parameter on prune_delisted_tickers swaps the freshness reference without updating the public ``latest_weekly.json`` pointer (which has cross-module read fan-out: alternative.py, macro.py, features/compute.py all depend on it). Mutually exclusive with ``tickers_override``. prune_delisted_tickers also runs at its existing post-Phase-1 site with the conservative 14d default — caught any newcomers the SF picked up between MorningEnrich and Phase 1. 9 new tests: - 5 in tests/test_weekly_collector_morning_enrich.py: refreshes constituents before collect / prunes before daily_append / aborts on constituents failure / continues on prune failure / dry-run skips preflight writes - 4 in tests/test_prune_delisted_tickers.py: constituents_override uses in-process set / accepts list-or-set / still gates on last_date / mutually-exclusive with tickers_override 369 tests total. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 tasks
cipher813
added a commit
that referenced
this pull request
May 2, 2026
* feat(preflight): add sf_preflight.py — Saturday SF dry-rehearsal Predicts whether the Saturday SF would succeed BEFORE launching a spot. Today's recovery cycle (5 SF redrives, ~5 polygon API calls each) burned free-tier quota and operator hours discovering bugs sequentially. This module simulates the critical pre-Phase-1 path against real S3 + ArcticDB state and reports per-step pass/fail in ~30s with 1 polygon call total. Eight independent checks, mapped to today's incident stack: PR #130 (backfill regression) → check_backfill_source_freshness PR #131 (polygon coverage flake) → check_polygon_grouped_coverage PR #132 (missing-from-closes scoping) → check_predicted_missing_from_closes PR #133 (freshness scan scoping) → check_universe_sample_freshness PR #134 (workflow ordering) → check_universe_drift PR #135 (return shape) → check_constituents_fetch Postflight contracts → check_postflight_contracts ArcticDB reachability → check_arctic_connectivity Each check is a pure function taking a PreflightContext, returning a CheckResult. The orchestrator runs them all (catching per-check exceptions so one fail doesn't abort the suite) and emits human or JSON output. Exit code 1 on any failure. Two macOS-specific design notes: 1. ArcticDB libs are initialized once in check_arctic_connectivity and reused across downstream checks via the context — re-initializing adb.Arctic() crashes Aws::S3::S3Client::S3Client on macOS. 2. Checks are ordered with arctic_connectivity FIRST so its bundled AWS SDK loads before boto3 (which gets pulled in by collectors imports). Polygon check skips gracefully (WARN, not FAIL) when POLYGON_API_KEY is unset — supports laptop-side preflight where the .env isn't loaded. On the spot the key is present and the check fires. 18 tests in tests/test_sf_preflight.py — happy path + each failure mode each check is designed to catch + orchestrator isolation. 394 tests total. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(sf_preflight): set POLYGON_API_KEY in polygon-coverage tests CI runs without POLYGON_API_KEY in env, so the no-key skip-to-WARN guard short-circuited the 3 polygon-coverage tests before they reached the mocked client. Set the env var via monkeypatch so the guard passes through to the polygon mock. Also add explicit test for the no-key path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
expected_tickerstodaily_append. When caller passes its request list, the check scopes toarctic ∩ expectedinstead of the full ArcticDB universe. Stragglers excluded from the alarm but logged at INFO so operators see drift building up between prune cycles.expected_tickersretain the prior whole-universe behavior._run_morning_enrich,_run_daily) updated — ticker list was already in scope at both.Net effect on 2026-05-02 redrive: missing-from-closes count drops 12 → 4 (only chronic BF-B/BRK-B/MOG-A/PSTG remain — well under the 5-threshold WARN-only path).
Test plan
tests/test_daily_append_missing_from_closes.pycover: stragglers excluded; real constituents-gap still raises; caret-prefix stripping;Nonepreserves legacy behavior; straggler-count INFO log fires🤖 Generated with Claude Code