Skip to content

fix(constituents): include tickers in collect() return dict#135

Merged
cipher813 merged 1 commit into
mainfrom
fix/constituents-collect-returns-tickers
May 2, 2026
Merged

fix(constituents): include tickers in collect() return dict#135
cipher813 merged 1 commit into
mainfrom
fix/constituents-collect-returns-tickers

Conversation

@cipher813
Copy link
Copy Markdown
Owner

Summary

  • PR fix(morning_enrich): refresh constituents + prune stragglers before write #134's pre-MorningEnrich preflight reads cons_result.get("tickers", []) to feed prune_delisted_tickers and the daily_closes request list. But collect() was returning only {"status": "ok", "count": N} — no tickers. Preflight got [] → "No tickers available for morning enrichment" → SF halt.
  • 2026-05-02 SF redrive EOD Step Function definition #5 was the live evidence: the architectural fix worked (prune correctly dropped all 8 churn-out stragglers — ASGN, GTM, HOLX, KMPR, LW, MOH, MTCH, PAYC). But the empty tickers killed the rest of MorningEnrich.
  • Add tickers to both happy-path returns (ok + ok_dry_run). Additive — no caller breaks.
  • _run_phase1 previously workaround was an S3 round-trip (re-read what was just written). Cleaned up to use const_result["tickers"] directly.

Test plan

  • New contract test test_collect_returns_tickers_in_dict locks both return shapes — sneaks-back protection for this exact regression class
  • Existing 4 constituents tests + 17 morning_enrich tests still pass
  • Full suite green: 376 passed
  • Live verification via Saturday SF redrive: constituents.collect → 903 tickers in return ✓ → prune drops 8 stragglers ✓ → daily_closes called with 921 tickers ✓ → daily_append clean ✓ → Phase 1 → backfill (PR fix(backfill): apply daily_closes delta + regression preflight #130) → ArcticDB at 5/1 → postflight → Research / Predictor / Backtester

🤖 Generated with Claude Code

PR #134's pre-MorningEnrich preflight calls ``constituents.collect()`` and
reads ``cons_result.get("tickers", [])`` to feed prune_delisted_tickers'
``constituents_override`` and the daily_closes request list. But collect()
was returning only ``{"status": "ok", "count": N}`` — no tickers — so the
preflight got [] and MorningEnrich aborted with "No tickers available".

2026-05-02 SF redrive #5 was the live failure: prune correctly dropped
all 8 stragglers (architectural fix worked\!), but then no tickers got
fed to daily_closes. The whole MorningEnrich step exited 1.

Add ``tickers`` to both happy-path returns (ok + ok_dry_run). Additive,
no breakage:
- ``_run_phase1`` (the only other caller) previously round-tripped to S3
  to re-read what it just wrote — now uses ``const_result["tickers"]``
  directly.
- The dry-run fork in _run_phase1 (which separately called the private
  ``_fetch_constituents``) is also collapsed.

Contract test in tests/test_constituents_sector_map.py locks both return
shapes — sneaks-back protection for this exact regression class.

376 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cipher813 cipher813 merged commit 3fd2b8c into main May 2, 2026
1 check passed
@cipher813 cipher813 deleted the fix/constituents-collect-returns-tickers branch May 2, 2026 15:00
cipher813 added a commit that referenced this pull request May 2, 2026
* feat(preflight): add sf_preflight.py — Saturday SF dry-rehearsal

Predicts whether the Saturday SF would succeed BEFORE launching a spot.
Today's recovery cycle (5 SF redrives, ~5 polygon API calls each) burned
free-tier quota and operator hours discovering bugs sequentially. This
module simulates the critical pre-Phase-1 path against real S3 + ArcticDB
state and reports per-step pass/fail in ~30s with 1 polygon call total.

Eight independent checks, mapped to today's incident stack:

  PR #130 (backfill regression)         → check_backfill_source_freshness
  PR #131 (polygon coverage flake)      → check_polygon_grouped_coverage
  PR #132 (missing-from-closes scoping) → check_predicted_missing_from_closes
  PR #133 (freshness scan scoping)      → check_universe_sample_freshness
  PR #134 (workflow ordering)           → check_universe_drift
  PR #135 (return shape)                → check_constituents_fetch
  Postflight contracts                  → check_postflight_contracts
  ArcticDB reachability                 → check_arctic_connectivity

Each check is a pure function taking a PreflightContext, returning a
CheckResult. The orchestrator runs them all (catching per-check
exceptions so one fail doesn't abort the suite) and emits human or
JSON output. Exit code 1 on any failure.

Two macOS-specific design notes:

1. ArcticDB libs are initialized once in check_arctic_connectivity and
   reused across downstream checks via the context — re-initializing
   adb.Arctic() crashes Aws::S3::S3Client::S3Client on macOS.
2. Checks are ordered with arctic_connectivity FIRST so its bundled AWS
   SDK loads before boto3 (which gets pulled in by collectors imports).

Polygon check skips gracefully (WARN, not FAIL) when POLYGON_API_KEY
is unset — supports laptop-side preflight where the .env isn't loaded.
On the spot the key is present and the check fires.

18 tests in tests/test_sf_preflight.py — happy path + each failure mode
each check is designed to catch + orchestrator isolation.

394 tests total.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(sf_preflight): set POLYGON_API_KEY in polygon-coverage tests

CI runs without POLYGON_API_KEY in env, so the no-key skip-to-WARN
guard short-circuited the 3 polygon-coverage tests before they
reached the mocked client. Set the env var via monkeypatch so the
guard passes through to the polygon mock. Also add explicit test
for the no-key path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cipher813 added a commit that referenced this pull request May 28, 2026
…across all artifacts (#339)

Closes the gap surfaced 2026-05-28: current-state probe answers
'is the artifact present now?' but operators also need 'did it
land last weekend? are there gaps in the producer's history?'
Filed per the same feedback memory observe_mode_unconditional_gates
— absence-of-artifact is the failure mode, and a single-cycle
absence could be a false-positive where a multi-cycle gap is a
real producer regression.

Adds:

- event['mode']='historical' dispatch in handler(). Routes to a
  new _handle_historical(s3, now, started_at, lookback_overrides)
  path that walks the registry, probes the last N cycles per
  artifact, and writes _freshness_monitor/history.json (page 26
  will surface per-row history expanders + gap counts).
- New EB cron alpha-engine-freshness-monitor-historical-cron
  (daily 04:00 UTC, off-peak) wired in deploy.sh --bootstrap.
- Default lookback: 12 saturday_sf + 30 weekday_sf/eod_sf cycles
  (~3 months each). continuous skipped (current-state covers).
  Tunable via event['lookback'] override.

403/404/NoSuchKey normalization: S3 returns 403 (not 404) for
missing keys when the Lambda lacks s3:ListBucket. Treat both as
cleanly-absent (no error_code in output) so page 26 doesn't show
spurious '403 errors' on legitimately-absent historical cycles.

9 new unit tests cover: saturday/weekday/eod cycle-date
resolution, continuous skip, zero-count short-circuit,
date/trading_day/no-placeholder template rendering, and handler
mode-dispatch.

Live smoke (post-deploy + manual invoke):
  n_artifacts=51, n_cycles_probed=474, duration=10.08s

Surfaced 1 real finding for follow-up: several artifacts use
calendar-vs-trading-day-anchored templates that don't match
producer behavior. research_signals registered as
signals/{date}/signals.json with cadence=saturday_sf, but
producer writes to mostly Friday trading-day keys (2026-05-22,
2026-05-15, etc.). The historical probe correctly reports the
Saturday keys as absent — which IS the right answer given the
registry template. ROADMAP follow-up filed separately to audit
all registry templates for calendar-vs-trading-day mismatch.

Calendar-naive by design — NYSE holidays surface as
false-positive absent days but operators can interpret in
context. Calendar-aware backfill is a P3 follow-up if the
noise becomes worth the dependency lift.

Composes with the OBSERVATION_REGISTRY arc (#349/#351/#352/#355
+ #135/#136/#137).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant