fix(daily_closes): date-bound FRED fetcher + repair for 14-BDay clobber#219
Merged
Conversation
…clobber The windowed-reconciliation cutover (PRs #199/#200/#201 + alpha-engine-config flip to daily_closes:{window_days:14, skip_if_canonical:true}, activated 2026-05-11) amplified a latent bug in _fetch_fred_closes: the FRED query used sort_order=desc + limit=5 with no upper bound, so per-date calls across the rolling window all returned today's most-recent observation. Every historical date's parquet got today's VIX/VIX3M/TNX/IRX/TWO/HYOAS/ BAA10Y stamped on it, clobbering correct historical closes. FlowDoctor surfaced the regression 2026-05-12 ~13:01/13:04 UTC with paired "polygon_only OVERWRITE VIX" ERROR alerts for 2026-04-22 and 2026-04-28, both showing identical pre (18.36) and post (17.19) closes — the signature of "every per-date stamp got today's latest". Fix: - _fetch_fred_closes sends observation_end=date_str so per-date calls return that date's actual FRED observation (or most-recent on-or-before for the same-day case where FRED hasn't published yet — preserves the legacy "today's parquet carries yesterday's FRED close" semantic). - Defensive guard refuses to write a future-dated observation if FRED somehow returns one despite observation_end. Repair tool (collectors/daily_closes_fred_repair.py) re-fetches correct FRED values across an operator-specified window and rewrites only the FRED-ticker rows of each affected daily_closes parquet. Polygon stock rows are untouched (their fetcher was always per-date-correct). Idempotent. Tests: +7 per-date regression tests pinning observation_end + same-day fallback + future-date refusal + missing-value skip; +11 repair tests covering business-day enumeration + on-or-before lookup + idempotent no-op + dry-run + missing-parquet skip. Suite 774 → 792. Operator follow-up: after merge, run python -m collectors.daily_closes_fred_repair \ --bucket alpha-engine-research \ --start 2026-04-22 --end 2026-05-12 [--dry-run] to repair the clobbered window before tomorrow's MorningEnrich (which now writes correct per-date FRED values going forward). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 tasks
cipher813
added a commit
that referenced
this pull request
May 12, 2026
#220) requests.exceptions.HTTPError.str() embeds the full request URL — including ``api_key=<credential>`` — in its message. The existing ``_fetch_fred_closes`` except-block logged that raw via ``logger.warning("FRED fetch failed for %s (%s): %s", t, sid, e)`` so any FRED 5xx in production MorningEnrich / EOD dumped the credential to CloudWatch. Surfaced 2026-05-12 during the post-merge repair run for PR #219 — a transient FRED 500 on VXVCLS during the operator step leaked the key to the conversation transcript. The FRED key is rotated separately; this commit closes the leak class. Changes: - ``_scrub_api_key(msg) -> str`` helper in ``collectors/daily_closes.py`` masks the ``api_key=...`` querystring fragment with ``api_key=***`` - ``_fetch_fred_closes`` warn-log routes the exception through the scrubber - Repair script's ``_fetch_fred_range`` already added retry+scrub in the cherry-pick from the live-debug session; this commit consolidates by importing the shared ``_scrub_api_key`` from ``daily_closes`` so the regex doesn't drift across files Tests (+7, suite 792 → 799): - ``_scrub_api_key`` masks URL-embedded keys, handles exception objects directly, passthroughs when no key present, terminates at ``&`` - ``_fetch_fred_closes`` warn-log scrub regression - ``_fetch_fred_range`` retry-warn + final-error scrub regressions Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the FlowDoctor
polygon_only OVERWRITE VIXERROR alerts that fired 2026-05-12 ~13:01 / ~13:04 UTC on MorningEnrich. The windowed-reconciliation cutover (PRs #199/#200/#201 + alpha-engine-config flip todaily_closes: {window_days: 14, skip_if_canonical: true}, activated 2026-05-11) amplified a latent bug in_fetch_fred_closes: the FRED query usedsort_order=desc, limit=5with no upper bound, so per-date calls across the rolling 14-BDay window all returned today's most-recent observation. Every historical date'sstaging/daily_closes/{date}.parquetgot today's VIX/VIX3M/TNX/IRX/TWO/HYOAS/BAA10Y stamped on it, clobbering correct historical closes.Identifying signature in this morning's alerts: 2026-04-22 and 2026-04-28 both showed identical pre (18.36) and post (17.19) closes — two different trading days can't legitimately have the same VIX close to the cent.
What changed
collectors/daily_closes.py::_fetch_fred_closes— sendsobservation_end=date_strso per-date calls return that date's actual FRED observation. Defensive guard refuses to write any returned observation date > date_str. Same-day morning call (FRED's T-1 publishing lag) still falls back to the most-recent on-or-before observation, preserving the legacy "today's parquet carries yesterday's FRED close" semantic.collectors/daily_closes_fred_repair.py(new) — one-shot operator CLI that re-fetches correct FRED values over a date window and rewrites only the FRED-ticker rows of each affectedstaging/daily_closes/parquet. Polygon stock rows untouched (their fetcher was always per-date-correct). Idempotent.tests/test_daily_closes_fred_per_date.py(new) — 7 regression tests pinning per-date semantics:observation_endbound, distinct-value-per-date, same-day fallback, future-date refusal, missing-value skip, missing-observation skip.tests/test_daily_closes_fred_repair.py(new) — 11 tests covering business-day enumeration, most-recent-on-or-before lookup, end-to-end overwrite of clobbered rows leaving polygon rows untouched, idempotent no-op on already-correct parquets, dry-run, missing-parquet skip, unknown-ticker rejection, missing-API-key rejection.Suite 774 → 792.
Scope of the clobber
Affected: ~14 BDay rolling window per MorningEnrich firing since 2026-05-11. Two firings so far (Mon 2026-05-11, Tue 2026-05-12). Affected tickers: VIX, VIX3M, TNX, IRX, TWO, HYOAS, BAA10Y. Downstream impact: predictor's 6 macro features (vix_level, vix_term_slope = VIX3M/VIX, yield_curve_slope = TNX-IRX, market_breadth, spy_20d_*) are L2 Ridge inputs; today's predictor inference at 6:15 AM PT consumed contaminated values.
Test plan
pytest -qclean (792 passed, 1 skipped pre-existing) ✓OVERWRITE VIXERROR alerts fire--dry-runonce changes look correct.🤖 Generated with Claude Code