Skip to content

fix(backfill): use min(valid_dates) so single fresh ticker can't suppress delta load#192

Merged
cipher813 merged 1 commit into
mainfrom
feat/apply-daily-delta-min-last-date
May 9, 2026
Merged

fix(backfill): use min(valid_dates) so single fresh ticker can't suppress delta load#192
cipher813 merged 1 commit into
mainfrom
feat/apply-daily-delta-min-last-date

Conversation

@cipher813
Copy link
Copy Markdown
Owner

Summary

  • Closes the 2026-05-09 weekly-SF DataPhase1 PARTIAL incident: _apply_daily_delta used max(valid_dates) to pick slim_last_date, so a single freshly-refreshed ticker (VEEV, refreshed by prices.collect's mtime check to 5/8 while every other parquet ended at 5/6) became the lookup anchor. On a Saturday run that turned bdate_range(slim_last_date+1, today) into an empty range — the delta loader returned {}, every other cache parquet stayed at 5/6, and the backfill regression preflight rejected the write because planned 5/6 < existing-in-ArcticDB 5/8 across 38 symbols (SPY/VIX/XL*/sampled-universe).
  • Switched to min(valid_dates) so the load always covers the oldest ticker; the existing keep="last" dedupe in the combine step handles the overlap with any freshly-refreshed ticker.
  • Updated the preflight error message in builders/backfill.py so the recovery hint actually addresses the underlying cause (the legacy text blamed the cache mtime check, which was working as designed — the bug was downstream in delta loading).
  • New regression test tests/test_apply_daily_delta_min_last_date.py pins the multi-mtime case + the all-tickers-current early-return.

Test plan

  • pytest tests/test_apply_daily_delta_min_last_date.py -v (2 new tests pass)
  • pytest full suite (574 passed, 1 skipped — same as origin/main)
  • After merge + boot-pull, redrive failed Sat SF execution ce500327-cf08-6731-5d44-882f5b380a30_0ae54554-5564-587a-5947-5cb99bf1130f and verify DataPhase1 → RAGIngestion → Research → ... chain completes
  • Verify _apply_daily_delta log line shows non-empty delta range on next Sat SF firing

🤖 Generated with Claude Code

…ress delta load

`_apply_daily_delta` used `max(valid_dates)` to compute `slim_last_date`, so
when `prices.collect` flagged a single ticker as stale and refreshed it via
yfinance to the prior trading day, that one ticker's date became
`slim_last_date`. On a Saturday SF run that turned `bdate_range(slim_last_date+1,
today)` into an empty range — the loader returned `{}`, every other cache
parquet stayed stuck at its older date, and the backfill regression preflight
rejected the write because planned (5/6) < existing-in-ArcticDB (5/8) across
SPY/VIX/XL*/sampled-universe.

Switched to `min(valid_dates)` so the delta load always covers the oldest
ticker; overlapping rows from any freshly-refreshed ticker dedupe via the
existing `keep="last"` step. Updated the preflight error message so the
recommended recovery actually addresses the underlying cause. New regression
test pins both the multi-mtime case and the all-tickers-current early-return.

Origin: 2026-05-09 weekly SF DataPhase1 PARTIAL — VEEV got refreshed to 5/8,
every other parquet still ended at 5/6, max picked 5/8, today=5/9 Sat,
empty bdate_range, 38 symbols flagged for regression.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cipher813 cipher813 merged commit 9f8b679 into main May 9, 2026
1 check passed
@cipher813 cipher813 deleted the feat/apply-daily-delta-min-last-date branch May 9, 2026 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant