Skip to content

fix(backfill): unified short-history path + scope macro rewrite to full-universe runs#85

Merged
cipher813 merged 1 commit into
mainfrom
fix/backfill-unified-shorthistory-and-macro-scoping
Apr 22, 2026
Merged

fix(backfill): unified short-history path + scope macro rewrite to full-universe runs#85
cipher813 merged 1 commit into
mainfrom
fix/backfill-unified-shorthistory-and-macro-scoping

Conversation

@cipher813
Copy link
Copy Markdown
Owner

Summary

Two ROADMAP P1 fixes to builders/backfill.py:

  1. Unified short-history path. The if ticker in tickers_with_features: <features> else: <OHLCV-only> fork is removed. Every ticker now goes through compute_features, which returns partial-NaN rows for features whose warmup exceeds available history (PR fix(features): per-feature graceful degrade, no whole-row dropna #78 contract). The fork would otherwise regress PR feat(data): one-shot migration for OHLCV-only ArcticDB symbols #79's schema migration on next Saturday's weekly backfill by writing stripped-column frames that daily_append.update() rejects.

  2. Macro writes gated by --ticker. A per-ticker backfill now skips the macro library rewrite by default. New --rebuild-macro flag is the explicit opt-in for operators who actually want macro rewritten during a ticker-scoped run. Default full-universe backfill still rewrites macro as before.

Why

The SOLS patch on 2026-04-22 ran --ticker SOLS, which ran backfill's full side-effect macro rewrite from parquet. Parquet macro was stale (4/17), so this regressed ArcticDB macro SPY/VIX/XL* from 4/20 → 4/17 and broke the predictor's macro-freshness preflight. Same-shaped gap as the 2026-04-20 SOLS / Q universe-library gap that just got cleaned up.

Next Saturday's weekly backfill run would also regress PR #79's schema migration — the OHLCV-only fork writes column sets that daily_append.update() rejects with schema mismatch (same class as the PR #76/#77/#79 chain we just finished).

Observability

Per-ticker partial-features ticker=X rows=N nan_last_row=M/total features=[...] INFO log for any ticker whose last row has NaN features. Completion log reports n_partial in addition to n_ok / n_skip / n_err.

Test plan

  • Source-text assertions: n_ok_ohlcv_only + tickers_with_features = { + "write raw OHLCV" all absent; MIN_ROWS_FOR_FEATURES not present inside the write loop; skip_macro + ticker_filter is not None + rebuild_macro present; --rebuild-macro CLI flag present
  • Functional: ticker_filter="AAPL", rebuild_macro=Falsemacro_lib.write NOT called
  • Functional: ticker_filter="AAPL", rebuild_macro=Truemacro_lib.write IS called
  • Functional: ticker_filter=Nonemacro_lib.write IS called (default preserved)
  • Functional: short-history ticker writes include dist_from_52w_high column with NaN preserved (unified schema)
  • Full suite: 151/151 pass

🤖 Generated with Claude Code

…ll-universe runs

Two ROADMAP P1 fixes landing together because they both live in
builders/backfill.py and share regression-test scaffolding.

1. **Unified short-history path.** Dropped the `if ticker in
   tickers_with_features: <feature path> else: <OHLCV-only path>` fork.
   Post-PR-#78 `compute_features` returns rows with NaN for features
   whose rolling-window warmup exceeds available history, so the fork
   is unnecessary — every ticker writes the full OHLCV+FEATURE schema.
   Left in: an `n_short_history_in_scope` observability counter and a
   per-ticker `partial-features` log line when the last row has NaN
   features. The fork was a time-bomb for next Saturday's weekly
   backfill: it would regress PR #79's schema migration by writing
   stripped-column frames that daily_append's `lib.update()` then
   rejects with schema mismatch.

2. **Macro writes gated by --ticker.** A per-ticker backfill
   (`--ticker X`) now skips the macro library rewrite by default. The
   parquet price cache's macro series may be stale relative to what
   daily_append has been appending; rewriting from parquet during a
   per-ticker patch silently regresses SPY/VIX/XL* last_date. The
   2026-04-22 SOLS patch knocked macro back from 4/20 to 4/17 by
   exactly this path and broke the predictor's macro-freshness
   preflight. Operators who genuinely want the macro rewrite during a
   ticker-scoped run can pass `--rebuild-macro` (explicit opt-in).
   Default full-universe backfill still rewrites macro as before.

8 regression tests added — source-text invariants + functional tests
covering the three macro-scoping modes and the short-history schema
contract. 151/151 full suite passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cipher813 cipher813 merged commit 64653af into main Apr 22, 2026
1 check passed
@cipher813 cipher813 deleted the fix/backfill-unified-shorthistory-and-macro-scoping branch April 22, 2026 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant