Skip to content

fix(data): short-history tickers are first-class, no silent skip#76

Merged
cipher813 merged 1 commit into
mainfrom
fix/short-history-first-class
Apr 21, 2026
Merged

fix(data): short-history tickers are first-class, no silent skip#76
cipher813 merged 1 commit into
mainfrom
fix/short-history-first-class

Conversation

@cipher813
Copy link
Copy Markdown
Owner

Summary

  • daily_append.py stops silently skipping short-history tickers (new listings / IPOs / spinoffs). Below the MIN_ROWS_FOR_FEATURES=265 warmup threshold, we now write an OHLCV-only row with NaN for every feature column — and log loudly (short-history ticker=X rows=N).
  • New n_partial counter distinguishes short-history writes from legit skips (dry run / NaN close) and read errors.
  • Regression test locks the write path + structured log.

Why

Root cause of the 2026-04-21 EOD failure. Held short-history tickers (SNDK after the 2026 WDC flash-memory spinoff, ~44 rows) had no ArcticDB row for today because daily_append silently n_skip++'d and wrote nothing. EOD reconcile then hard-failed looking up the authoritative close.

New listings are a normal, recurring market event — S&P 500+400 sees 20-40 constituent changes/year, every spinoff creates one. Treating them as an edge case guarantees we re-hit this bug on every universe rotation. Violates three standing preferences:

  • `feedback_no_silent_fails.md` — skipping a held ticker silently
  • `feedback_no_unscoreable_labels.md` — writing nothing is the same sentinel-escape-hatch pattern
  • `feedback_hard_fail_until_stable.md` — partial outcome should be loud

What is NOT in this PR (follow-ups)

  • Per-feature `min_rows` so short-history tickers get some features (today: all features NaN below threshold)
  • Predictor NaN-handling hardening + short-history subsample validation with named baseline (per `feedback_component_baseline_validation.md`)
  • Executor position sizer coverage derate
  • `sndk_patch.py` deletion (obsolete once this ships and a daily run has landed the row)

Test plan

  • `pytest tests/test_daily_append_semantics.py -v` — 7/7 pass (new test + 6 prior regressions)
  • Full repo suite: `pytest tests/` — 110/110 pass
  • End-to-end validation (this session): deploy branch to ae-trading, run daily-data for 2026-04-21, confirm SNDK row lands in ArcticDB, run EOD, confirm email sends with SNDK attributed

🤖 Generated with Claude Code

Replaces the len(hist) < MIN_ROWS_FOR_FEATURES silent skip in
daily_append with an OHLCV-only write. When a ticker has insufficient
history for feature warmup (new listings, IPOs, spinoffs — e.g. SNDK
post the 2026 WDC flash-memory spinoff) we still write the
authoritative close, with NaN for every feature column, and log
"short-history ticker=X rows=N" so coverage gaps surface.

A dedicated n_partial counter separates this state from n_skip (dry
run / NaN close) and n_err (ArcticDB read failures). The 5% err_rate
guard is unchanged — partial writes don't count as errors.

Root cause traced 2026-04-21: EOD reconcile hard-failed on every held
short-history ticker because authoritative close was missing from
ArcticDB. New listings are a normal, recurring market event; silently
dropping them violates the no-silent-fails, no-unscoreable-labels, and
hard-fail-until-stable preferences.

Regression test locks the write path + structured log.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cipher813 cipher813 marked this pull request as ready for review April 21, 2026 21:17
@cipher813 cipher813 merged commit b9b07ad into main Apr 21, 2026
1 check passed
@cipher813 cipher813 deleted the fix/short-history-first-class branch April 21, 2026 21:17
cipher813 added a commit that referenced this pull request Apr 21, 2026
PR #76 hardcoded Volume→int64 in the short-history write path. ArcticDB
rejects updates whose column dtypes don't match the existing version,
and stored Volume dtype varies across tickers — some int64, some
float64, depending on when the symbol was first backfilled. Three
short-history tickers (SOLS, ULS, +1) failed the update on 2026-04-21
with FLOAT64/INT64 mismatch, surfacing as n_err=4 in the daily_append
summary.

Fix: astype every column to hist.dtypes[col] — authoritative by
construction, compatible with every ticker regardless of stored
schema. No dtype hardcoding anywhere in the branch.

Regression test locks the hist.dtypes dispatch and forbids hardcoded
int64 casts.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cipher813 added a commit that referenced this pull request Apr 21, 2026
Adds builders/promote_ohlcv_only_schema.py — scans every symbol in the
universe library and rewrites any whose stored schema lacks feature
columns to the full OHLCV + FEATURE schema.

Context: PR #76 (short-history first-class) persisted some symbols as
OHLCV-only. PR #78 (per-feature graceful degrade) now writes full
schema on every daily_append pass. ArcticDB update() enforces schema
match, so the transitional OHLCV-only symbols fail today's daily_append
with n_err and their row never lands. 2026-04-21 post-#78 run reported
n_err=2.

The migration reads each candidate's OHLCV history, runs
compute_features (partial-feature semantics per PR #78), and calls
lib.write() to replace the symbol. write() is authoritative for schema;
update() is incremental and cannot widen columns.

One-shot, idempotent — symbols already at full schema are skipped.
Supports --dry-run for plan review and --ticker X for targeted retries.

Regression tests lock:
- write() (not update()) for the rewrite
- exhaustive FEATURE-column detection (no heuristic subsets)
- explicit error reason on empty compute_features (no silent skip)
- --dry-run guards the write() call
- partial-features structured log matches daily_append convention

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant