Skip to content

fix(data): short-history branch matches stored dtype per-column#77

Merged
cipher813 merged 1 commit into
mainfrom
fix/daily-append-volume-dtype
Apr 21, 2026
Merged

fix(data): short-history branch matches stored dtype per-column#77
cipher813 merged 1 commit into
mainfrom
fix/daily-append-volume-dtype

Conversation

@cipher813
Copy link
Copy Markdown
Owner

Summary

Follow-up to PR #76. The short-history write path hardcoded `Volume → int64`; stored ArcticDB schema varies per-ticker (some int64, some float64 depending on backfill vintage). Fix: astype every column to `hist.dtypes[col]`, authoritative by construction.

Why

PR #76 shipped + validated today — today's EOD email sent successfully because SNDK's stored Volume was already int64 (matching the hardcode). Three other short-history tickers (SOLS, ULS, +1) failed the update with `FLOAT64/INT64` mismatch and contributed n_err=4 in the daily_append summary. Tomorrow's unattended 13:05 PT daily_append would repeat the same failures every day until fixed.

Change

```python

before (PR #76)

for col in new_row.columns:
if col in OHLCV_COLS:
if col == "Volume":
new_row[col] = new_row[col].astype("int64")
else:
new_row[col] = new_row[col].astype("float64")
else:
new_row[col] = new_row[col].astype("float32")

after

for col in new_row.columns:
new_row[col] = new_row[col].astype(hist.dtypes[col])
```

Test plan

  • `pytest tests/test_daily_append_semantics.py -v` — 8/8 pass (new dtype test + 7 prior)
  • Full repo suite: `pytest tests/` — 111/111 pass
  • Tomorrow's unattended 13:05 PT daily_append: expect n_partial count to increase + n_err to drop by 3 (SOLS, ULS, +1 previously failing)

🤖 Generated with Claude Code

PR #76 hardcoded Volume→int64 in the short-history write path. ArcticDB
rejects updates whose column dtypes don't match the existing version,
and stored Volume dtype varies across tickers — some int64, some
float64, depending on when the symbol was first backfilled. Three
short-history tickers (SOLS, ULS, +1) failed the update on 2026-04-21
with FLOAT64/INT64 mismatch, surfacing as n_err=4 in the daily_append
summary.

Fix: astype every column to hist.dtypes[col] — authoritative by
construction, compatible with every ticker regardless of stored
schema. No dtype hardcoding anywhere in the branch.

Regression test locks the hist.dtypes dispatch and forbids hardcoded
int64 casts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cipher813 cipher813 merged commit ff752bc into main Apr 21, 2026
1 check passed
@cipher813 cipher813 deleted the fix/daily-append-volume-dtype branch April 21, 2026 21:46
cipher813 added a commit that referenced this pull request May 2, 2026
…sion budget) (#137)

Today's incident chain proved the existing preflight catches structural
issues but is by-design blind to runtime LLM behavior. Specifically:

  PR #77 (PriceCardLookupError): runtime model 'claude-haiku-4-5-20251001'
    didn't normalize to any price card → Research SF crash.
  PR #78 (GraphRecursionError): ReAct sites used response_format= but
    recursion_limit was bare MAX_ITERATIONS * 2 → Research SF crash.

Both are catchable by static config-walk preflight (zero LLM cost):

## check_price_cards_cover_all_models

Walks every runtime model name (universe.yaml's per_stock_model +
strategic_model + research_graph.py's _FALLBACK_AGENT_MODEL_NAMES dict),
normalizes via the same snapshot-suffix strip the production cost
tracker uses (PR #77's _normalize_model_for_pricing — duplicated here
to avoid heavy imports), and asserts each maps to a card in
alpha-engine-config/cost/model_pricing.yaml.

## check_recursion_budget_for_response_format

Static regex scan of agents/sector_teams/{quant,qual}_analyst.py.
For every file using response_format= in create_react_agent, asserts
recursion_limit is NOT bare 'MAX_ITERATIONS * 2' (must include +N
buffer for the post-loop structured-extraction call). Catches PR #78's
exact failure mode at config-walk time.

Both checks WARN (don't FAIL) when sibling repos aren't checked out
(CI / restricted environments) — preserves the preflight's "useful
even when partial" property.

Validation against current state (post PR #77 + #78):
  [OK]   price_cards_cover_all_models   3 runtime models map to cards
  [OK]   recursion_budget_for_response_format   2 ReAct sites buffered

8 new tests in test_sf_preflight.py covering happy path, failure path
(reproducing today's exact incidents in tmp sibling layout), absent-
sibling skip, and the snapshot-suffix-normalization round-trip.

403 tests pass.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant