fix(data): short-history branch matches stored dtype per-column#77
Merged
Conversation
PR #76 hardcoded Volume→int64 in the short-history write path. ArcticDB rejects updates whose column dtypes don't match the existing version, and stored Volume dtype varies across tickers — some int64, some float64, depending on when the symbol was first backfilled. Three short-history tickers (SOLS, ULS, +1) failed the update on 2026-04-21 with FLOAT64/INT64 mismatch, surfacing as n_err=4 in the daily_append summary. Fix: astype every column to hist.dtypes[col] — authoritative by construction, compatible with every ticker regardless of stored schema. No dtype hardcoding anywhere in the branch. Regression test locks the hist.dtypes dispatch and forbids hardcoded int64 casts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6 tasks
3 tasks
cipher813
added a commit
that referenced
this pull request
May 2, 2026
…sion budget) (#137) Today's incident chain proved the existing preflight catches structural issues but is by-design blind to runtime LLM behavior. Specifically: PR #77 (PriceCardLookupError): runtime model 'claude-haiku-4-5-20251001' didn't normalize to any price card → Research SF crash. PR #78 (GraphRecursionError): ReAct sites used response_format= but recursion_limit was bare MAX_ITERATIONS * 2 → Research SF crash. Both are catchable by static config-walk preflight (zero LLM cost): ## check_price_cards_cover_all_models Walks every runtime model name (universe.yaml's per_stock_model + strategic_model + research_graph.py's _FALLBACK_AGENT_MODEL_NAMES dict), normalizes via the same snapshot-suffix strip the production cost tracker uses (PR #77's _normalize_model_for_pricing — duplicated here to avoid heavy imports), and asserts each maps to a card in alpha-engine-config/cost/model_pricing.yaml. ## check_recursion_budget_for_response_format Static regex scan of agents/sector_teams/{quant,qual}_analyst.py. For every file using response_format= in create_react_agent, asserts recursion_limit is NOT bare 'MAX_ITERATIONS * 2' (must include +N buffer for the post-loop structured-extraction call). Catches PR #78's exact failure mode at config-walk time. Both checks WARN (don't FAIL) when sibling repos aren't checked out (CI / restricted environments) — preserves the preflight's "useful even when partial" property. Validation against current state (post PR #77 + #78): [OK] price_cards_cover_all_models 3 runtime models map to cards [OK] recursion_budget_for_response_format 2 ReAct sites buffered 8 new tests in test_sf_preflight.py covering happy path, failure path (reproducing today's exact incidents in tmp sibling layout), absent- sibling skip, and the snapshot-suffix-normalization round-trip. 403 tests pass. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to PR #76. The short-history write path hardcoded `Volume → int64`; stored ArcticDB schema varies per-ticker (some int64, some float64 depending on backfill vintage). Fix: astype every column to `hist.dtypes[col]`, authoritative by construction.
Why
PR #76 shipped + validated today — today's EOD email sent successfully because SNDK's stored Volume was already int64 (matching the hardcode). Three other short-history tickers (SOLS, ULS, +1) failed the update with `FLOAT64/INT64` mismatch and contributed n_err=4 in the daily_append summary. Tomorrow's unattended 13:05 PT daily_append would repeat the same failures every day until fixed.
Change
```python
before (PR #76)
for col in new_row.columns:
if col in OHLCV_COLS:
if col == "Volume":
new_row[col] = new_row[col].astype("int64")
else:
new_row[col] = new_row[col].astype("float64")
else:
new_row[col] = new_row[col].astype("float32")
after
for col in new_row.columns:
new_row[col] = new_row[col].astype(hist.dtypes[col])
```
Test plan
🤖 Generated with Claude Code