feat(preflight): add Research-side static checks (price cards + recursion budget)#137
Merged
Merged
Conversation
…sion budget) Today's incident chain proved the existing preflight catches structural issues but is by-design blind to runtime LLM behavior. Specifically: PR #77 (PriceCardLookupError): runtime model 'claude-haiku-4-5-20251001' didn't normalize to any price card → Research SF crash. PR #78 (GraphRecursionError): ReAct sites used response_format= but recursion_limit was bare MAX_ITERATIONS * 2 → Research SF crash. Both are catchable by static config-walk preflight (zero LLM cost): ## check_price_cards_cover_all_models Walks every runtime model name (universe.yaml's per_stock_model + strategic_model + research_graph.py's _FALLBACK_AGENT_MODEL_NAMES dict), normalizes via the same snapshot-suffix strip the production cost tracker uses (PR #77's _normalize_model_for_pricing — duplicated here to avoid heavy imports), and asserts each maps to a card in alpha-engine-config/cost/model_pricing.yaml. ## check_recursion_budget_for_response_format Static regex scan of agents/sector_teams/{quant,qual}_analyst.py. For every file using response_format= in create_react_agent, asserts recursion_limit is NOT bare 'MAX_ITERATIONS * 2' (must include +N buffer for the post-loop structured-extraction call). Catches PR #78's exact failure mode at config-walk time. Both checks WARN (don't FAIL) when sibling repos aren't checked out (CI / restricted environments) — preserves the preflight's "useful even when partial" property. Validation against current state (post PR #77 + #78): [OK] price_cards_cover_all_models 3 runtime models map to cards [OK] recursion_budget_for_response_format 2 ReAct sites buffered 8 new tests in test_sf_preflight.py covering happy path, failure path (reproducing today's exact incidents in tmp sibling layout), absent- sibling skip, and the snapshot-suffix-normalization round-trip. 403 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cipher813
added a commit
that referenced
this pull request
May 6, 2026
* refactor(iam): take ownership of shared orchestration roles Step Functions execution role + EventBridge cron role move here from alpha-engine/infrastructure/iam/. Their grants are derived from code that lives in this repo (SF JSON Lambda invoke targets, EC2 instances the SF SSMs, EventBridge SFN target ARNs) — co-locating the codified IAM with the source of those grants tightens the coupling so a single PR can change SF behavior + matching IAM atomically. Files added: - infrastructure/iam/alpha-engine-step-functions-role.json - infrastructure/iam/alpha-engine-eventbridge-sfn-role.json - infrastructure/iam/check-drift.py (flat-layout variant of alpha-engine's directory-per-role drift checker) - infrastructure/iam/README.md (documents which roles this repo owns + the single-writer rule) - .github/workflows/iam-drift-check.yml (PR + daily 09:30 UTC + manual) Files updated: - infrastructure/deploy_step_function.sh — drops the surviving inline put-role-policy block against alpha-engine-step-functions-role (PR #170 dropped the EB-SFN twin; PR #151 dropped the daily-script twin; this completes the trio). The script kept a stale narrower policy that clobbered ssm:DescribeInstanceInformation + ec2:StopInstances + the trading-instance SSM ARN every saturday deploy. Trust policy + create-role bootstrap stay (one-time setup). - infrastructure/add-ssm-policy.sh — drops alpha-engine-executor-role + alpha-engine-predictor-role from the ROLES list. Both now have alpha-engine-ssm-read codified in their home repos (executor: already; predictor: covered by separate PR codifying its existing live grant). Script remains the writer for non-codified Lambda execution roles only. OIDC trust policy on github-actions-iam-drift-check widened live to also permit repo:cipher813/alpha-engine-data so the new drift-check workflow here can authenticate with the existing OIDC role. Companion PRs: - alpha-engine #137 — removes the codified directories, updates the cross-repo foreign-writer guard. - alpha-engine-predictor (separate) — codifies existing live alpha-engine-ssm-read grant on predictor-role. Supersedes alpha-engine-data #171 (which only addressed the saturday script's inline write). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci: re-trigger after OIDC trust + scope widening * ci: re-trigger pull_request workflow after OIDC widening --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Today's incident chain proved the existing dry-run/stub-LLM gate catches structural issues but is by-design blind to runtime LLM behavior — bugs in cost-telemetry lookup and recursion budget only surface when real LLM calls happen.
Two new static checks (zero LLM cost) catch today's exact failure modes at preflight time:
check_price_cards_cover_all_modelsWalks runtime model names from research config (
per_stock_model,strategic_model,_FALLBACK_AGENT_MODEL_NAMES), normalizes via the same snapshot-suffix strip PR #77 added, and asserts each maps to a card inalpha-engine-config/cost/model_pricing.yaml. Pre-emptsPriceCardLookupError.check_recursion_budget_for_response_formatStatic regex scan of
agents/sector_teams/{quant,qual}_analyst.py. For every site usingresponse_format=, assertsrecursion_limitincludes a+Nbuffer (not bareMAX_ITERATIONS * 2). Pre-emptsGraphRecursionErrorfrom the structured-extraction call.Both WARN (don't FAIL) when sibling repos aren't checked out — preserves "useful even when partial" property.
Validation against current state
Both pass after PR #77 + #78 merged.
Test plan
test_sf_preflight.py: happy path, failure path (reproducing today's exact incidents in tmp sibling layout), absent-sibling skip, snapshot-suffix normalization round-trip🤖 Generated with Claude Code