Skip to content

feat(preflight): add Research-side static checks (price cards + recursion budget)#137

Merged
cipher813 merged 1 commit into
mainfrom
feat/sf-preflight-research-checks
May 2, 2026
Merged

feat(preflight): add Research-side static checks (price cards + recursion budget)#137
cipher813 merged 1 commit into
mainfrom
feat/sf-preflight-research-checks

Conversation

@cipher813
Copy link
Copy Markdown
Owner

Summary

Today's incident chain proved the existing dry-run/stub-LLM gate catches structural issues but is by-design blind to runtime LLM behavior — bugs in cost-telemetry lookup and recursion budget only surface when real LLM calls happen.

Two new static checks (zero LLM cost) catch today's exact failure modes at preflight time:

check_price_cards_cover_all_models

Walks runtime model names from research config (per_stock_model, strategic_model, _FALLBACK_AGENT_MODEL_NAMES), normalizes via the same snapshot-suffix strip PR #77 added, and asserts each maps to a card in alpha-engine-config/cost/model_pricing.yaml. Pre-empts PriceCardLookupError.

check_recursion_budget_for_response_format

Static regex scan of agents/sector_teams/{quant,qual}_analyst.py. For every site using response_format=, asserts recursion_limit includes a +N buffer (not bare MAX_ITERATIONS * 2). Pre-empts GraphRecursionError from the structured-extraction call.

Both WARN (don't FAIL) when sibling repos aren't checked out — preserves "useful even when partial" property.

Validation against current state

[OK]   price_cards_cover_all_models     3 runtime models map to cards
[OK]   recursion_budget_for_response_format   2 ReAct sites buffered

Both pass after PR #77 + #78 merged.

Test plan

  • 8 new tests in test_sf_preflight.py: happy path, failure path (reproducing today's exact incidents in tmp sibling layout), absent-sibling skip, snapshot-suffix normalization round-trip
  • Full suite: 403 passed
  • Live run on Friday before next Saturday SF — should catch any drift before launch

🤖 Generated with Claude Code

…sion budget)

Today's incident chain proved the existing preflight catches structural
issues but is by-design blind to runtime LLM behavior. Specifically:

  PR #77 (PriceCardLookupError): runtime model 'claude-haiku-4-5-20251001'
    didn't normalize to any price card → Research SF crash.
  PR #78 (GraphRecursionError): ReAct sites used response_format= but
    recursion_limit was bare MAX_ITERATIONS * 2 → Research SF crash.

Both are catchable by static config-walk preflight (zero LLM cost):

## check_price_cards_cover_all_models

Walks every runtime model name (universe.yaml's per_stock_model +
strategic_model + research_graph.py's _FALLBACK_AGENT_MODEL_NAMES dict),
normalizes via the same snapshot-suffix strip the production cost
tracker uses (PR #77's _normalize_model_for_pricing — duplicated here
to avoid heavy imports), and asserts each maps to a card in
alpha-engine-config/cost/model_pricing.yaml.

## check_recursion_budget_for_response_format

Static regex scan of agents/sector_teams/{quant,qual}_analyst.py.
For every file using response_format= in create_react_agent, asserts
recursion_limit is NOT bare 'MAX_ITERATIONS * 2' (must include +N
buffer for the post-loop structured-extraction call). Catches PR #78's
exact failure mode at config-walk time.

Both checks WARN (don't FAIL) when sibling repos aren't checked out
(CI / restricted environments) — preserves the preflight's "useful
even when partial" property.

Validation against current state (post PR #77 + #78):
  [OK]   price_cards_cover_all_models   3 runtime models map to cards
  [OK]   recursion_budget_for_response_format   2 ReAct sites buffered

8 new tests in test_sf_preflight.py covering happy path, failure path
(reproducing today's exact incidents in tmp sibling layout), absent-
sibling skip, and the snapshot-suffix-normalization round-trip.

403 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cipher813 cipher813 merged commit 79d95e1 into main May 2, 2026
1 check passed
@cipher813 cipher813 deleted the feat/sf-preflight-research-checks branch May 2, 2026 17:55
cipher813 added a commit that referenced this pull request May 6, 2026
* refactor(iam): take ownership of shared orchestration roles

Step Functions execution role + EventBridge cron role move here from
alpha-engine/infrastructure/iam/. Their grants are derived from code
that lives in this repo (SF JSON Lambda invoke targets, EC2 instances
the SF SSMs, EventBridge SFN target ARNs) — co-locating the codified
IAM with the source of those grants tightens the coupling so a single
PR can change SF behavior + matching IAM atomically.

Files added:
- infrastructure/iam/alpha-engine-step-functions-role.json
- infrastructure/iam/alpha-engine-eventbridge-sfn-role.json
- infrastructure/iam/check-drift.py (flat-layout variant of
  alpha-engine's directory-per-role drift checker)
- infrastructure/iam/README.md (documents which roles this repo owns
  + the single-writer rule)
- .github/workflows/iam-drift-check.yml (PR + daily 09:30 UTC + manual)

Files updated:
- infrastructure/deploy_step_function.sh — drops the surviving inline
  put-role-policy block against alpha-engine-step-functions-role
  (PR #170 dropped the EB-SFN twin; PR #151 dropped the daily-script
  twin; this completes the trio). The script kept a stale narrower
  policy that clobbered ssm:DescribeInstanceInformation +
  ec2:StopInstances + the trading-instance SSM ARN every saturday
  deploy. Trust policy + create-role bootstrap stay (one-time setup).
- infrastructure/add-ssm-policy.sh — drops alpha-engine-executor-role
  + alpha-engine-predictor-role from the ROLES list. Both now have
  alpha-engine-ssm-read codified in their home repos (executor:
  already; predictor: covered by separate PR codifying its existing
  live grant). Script remains the writer for non-codified Lambda
  execution roles only.

OIDC trust policy on github-actions-iam-drift-check widened live to
also permit repo:cipher813/alpha-engine-data so the new drift-check
workflow here can authenticate with the existing OIDC role.

Companion PRs:
- alpha-engine #137 — removes the codified directories, updates the
  cross-repo foreign-writer guard.
- alpha-engine-predictor (separate) — codifies existing live
  alpha-engine-ssm-read grant on predictor-role.

Supersedes alpha-engine-data #171 (which only addressed the saturday
script's inline write).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci: re-trigger after OIDC trust + scope widening

* ci: re-trigger pull_request workflow after OIDC widening

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant