💡 Complexity-Aware Dynamic Model Routing for Cost-Optimized PR Review #564

2026-06-11T00:59:48Z

github-actions[bot]
Bot Jun 11, 2026

Summary

Implement a PR complexity scorer in the review pipeline that analyzes PR characteristics (lines changed, file count, language diversity, security-label presence, test coverage delta) before model selection. Route simple PRs to cheaper model tiers and reserve expensive models (Opus, Fable) for genuinely complex reviews. Industry data shows 50-80% cost reduction from complexity-based routing.

Market Signal

LLM model routing is a rapidly maturing practice in 2026. Industry research shows well-designed routing systems can outperform even the strongest individual models while reducing costs 50-80% (Model Routing Strategies 2026). Anthropic's adaptive thinking in Opus 4.8 does per-turn reasoning calibration internally. Multiple frameworks (MindStudio, Burnwise) now offer production-grade routing. The typical implementation is 50-100 lines of code with a cost-aware router that picks the cheapest model likely to succeed, escalating only on low-confidence results.

User Signal

engine.sh implements basic tier routing (triage→deep→audit) but the initial tier selection is static — every PR gets the same model at each stage regardless of complexity. Issue #553 requests "capability-aware selection" specifically. The Token Cost Observatory (#332, #464) shows cost is actively monitored, suggesting appetite for optimization. The engine's model chain architecture (CLAUDE_TRIAGE_MODEL_CHAIN, etc.) was designed for exactly this kind of dynamic selection.

Technical Opportunity

engine.sh's set_engine_config() defines per-tier model chains but tier assignment in review-one-pr.sh is fixed. A complexity scorer in the preflight step could set environment variables (e.g., PR_COMPLEXITY=simple|medium|complex|critical) that engine.sh uses to adjust model selection:

simple (docs-only, 1-2 files, <50 lines): Haiku triage only, skip deep/audit
medium (standard code changes, <10 files): Sonnet deep, skip audit
complex (multi-language, >10 files, infrastructure): full pipeline with Opus
critical (security-labeled, CI config, agent self-modification): Fable 5 apex tier

The scoring function itself is deterministic shell — ~50-100 lines, no LLM needed. Explicit safety floors prevent dangerous under-routing.

Assessment

Dimension	Score	Rationale
Feasibility	med	Requires defining complexity heuristics and validating against historical data; shell implementation is straightforward but calibration needs iteration
Impact	high	50-80% cost reduction potential on the review pipeline's largest expense; industry-validated pattern
Urgency	med	Cost optimization, not blocking; value increases as review volume grows across org repos

Adversarial Review

Strongest objection: Simple heuristics (line count, file count) may not capture actual review difficulty. A subtle security vulnerability in a 3-line change would be routed to Haiku and missed, creating a false sense of security.

Rebuttal: The multi-tier review pipeline already guards against this — triage (Haiku) examines every PR regardless and flags complexity signals for the deep tier. The routing optimizer would adjust the deep/audit tier model selection, not skip tiers entirely. Adding explicit floors ("never route security-labeled PRs below Sonnet", "never skip audit for infrastructure files") prevents the worst case. The existing pipeline preserves defense-in-depth; complexity routing optimizes within it, not around it.

Suggested Next Step

Define a PR complexity heuristic in shell that scores PRs on 5-6 dimensions (lines changed, file count, language count, security labels, CI config touched, agent self-modification). Prototype in a branch, test against recent PR data from merged_prs_30d to validate the scoring correlates with actual review effort and token consumption.

don-petry · 2026-06-11T01:48:58Z

don-petry
Jun 11, 2026
Maintainer

Cross-link. Related to canonical model-selection discussion #413 and tracking issue #553 (capability/cost-aware routing across dev-lead + pr-review, gated on #195). Keep this proposal's routing direction consolidated with that work.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

💡 Complexity-Aware Dynamic Model Routing for Cost-Optimized PR Review #564

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

💡 Complexity-Aware Dynamic Model Routing for Cost-Optimized PR Review #564

Uh oh!

github-actions[bot] Bot Jun 11, 2026

Summary

Market Signal

User Signal

Technical Opportunity

Assessment

Adversarial Review

Suggested Next Step

Replies: 1 comment

Uh oh!

don-petry Jun 11, 2026 Maintainer

github-actions[bot]
Bot Jun 11, 2026

don-petry
Jun 11, 2026
Maintainer