perf(review): bump max_concurrent_reviewers 3 → 10 to unblock review_dimension fan-out#19
Merged
Merged
Conversation
…dimension fan-out Production data on a recent PR showed 8 review_dimensions running ~25 min each but throttled to a concurrency=3 semaphore — wall-clock time was ~3× per-dimension cost (≥75 min for the review phase alone) even though we were paying nothing during the wait. This was the single biggest reason `review` runs were hitting the 7200s caller timeout in github-buddy. Two changes: 1. Default raised 3 → 10. With 6–8 dimensions per typical PR, all of them can now run in parallel and the phase is bounded by the slowest single dimension instead of the semaphore. 2. Surfaced the value as `PR_AF_MAX_CONCURRENT_REVIEWERS` so a deployment can dial it back if its provider hits rate limits. The stagger_delay_seconds (2s) is unchanged and remains the first line of defense against burst-rate-limit hits. The previous comment on this field cited "OpenRouter or other rate-limited providers" — at 10 concurrent on Kimi K2.5 we're well within OpenRouter's per-key limits, and any provider that can't sustain that for a single review session would need configuration anyway. Tests: test_budget_config_defaults updated to pin the new default and to clear PR_AF_MAX_CONCURRENT_REVIEWERS so the assertion reflects the in-code default rather than the runtime env override. Full suite green except for one pre-existing test_cost_tracker failure on main. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Production data on a recent review run showed 8
review_dimensionphases each taking ~25 min of LLM work but throttled bymax_concurrent_reviewers = 3. Wall-clock time on the review phase was ~3× per-dimension cost — ≥75 min, with several dimensions blocked behind the semaphore doing nothing. This was the dominant reasonpr-af.reviewruns were hitting the 7200sagent_call_timeoutin github-buddy and burning the parent reasoner.Per-dimension LLM time is what it is until the meta-selectors get refactored. But we were leaving free wall-clock time on the table by serializing dimensions that can already run in parallel.
What
PR_AF_MAX_CONCURRENT_REVIEWERSso a deployment can dial it back if its provider has tighter limits.stagger_delay_seconds = 2.0is unchanged — the 2-second stagger remains first-line defense against burst rate-limit hits, the semaphore is just no longer the bottleneck.The original comment cited "OpenRouter or other rate-limited providers" as the reason for the cap. At 10 concurrent on Kimi K2.5 we're well within OpenRouter's per-key limits; if a deployment runs against a slower-rate provider it can lower via env without a code change.
Expected impact
Worst case observed today: review phase 75–100 min wall, total run 110+ min, hitting the 7200s caller timeout.
After this change: review phase ≈ slowest single dimension (~25–40 min), total run ~50–80 min. Should stop hitting the timeout on normal-sized PRs.
This is the highest-leverage single-line speedup. Not the last one —
meta_semantic/meta_mechanical/meta_systemicare each one monolithic 14–38 min LLM call and would benefit from a separate fast-proposal + parallel-validation refactor. That's a bigger change and lands later.Test plan
test_budget_config_defaultsupdated to pin the new default and clearPR_AF_MAX_CONCURRENT_REVIEWERSso it asserts the in-code default rather than the runtime overridetest_cost_trackerfailure onmain, unrelated to this PR)🤖 Generated with Claude Code