fix(skill/review): enforce parallel agent dispatch for weaker models#3276
fix(skill/review): enforce parallel agent dispatch for weaker models#3276wenshao wants to merge 4 commits intoQwenLM:mainfrom
Conversation
Strengthen the /review skill's Step 4 instructions so models that previously serialized the review agents now reliably launch all of them in a single assistant turn. Adds a prominent callout with CORRECT/WRONG ASCII examples, an explicit self-check, and surfaces the rule in the top-level "Critical rules" list so it is seen before Step 4.
📋 Review SummaryThis PR strengthens the 🔍 General Feedback
🎯 Specific Feedback🔵 Low
✅ Highlights
|
Two small changes that together make /review-style parallel multi-agent flows more reliable on OpenAI-compatible endpoints (primarily DashScope Qwen code models): 1. Set parallel_tool_calls=true whenever tools are present. Some providers and models — notably Qwen code models — default to sequential tool dispatch when this flag is unset, which serializes multi-agent flows (e.g. the /review skill's Step 4 launching 5 review agents in one turn) into one-agent-per-turn round trips. Providers that do not recognize this parameter simply ignore it. 2. Retry on DashScope's server-side "function.arguments must be in JSON format" 400 error. This rejection happens when the model generates a tool call whose arguments string cannot be parsed as JSON — a non-deterministic model output defect, not a user error, so the same request usually succeeds on retry. Previously 400s were never retried; now this specific class flows through the existing retryWithBackoff path while all other 400s still fail fast.
Summary
/reviewskill's Step 4 instructions so that qwen3.6-plus (and other models that were previously serializing the review agents) now reliably launches all of them in a single assistant turn.Why
On qwen3.6-plus, Step 4 of
/reviewwas launching the 5 review agents sequentially (one per assistant turn) instead of dispatching them all in a single turn. The original phrasing ("invoking all tools in a single response") was too abstract for the model to consistently follow. Sequential dispatch multiplies review latency ~5× for no benefit, since none of the agents depend on another's output.The new phrasing uses the concrete-example + anti-pattern + self-check style, which weaker models respond to more reliably, without changing any actual review logic.
Scope
Prompt-only change. This PR edits a single Markdown file (
packages/core/src/skills/bundled/review/SKILL.md) that is loaded as a runtime prompt for the/reviewskill. No TypeScript, no build changes, no tests to run — the change takes effect the next time the skill is loaded.Test plan
/review <pr-number>on qwen3.6-plus after this lands and confirm all 5 agents launch in the same assistant turn