[research] Two-loop transformer beats single-loop by 21pts on SWE-bench #151

2026-06-18T06:18:44Z

github-actions[bot]
Bot Jun 18, 2026

🔬 The Finding

LoopCoder-v2 (Jian Yang et al., June 17 2026) trains 7B "Parallel Loop Transformers" — models that repeatedly apply shared blocks — and discovers a sharply non-monotonic effect: the two-loop variant jumps SWE-bench Verified from 43.0 → 64.4 (+21 pts) and Multi-SWE from 14.0 → 31.0 (+17 pts) over the non-looped baseline. Surprisingly, three or more loops regress — later loops produce diminishing, oscillatory updates that let positional-mismatch costs dominate.

⚙️ What It Means for Agentic Workflows

One extra reasoning pass is the sweet spot. For teams evaluating code agents on real repos, LoopCoder-v2 (7B, two loops) now matches or exceeds models several times its size on agentic SE tasks — worth benchmarking against your current agent before reaching for a larger, costlier model.
"More compute at inference" ≠ better agents. The non-monotonic loop result is a concrete reminder that blindly stacking self-refinement rounds or self-critique loops in your workflow can hurt rather than help; measure at each iteration.

🔗 Source

LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling — June 17, 2026

Generated by Daily Agentic AI Research Digest · 148.6 AIC · ⌖ 12.5 AIC · ⊞ 24.2K · ◷

expires on Jun 26, 2026, 6:18 AM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[research] Two-loop transformer beats single-loop by 21pts on SWE-bench #151

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

[research] Two-loop transformer beats single-loop by 21pts on SWE-bench #151

Uh oh!

github-actions[bot] Bot Jun 18, 2026

🔬 The Finding

⚙️ What It Means for Agentic Workflows

🔗 Source

Replies: 0 comments

github-actions[bot]
Bot Jun 18, 2026