fix(agents): make loop breaker session-global by TurboTheTurtle · Pull Request #79267 · openclaw/openclaw

TurboTheTurtle · 2026-05-08T05:26:53Z

Summary

Make the global circuit breaker count stable no-progress outcomes across repeated tool-call patterns in the session, instead of only the current tool signature.
Keep changing-output calls and unrelated one-off tools from contributing to the global breaker.
Add detector-level and before-tool-call wrapper coverage proving a different tool is blocked once the session-global threshold is reached.
Add an Unreleased changelog entry for the agent safety fix.

Real behavior proof

Behavior or issue addressed: a session that repeatedly produces stable no-progress tool outcomes should trip the global breaker and block a different next tool, instead of continuing the loop because the next tool has a different name/argument signature.
Real environment tested: local macOS 26.4.1 (25E253) arm64 OpenClaw checkout, Node v25.8.2, pnpm 10.33.2, branch codex/global-loop-breaker at commit 9e76446e98d08d6ccb8963329bca5a09ce8db06e.
Exact steps or command run after this patch:
- pnpm install --frozen-lockfile --offline
- pnpm exec vitest run src/agents/tool-loop-detection.test.ts --reporter=dot
- pnpm test:e2e src/agents/pi-tools.before-tool-call.e2e.test.ts --reporter=dot
- pnpm exec oxfmt --check --threads=1 CHANGELOG.md src/agents/tool-loop-detection.ts src/agents/tool-loop-detection.test.ts src/agents/pi-tools.before-tool-call.e2e.test.ts
- git diff --check
- pnpm check:changelog-attributions
- pnpm tsgo:core
- pnpm tsgo:core:test
- pnpm exec tsx -e '<production before-tool-call smoke: run 30 stable read tool outcomes in one session, then invoke memory_search through the wrapped tool path>'
Evidence after fix: console output after fix from the production before-tool-call smoke:

threshold=30
readExecutions=30
searchExecutions=0
blockedDeniedReason=tool-loop
blockedStatus=blocked
blockedContent=CRITICAL: Session tool calls have repeated identical no-progress outcomes 30 times across tools. Session execution blocked by global circuit breaker to prevent runaway loops.

Detector tests also passed with Test Files 1 passed (1) and Tests 45 passed (45). Before-tool-call e2e tests passed with Test Files 1 passed (1) and Tests 38 passed (38). Formatting, whitespace, changelog attribution, core typecheck, and core test typecheck all passed.

Observed result after fix: after 30 stable no-progress read outcomes in the same session, the next memory_search call was blocked by global_circuit_breaker before its execute function ran (searchExecutions=0), matching the intended cross-tool session-global behavior.
What was not tested: a live model-driven Gateway conversation with external services; the real-behavior smoke used production OpenClaw before-tool-call wrapping and diagnostic session state with local in-process tools.

clawsweeper · 2026-05-08T05:30:00Z

Codex review: needs maintainer review before merge.

Summary
The PR changes agent tool-loop detection to aggregate stable no-progress outcomes across a session/run, adds detector and before-tool-call regression tests, and adds a changelog entry.

Reproducibility: yes. Source inspection on current main gives a high-confidence path: record stable no-progress outcomes for one tool signature, then call a different tool and the existing global breaker reads only the different tool's current-signature streak.

Real behavior proof
Sufficient (terminal): The PR body includes terminal output from an after-fix production before-tool-call smoke showing the different next tool blocked before execution.

Next step before merge
No repair lane is needed because the PR already contains a focused implementation, changelog entry, supplied real behavior proof, and relevant green exact-head checks.

Security
Cleared: The diff is limited to agent loop-detection logic, tests, and changelog text, with no dependency, workflow, secret, permission, or package-resolution changes.

Review details

Best possible solution:

Land the session/run-scoped aggregate breaker with the regression tests and changelog after normal maintainer review.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection on current main gives a high-confidence path: record stable no-progress outcomes for one tool signature, then call a different tool and the existing global breaker reads only the different tool's current-signature streak.

Is this the best way to solve the issue?

Yes. The PR fixes the detector where the contract is implemented, keeps run scoping and same-pattern detectors intact, and covers both the pure detector and the production before-tool-call wrapper path.

What I checked:

Current main breaker is current-signature scoped: detectToolCallLoop computes noProgressStreak from getNoProgressStreak(history, toolName, currentHash) and compares that per-signature count to globalCircuitBreakerThreshold, so a different next tool does not inherit the prior stable no-progress streak. (src/agents/tool-loop-detection.ts:512, 7c31a9aafc1c)
Docs define global behavior: The public loop-detection docs describe globalCircuitBreakerThreshold as the global no-progress breaker threshold across all detectors, making the reported behavior a contract bug rather than a new feature request. Public docs: docs/tools/loop-detection.md. (docs/tools/loop-detection.md:82, 7c31a9aafc1c)
Current main coverage only proves same-pattern blocking: The existing main test covers repeated no-progress calls for one tool pattern, but does not cover a different next tool after the session-level threshold is reached. (src/agents/tool-loop-detection.test.ts:518, 7c31a9aafc1c)
PR implements aggregate session counting: The PR adds getGlobalNoProgressStreak and switches the global breaker branch to compare the aggregate session/run no-progress count, while preserving the existing same-pattern count for poll warnings and critical polling loops. (src/agents/tool-loop-detection.ts:388, 11667aafe5e9)
PR adds focused regression coverage: The detector test proves a different tool is blocked after the session-global threshold, and the before-tool-call e2e test proves the wrapped memory_search execute function is not called after repeated stable read outcomes. (src/agents/pi-tools.before-tool-call.e2e.test.ts:273, 11667aafe5e9)
Real behavior proof and exact-head checks: The PR body includes terminal output from a production before-tool-call smoke showing readExecutions=30, searchExecutions=0, blockedDeniedReason=tool-loop, and exact-head GitHub check runs include a successful Real behavior proof check plus successful lint/type/test/security shards. (11667aafe5e9)

Likely related people:

steipete: Peter Steinberger introduced configurable tool-loop detection and recently maintained exec outcome hashing and loop behavior in the affected detector files. (role: feature-history owner and recent maintainer; confidence: high; commits: 076df941a326, e9bce3f81c3a, 9b2f10dcf8d5; files: src/agents/tool-loop-detection.ts, src/agents/tool-loop-detection.test.ts, docs/tools/loop-detection.md)
vincentkoc: Vincent Koc recently worked on diagnostic event and before-tool-call surfaces that this fix exercises through the wrapped tool path. (role: adjacent diagnostics and hook-path maintainer; confidence: medium; commits: cead2ea4b106, bcdacfa1b3df, 8bff73cfb07f; files: src/agents/pi-tools.before-tool-call.ts, src/agents/pi-tools.before-tool-call.e2e.test.ts, src/infra/diagnostic-events.ts)
akramcodez: Earlier stuck-loop detection and backoff infrastructure history touched the current loop-detection and before-tool-call files. (role: earlier loop/backoff infrastructure contributor; confidence: medium; commits: e5eb5b3e43d7; files: src/agents/tool-loop-detection.ts, src/agents/tool-loop-detection.test.ts, src/agents/pi-tools.before-tool-call.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against 7c31a9aafc1c.

TurboTheTurtle · 2026-05-08T06:14:56Z

Addressed the remaining review item in 9e76446e by adding the missing changelog entry.

I also updated the PR body with structured real behavior proof from a production before-tool-call smoke: after 30 stable no-progress read outcomes in one session, the next memory_search call was blocked by the global circuit breaker before its execute function ran (searchExecutions=0). Detector tests, before-tool-call e2e, formatting, diff check, changelog attribution, and tsgo core/core:test all passed. The Real behavior proof check is now passing.

Remaining red check: check-additional-runtime-topology-architecture is failing in pnpm check:architecture on the existing madge cycle through src/agents/agent-scope.ts -> src/agents/model-ref-shared.ts -> ... -> src/plugins/types.ts. That cycle is outside this loop-breaker diff.

TurboTheTurtle · 2026-05-08T08:08:48Z

Rebased this branch onto current upstream/main (29689c62d0) to pick up the upstream import-cycle cleanup that was making check-additional-runtime-topology-architecture fail.

Local verification after the rebase:

pnpm check:architecture

That check now passes locally with 0 runtime value cycle(s) and 0 Madge cycles.

TurboTheTurtle · 2026-05-08T14:09:30Z

Rebased this branch onto current upstream/main (2008873be6) to clear the stale old-base CI state; new head is c67ae76e44.

Post-rebase validation:

pnpm vitest run src/agents/tool-loop-detection.test.ts src/agents/pi-tools.before-tool-call.e2e.test.ts --reporter=dot — 45 tests passing
pnpm exec oxfmt --check --threads=1 CHANGELOG.md src/agents/pi-tools.before-tool-call.e2e.test.ts src/agents/tool-loop-detection.test.ts src/agents/tool-loop-detection.ts
git diff --check upstream/main

TurboTheTurtle · 2026-05-08T14:20:06Z

Follow-up: main advanced again with an upstream UTC timestamp-label test fix, so I rebased this branch once more onto upstream/main (f6476140d2). I also pushed an empty rerun commit after the OpenGrep changed-path scan failed while fetching OpenGrep versions from GitHub and GitHub denied a direct failed-job rerun. New head is 11667aafe5.

Validation after the final rebase: git diff --check upstream/main. CI has restarted on the new head.

martingarramon · 2026-05-08T21:47:41Z

The session-global scope change is clean. getGlobalNoProgressStreak accumulates stable no-progress counts across different argsHash patterns and fires when the session total crosses globalCircuitBreakerThreshold. The break on a result-change for any prior-seen pattern is intentionally conservative — a single tool showing progress stops the backward count for all tools. Both edge cases are covered in the new tests (blocks a different tool after the session-global threshold + does not count unrelated one-off tools). LGTM.

openclaw-barnacle Bot added agents Agent runtime and tooling size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 8, 2026

TurboTheTurtle changed the title ~~[codex] fix(agents): make loop breaker session-global~~ fix(agents): make loop breaker session-global May 8, 2026

openclaw-barnacle Bot added triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 8, 2026

openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. labels May 8, 2026

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 8, 2026

TurboTheTurtle force-pushed the codex/global-loop-breaker branch from 9e76446 to 8fb3478 Compare May 8, 2026 08:06

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 8, 2026

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 8, 2026

TurboTheTurtle force-pushed the codex/global-loop-breaker branch from 8fb3478 to c67ae76 Compare May 8, 2026 14:06

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 8, 2026

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 8, 2026

Andy Ye added 2 commits May 8, 2026 07:16

fix(agents): make loop breaker session-global

533c892

docs(changelog): note global loop breaker

aa69920

TurboTheTurtle force-pushed the codex/global-loop-breaker branch from c67ae76 to aa69920 Compare May 8, 2026 14:16

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 8, 2026

ci: rerun opengrep scan

11667aa

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(agents): make loop breaker session-global#79267

fix(agents): make loop breaker session-global#79267
TurboTheTurtle wants to merge 3 commits intoopenclaw:mainfrom
TurboTheTurtle:codex/global-loop-breaker

TurboTheTurtle commented May 8, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 8, 2026 •

edited

Loading

Uh oh!

TurboTheTurtle commented May 8, 2026 •

edited

Loading

Uh oh!

TurboTheTurtle commented May 8, 2026

Uh oh!

TurboTheTurtle commented May 8, 2026

Uh oh!

TurboTheTurtle commented May 8, 2026

Uh oh!

martingarramon commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

TurboTheTurtle commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Real behavior proof

Uh oh!

clawsweeper Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TurboTheTurtle commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TurboTheTurtle commented May 8, 2026

Uh oh!

TurboTheTurtle commented May 8, 2026

Uh oh!

TurboTheTurtle commented May 8, 2026

Uh oh!

martingarramon commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TurboTheTurtle commented May 8, 2026 •

edited

Loading

clawsweeper Bot commented May 8, 2026 •

edited

Loading

TurboTheTurtle commented May 8, 2026 •

edited

Loading