Skip to content

fix(tools): append_messages() inside the tool runner loop no longer infinite-loops or drops the assistant turn#1538

Open
Zelys-DFKH wants to merge 1 commit into
anthropics:mainfrom
Zelys-DFKH:fix/tool-runner-append-messages-loop
Open

fix(tools): append_messages() inside the tool runner loop no longer infinite-loops or drops the assistant turn#1538
Zelys-DFKH wants to merge 1 commit into
anthropics:mainfrom
Zelys-DFKH:fix/tool-runner-append-messages-loop

Conversation

@Zelys-DFKH
Copy link
Copy Markdown

Fixes #1536.

What was broken

Calling append_messages() inside a tool runner loop had two failure modes, depending on what you appended.

If you added a plain context message like {"role": "user", "content": "Remember: be concise."}, the loop ran forever. The runner used a _messages_modified flag to decide whether to skip the auto-generated (assistant, tool_result) pair. The problem was that both user appends and the auto-append itself set the flag to True, so once you called append_messages(), the flag never came back down and the loop never stopped.

If you manually provided a tool_result block (for cases where you want full control over what gets returned), the runner sent it to the API without the preceding assistant turn. That produced a 400 unexpected tool_use_id error every time.

How the fix works

Before yielding in the loop, the runner now snapshots the current message count. After yield returns, anything between that index and the end of the list is what the user appended. The runner checks whether any of those messages contain a tool_result block.

If yes: the user is handling tool results themselves. The runner inserts the assistant turn before their result only if they didn't include it already.

If no: the runner generates tool responses as usual, then splices the auto-generated (assistant, tool_result) pair in before any extra messages the user added.

The logic is the same in both the sync (BaseSyncToolRunner) and async (BaseAsyncToolRunner) paths. BetaStreamingToolRunner and BetaAsyncStreamingToolRunner both inherit __run__ from the base classes, so they get the fix for free.

The four call sites that set the full message list now use a private _set_messages_list helper instead of inline lambdas. Capturing a loop variable in a lambda triggers ruff B023, and the default-arg workaround (msgs=new_messages) confuses mypy's callable-type inference. A method parameter solves both at once.

Tests

test_custom_message_handling was previously marked xfail with a # TODO: fix the append_messages method comment. The xfail is removed and the HTTP fixture updated to replay a proper 200 response instead of the 400 the bug used to produce.

Three new regression tests cover the scenarios from the issue: appending a plain user message (must not loop, must place the tool result correctly), providing a manual tool result (runner must prepend the assistant turn), and providing both the assistant turn and the tool result manually (runner must not insert a second assistant message).

…ite-loops or drops the assistant turn

Fixes anthropics#1536.

The runner used a _messages_modified boolean that could not distinguish
between two distinct append behaviors: a user supplying their own
tool_result (skip auto-append) vs. a user appending extra context
(auto-append still needed). Both set the flag True, so calling
append_messages() inside the loop caused either an infinite loop or a
400 API error depending on what was appended.

Replace the flag with a message-count snapshot taken before each yield.
After yield returns, slice the list to get what the user appended and
branch on whether those messages contain a tool_result block. If yes,
insert the assistant turn before the user's result only when they did
not include it themselves. If no, generate tool responses as normal and
splice the (assistant, tool_result) pair in before any extra messages.

Add _set_messages_list helper to BaseToolRunner so the four call sites
that replace the full message list can capture the list as a method
parameter rather than a loop variable, satisfying both ruff B023 and
mypy's callable-type inference.

Remove the xfail from test_custom_message_handling and update its HTTP
fixture to return a valid 200 response. Add three regression tests
covering: plain user message append (no infinite loop), manual
tool_result (assistant turn auto-inserted), and user-supplied assistant
+ tool_result (no duplicate assistant message).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] append_messages() inside tool runner loop causes infinite loop (advanced usage docs example)

1 participant