-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Closed
Labels
Description
When using Anthropic models with extended thinking enabled (reasoning.effort
set), conversation histories from sessions can get malformed message sequences where tool results appear before their corresponding tool calls. This causes Anthropic API to reject requests with errors like:
AnthropicException - messages.4.content.1: unexpected `tool_use_id` found in `tool_result` blocks: toolu_017j99WJ1GYnFYsqLQjT3c4x. Each `tool_result` block must have a corresponding `tool_use` block in the previous message.
The root cause is in how the items_to_messages
converter handles reasoning blocks and tool calls when preserve_thinking_blocks=True
. The converter can flush assistant messages prematurely, breaking the required tool_use → tool_result pairing that Anthropic's API expects.
In the below repro script, CONVERSATION_HISTORY
was pulled from a session:
import asyncio
import os
import litellm
from dotenv import load_dotenv
from openai.types.shared.reasoning import Reasoning
from agents import Agent, Runner, TResponseInputItem, set_tracing_disabled
from agents.model_settings import ModelSettings
load_dotenv()
set_tracing_disabled(True)
litellm.modify_params = True
CONVERSATION_HISTORY: list[TResponseInputItem] = [
{
"content": "User request",
"role": "user"
},
{
"arguments": "{\"param\": \"value\"}",
"call_id": "toolu_019LhVmM8SYJpTeVT3k2xrAc",
"name": "first_tool",
"type": "function_call",
"id": "__fake_id__"
},
{
"call_id": "toolu_019LhVmM8SYJpTeVT3k2xrAc",
"output": "{\"result\": \"success\"}",
"type": "function_call_output"
},
{
"id": "__fake_id__",
"content": [
{
"annotations": [],
"text": "Assistant response",
"type": "output_text"
}
],
"role": "assistant",
"status": "completed",
"type": "message"
},
{
"arguments": "{\"param\": \"value\"}",
"call_id": "toolu_01WY8EwF6DxJNh2EvaumYCtu",
"name": "second_tool",
"type": "function_call",
"id": "__fake_id__"
},
{
"id": "__fake_id__",
"summary": [
{
"text": "Reasoning about the task",
"type": "summary_text"
}
],
"type": "reasoning",
"content": [
{
"text": "Reasoning about the task",
"type": "reasoning_text"
}
],
"encrypted_content": "REDACTED_SIGNATURE"
},
{
"call_id": "toolu_01WY8EwF6DxJNh2EvaumYCtu",
"output": "{\"result\": \"success\"}",
"type": "function_call_output"
},
{
"call_id": "toolu_017j99WJ1GYnFYsqLQjT3c4x",
"output": "{\"result\": \"success\"}",
"type": "function_call_output"
},
{
"arguments": "{\"param\": \"value\"}",
"call_id": "toolu_017j99WJ1GYnFYsqLQjT3c4x",
"name": "third_tool",
"type": "function_call",
"id": "__fake_id__"
}
]
model_settings = ModelSettings(
reasoning=Reasoning(effort="high"),
extra_headers={"anthropic-beta": "interleaved-thinking-2025-05-14"},
)
agent = Agent(
name="Test Agent",
model="litellm/anthropic/claude-sonnet-4-20250514",
instructions="You are a test agent.",
model_settings=model_settings,
)
async def test_bug():
"""Test that reproduces the interleaved thinking bug."""
result = await Runner.run(
agent,
CONVERSATION_HISTORY + [{"role": "user", "content": "hi"}],
)
print(result.final_output)
if __name__ == "__main__":
if not os.getenv("ANTHROPIC_API_KEY"):
print("ANTHROPIC_API_KEY not found in environment")
asyncio.run(test_bug())