Skip to content

Anthropic Extended Thinking bug: tool_result blocks appear before corresponding tool_use blocks #1797

@habema

Description

@habema

When using Anthropic models with extended thinking enabled (reasoning.effort set), conversation histories from sessions can get malformed message sequences where tool results appear before their corresponding tool calls. This causes Anthropic API to reject requests with errors like:

AnthropicException - messages.4.content.1: unexpected `tool_use_id` found in `tool_result` blocks: toolu_017j99WJ1GYnFYsqLQjT3c4x. Each `tool_result` block must have a corresponding `tool_use` block in the previous message.

The root cause is in how the items_to_messages converter handles reasoning blocks and tool calls when preserve_thinking_blocks=True. The converter can flush assistant messages prematurely, breaking the required tool_use → tool_result pairing that Anthropic's API expects.

In the below repro script, CONVERSATION_HISTORY was pulled from a session:

import asyncio
import os

import litellm
from dotenv import load_dotenv
from openai.types.shared.reasoning import Reasoning

from agents import Agent, Runner, TResponseInputItem, set_tracing_disabled
from agents.model_settings import ModelSettings

load_dotenv()
set_tracing_disabled(True)
litellm.modify_params = True

CONVERSATION_HISTORY: list[TResponseInputItem] = [
    {
        "content": "User request",
        "role": "user"
    },
    {
        "arguments": "{\"param\": \"value\"}",
        "call_id": "toolu_019LhVmM8SYJpTeVT3k2xrAc",
        "name": "first_tool",
        "type": "function_call",
        "id": "__fake_id__"
    },
    {
        "call_id": "toolu_019LhVmM8SYJpTeVT3k2xrAc",
        "output": "{\"result\": \"success\"}",
        "type": "function_call_output"
    },
    {
        "id": "__fake_id__",
        "content": [
            {
                "annotations": [],
                "text": "Assistant response",
                "type": "output_text"
            }
        ],
        "role": "assistant",
        "status": "completed",
        "type": "message"
    },
    {
        "arguments": "{\"param\": \"value\"}",
        "call_id": "toolu_01WY8EwF6DxJNh2EvaumYCtu",
        "name": "second_tool",
        "type": "function_call",
        "id": "__fake_id__"
    },
    {
        "id": "__fake_id__",
        "summary": [
            {
                "text": "Reasoning about the task",
                "type": "summary_text"
            }
        ],
        "type": "reasoning",
        "content": [
            {
                "text": "Reasoning about the task",
                "type": "reasoning_text"
            }
        ],
        "encrypted_content": "REDACTED_SIGNATURE"
    },
    {
        "call_id": "toolu_01WY8EwF6DxJNh2EvaumYCtu",
        "output": "{\"result\": \"success\"}",
        "type": "function_call_output"
    },
    {
        "call_id": "toolu_017j99WJ1GYnFYsqLQjT3c4x",
        "output": "{\"result\": \"success\"}",
        "type": "function_call_output"
    },
    {
        "arguments": "{\"param\": \"value\"}",
        "call_id": "toolu_017j99WJ1GYnFYsqLQjT3c4x",
        "name": "third_tool",
        "type": "function_call",
        "id": "__fake_id__"
    }
]

model_settings = ModelSettings(
    reasoning=Reasoning(effort="high"),
    extra_headers={"anthropic-beta": "interleaved-thinking-2025-05-14"},
)

agent = Agent(
    name="Test Agent",
    model="litellm/anthropic/claude-sonnet-4-20250514",
    instructions="You are a test agent.",
    model_settings=model_settings,
)


async def test_bug():
    """Test that reproduces the interleaved thinking bug."""

    result = await Runner.run(
        agent,
        CONVERSATION_HISTORY + [{"role": "user", "content": "hi"}],
    )

    print(result.final_output)


if __name__ == "__main__":
    if not os.getenv("ANTHROPIC_API_KEY"):
        print("ANTHROPIC_API_KEY not found in environment")

    asyncio.run(test_bug())

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions