Skip to content

Conversation

@Danipulok
Copy link

@Danipulok Danipulok commented Nov 22, 2025

Closes #3485

Changes:

  • Change logic of handling output tools with end_strategy='exhaustive';
  • Added tests;
  • Updated docstrings;
  • Updated docs;

How I verified:

MRE:

import asyncio
import os

from pydantic import BaseModel
from pydantic_ai import Agent, ModelSettings, ToolOutput
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider


class TextMessage(BaseModel):
    text: str | None = None


async def send_text_message(
    message: TextMessage,
) -> None:
    """Send a text message."""
    print(f"\nText Message: {message}")


class QuickRepliesMessage(BaseModel):
    text: str | None = None
    quick_replies: list[str] | None = None


async def send_quick_replies_message(
    message: QuickRepliesMessage,
) -> None:
    """Send a quick replies message."""
    print(f"\nQuick Replies Message: {message}")


class TestDate(BaseModel):
    days_of_sunshine: int
    info: str


async def main() -> None:
    api_key = os.environ["OPENAI_API_KEY"]
    model = OpenAIChatModel(
        "gpt-4o",
        provider=OpenAIProvider(api_key=api_key),
        settings=ModelSettings(
            temperature=0.1,
        ),
    )
    output_type = [
        ToolOutput(send_text_message, name="send_text_message"),
        ToolOutput(send_quick_replies_message, name="send_quick_replies_message"),
    ]
    agent = Agent(
        model,
        output_type=output_type,
        instructions="For response, call at first `send_quick_replies_message` and then `send_text_message`, both in parallel",
        end_strategy="exhaustive",  # comment to use the default "early" strategy
    )

    user_prompt = "Tell me about Python"

    async with agent.run_stream(user_prompt) as run:
        async for output in run.stream_responses():
            model_response, is_last_message = output
            print(model_response, is_last_message, end="\n\n")


if __name__ == "__main__":
    asyncio.run(main())

Old output:

ModelResponse(parts=[ToolCallPart(tool_name='send_text_message', args='{"text": "Python is a high-level, interpreted programming language known for its readability and simplicity. It supports multiple programming paradigms, including procedural, object-oriented, and functional programming. Python is widely used in web development, data analysis, artificial intelligence, scientific computing, and more."}', tool_call_id='call_iqyle7PEgwRrKPIB9uhMaEQ9'), ToolCallPart(tool_name='send_quick_replies_message', args='{"text": "Would you like to know more about Python?", "quick_replies": ["History of Python", "Python Features", "Python Applications", "Learning Resources"]}', tool_call_id='call_50K33vhEJ7lyfk6cjulek3hZ')], usage=RequestUsage(input_tokens=124, output_tokens=127, details={'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}), model_name='gpt-4o-2024-08-06', timestamp=datetime.datetime(2025, 11, 20, 16, 42, 7, tzinfo=TzInfo(0)), provider_name='openai', provider_details={'finish_reason': 'tool_calls'}, provider_response_id='chatcmpl-Ce21PzXA1xj1yLIcXxGBJyZR5KrEy', finish_reason='tool_call') True

Text Message: text='Python is a high-level, interpreted programming language known for its readability and simplicity. It supports multiple programming paradigms, including procedural, object-oriented, and functional programming. Python is widely used in web development, data analysis, artificial intelligence, scientific computing, and more.'

Process finished with exit code 0

New output:

ModelResponse(parts=[ToolCallPart(tool_name='send_quick_replies_message', args='{"text": "What would you like to know about Python?", "quick_replies": ["History", "Features", "Applications", "Learning Resources"]}', tool_call_id='call_mmWS2iYDasr3rqosyxozy1UG'), ToolCallPart(tool_name='send_text_message', args='{"text": "Python is a high-level, interpreted programming language known for its readability and versatility. It supports multiple programming paradigms, including procedural, object-oriented, and functional programming. Python is widely used in web development, data analysis, artificial intelligence, scientific computing, and more."}', tool_call_id='call_IEmqlH3CxF8vgGDrEhfU6fpQ')], usage=RequestUsage(input_tokens=126, output_tokens=123, details={'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}), model_name='gpt-4o-2024-08-06', timestamp=datetime.datetime(2025, 11, 22, 22, 12, 54, tzinfo=TzInfo(0)), provider_name='openai', provider_details={'finish_reason': 'tool_calls'}, provider_response_id='chatcmpl-Ceq8cWafgeCMWija35zEkUqeHmK7b', finish_reason='tool_call') True


Quick Replies Message: text='What would you like to know about Python?' quick_replies=['History', 'Features', 'Applications', 'Learning Resources']

Text Message: text='Python is a high-level, interpreted programming language known for its readability and versatility. It supports multiple programming paradigms, including procedural, object-oriented, and functional programming. Python is widely used in web development, data analysis, artificial intelligence, scientific computing, and more.'

Process finished with exit code 0

@Danipulok
Copy link
Author

@DouweM, hey!
I have finished the PR we discussed in #3485, please check
I'm not sure about the new logic, maybe it needs some changes

Copy link
Collaborator

@DouweM DouweM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Danipulok Thank you! I'll likely make some tweaks to the docs before merging, but first please have a look at my code comments

output_parts.append(part)
output_parts.append(part)
# With exhaustive strategy, execute all output tools
elif ctx.deps.end_strategy == 'exhaustive':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's too much duplication here with the else branch below; can you clean that up somehow please?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are completely right, I've missed that part

I think there are two options to handle this (if we leave current logic):

  1. Util function like _call_output_tool
  2. Merge two strategy conditions to not duplicate the code

Here's some pseudocode example of the logic:

# In case we got two tool calls with the same ID
if final_result and tool_call.tool_call_id == final_result.tool_call_id:
    # Final result processed.

# Early strategy is chosen and final result is already set
elif ctx.deps.end_strategy == 'early' and final_result:
    # Output tool not used - a final result was already processed

# Early strategy is chosen and final result is not yet set
# Or exhaustive strategy is chosen
elif (ctx.deps.end_strategy == 'early' and not final_result) or ctx.deps.end_strategy == 'exhaustive':
    # Final result processed
    if not final_result:
          final_result = result.FinalResult(result_data, call.tool_name, call.tool_call_id)

# This should never happen
else:
    assert_never(ctx.deps.end_strategy)

Which approach is better in your opinion?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the usual code complexity and if the second option would be easy to read in the future

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with option2

assert result.output.value == 'first'

# Verify both output tools were called
assert output_tools_called == ['first', 'second']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we verify that result.all_messages() looks as expected, as we did above?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

assert result.output.value == 'first'

# Verify only the first output tool was called
assert output_tools_called == ['first']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we verify that result.all_messages() looks as expected, as we did above?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

]
assert len(retry_parts) >= 1
# The retry should mention validation error
assert any('value' in str(p.content).lower() for p in retry_parts)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we show the entire all_messages() as above?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

return ModelResponse(
parts=[
ToolCallPart('first_output', {'value': 'first'}),
ToolCallPart('second_output', {'value': 'second'}),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also test here what happens if the second call is invalid? It should be consistent with exhaustive execution of non-output tool calls in terms of whether it causes us to go back to the model to make it try again, or whether we ignore the tool call failure because we already have valid final output

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try:
if tool_call_result is None:
tool_result = await tool_manager.handle_call(tool_call)
elif isinstance(tool_call_result, ToolApproved):
if tool_call_result.override_args is not None:
tool_call = dataclasses.replace(tool_call, args=tool_call_result.override_args)
tool_result = await tool_manager.handle_call(tool_call, approved=True)
elif isinstance(tool_call_result, ToolDenied):
return _messages.ToolReturnPart(
tool_name=tool_call.tool_name,
content=tool_call_result.message,
tool_call_id=tool_call.tool_call_id,
), None
elif isinstance(tool_call_result, exceptions.ModelRetry):
m = _messages.RetryPromptPart(
content=tool_call_result.message,
tool_name=tool_call.tool_name,
tool_call_id=tool_call.tool_call_id,
)
raise ToolRetryError(m)
elif isinstance(tool_call_result, _messages.RetryPromptPart):
tool_call_result.tool_name = tool_call.tool_name
tool_call_result.tool_call_id = tool_call.tool_call_id
raise ToolRetryError(tool_call_result)
else:
tool_result = tool_call_result
except ToolRetryError as e:
return e.tool_retry, None

Judging by those lines, it seems retries are done to model in exhaustive strategy, if any tool (not output) is failed

Here's a small gift that confirms this behavior:
https://gist.github.com/Danipulok/0897bf27c1214adb7d4a401a684b0c39

FunctionModel(stream_function=sf), output_type=CustomOutputType, end_strategy='exhaustive', output_retries=0
)

# Should raise because the second final_result has invalid JSON
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really the desired behavior? If we have valid output from the first output tool call, shouldn't we finish on that?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a point for discussion, and we should think about the use-cases when exhaustive is used

IMHO, the main point of exhaustive strategy is to execute all tools
Because the only time exhaustive strategy is used, is when tools and output tools have side effects
For example, each output tool should send a message to the user, or maybe the human operator is called together with some text message (two different output tools)

And in those cases it would be wanted to be sure all output tools are executed, otherwise the default early strategy is enough
So I think the current behavior is intuitive and correct

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Parallel output tool calls with end_strategy='exhaustive' should call all output functions

2 participants