-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Make end_strategy also work for output tools, not just tools
#3523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
DouweM
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Danipulok Thank you! I'll likely make some tweaks to the docs before merging, but first please have a look at my code comments
| output_parts.append(part) | ||
| output_parts.append(part) | ||
| # With exhaustive strategy, execute all output tools | ||
| elif ctx.deps.end_strategy == 'exhaustive': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's too much duplication here with the else branch below; can you clean that up somehow please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are completely right, I've missed that part
I think there are two options to handle this (if we leave current logic):
- Util function like
_call_output_tool - Merge two strategy conditions to not duplicate the code
Here's some pseudocode example of the logic:
# In case we got two tool calls with the same ID
if final_result and tool_call.tool_call_id == final_result.tool_call_id:
# Final result processed.
# Early strategy is chosen and final result is already set
elif ctx.deps.end_strategy == 'early' and final_result:
# Output tool not used - a final result was already processed
# Early strategy is chosen and final result is not yet set
# Or exhaustive strategy is chosen
elif (ctx.deps.end_strategy == 'early' and not final_result) or ctx.deps.end_strategy == 'exhaustive':
# Final result processed
if not final_result:
final_result = result.FinalResult(result_data, call.tool_name, call.tool_call_id)
# This should never happen
else:
assert_never(ctx.deps.end_strategy)Which approach is better in your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about the usual code complexity and if the second option would be easy to read in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated with option2
| assert result.output.value == 'first' | ||
|
|
||
| # Verify both output tools were called | ||
| assert output_tools_called == ['first', 'second'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we verify that result.all_messages() looks as expected, as we did above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
| assert result.output.value == 'first' | ||
|
|
||
| # Verify only the first output tool was called | ||
| assert output_tools_called == ['first'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we verify that result.all_messages() looks as expected, as we did above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
tests/test_streaming.py
Outdated
| ] | ||
| assert len(retry_parts) >= 1 | ||
| # The retry should mention validation error | ||
| assert any('value' in str(p.content).lower() for p in retry_parts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we show the entire all_messages() as above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
| return ModelResponse( | ||
| parts=[ | ||
| ToolCallPart('first_output', {'value': 'first'}), | ||
| ToolCallPart('second_output', {'value': 'second'}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also test here what happens if the second call is invalid? It should be consistent with exhaustive execution of non-output tool calls in terms of whether it causes us to go back to the model to make it try again, or whether we ignore the tool call failure because we already have valid final output
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pydantic-ai/pydantic_ai_slim/pydantic_ai/_agent_graph.py
Lines 1079 to 1106 in 5afc2d6
| try: | |
| if tool_call_result is None: | |
| tool_result = await tool_manager.handle_call(tool_call) | |
| elif isinstance(tool_call_result, ToolApproved): | |
| if tool_call_result.override_args is not None: | |
| tool_call = dataclasses.replace(tool_call, args=tool_call_result.override_args) | |
| tool_result = await tool_manager.handle_call(tool_call, approved=True) | |
| elif isinstance(tool_call_result, ToolDenied): | |
| return _messages.ToolReturnPart( | |
| tool_name=tool_call.tool_name, | |
| content=tool_call_result.message, | |
| tool_call_id=tool_call.tool_call_id, | |
| ), None | |
| elif isinstance(tool_call_result, exceptions.ModelRetry): | |
| m = _messages.RetryPromptPart( | |
| content=tool_call_result.message, | |
| tool_name=tool_call.tool_name, | |
| tool_call_id=tool_call.tool_call_id, | |
| ) | |
| raise ToolRetryError(m) | |
| elif isinstance(tool_call_result, _messages.RetryPromptPart): | |
| tool_call_result.tool_name = tool_call.tool_name | |
| tool_call_result.tool_call_id = tool_call.tool_call_id | |
| raise ToolRetryError(tool_call_result) | |
| else: | |
| tool_result = tool_call_result | |
| except ToolRetryError as e: | |
| return e.tool_retry, None |
Judging by those lines, it seems retries are done to model in
exhaustive strategy, if any tool (not output) is failed
Here's a small gift that confirms this behavior:
https://gist.github.com/Danipulok/0897bf27c1214adb7d4a401a684b0c39
| FunctionModel(stream_function=sf), output_type=CustomOutputType, end_strategy='exhaustive', output_retries=0 | ||
| ) | ||
|
|
||
| # Should raise because the second final_result has invalid JSON |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really the desired behavior? If we have valid output from the first output tool call, shouldn't we finish on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a point for discussion, and we should think about the use-cases when exhaustive is used
IMHO, the main point of exhaustive strategy is to execute all tools
Because the only time exhaustive strategy is used, is when tools and output tools have side effects
For example, each output tool should send a message to the user, or maybe the human operator is called together with some text message (two different output tools)
And in those cases it would be wanted to be sure all output tools are executed, otherwise the default early strategy is enough
So I think the current behavior is intuitive and correct
Closes #3485
Changes:
end_strategy='exhaustive';How I verified:
end_strategy='exhaustive'should call all output functions #3485 now successfully calls both output tools;MRE:
Old output:
New output: