Make Tool Output dict conversion stricter to improve backward compatibility #1965
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Resolves #1930
Background
Before v0.4.0, all function tool outputs were converted to string.
Starting from v0.4.0 #1898, function tool outputs can now return a
dictand the SDK attempts to convert into one of theValidToolOutputPydanticModels=Union[ToolOutputText, ToolOutputImage, ToolOutputFileContent].Problem
However, the current dict conversation is too permissive: returning any dict or list can trigger breaking behavior such as #1930.
For example, the following function output has no intention of returning an image:
{ "msg": "foobar" }but it gets converted into:
{ "type": "input_image", "image_url": None, "file_id": None }Similarly, this list:
[ { "product_id": 1 }, { "product_id": 2 } ]is also converted into:
[ { "type": "input_image", "image_url": None, "file_id": None }, { "type": "input_image", "image_url": None, "file_id": None } ]All of these cases cause API error:
Solution
This PR refines the dict conversion logic to make
_maybe_get_output_as_structured_function_outputmore accurate and stricter.A dict will now only be converted into a structured output if both conditions are met:
The dict must include a valid
typefieldThe dict must also contain the required content fields:
ToolOutputText: requirestextToolOutputImage: requires at least one ofimage_urlorfile_idToolOutputFileContent: requires at least one offile_data,file_url, orfile_idIf a dict does not meet these criteria, it will no longer be implicitly converted.
Instead, it will fall back to the previous behavior and be converted to a string (
str()), consistent with pre-v0.4.0 behavior.This aligns with the intended logic described in the original code comments.
Additionally, comprehensive unit tests have been added to verify which dict combinations should and should not be automatically converted.
Benefits
typeand required fields) will be converted automatically.str().