fix: #3310 avoid empty chat tool outputs by Aphroq · Pull Request #3312 · openai/openai-agents-python

Aphroq · 2026-05-09T15:29:18Z

Summary

Preserve the default Chat Completions behavior for non-text-only function tool outputs by converting them to a non-empty placeholder string instead of raising.
Gate the explicit UserError behind strict_feature_validation=True so callers can opt in to fail-fast validation.
Keep preserve_tool_output_all_content=True behavior unchanged for compatible third-party providers, and keep mixed text/non-text outputs preserving only the text parts by default.

Test plan

uv run pytest tests/models/test_openai_chatcompletions_converter.py -k "non_text_function_output or function_output_item"
bash .agents/skills/code-change-verification/scripts/run.sh

Issue number

Closes #3310

Checks

I've added new tests (if relevant)
I've added/updated the relevant documentation
I've run make lint and make format
I've made sure tests pass

seratch

One concern: do we actually need to reject here?

I agree the current content: [] conversion is lossy and misleading for non-text-only tool outputs, but changing it to raise UserError is still a behavior change. Existing Chat Completions users who return ToolOutputImage / ToolOutputFileContent currently get a successful run, even if the model receives an empty tool message. With this PR, the same workflow starts failing.

Could we consider a less breaking fallback instead? For example, preserve the existing "run continues" behavior but avoid an empty tool message by converting the non-text-only result to a textual placeholder/serialized representation, or gate the hard error behind an explicit strict mode.

If we do keep the hard error, I think the PR should call out that this is an intentional behavior change and explain why fail-fast is preferable to preserving compatibility here.

Aphroq · 2026-05-10T06:57:42Z

Thanks, that compatibility concern makes sense. We probably should not make this a default UserError, especially because existing users can currently continue the run even if the model receives content: [].

From a compatibility and consistency perspective, we could consider gating the hard error behind strict_feature_validation=True: keep the default path compatible, and raise explicitly in strict mode instead of silently producing an empty tool message. This also matches the recent custom tool call handling.

My main concern with a default textual placeholder / serialized fallback is that it may make the model think it received a meaningful representation of image/file/audio output, even though the default Chat Completions path still cannot really consume that non-text content.

If we did keep the default hard error, I agree the PR should call that out as an intentional behavior change. The fail-fast rationale would be that the existing default conversion is not just lossy, but silently misleading: a non-text-only tool output becomes content: [], so the run appears to succeed even though the model receives no usable representation of the tool result. That can hide integration bugs and lead to follow-up model behavior based on an empty tool message.

Which direction would you prefer we take here: strict-mode gating, a default placeholder/serialized fallback, or keeping the default hard error while explicitly documenting it as an intentional behavior change?

Aphroq · 2026-05-11T05:12:36Z

Updated this in the compatibility-friendly direction.

The default Chat Completions path no longer raises for non-text-only tool output. It now keeps the run moving with a non-empty placeholder string instead of producing content: []. The hard error is gated behind strict_feature_validation=True, and preserve_tool_output_all_content=True still preserves the full non-text content for compatible providers.

I also added coverage for the default placeholder path, strict-mode failure, mixed text/non-text output, and full-content preservation.

seratch

Can you add warn logging when the strict validation is off? Here is an example I've explored locally:

diff --git a/src/agents/models/chatcmpl_converter.py b/src/agents/models/chatcmpl_converter.py
index cf39f4aa..821a7d42 100644
--- a/src/agents/models/chatcmpl_converter.py
+++ b/src/agents/models/chatcmpl_converter.py
@@ -47,6 +47,7 @@ from ..agent_output import AgentOutputSchemaBase
 from ..exceptions import AgentsException, UserError
 from ..handoffs import Handoff
 from ..items import TResponseInputItem, TResponseOutputItem
+from ..logger import logger
 from ..model_settings import MCPToolChoice
 from ..tool import (
     FunctionTool,
@@ -66,6 +67,8 @@ ResponseInputContentWithAudioParam = (
     ResponseInputContentParam | ResponseInputAudioParam | dict[str, Any]
 )
 
+_OMITTED_TOOL_OUTPUT_PLACEHOLDER = "[tool output omitted]"
+
 
 class Converter:
     @classmethod
@@ -468,6 +471,7 @@ class Converter:
         preserve_tool_output_all_content: bool = False,
         base_url: str | None = None,
         should_replay_reasoning_content: ShouldReplayReasoningContent | None = None,
+        strict_feature_validation: bool = False,
     ) -> list[ChatCompletionMessageParam]:
         """
         Convert a sequence of 'Item' objects into a list of ChatCompletionMessageParam.
@@ -493,6 +497,8 @@ class Converter:
             should_replay_reasoning_content: Optional hook that decides whether a
                 reasoning item should be replayed into the next assistant message as
                 `reasoning_content`.
+            strict_feature_validation: Whether to raise a UserError for Responses-only
+                features that Chat Completions cannot faithfully represent.
 
         Rules:
         - EasyInputMessage or InputMessage (role=user) => ChatCompletionUserMessageParam
@@ -748,6 +754,19 @@ class Converter:
                             for c in all_output_content
                             if c.get("type") == "text"
                         ]
+                        if not tool_result_content:
+                            message = (
+                                "Chat Completions tool outputs cannot be empty or contain only "
+                                "non-text content unless preserve_tool_output_all_content=True."
+                            )
+                            if strict_feature_validation:
+                                raise UserError(message)
+                            logger.warning(
+                                "%s Replacing the tool output with a placeholder; enable strict "
+                                "feature validation to raise an error instead.",
+                                message,
+                            )
+                            tool_result_content = _OMITTED_TOOL_OUTPUT_PLACEHOLDER
                 msg: ChatCompletionToolMessageParam = {
                     "role": "tool",
                     "tool_call_id": func_output["call_id"],
diff --git a/src/agents/models/multi_provider.py b/src/agents/models/multi_provider.py
index fb609778..2dd5af20 100644
--- a/src/agents/models/multi_provider.py
+++ b/src/agents/models/multi_provider.py
@@ -109,8 +109,8 @@ class MultiProvider(ModelProvider):
                 responses API.
             openai_strict_feature_validation: Whether OpenAI Chat Completions models should raise
                 a UserError when callers pass Responses-only features such as previous_response_id,
-                conversation_id, or prompt. Defaults to False, which preserves the previous
-                ignore-and-warn behavior.
+                conversation_id, prompt, or non-text-only tool outputs. Defaults to False, which
+                preserves the default compatibility behavior.
             openai_websocket_base_url: The websocket base URL to use for the OpenAI provider.
                 If not provided, the provider will use `OPENAI_WEBSOCKET_BASE_URL` when set.
             openai_prefix_mode: Controls how ``openai/...`` model strings are interpreted.
diff --git a/src/agents/models/openai_chatcompletions.py b/src/agents/models/openai_chatcompletions.py
index 95b9b64d..dbf4045b 100644
--- a/src/agents/models/openai_chatcompletions.py
+++ b/src/agents/models/openai_chatcompletions.py
@@ -412,6 +412,7 @@ class OpenAIChatCompletionsModel(Model):
             model=self.model,
             base_url=str(self._client.base_url),
             should_replay_reasoning_content=self.should_replay_reasoning_content,
+            strict_feature_validation=self._strict_feature_validation,
         )
 
         if system_instructions:
diff --git a/src/agents/models/openai_provider.py b/src/agents/models/openai_provider.py
index 45b3acbc..4153e659 100644
--- a/src/agents/models/openai_provider.py
+++ b/src/agents/models/openai_provider.py
@@ -74,8 +74,8 @@ class OpenAIProvider(ModelProvider):
                 API.
             strict_feature_validation: Whether Chat Completions models should raise a UserError
                 when callers pass Responses-only features such as previous_response_id,
-                conversation_id, or prompt. Defaults to False, which preserves the previous
-                ignore-and-warn behavior.
+                conversation_id, prompt, or non-text-only tool outputs. Defaults to False, which
+                preserves the default compatibility behavior.
             agent_registration: Optional agent registration configuration.
             responses_websocket_options: Optional low-level websocket keepalive options for the
                 OpenAI Responses websocket transport.
diff --git a/tests/models/test_openai_chatcompletions.py b/tests/models/test_openai_chatcompletions.py
index 399acf22..c18b29f5 100644
--- a/tests/models/test_openai_chatcompletions.py
+++ b/tests/models/test_openai_chatcompletions.py
@@ -320,6 +320,50 @@ async def test_get_response_rejects_prompt_in_strict_mode(monkeypatch) -> None:
         )
 
 
+@pytest.mark.allow_call_model_methods
+@pytest.mark.asyncio
+async def test_get_response_rejects_non_text_tool_output_in_strict_mode() -> None:
+    class DummyCompletions:
+        async def create(self, **kwargs: Any) -> Any:
+            raise AssertionError("chat.completions.create should not run")
+
+    class DummyClient:
+        def __init__(self) -> None:
+            self.chat = type("_Chat", (), {"completions": DummyCompletions()})()
+            self.base_url = httpx.URL("http://fake")
+
+    model = OpenAIChatCompletionsModel(
+        model="gpt-4",
+        openai_client=DummyClient(),  # type: ignore[arg-type]
+        strict_feature_validation=True,
+    )
+
+    with pytest.raises(UserError, match="cannot be empty or contain only non-text content"):
+        await model.get_response(
+            system_instructions=None,
+            input=[
+                {
+                    "type": "function_call_output",
+                    "call_id": "call_image",
+                    "output": [
+                        {
+                            "type": "input_image",
+                            "image_url": "https://example.com/image.png",
+                        }
+                    ],
+                }
+            ],
+            model_settings=ModelSettings(),
+            tools=[],
+            output_schema=None,
+            handoffs=[],
+            tracing=ModelTracing.DISABLED,
+            previous_response_id=None,
+            conversation_id=None,
+            prompt=None,
+        )
+
+
 @pytest.mark.allow_call_model_methods
 @pytest.mark.asyncio
 async def test_get_response_attaches_logprobs(monkeypatch) -> None:
diff --git a/tests/models/test_openai_chatcompletions_converter.py b/tests/models/test_openai_chatcompletions_converter.py
index c0bb95c4..14174c77 100644
--- a/tests/models/test_openai_chatcompletions_converter.py
+++ b/tests/models/test_openai_chatcompletions_converter.py
@@ -23,6 +23,7 @@ These tests exercise both conversion directions:
 
 from __future__ import annotations
 
+import logging
 from typing import Literal, cast
 
 import pytest
@@ -356,6 +357,139 @@ def test_items_to_messages_with_function_output_item():
     assert tool_msg["content"] == func_output_item["output"]
 
 
+def test_items_to_messages_with_non_text_only_function_output_uses_placeholder_by_default(
+    caplog: pytest.LogCaptureFixture,
+):
+    """Default conversion should keep running without sending an empty tool message."""
+    func_output_item: FunctionCallOutput = {
+        "type": "function_call_output",
+        "call_id": "somecall",
+        "output": [
+            {
+                "type": "input_image",
+                "image_url": "https://example.com/image.png",
+            }
+        ],
+    }
+
+    with caplog.at_level(logging.WARNING, logger="openai.agents"):
+        messages = Converter.items_to_messages([func_output_item])
+
+    assert len(messages) == 1
+    tool_msg = messages[0]
+    assert tool_msg["role"] == "tool"
+    assert tool_msg["tool_call_id"] == func_output_item["call_id"]
+    assert tool_msg["content"] == "[tool output omitted]"
+    assert "Replacing the tool output with a placeholder" in caplog.text
+
+
+def test_items_to_messages_with_non_text_only_function_output_raises_in_strict_mode():
+    """Strict validation should fail explicitly instead of silently losing the output."""
+    func_output_item: FunctionCallOutput = {
+        "type": "function_call_output",
+        "call_id": "somecall",
+        "output": [
+            {
+                "type": "input_image",
+                "image_url": "https://example.com/image.png",
+            }
+        ],
+    }
+
+    with pytest.raises(UserError, match="cannot be empty or contain only non-text content"):
+        Converter.items_to_messages([func_output_item], strict_feature_validation=True)
+
+
+def test_items_to_messages_with_empty_function_output_uses_placeholder_by_default(
+    caplog: pytest.LogCaptureFixture,
+):
+    """Default conversion should not send an empty tool message."""
+    func_output_item: FunctionCallOutput = {
+        "type": "function_call_output",
+        "call_id": "somecall",
+        "output": [],
+    }
+
+    with caplog.at_level(logging.WARNING, logger="openai.agents"):
+        messages = Converter.items_to_messages([func_output_item])
+
+    assert len(messages) == 1
+    tool_msg = messages[0]
+    assert tool_msg["role"] == "tool"
+    assert tool_msg["tool_call_id"] == func_output_item["call_id"]
+    assert tool_msg["content"] == "[tool output omitted]"
+    assert "Replacing the tool output with a placeholder" in caplog.text
+
+
+def test_items_to_messages_with_empty_function_output_raises_in_strict_mode():
+    """Strict validation should fail explicitly instead of sending empty output."""
+    func_output_item: FunctionCallOutput = {
+        "type": "function_call_output",
+        "call_id": "somecall",
+        "output": [],
+    }
+
+    with pytest.raises(UserError, match="cannot be empty or contain only non-text content"):
+        Converter.items_to_messages([func_output_item], strict_feature_validation=True)
+
+
+def test_items_to_messages_with_mixed_function_output_keeps_text_by_default(
+    caplog: pytest.LogCaptureFixture,
+):
+    """Default conversion should preserve text parts and omit unsupported non-text parts."""
+    func_output_item: FunctionCallOutput = {
+        "type": "function_call_output",
+        "call_id": "somecall",
+        "output": [
+            {"type": "input_text", "text": "visible text"},
+            {
+                "type": "input_image",
+                "image_url": "https://example.com/image.png",
+            },
+        ],
+    }
+
+    with caplog.at_level(logging.WARNING, logger="openai.agents"):
+        messages = Converter.items_to_messages([func_output_item])
+
+    assert len(messages) == 1
+    tool_msg = messages[0]
+    assert tool_msg["role"] == "tool"
+    assert tool_msg["tool_call_id"] == func_output_item["call_id"]
+    assert tool_msg["content"] == [{"type": "text", "text": "visible text"}]
+    assert "tool output omitted" not in caplog.text
+
+
+def test_items_to_messages_can_preserve_non_text_function_output() -> None:
+    """Compatible providers can opt in to preserving non-text tool output."""
+    func_output_item: FunctionCallOutput = {
+        "type": "function_call_output",
+        "call_id": "somecall",
+        "output": [
+            {
+                "type": "input_image",
+                "image_url": "https://example.com/image.png",
+            }
+        ],
+    }
+
+    messages = Converter.items_to_messages(
+        [func_output_item],
+        preserve_tool_output_all_content=True,
+    )
+
+    assert len(messages) == 1
+    tool_msg = messages[0]
+    assert tool_msg["role"] == "tool"
+    assert tool_msg["tool_call_id"] == func_output_item["call_id"]
+    assert tool_msg["content"] == [
+        {
+            "type": "image_url",
+            "image_url": {"url": "https://example.com/image.png", "detail": "auto"},
+        }
+    ]
+
+
 def test_extract_all_and_text_content_for_strings_and_lists():
     """
     The converter provides helpers for extracting user-supplied message content

github-actions Bot added bug Something isn't working feature:chat-completions labels May 9, 2026

Aphroq changed the title ~~fix(models): reject empty chat tool outputs~~ fix: #3310 reject empty chat tool outputs May 9, 2026

seratch requested changes May 9, 2026

View reviewed changes

seratch marked this pull request as draft May 9, 2026 23:57

Avoid empty chat tool outputs

7e89518

Aphroq force-pushed the fix/chatcmpl-nontext-tool-output-empty branch from c0032ed to 7e89518 Compare May 11, 2026 05:00

Aphroq changed the title ~~fix: #3310 reject empty chat tool outputs~~ fix: #3310 avoid empty chat tool outputs May 11, 2026

Aphroq marked this pull request as ready for review May 11, 2026 05:13

seratch requested changes May 11, 2026

View reviewed changes

seratch added this to the 0.17.x milestone May 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: #3310 avoid empty chat tool outputs#3312

fix: #3310 avoid empty chat tool outputs#3312
Aphroq wants to merge 1 commit intoopenai:mainfrom
Aphroq:fix/chatcmpl-nontext-tool-output-empty

Aphroq commented May 9, 2026 •

edited

Loading

Uh oh!

seratch left a comment

Uh oh!

Aphroq commented May 10, 2026 •

edited

Loading

Uh oh!

Aphroq commented May 11, 2026

Uh oh!

seratch left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Aphroq commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Issue number

Checks

Uh oh!

seratch left a comment

Choose a reason for hiding this comment

Uh oh!

Aphroq commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Aphroq commented May 11, 2026

Uh oh!

seratch left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Aphroq commented May 9, 2026 •

edited

Loading

Aphroq commented May 10, 2026 •

edited

Loading