Skip to content

fix: #3310 avoid empty chat tool outputs#3312

Open
Aphroq wants to merge 1 commit intoopenai:mainfrom
Aphroq:fix/chatcmpl-nontext-tool-output-empty
Open

fix: #3310 avoid empty chat tool outputs#3312
Aphroq wants to merge 1 commit intoopenai:mainfrom
Aphroq:fix/chatcmpl-nontext-tool-output-empty

Conversation

@Aphroq
Copy link
Copy Markdown
Contributor

@Aphroq Aphroq commented May 9, 2026

Summary

  • Preserve the default Chat Completions behavior for non-text-only function tool outputs by converting them to a non-empty placeholder string instead of raising.
  • Gate the explicit UserError behind strict_feature_validation=True so callers can opt in to fail-fast validation.
  • Keep preserve_tool_output_all_content=True behavior unchanged for compatible third-party providers, and keep mixed text/non-text outputs preserving only the text parts by default.

Test plan

  • uv run pytest tests/models/test_openai_chatcompletions_converter.py -k "non_text_function_output or function_output_item"
  • bash .agents/skills/code-change-verification/scripts/run.sh

Issue number

Closes #3310

Checks

  • I've added new tests (if relevant)
  • I've added/updated the relevant documentation
  • I've run make lint and make format
  • I've made sure tests pass

@github-actions github-actions Bot added bug Something isn't working feature:chat-completions labels May 9, 2026
@Aphroq Aphroq changed the title fix(models): reject empty chat tool outputs fix: #3310 reject empty chat tool outputs May 9, 2026
Copy link
Copy Markdown
Member

@seratch seratch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One concern: do we actually need to reject here?

I agree the current content: [] conversion is lossy and misleading for non-text-only tool outputs, but changing it to raise UserError is still a behavior change. Existing Chat Completions users who return ToolOutputImage / ToolOutputFileContent currently get a successful run, even if the model receives an empty tool message. With this PR, the same workflow starts failing.

Could we consider a less breaking fallback instead? For example, preserve the existing "run continues" behavior but avoid an empty tool message by converting the non-text-only result to a textual placeholder/serialized representation, or gate the hard error behind an explicit strict mode.

If we do keep the hard error, I think the PR should call out that this is an intentional behavior change and explain why fail-fast is preferable to preserving compatibility here.

@seratch seratch marked this pull request as draft May 9, 2026 23:57
@Aphroq
Copy link
Copy Markdown
Contributor Author

Aphroq commented May 10, 2026

Thanks, that compatibility concern makes sense. We probably should not make this a default UserError, especially because existing users can currently continue the run even if the model receives content: [].

From a compatibility and consistency perspective, we could consider gating the hard error behind strict_feature_validation=True: keep the default path compatible, and raise explicitly in strict mode instead of silently producing an empty tool message. This also matches the recent custom tool call handling.

My main concern with a default textual placeholder / serialized fallback is that it may make the model think it received a meaningful representation of image/file/audio output, even though the default Chat Completions path still cannot really consume that non-text content.

If we did keep the default hard error, I agree the PR should call that out as an intentional behavior change. The fail-fast rationale would be that the existing default conversion is not just lossy, but silently misleading: a non-text-only tool output becomes content: [], so the run appears to succeed even though the model receives no usable representation of the tool result. That can hide integration bugs and lead to follow-up model behavior based on an empty tool message.

Which direction would you prefer we take here: strict-mode gating, a default placeholder/serialized fallback, or keeping the default hard error while explicitly documenting it as an intentional behavior change?

@Aphroq Aphroq force-pushed the fix/chatcmpl-nontext-tool-output-empty branch from c0032ed to 7e89518 Compare May 11, 2026 05:00
@Aphroq
Copy link
Copy Markdown
Contributor Author

Aphroq commented May 11, 2026

Updated this in the compatibility-friendly direction.

The default Chat Completions path no longer raises for non-text-only tool output. It now keeps the run moving with a non-empty placeholder string instead of producing content: []. The hard error is gated behind strict_feature_validation=True, and preserve_tool_output_all_content=True still preserves the full non-text content for compatible providers.

I also added coverage for the default placeholder path, strict-mode failure, mixed text/non-text output, and full-content preservation.

@Aphroq Aphroq changed the title fix: #3310 reject empty chat tool outputs fix: #3310 avoid empty chat tool outputs May 11, 2026
@Aphroq Aphroq marked this pull request as ready for review May 11, 2026 05:13
Copy link
Copy Markdown
Member

@seratch seratch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add warn logging when the strict validation is off? Here is an example I've explored locally:

diff --git a/src/agents/models/chatcmpl_converter.py b/src/agents/models/chatcmpl_converter.py
index cf39f4aa..821a7d42 100644
--- a/src/agents/models/chatcmpl_converter.py
+++ b/src/agents/models/chatcmpl_converter.py
@@ -47,6 +47,7 @@ from ..agent_output import AgentOutputSchemaBase
 from ..exceptions import AgentsException, UserError
 from ..handoffs import Handoff
 from ..items import TResponseInputItem, TResponseOutputItem
+from ..logger import logger
 from ..model_settings import MCPToolChoice
 from ..tool import (
     FunctionTool,
@@ -66,6 +67,8 @@ ResponseInputContentWithAudioParam = (
     ResponseInputContentParam | ResponseInputAudioParam | dict[str, Any]
 )
 
+_OMITTED_TOOL_OUTPUT_PLACEHOLDER = "[tool output omitted]"
+
 
 class Converter:
     @classmethod
@@ -468,6 +471,7 @@ class Converter:
         preserve_tool_output_all_content: bool = False,
         base_url: str | None = None,
         should_replay_reasoning_content: ShouldReplayReasoningContent | None = None,
+        strict_feature_validation: bool = False,
     ) -> list[ChatCompletionMessageParam]:
         """
         Convert a sequence of 'Item' objects into a list of ChatCompletionMessageParam.
@@ -493,6 +497,8 @@ class Converter:
             should_replay_reasoning_content: Optional hook that decides whether a
                 reasoning item should be replayed into the next assistant message as
                 `reasoning_content`.
+            strict_feature_validation: Whether to raise a UserError for Responses-only
+                features that Chat Completions cannot faithfully represent.
 
         Rules:
         - EasyInputMessage or InputMessage (role=user) => ChatCompletionUserMessageParam
@@ -748,6 +754,19 @@ class Converter:
                             for c in all_output_content
                             if c.get("type") == "text"
                         ]
+                        if not tool_result_content:
+                            message = (
+                                "Chat Completions tool outputs cannot be empty or contain only "
+                                "non-text content unless preserve_tool_output_all_content=True."
+                            )
+                            if strict_feature_validation:
+                                raise UserError(message)
+                            logger.warning(
+                                "%s Replacing the tool output with a placeholder; enable strict "
+                                "feature validation to raise an error instead.",
+                                message,
+                            )
+                            tool_result_content = _OMITTED_TOOL_OUTPUT_PLACEHOLDER
                 msg: ChatCompletionToolMessageParam = {
                     "role": "tool",
                     "tool_call_id": func_output["call_id"],
diff --git a/src/agents/models/multi_provider.py b/src/agents/models/multi_provider.py
index fb609778..2dd5af20 100644
--- a/src/agents/models/multi_provider.py
+++ b/src/agents/models/multi_provider.py
@@ -109,8 +109,8 @@ class MultiProvider(ModelProvider):
                 responses API.
             openai_strict_feature_validation: Whether OpenAI Chat Completions models should raise
                 a UserError when callers pass Responses-only features such as previous_response_id,
-                conversation_id, or prompt. Defaults to False, which preserves the previous
-                ignore-and-warn behavior.
+                conversation_id, prompt, or non-text-only tool outputs. Defaults to False, which
+                preserves the default compatibility behavior.
             openai_websocket_base_url: The websocket base URL to use for the OpenAI provider.
                 If not provided, the provider will use `OPENAI_WEBSOCKET_BASE_URL` when set.
             openai_prefix_mode: Controls how ``openai/...`` model strings are interpreted.
diff --git a/src/agents/models/openai_chatcompletions.py b/src/agents/models/openai_chatcompletions.py
index 95b9b64d..dbf4045b 100644
--- a/src/agents/models/openai_chatcompletions.py
+++ b/src/agents/models/openai_chatcompletions.py
@@ -412,6 +412,7 @@ class OpenAIChatCompletionsModel(Model):
             model=self.model,
             base_url=str(self._client.base_url),
             should_replay_reasoning_content=self.should_replay_reasoning_content,
+            strict_feature_validation=self._strict_feature_validation,
         )
 
         if system_instructions:
diff --git a/src/agents/models/openai_provider.py b/src/agents/models/openai_provider.py
index 45b3acbc..4153e659 100644
--- a/src/agents/models/openai_provider.py
+++ b/src/agents/models/openai_provider.py
@@ -74,8 +74,8 @@ class OpenAIProvider(ModelProvider):
                 API.
             strict_feature_validation: Whether Chat Completions models should raise a UserError
                 when callers pass Responses-only features such as previous_response_id,
-                conversation_id, or prompt. Defaults to False, which preserves the previous
-                ignore-and-warn behavior.
+                conversation_id, prompt, or non-text-only tool outputs. Defaults to False, which
+                preserves the default compatibility behavior.
             agent_registration: Optional agent registration configuration.
             responses_websocket_options: Optional low-level websocket keepalive options for the
                 OpenAI Responses websocket transport.
diff --git a/tests/models/test_openai_chatcompletions.py b/tests/models/test_openai_chatcompletions.py
index 399acf22..c18b29f5 100644
--- a/tests/models/test_openai_chatcompletions.py
+++ b/tests/models/test_openai_chatcompletions.py
@@ -320,6 +320,50 @@ async def test_get_response_rejects_prompt_in_strict_mode(monkeypatch) -> None:
         )
 
 
+@pytest.mark.allow_call_model_methods
+@pytest.mark.asyncio
+async def test_get_response_rejects_non_text_tool_output_in_strict_mode() -> None:
+    class DummyCompletions:
+        async def create(self, **kwargs: Any) -> Any:
+            raise AssertionError("chat.completions.create should not run")
+
+    class DummyClient:
+        def __init__(self) -> None:
+            self.chat = type("_Chat", (), {"completions": DummyCompletions()})()
+            self.base_url = httpx.URL("http://fake")
+
+    model = OpenAIChatCompletionsModel(
+        model="gpt-4",
+        openai_client=DummyClient(),  # type: ignore[arg-type]
+        strict_feature_validation=True,
+    )
+
+    with pytest.raises(UserError, match="cannot be empty or contain only non-text content"):
+        await model.get_response(
+            system_instructions=None,
+            input=[
+                {
+                    "type": "function_call_output",
+                    "call_id": "call_image",
+                    "output": [
+                        {
+                            "type": "input_image",
+                            "image_url": "https://example.com/image.png",
+                        }
+                    ],
+                }
+            ],
+            model_settings=ModelSettings(),
+            tools=[],
+            output_schema=None,
+            handoffs=[],
+            tracing=ModelTracing.DISABLED,
+            previous_response_id=None,
+            conversation_id=None,
+            prompt=None,
+        )
+
+
 @pytest.mark.allow_call_model_methods
 @pytest.mark.asyncio
 async def test_get_response_attaches_logprobs(monkeypatch) -> None:
diff --git a/tests/models/test_openai_chatcompletions_converter.py b/tests/models/test_openai_chatcompletions_converter.py
index c0bb95c4..14174c77 100644
--- a/tests/models/test_openai_chatcompletions_converter.py
+++ b/tests/models/test_openai_chatcompletions_converter.py
@@ -23,6 +23,7 @@ These tests exercise both conversion directions:
 
 from __future__ import annotations
 
+import logging
 from typing import Literal, cast
 
 import pytest
@@ -356,6 +357,139 @@ def test_items_to_messages_with_function_output_item():
     assert tool_msg["content"] == func_output_item["output"]
 
 
+def test_items_to_messages_with_non_text_only_function_output_uses_placeholder_by_default(
+    caplog: pytest.LogCaptureFixture,
+):
+    """Default conversion should keep running without sending an empty tool message."""
+    func_output_item: FunctionCallOutput = {
+        "type": "function_call_output",
+        "call_id": "somecall",
+        "output": [
+            {
+                "type": "input_image",
+                "image_url": "https://example.com/image.png",
+            }
+        ],
+    }
+
+    with caplog.at_level(logging.WARNING, logger="openai.agents"):
+        messages = Converter.items_to_messages([func_output_item])
+
+    assert len(messages) == 1
+    tool_msg = messages[0]
+    assert tool_msg["role"] == "tool"
+    assert tool_msg["tool_call_id"] == func_output_item["call_id"]
+    assert tool_msg["content"] == "[tool output omitted]"
+    assert "Replacing the tool output with a placeholder" in caplog.text
+
+
+def test_items_to_messages_with_non_text_only_function_output_raises_in_strict_mode():
+    """Strict validation should fail explicitly instead of silently losing the output."""
+    func_output_item: FunctionCallOutput = {
+        "type": "function_call_output",
+        "call_id": "somecall",
+        "output": [
+            {
+                "type": "input_image",
+                "image_url": "https://example.com/image.png",
+            }
+        ],
+    }
+
+    with pytest.raises(UserError, match="cannot be empty or contain only non-text content"):
+        Converter.items_to_messages([func_output_item], strict_feature_validation=True)
+
+
+def test_items_to_messages_with_empty_function_output_uses_placeholder_by_default(
+    caplog: pytest.LogCaptureFixture,
+):
+    """Default conversion should not send an empty tool message."""
+    func_output_item: FunctionCallOutput = {
+        "type": "function_call_output",
+        "call_id": "somecall",
+        "output": [],
+    }
+
+    with caplog.at_level(logging.WARNING, logger="openai.agents"):
+        messages = Converter.items_to_messages([func_output_item])
+
+    assert len(messages) == 1
+    tool_msg = messages[0]
+    assert tool_msg["role"] == "tool"
+    assert tool_msg["tool_call_id"] == func_output_item["call_id"]
+    assert tool_msg["content"] == "[tool output omitted]"
+    assert "Replacing the tool output with a placeholder" in caplog.text
+
+
+def test_items_to_messages_with_empty_function_output_raises_in_strict_mode():
+    """Strict validation should fail explicitly instead of sending empty output."""
+    func_output_item: FunctionCallOutput = {
+        "type": "function_call_output",
+        "call_id": "somecall",
+        "output": [],
+    }
+
+    with pytest.raises(UserError, match="cannot be empty or contain only non-text content"):
+        Converter.items_to_messages([func_output_item], strict_feature_validation=True)
+
+
+def test_items_to_messages_with_mixed_function_output_keeps_text_by_default(
+    caplog: pytest.LogCaptureFixture,
+):
+    """Default conversion should preserve text parts and omit unsupported non-text parts."""
+    func_output_item: FunctionCallOutput = {
+        "type": "function_call_output",
+        "call_id": "somecall",
+        "output": [
+            {"type": "input_text", "text": "visible text"},
+            {
+                "type": "input_image",
+                "image_url": "https://example.com/image.png",
+            },
+        ],
+    }
+
+    with caplog.at_level(logging.WARNING, logger="openai.agents"):
+        messages = Converter.items_to_messages([func_output_item])
+
+    assert len(messages) == 1
+    tool_msg = messages[0]
+    assert tool_msg["role"] == "tool"
+    assert tool_msg["tool_call_id"] == func_output_item["call_id"]
+    assert tool_msg["content"] == [{"type": "text", "text": "visible text"}]
+    assert "tool output omitted" not in caplog.text
+
+
+def test_items_to_messages_can_preserve_non_text_function_output() -> None:
+    """Compatible providers can opt in to preserving non-text tool output."""
+    func_output_item: FunctionCallOutput = {
+        "type": "function_call_output",
+        "call_id": "somecall",
+        "output": [
+            {
+                "type": "input_image",
+                "image_url": "https://example.com/image.png",
+            }
+        ],
+    }
+
+    messages = Converter.items_to_messages(
+        [func_output_item],
+        preserve_tool_output_all_content=True,
+    )
+
+    assert len(messages) == 1
+    tool_msg = messages[0]
+    assert tool_msg["role"] == "tool"
+    assert tool_msg["tool_call_id"] == func_output_item["call_id"]
+    assert tool_msg["content"] == [
+        {
+            "type": "image_url",
+            "image_url": {"url": "https://example.com/image.png", "detail": "auto"},
+        }
+    ]
+
+
 def test_extract_all_and_text_content_for_strings_and_lists():
     """
     The converter provides helpers for extracting user-supplied message content

@seratch seratch added this to the 0.17.x milestone May 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working feature:chat-completions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Chat Completions converter can send empty tool output for non-text results

2 participants