Skip to content

Python: Bug: Streaming replies error – final ResponseOutputMessage missing .delta on large runs #12296

Closed
@ltwlf

Description

@ltwlf

Describe the bug
Invoking an AzureResponseAgent that returns a function‑calling answer whose total context grows to many tokens (exact threshold still unclear) causes the run to terminate with

AttributeError: 'ResponseOutputMessage' object has no attribute 'delta'

To Reproduce

  1. Use the latest main branch of Semantic Kernel.
  2. Configure an Azure OpenAI GPT‑4 model that supports tool/function calling (e.g. gpt‑4o).
  3. Create an AzureResponseAgent with at least one function/tool; keep all defaults.
  4. Prompt the agent so that the response invokes the tool and the combined context becomes very large (e.g. ask for a deep recursive JSON structure).
  5. Observe the run end with the AttributeError above.

Expected behavior
The agent run should finish cleanly and deliver the complete assistant message (including tool_calls) without raising an exception, regardless of response length.

Screenshots
N/A – console traceback available upon request.

Platform

  • Language: Python 3.13.3
  • Source: main branch
  • AI model: Azure OpenAI gpt‑4.1
  • IDE: VS Code
  • OS: Windows 11 

Additional context

  • Bug has surfaced in two separate apps under active development.
  • Root cause seems related to OpenAI’s switch from streaming …MessageDeltaEvent objects to a final ResponseOutputMessage (which lacks .delta) for very large tool‑calling completions—but the precise size boundary still needs investigation.

Metadata

Metadata

Assignees

Labels

agentsbugSomething isn't workingpythonPull requests for the Python Semantic Kernel

Type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions