Skip to content

Python: [Bug]: AG-UI history replay can send invalid assistant/tool sequence to OpenAI (tool_calls without matching tool messages) #5855

@adityanile

Description

@adityanile

Description

When using agent_framework.ag_ui with persisted thread history, later requests can intermittently fail with OpenAI/Azure validation errors:

An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'.

The issue appears related to replay/history reconstruction logic where assistant tool-call messages and tool-result messages become inconsistently paired during outbound payload generation.

Code Sample

Error Messages / Stack Traces

{
  "error": {
    "message": "An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. The following tool_call_ids did not have response messages: call_xxx",
    "type": "invalid_request_error",
    "param": "messages.[N].role"
  }
}

Package Versions

agent_framework_ag_ui: 1.0.0b260507

Python Version

No response

Additional Context

Observed History Behavior

Persisted history files may contain both:

  • assistant tool-call messages
  • matching tool-result messages

However, during replay/reconstruction, outbound model payloads can still contain:

  • unmatched assistant tool_calls
  • reordered tool messages
  • duplicated call entries
  • missing tool-result messages

This eventually causes provider-side validation failures.


Expected Behavior

Framework-generated outbound payloads should always satisfy provider message contracts:

  1. Every assistant tool_call must have a matching tool-result message.
  2. Tool messages must appear immediately after their originating assistant tool-call message.
  3. No orphan tool messages should exist.
  4. History replay and deduplication should remain deterministic under:
    • retries
    • interruptions
    • resumed execution flows

Actual Behavior

Under some replay/history scenarios, invalid message sequences are sent to the model provider, resulting in hard 400 failures and unusable conversation threads.


Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions