Skip to content

fix: Chat Completions streaming output indexes#3108

Open
Aphroq wants to merge 3 commits intoopenai:mainfrom
Aphroq:fix/chatcmpl-fallback-output-index
Open

fix: Chat Completions streaming output indexes#3108
Aphroq wants to merge 3 commits intoopenai:mainfrom
Aphroq:fix/chatcmpl-fallback-output-index

Conversation

@Aphroq
Copy link
Copy Markdown
Contributor

@Aphroq Aphroq commented May 4, 2026

Summary

  • ensure Chat Completions fallback function calls receive distinct stream output_index values
  • compute assistant message output_index explicitly instead of passing a bool expression into event constructors
  • add regression coverage for multiple fallback tool calls and reasoning-before-text stream events

Test plan

  • uv run pytest tests/models/test_openai_chatcompletions_stream.py -k 'fallback_tool_calls_use_distinct_output_indexes'
  • uv run pytest tests/models/test_reasoning_content.py -k 'stream_response_yields_events_for_reasoning_content'
  • bash .agents/skills/code-change-verification/scripts/run.sh

Issue number

Closes #3104
Closes #3109

Checks

  • I've added new tests (if relevant)
  • I've added/updated the relevant documentation
  • I've run make lint and make format
  • I've made sure tests pass

@Aphroq Aphroq changed the title Fix Chat Completions fallback output indexes Fix Chat Completions streaming output indexes May 4, 2026
@seratch
Copy link
Copy Markdown
Member

seratch commented May 4, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Another round soon, please!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown
Member

@seratch seratch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I think this is directionally valuable: the fallback-only duplicate output_index bug is real, and the explicit integer helper for assistant message indexes also fixes the True vs 1 issue after reasoning output.

Before merging, could you cover and fix one mixed case as well? If one tool call is first seen without enough data to enter the real-time streaming path, and a later tool call does enter the real-time streaming path, the stream event indexes can still disagree with the final response.output order.

A minimal shape is:

  1. tool call index 0 arrives without id, so it remains fallback-only
  2. tool call index 1 arrives with id and name, so it starts streaming immediately
  3. the completed response output order is [fallback_first, streamed_second]

With the current PR, the streamed call can emit response.output_item.added at output_index=0, while the fallback call is emitted later at output_index=1. That makes the event indexes map to [streamed_second, fallback_first], which does not match the completed response output order.

Could you add a regression test for mixed fallback + streamed tool calls and adjust the index assignment so all tool calls use stable indexes that match the final output order?

@seratch seratch added this to the 0.15.x milestone May 4, 2026
@seratch seratch changed the title Fix Chat Completions streaming output indexes fix: Chat Completions streaming output indexes May 4, 2026
@Aphroq
Copy link
Copy Markdown
Contributor Author

Aphroq commented May 4, 2026

Thanks for the detailed catch. Agreed — I’ll add regression coverage for this path and update the implementation accordingly.

@github-actions github-actions Bot added the bug Something isn't working label May 4, 2026
@Aphroq Aphroq force-pushed the fix/chatcmpl-fallback-output-index branch from ba02386 to 49142b5 Compare May 4, 2026 12:02
@Aphroq
Copy link
Copy Markdown
Contributor Author

Aphroq commented May 4, 2026

Updated based on the feedback.

I kept the fix focused on the Chat Completions streaming adapter index semantics: each function call now gets a stable output_index when it is first tracked, and both the streamed and fallback paths reuse that same index. This keeps stream event output_index values aligned with the final response.completed.response.output positions.

Added/expanded regression coverage for:

  • fallback-only multiple tool calls using distinct indexes
  • mixed fallback + streamed tool calls matching final output order
  • tool call indexes offset correctly after assistant text
  • tool call indexes offset correctly after reasoning output

Full local verification passed.

@seratch
Copy link
Copy Markdown
Member

seratch commented May 4, 2026

Thanks for the update. The mixed fallback + streamed tool call case from the previous review is fixed now, and the added coverage looks good for fallback-only, mixed tool calls, text-before-tools, and reasoning-before-tools.

I found one remaining ordering regression before merging: when a tool call is first seen before any assistant text, and assistant text is streamed later, the new "assign the function call output_index when first tracked" approach can reserve 0 for the function call. The final completed response still orders the assistant message before function calls, so the message also uses output_index=0 and the function call should be shifted to 1.

Minimal shape:

  1. chunk 1 has tool_calls=[index=0] without id, so it stays fallback-only
  2. chunk 2 has content="answer"
  3. response.completed.response.output is [message, function_call]

On the PR head, the function call response.output_item.added is emitted with output_index=0, colliding with the assistant message. The same probe passes on the PR base because the fallback output index is computed after text exists.

Could you add a regression test for tool-call-before-text and adjust the stable index assignment so late-created assistant messages can still occupy their final output position?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working feature:chat-completions

Projects

None yet

2 participants