Skip to content

fix(realtime): preserve output_audio content parts in output_item events#3230

Merged
seratch merged 1 commit intoopenai:mainfrom
adityasingh2400:fix/rtoai-output-audio-content
May 9, 2026
Merged

fix(realtime): preserve output_audio content parts in output_item events#3230
seratch merged 1 commit intoopenai:mainfrom
adityasingh2400:fix/rtoai-output-audio-content

Conversation

@adityasingh2400
Copy link
Copy Markdown
Contributor

Summary

The dict-based fast path in _handle_ws_event for response.output_item.added / response.output_item.done only matched part type "audio". GA-style assistant messages emit "output_audio" (matching the official literal Literal["output_text", "output_audio"] on openai.types.realtime.realtime_conversation_item_assistant_message.Content), so audio content + transcript were silently dropped from listeners on these events.

The other early conversion site (_ConversionHelper.conversation_item_to_realtime_message_item at L1588) already maps output_audioaudio; the dict-path was inconsistent.

This PR accepts both "audio" and "output_audio" and normalizes to the SDK's "audio" type, mirroring how output_text/text are already handled in the same block.

Changes

  • src/agents/realtime/openai_realtime.py: 1-line tuple change in _handle_ws_event's message-content normalization.
  • tests/realtime/test_openai_realtime.py: focused regression test asserting output_audio parts reach listeners with type="audio" and the transcript preserved.

Diff: 2 files, +35/-1.

Test plan

  • New test test_output_audio_content_type_normalized fails on main, passes after fix.
  • Full tests/realtime/ suite (233 tests) passes.
  • ruff check clean.

The dict-based fast path in _handle_ws_event for response.output_item.added
and response.output_item.done only matched part type "audio". GA-style
assistant messages emit "output_audio" (matching the official Content type
literal in openai.types.realtime), so audio content + transcript were
silently dropped from listeners.

Accept both "audio" and "output_audio" and normalize to the SDK's "audio"
type, mirroring how output_text/text are already handled in the same block.
@github-actions github-actions Bot added bug Something isn't working feature:realtime labels May 8, 2026
@seratch seratch added this to the 0.17.x milestone May 8, 2026
@seratch
Copy link
Copy Markdown
Member

seratch commented May 9, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Another round soon, please!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@seratch seratch merged commit 7e9089f into openai:main May 9, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working feature:realtime

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants