fix(realtime): handle typed/Mapping g711 formats in audio length calc by adityasingh2400 · Pull Request #3196 · openai/openai-agents-python

adityasingh2400 · 2026-05-08T09:29:25Z

Summary

calculate_audio_length_ms only special-cased format strings starting with g711, so g711 audio passed as

a typed AudioPCMU / AudioPCMA pydantic model,
a Mapping like {\"type\": \"audio/pcmu\"}, or
the API-style \"audio/pcmu\" / \"audio/pcma\" string

silently fell back to PCM-24kHz math. That yields a ~6x wrong duration (10ms of g711 reads as ~1.67ms) and miscalculates truncation offsets on interrupt for SIP/Twilio sessions whose callers wired RealtimePlaybackTracker.set_audio_format / ModelAudioTracker.set_audio_format with the typed shape that RealtimeAudioFormat already advertises.

RealtimeAudioFormat is declared as str | Mapping[str, Any] | OpenAIRealtimeAudioFormats, but the length helper only honored the first form. Internal callers (OpenAIRealtimeWebSocketModel._extract_audio_format) always normalize to one of pcm16/g711_ulaw/g711_alaw, so the bug only surfaces for user-facing entry points (custom playback trackers, library users mirroring the public type).

This PR normalizes the format to a lower-cased type identifier up front and treats audio/pcmu / audio/pcma as g711 alongside the existing g711_* prefix. PCM behavior and the existing None fallback are unchanged.

In the spirit of recently merged #3141 (audio deltas before format negotiation) and #3140 — keep the realtime audio path defensive against the input shapes the public types already permit.

Test

tests/realtime/test_playback_tracker.py::test_audio_length_calculation_handles_typed_and_mapping_g711_formats exercises typed AudioPCM/AudioPCMU/AudioPCMA, Mapping {type: audio/pcm*}, and string audio/pcmu / audio/pcma inputs and asserts the g711 forms resolve to the 8 kHz sample rate while PCM stays on 24 kHz. The test fails on main (g711 forms read as ~1.67ms instead of 10ms) and passes with this fix. Full tests/realtime/ suite (233 tests) passes.

calculate_audio_length_ms only special-cased strings starting with 'g711', so g711 audio passed as a typed AudioPCMU/AudioPCMA pydantic model, a Mapping like {'type': 'audio/pcmu'}, or the API-style 'audio/pcmu' / 'audio/pcma' string silently fell back to PCM-24kHz math. That yielded a ~6x wrong duration and miscalculated truncation offsets on interrupt for SIP/Twilio sessions whose users wired set_audio_format with the typed shape that RealtimeAudioFormat already advertises. Normalize the format to a lower-cased type string up front and treat 'audio/pcmu' / 'audio/pcma' as g711 alongside the existing 'g711_*' prefix. Internal callers always pass normalized strings already, so behavior there is unchanged.

seratch · 2026-05-08T15:58:18Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 878201205b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-08T17:24:56Z

+        type_value = getattr(format, "type", None)
+    return type_value.lower() if isinstance(type_value, str) else None


Classify typed G.711 models even when type is omitted

AudioPCMU/AudioPCMA are accepted typed formats, but their generated type field is optional and defaults to None. Passing AudioPCMU() or AudioPCMA() still falls through to PCM math here because only the attribute value is inspected, so typed G.711 models are not reliably fixed unless callers redundantly set type.

Useful? React with 👍 / 👎.

github-actions Bot added bug Something isn't working feature:realtime labels May 8, 2026

adityasingh2400 mentioned this pull request May 8, 2026

fix(realtime): normalize non-string audio formats in length calc #3242

Closed

3 tasks

chatgpt-codex-connector Bot reviewed May 8, 2026

View reviewed changes

seratch marked this pull request as draft May 8, 2026 17:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(realtime): handle typed/Mapping g711 formats in audio length calc#3196

fix(realtime): handle typed/Mapping g711 formats in audio length calc#3196
adityasingh2400 wants to merge 1 commit intoopenai:mainfrom
adityasingh2400:fix/realtime-audio-length-typed-formats

adityasingh2400 commented May 8, 2026

Uh oh!

seratch commented May 8, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		type_value = getattr(format, "type", None)
		return type_value.lower() if isinstance(type_value, str) else None

Conversation

adityasingh2400 commented May 8, 2026

Summary

Test

Uh oh!

seratch commented May 8, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants