Make realtime sideband startup async#20715
Merged
Merged
Conversation
Contributor
|
All contributors have signed the CLA ✍️ ✅ |
This was referenced May 2, 2026
Move the WebRTC sideband websocket connect out of the start critical path. The call-create request still returns the SDP answer synchronously, while the sideband input task connects in the background and uses the existing input channels to queue text, handoff output, and audio until the websocket is ready. Add coverage that a delayed sideband accepts queued text after the SDP answer has already been emitted. Co-authored-by: Codex <noreply@openai.com>
c5fd3a8 to
df4a907
Compare
Contributor
Author
|
I have read the CLA Document and I hereby sign the CLA |
6a7768e to
e7207ca
Compare
kmeelu-oai
commented
May 4, 2026
| .await | ||
| .map_err(map_api_error)?; | ||
| (connection, Some(call.sdp)) | ||
| let task = spawn_webrtc_sideband_input_task(RealtimeWebrtcSidebandInputTask { |
Contributor
Author
There was a problem hiding this comment.
This now also runs run_realtime_input_task, as you'll see below. We can alternatively make that clearer by running a spawn_webrtc_sideband_and_realtime_input_tasks and chain them (we could even put that logic here, if desired).
kmeelu-oai
commented
May 4, 2026
| event_parser, | ||
| } = input; | ||
|
|
||
| tokio::spawn(async move { |
Contributor
Author
There was a problem hiding this comment.
big diff hides the fact that we're just putting this within spawn_realtime_input_task, which calls tokio::spawn.
kmeelu-oai
commented
May 4, 2026
| } | ||
|
|
||
| let connection = match client | ||
| .connect_webrtc_sideband( |
Contributor
Author
There was a problem hiding this comment.
fyi: defaults to 4 retries.
Wait for the sideband outbound request before asserting on the recorded handshake so async sideband startup cannot race the test on musl. Co-authored-by: Codex <noreply@openai.com>
aibrahim-oai
approved these changes
May 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Moves the WebRTC realtime sideband websocket join out of the voice start critical path. Call creation still posts the SDP offer and session config synchronously so the client gets the SDP answer, but the sideband websocket now connects in the input task async and doesn't block conversation state installation.
This lets the normal realtime input channels buffer text, handoff output, and audio while the WebRTC sideband websocket is connecting. If the sideband join fails while the conversation is still active, the task sends a RealtimeEvent::Error through the existing events_tx / fanout path.
To rephrase this:
Validation
env CODEX_SKIP_VENDORED_BWRAP=1 cargo test --manifest-path codex-rs/Cargo.toml -p codex-core --test all conversation_webrtc_start_posts_generated_sessionCODEX_SKIP_VENDORED_BWRAP=1is needed in this local environment becauselibcap.pcis not installed for the vendored bubblewrap build.Testing
I tested this locally by running
cargo run -p codex-cli --bin codex -- --enable realtime_conversationand invoking/realtime. Then, we get logs emitted in~/.codex/log/codex-tui.log.Before the Change
Logging commit (c0299e6)
After the Change
Logging commit (c8b00ac)
Conclusion
Here we see that we saved about a half a second in conversation startup (1532ms -> 969ms). This also checks out with my sanity tests; I was seeing at most a second of saving.