Skip to content

feat(openai): stream input_audio_transcription delta events#5859

Merged
longcw merged 2 commits into
mainfrom
longc/realtime-input-transcription-delta
May 27, 2026
Merged

feat(openai): stream input_audio_transcription delta events#5859
longcw merged 2 commits into
mainfrom
longc/realtime-input-transcription-delta

Conversation

@longcw
Copy link
Copy Markdown
Contributor

@longcw longcw commented May 27, 2026

Summary

Ports livekit/agents-js#1581 — wires conversation.item.input_audio_transcription.delta from the OpenAI Realtime API so user transcripts surface word-by-word as InputTranscriptionCompleted(is_final=False) partials, instead of only firing once on .completed.

Enables streaming user transcripts with gpt-realtime-whisper (and any future delta-emitting transcription model). Previously the .delta branch was a pass because partials weren't useful from the legacy transcription pipeline; now that OpenAI streams them in realtime, we accumulate and emit.

Changes

  • Add _input_transcript_accumulators: dict[str, dict[int, str]] keyed by (item_id, content_index).
  • New _handle_conversion_item_input_audio_transcription_delta handler accumulates and emits is_final=False.
  • _handle_..._completed clears the matching accumulator before emitting the final, so a subsequent delta on the same item_id starts fresh.
  • _handle_..._failed emits a closing is_final=True with the last accumulated partial so consumers waiting on a final don't hang. No-op when no partials had streamed.
  • Accumulators are also cleared on conversation.item.deleted and session reconnect.

Wire conversation.item.input_audio_transcription.delta from the OpenAI
Realtime API as InputTranscriptionCompleted(is_final=False) partials.
Accumulators are keyed per (item_id, content_index) and cleared on
.completed, .deleted, session reconnect, and on .failed (which now emits
a closing is_final=True when partials had streamed so consumers don't
hang).
@chenghao-mou chenghao-mou requested a review from a team May 27, 2026 02:34
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

@longcw longcw merged commit 541d844 into main May 27, 2026
25 checks passed
@longcw longcw deleted the longc/realtime-input-transcription-delta branch May 27, 2026 03:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants