You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RealtimeSession.interrupt() is a no-op when called without response_id under concurrent in-flight responses (OpenAI plugin)
Summary
RealtimeSession.interrupt() in the OpenAI Realtime plugin sends a ResponseCancelEvent(type="response.cancel") without specifying response_id. Per empirical OpenAI Realtime API behavior, a cancel event without response_id is a no-op when more than one response is in flight — it does not cancel "all" responses; it cancels nothing.
Today this is harmless because the plugin's single-slot _current_generation model only allows one in-flight response at a time, so "the only in-flight response" is unambiguous and the substrate accepts the cancel.
This becomes broken under any future change that enables concurrent in-flight responses on a single session (e.g., a refactor of _current_generation to a multi-instance state container, which is needed to correctly represent the substrate's existing concurrent-OOB-response capability per Research notes below).
Reproduction (current state — no concurrency, so this is latent)
fromlivekit.plugins.openai.realtimeimportRealtimeModelsession=RealtimeModel(model="gpt-4o-realtime-preview-2025-06-03").session()
# Today: only one response.create can be in flight at a time# (single-slot _current_generation), so interrupt() works correctly.fut=session.generate_reply(instructions="Speak slowly: aaaaaa")
awaitasyncio.sleep(0.5)
session.interrupt() # cancels the only in-flight response — works fine
Reproduction under hypothetical concurrent-response refactor (the latent bug)
# Hypothetical: dict-based _generations replacing single-slotfut_a=awaitsession._fire_oob("Long response A "+" ".join(["a"] *100))
fut_b=awaitsession._fire_oob("Long response B "+" ".join(["b"] *100))
awaitasyncio.sleep(0.5)
# Both responses now in flight (substrate supports this for OOB)session.interrupt() # NO-OP: cancels neither response# Both responses continue to completion; user hears unwanted audio
Empirical evidence
I ran probes against the OpenAI Realtime API (gpt-4o-realtime-preview-2025-06-03) to characterize cancel semantics:
Cancel WITH response_id while concurrent responses are in flight: cancels the targeted response cleanly. Other in-flight responses continue.
Cancel WITHOUT response_id while concurrent responses are in flight: NO-OP. Neither response cancelled.
Cancel WITHOUT response_id when only one response is in flight: cancels that response (current single-slot behavior).
definterrupt(self) ->None:
self.send_event(
ResponseCancelEvent(type="response.cancel") # no response_id
)
Proposed fix
interrupt() should iterate the in-flight response IDs and emit one ResponseCancelEvent per in-flight response, with each event carrying its specific response_id:
definterrupt(self) ->None:
"""Cancel all in-flight responses on this session."""# Today, _current_generation is a single slot; iterate accordingly.# If the slot model is later refactored to a dict, iterate dict keys.ifself._current_generationisnotNone:
response_id=self._current_generation.response_id# if availableifresponse_id:
self.send_event(
ResponseCancelEvent(
type="response.cancel",
response_id=response_id,
)
)
else:
# Fallback to no-response_id cancel for backward compatself.send_event(ResponseCancelEvent(type="response.cancel"))
The fix is small (~10 LOC) and orthogonal to other concerns; can ship in either the dict refactor PR or as a tiny standalone PR.
Documenting the empirical cancel semantics here helps future contributors who find the source-code race comment at realtime_model.py:1870 (which documents an adjacent race — response.done without prior response.created) and wonder about cancel semantics more broadly.
Acceptance criteria
RealtimeSession.interrupt() cancels all in-flight responses correctly under both single-slot AND any future multi-instance state model.
A test verifies the cancel-with-response_id path explicitly.
Substrate behavior documented in source-code comment near the implementation.
Source-code race comment at livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/realtime_model.py:1870 (adjacent race, different mechanism)
TL;DR
RealtimeSession.interrupt() is a no-op when called without response_id under concurrent in-flight responses (OpenAI plugin)
Summary
RealtimeSession.interrupt()in the OpenAI Realtime plugin sends aResponseCancelEvent(type="response.cancel")without specifyingresponse_id. Per empirical OpenAI Realtime API behavior, a cancel event withoutresponse_idis a no-op when more than one response is in flight — it does not cancel "all" responses; it cancels nothing.Today this is harmless because the plugin's single-slot
_current_generationmodel only allows one in-flight response at a time, so "the only in-flight response" is unambiguous and the substrate accepts the cancel.This becomes broken under any future change that enables concurrent in-flight responses on a single session (e.g., a refactor of
_current_generationto a multi-instance state container, which is needed to correctly represent the substrate's existing concurrent-OOB-response capability per Research notes below).Reproduction (current state — no concurrency, so this is latent)
Reproduction under hypothetical concurrent-response refactor (the latent bug)
Empirical evidence
I ran probes against the OpenAI Realtime API (
gpt-4o-realtime-preview-2025-06-03) to characterize cancel semantics:response_idwhile concurrent responses are in flight: cancels the targeted response cleanly. Other in-flight responses continue.response_idwhile concurrent responses are in flight: NO-OP. Neither response cancelled.response_idwhen only one response is in flight: cancels that response (current single-slot behavior).The relevant source code is at
livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/realtime_model.pylines 1542-1545:Proposed fix
interrupt()should iterate the in-flight response IDs and emit oneResponseCancelEventper in-flight response, with each event carrying its specificresponse_id:Under a future dict-based refactor, this becomes:
Why file this now even though the bug is latent
This issue is filed proactively because:
realtime_model.py:1870(which documents an adjacent race —response.donewithout priorresponse.created) and wonder about cancel semantics more broadly.Acceptance criteria
RealtimeSession.interrupt()cancels all in-flight responses correctly under both single-slot AND any future multi-instance state model.response_idpath explicitly.Related
livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/realtime_model.py:1870(adjacent race, different mechanism)cc @longcw (author of this file)