Summary
The Soniox STT plugin doesn't emit PREFLIGHT_TRANSCRIPT events, which means AgentSession's preemptive LLM generation can't fire when using Soniox for STT-driven end-of-turn detection. The Deepgram v2 plugin already supports this via EagerEndOfTurn, so the pattern is established in the repo.
Context
We're using Soniox's latest model with STT-driven endpointing, and the end-of-turn detection itself is fast and accurate. But because the plugin doesn't emit preflight events, the LLM can only start generating after the final transcript lands, so we pay full LLM TTFT in wall-clock time after every turn. That ends up being too high for real-time voice use cases.
With Deepgram's v2 plugin, preemptive generation "just works" — the LLM starts speculatively on EagerEndOfTurn and is adopted on final transcript match. We'd love the same experience with Soniox.
Request
Add PREFLIGHT_TRANSCRIPT emission to the Soniox plugin, following the same pattern as Deepgram v2. A natural trigger would be on stable interim transcripts (or a Soniox-native signal if one exists that maps to the same idea).
Questions
- Is this something the LiveKit team is planning to add, or would you accept a community PR?
- Any preference on how stability should be determined — time-based on interims, or tied to a specific Soniox token/signal?
Happy to help however is useful — feedback, testing, or contributing the PR itself.
Thanks!
Summary
The Soniox STT plugin doesn't emit
PREFLIGHT_TRANSCRIPTevents, which meansAgentSession's preemptive LLM generation can't fire when using Soniox for STT-driven end-of-turn detection. The Deepgram v2 plugin already supports this viaEagerEndOfTurn, so the pattern is established in the repo.Context
We're using Soniox's latest model with STT-driven endpointing, and the end-of-turn detection itself is fast and accurate. But because the plugin doesn't emit preflight events, the LLM can only start generating after the final transcript lands, so we pay full LLM TTFT in wall-clock time after every turn. That ends up being too high for real-time voice use cases.
With Deepgram's v2 plugin, preemptive generation "just works" — the LLM starts speculatively on
EagerEndOfTurnand is adopted on final transcript match. We'd love the same experience with Soniox.Request
Add
PREFLIGHT_TRANSCRIPTemission to the Soniox plugin, following the same pattern as Deepgram v2. A natural trigger would be on stable interim transcripts (or a Soniox-native signal if one exists that maps to the same idea).Questions
Happy to help however is useful — feedback, testing, or contributing the PR itself.
Thanks!