Small follow-up to #5672.
In livekit-plugins-soniox==1.5.14, Soniox FINAL_TRANSCRIPT / PREFLIGHT_TRANSCRIPT events include SpeechData.end_time, but the following END_OF_SPEECH event is emitted without alternatives.
In AudioRecognition._on_stt_event, stt_last_speaking_time falls back to time.time() when the current STT event has no alternatives[0].end_time. In no-VAD / STT-driven user-state mode, the Soniox END_OF_SPEECH can therefore overwrite the previous timestamped anchor with event arrival time.
Effect: end_of_turn_delay can still collapse to roughly the post-EOS endpointing delay for Soniox no-VAD sessions, even though earlier transcript events had valid STT timestamps.
Possible fixes:
- include a timestamped alternative on Soniox
END_OF_SPEECH, or
- avoid overwriting
_last_speaking_time on EOS when the EOS event has no timestamp but a previous STT timestamp anchor exists.
We kept a local correction for now, but wanted to report the remaining edge case.
Small follow-up to #5672.
In
livekit-plugins-soniox==1.5.14, SonioxFINAL_TRANSCRIPT/PREFLIGHT_TRANSCRIPTevents includeSpeechData.end_time, but the followingEND_OF_SPEECHevent is emitted withoutalternatives.In
AudioRecognition._on_stt_event,stt_last_speaking_timefalls back totime.time()when the current STT event has noalternatives[0].end_time. In no-VAD / STT-driven user-state mode, the SonioxEND_OF_SPEECHcan therefore overwrite the previous timestamped anchor with event arrival time.Effect:
end_of_turn_delaycan still collapse to roughly the post-EOS endpointing delay for Soniox no-VAD sessions, even though earlier transcript events had valid STT timestamps.Possible fixes:
END_OF_SPEECH, or_last_speaking_timeon EOS when the EOS event has no timestamp but a previous STT timestamp anchor exists.We kept a local correction for now, but wanted to report the remaining edge case.