Live captioning - incremental cues review #320
To address REQ2 of #318 , we are after an extension of the WebVTT file format.
The principle idea is that we map the TextTrack API calls from #319 to how we would archive them in a WebVTT file to replicate the functionality.
The approach we try out here is to use the 00:00:00 cue timestamps as the means to separate cues into smaller incremental cues. <now()> will be acue timestamp with the now() time.
Looks like what we need is a way to change cue settings and cut cue length half-way through cues, as well as an undefined end time that can be set at a later stage.
discussion at FOMS: separate between two use cases:
1/ live broadcasting
This has near-realtime requirements and focuses on the use of WebVTT with MSE/HLS.
2/ real-time video/audio communication (also called Realtime Captioning RTC)
This has realtime requirements with an ability to support the "editing"-type functionality of 608/708.
I think we may need VTT file-level support for the concept "I am updating this cue" so that clients can work out what's new/changed.
and so on. Another case is where a cue has to be sent immediately but can then be edited and fixed.
I would very much like to understand best current practices in captioning of video telephony and conferences (if any).