Set accurate seekMap duration in SubtitleExtractor after parsing#3106
Conversation
|
This change makes sense - but I'm curious about this part of your explanation:
Can you give more details about these consumers? If they are handling the data after it's been transcoded to |
|
Hi, @icbaker. The consumer I had in mind is I initially had the same thought about end-of-cue samples, but it turns out if (cuesForThisStartTime.isEmpty()) {
// An empty cue list has already been implicitly encoded in the duration
// of the previous sample.
return;
}For example, a single WebVTT cue spanning 00:10.000 --> 05:05.000 results in two event times [10s, 305s] inside WebvttSubtitle. But Because of this, The existing tests also confirm this behavior. For instance, |
|
Ah right, yes I'd misremembered how things work - thanks for clarifying. |
After parsing all subtitle cues, the maximum `endTimeUs` across all cues is known. Update the seekMap's `durationUs` with this value instead of leaving it as `C.TIME_UNSET`. This follows the same pattern used by `Mp3Extractor`, which also re-emits the seekMap with an exact duration after reading all samples.
f5b3b1a to
58cbdc4
Compare
|
I'm going to send this for internal review now. You may see some more commits being added as I make changes in response to review feedback. Please refrain from pushing any more substantive changes as it will complicate the internal review - thanks! |
SubtitleExtractorcreates itsIndexSeekMapwithdurationUs = C.TIME_UNSETininit(), but never updates it after parsing, even though the exact duration is knowable from the parsed cues'endTimeUsvalues.This causes problems for WebVTT files with long-duration cues (e.g. a cue spanning
00:00.000 --> 01:00:00.000).Since the seekMap reports
TIME_UNSETas its duration, consumers may fall back to estimating duration from sample timestamps (startTimeUs), which can be significantly shorter than the actual content end time. This can result in subtitle samples not being loaded after seeking to a position beyond the estimated duration.This change updates the seekMap's
durationUsto the maximumendTimeUsacross all parsed cues, following the same pattern used byMp3Extractor(which callsIndexSeekMap.setDurationUs()and re-emits the seekMap after reading all samples).Also added tests to verify this behavior. Happy to adjust if anything needs to be changed :)