`WebvttParser` creates duplicate `CuesWithTiming` when handling cues sharing same start/end timestamps #1177

JunkFood02 · 2024-03-12T05:15:23Z

Version

Media3 1.3.0

More version details

media3-extractor

Devices that reproduce the issue

Android Studio

Devices that do not reproduce the issue

No response

Reproducible in the demo app?

Not tested

Reproduction steps

Use WebvttParser.parse() to parse the vtt captions below:

private const val vttSample = """WEBVTT
Kind: captions
Language: en

00:00:00.000 --> 00:00:01.712
[MUSIC PLAYING]

00:00:02.533 --> 00:00:03.950
SPEAKER: As an
Android engineer, I

00:00:03.950 --> 00:00:06.350
have seen firsthand how
the ecosystem is growing--
"""

val cuesWithTimingList = buildList {
            WebvttParser().parse(
                vttSample.toByteArray(),
                SubtitleParser.OutputOptions.allCues()
            ) {
                add(it)
            }
        }

Expected result

The cuesWithTimingList contains 3 items or lines

Actual result

The list contains 4 items, see screenshot

Media

WEBVTT
Kind: captions
Language: en

00:00:00.000 --> 00:00:01.712
[MUSIC PLAYING]

00:00:02.533 --> 00:00:03.950
SPEAKER: As an
Android engineer, I

00:00:03.950 --> 00:00:06.350
have seen firsthand how
the ecosystem is growing--

Bug Report

You will email the zip file produced by adb bugreport to android-media-github@google.com after filing this issue.

The text was updated successfully, but these errors were encountered:

icbaker · 2024-03-13T10:07:14Z

The duplication is expected for cues that actually overlap, that's checked in a test case here:

media/libraries/extractor/src/test/java/androidx/media3/extractor/text/webvtt/WebvttParserTest.java

Line 338 in 12aa637

public void parseWithOverlappingTimestamps() throws Exception {

This duplication is currently required due to the way we implement WebVTT's simultaneous cues handling (by creating synthetic CuesWithTiming objects representing the overlapping state) - which is mostly done inside WebvttSubtitle:

media/libraries/extractor/src/main/java/androidx/media3/extractor/text/webvtt/WebvttSubtitle.java

Line 81 in 12aa637

// Steps 4 - 10 of https://www.w3.org/TR/webvtt1/#cue-computed-line

If we didn't create these synthetic cues here, we would have lost too much information to correctly implement the layout rules later in the playback pipeline. It's still not perfect (see e.g. google/ExoPlayer#10980), and it would be better to implement this layout logic later, but it's not likely something we're going to change soon.

However, this duplication is not expected for consecutive cues where one cue ends at the same time the next cue starts. I can reproduce that the duplication does occur in this case, I'll look into fixing it.

JunkFood02 · 2024-03-13T11:03:03Z

Thanks for the quick and detailed response! Edited the title since I didn't intend to work with subtitles with overlapping timestamp

It's a bit arguable whether the `Subtitle` implementation supports zero-duration events, since `getEventTimeCount` is documented as effectively "the number of times the cues returns by `getCues(long)` changes", and zero-duration events violate that. However, the current `WebvttSubtitle` impl **does** produce zero-duration events, so it seems safer to handle them gracefully here and then, as a possible follow-up, fix the `WebvttSubtitle` impl (or remove it completely). Issue: #1177 #minor-release PiperOrigin-RevId: 616095798

It's a bit arguable whether the `Subtitle` implementation supports zero-duration events, since `getEventTimeCount` is documented as effectively "the number of times the cues returns by `getCues(long)` changes", and zero-duration events violate that. However, the current `WebvttSubtitle` impl **does** produce zero-duration events, so it seems safer to handle them gracefully here and then, as a possible follow-up, fix the `WebvttSubtitle` impl (or remove it completely). Issue: androidx/media#1177 #minor-release PiperOrigin-RevId: 616095798

szaboa · 2024-04-02T13:32:20Z

@icbaker as I understood this was just a preparation to handle overlapping vtt cues?

If we didn't create these synthetic cues here, we would have lost too much information to correctly implement the layout rules later in the playback pipeline. It's still not perfect (see e.g. google/ExoPlayer#10980), and it would be better to implement this layout logic later, but it's not likely something we're going to change soon.

Do you have suggestions on the "layout logic"?

icbaker · 2024-04-04T08:15:53Z

@szaboa I'm not sure I completely understand the question - do you mind filing a new question issue with a bit more detail? (I'm likely to lose track of this closed one)

It's a bit arguable whether the `Subtitle` implementation supports zero-duration events, since `getEventTimeCount` is documented as effectively "the number of times the cues returns by `getCues(long)` changes", and zero-duration events violate that. However, the current `WebvttSubtitle` impl **does** produce zero-duration events, so it seems safer to handle them gracefully here and then, as a possible follow-up, fix the `WebvttSubtitle` impl (or remove it completely). Issue: #1177 PiperOrigin-RevId: 616095798 (cherry picked from commit e9ed874)

It's a bit arguable whether the `Subtitle` implementation supports zero-duration events, since `getEventTimeCount` is documented as effectively "the number of times the cues returns by `getCues(long)` changes", and zero-duration events violate that. However, the current `WebvttSubtitle` impl **does** produce zero-duration events, so it seems safer to handle them gracefully here and then, as a possible follow-up, fix the `WebvttSubtitle` impl (or remove it completely). Issue: androidx#1177 PiperOrigin-RevId: 616095798 (cherry picked from commit e9ed874)

JunkFood02 added bug needs triage labels Mar 12, 2024

icbaker self-assigned this Mar 12, 2024

JunkFood02 changed the title ~~WebvttParser generates duplicated CuesWithTiming when handling overlapped timestamps~~ WebvttParser creates duplicate CuesWithTiming when handling cues sharing same start/end timestamps Mar 13, 2024

icbaker closed this as completed Mar 15, 2024

icbaker removed the needs triage label Apr 4, 2024

szaboa mentioned this issue Apr 4, 2024

WebVTT multi-line subtitles overlapping google/ExoPlayer#10980

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`WebvttParser` creates duplicate `CuesWithTiming` when handling cues sharing same start/end timestamps #1177

`WebvttParser` creates duplicate `CuesWithTiming` when handling cues sharing same start/end timestamps #1177

JunkFood02 commented Mar 12, 2024

icbaker commented Mar 13, 2024 •

edited

JunkFood02 commented Mar 13, 2024

szaboa commented Apr 2, 2024

icbaker commented Apr 4, 2024

WebvttParser creates duplicate CuesWithTiming when handling cues sharing same start/end timestamps #1177

WebvttParser creates duplicate CuesWithTiming when handling cues sharing same start/end timestamps #1177

Comments

JunkFood02 commented Mar 12, 2024

Version

More version details

Devices that reproduce the issue

Devices that do not reproduce the issue

Reproducible in the demo app?

Reproduction steps

Expected result

Actual result

Media

Bug Report

icbaker commented Mar 13, 2024 • edited

JunkFood02 commented Mar 13, 2024

szaboa commented Apr 2, 2024

icbaker commented Apr 4, 2024

`WebvttParser` creates duplicate `CuesWithTiming` when handling cues sharing same start/end timestamps #1177

`WebvttParser` creates duplicate `CuesWithTiming` when handling cues sharing same start/end timestamps #1177

icbaker commented Mar 13, 2024 •

edited