Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Allow VTT files with erroneous linebreaks #2394

Merged
merged 6 commits into from Feb 9, 2023

Conversation

bloomtom
Copy link
Contributor

@bloomtom bloomtom commented Feb 13, 2020

Bad linebreaks will now cause cues to be skipped (with a warning), rather than throwing an error.

Closes #2358

The VTT spec does not allow more than
one linebreak within the same cue. It's
desirable to still allow most of the VTT file
to work even if the file is technically out of
spec, however. This commit allows such
files by removing a thrown exception in
the case of bad time codes, instead
returning null, skipping the erroneous cue.
The text_parser also needed to be
patched, as it previously threw a perhaps
unexpected exception when data_.length
is accessed on a null or undefined data
object.
test/text/vtt_text_parser_unit.js Show resolved Hide resolved
test/text/vtt_text_parser_unit.js Show resolved Hide resolved
lib/util/text_parser.js Show resolved Hide resolved
],
'WEBVTT\n\n' +
'00:00:20.000 --> 00:00:40.000\n' +
'\nTest\n\nExtra line\n\n' +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you really want this text ignored instead of represented in the output?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to display it, but after a lot of noodling I can't see how to do so without using cowboy hijinks counter to other implementations I've seen. For instance, dash.js gives:
[9751][VTTParser] Skipping cue due to empty/malformed cue text

The current parser has good adherence to the spec in the implemented areas. I'd like to change that as little as possible while also giving a best effort towards displaying the good parts of a VTT file.

Extra newlines before the cue body and within the
cue body are both VTT errors which can be ignored.
Previously an exception was thrown here, which caused the entire VTT
stream to fail parsing. Some notification of failure is still warranted,
so we issue a warning. This is in line with the docstring for warnings:
"[If] we work around unusual or bad content".
@avelad
Copy link
Collaborator

avelad commented May 23, 2022

@bloomtom can you rebase the PR? Thanks!

@avelad avelad added type: enhancement New feature or request status: waiting on response Waiting on a response from the reporter(s) of the issue component: WebVTT The issue involves WebVTT subtitles specifically labels May 23, 2022
@avelad
Copy link
Collaborator

avelad commented May 30, 2022

Closing due to inactivity. If you need to reopen this issue, just put @shaka-bot reopen in a comment. Thanks!

@avelad avelad closed this May 30, 2022
@bloomtom
Copy link
Contributor Author

Going to work on this again.
@shaka-bot reopen

@joeyparrish joeyparrish reopened this Jan 26, 2023
@joeyparrish
Copy link
Member

I'm reopening this PR as requested, but I am still not convinced of the value of it. I don't think errors should be ignored or malformed inputs casually skipped.

But if you still want to make a case for it, I'm listening.

@bloomtom bloomtom changed the title Allows VTT files with erroneous linebreaks fix: Allows VTT files with erroneous linebreaks Jan 27, 2023
@bloomtom
Copy link
Contributor Author

My case is that a single error within a VTT file shouldn't prevent the user from seeing any of the caption data. It doesn't help anyone to do so when you can indicate a partial failure with a warning and move on.

If you try to play bad VTT on the current day demo page, you still get the lovely "this.data_ is undefined" error, which is the worst of both worlds.

@avelad avelad added type: bug Something isn't working correctly type: enhancement New feature or request and removed type: enhancement New feature or request type: bug Something isn't working correctly labels Jan 27, 2023
@avelad avelad changed the title fix: Allows VTT files with erroneous linebreaks feat: Allows VTT files with erroneous linebreaks Jan 27, 2023
@avelad avelad added priority: P3 Useful but not urgent and removed status: waiting on response Waiting on a response from the reporter(s) of the issue labels Jan 27, 2023
@github-actions
Copy link
Contributor

Incremental code coverage: 100.00%

@avelad
Copy link
Collaborator

avelad commented Jan 31, 2023

@joeyparrish can you review it? Thanks!

avelad
avelad previously approved these changes Feb 3, 2023
shaka.util.Error.Category.TEXT,
shaka.util.Error.Code.INVALID_TEXT_CUE,
'Could not parse cue time range in WebVTT');
shaka.log.warning(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this shaka.log.alwaysWarn, so the warning is kept in production builds.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@joeyparrish joeyparrish changed the title feat: Allows VTT files with erroneous linebreaks feat: Allow VTT files with erroneous linebreaks Feb 9, 2023
@joeyparrish joeyparrish merged commit 9b1c614 into shaka-project:main Feb 9, 2023
@github-actions github-actions bot added the status: archived Archived and locked; will not be updated label Jul 25, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
component: WebVTT The issue involves WebVTT subtitles specifically priority: P3 Useful but not urgent status: archived Archived and locked; will not be updated type: enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Be more tolerant of bad VTT
4 participants