Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex backtracking issues #1312

Closed
joeyparrish opened this issue Feb 20, 2018 · 7 comments
Closed

Regex backtracking issues #1312

joeyparrish opened this issue Feb 20, 2018 · 7 comments
Assignees
Labels
status: archived Archived and locked; will not be updated type: bug Something isn't working correctly
Milestone

Comments

@joeyparrish
Copy link
Member

@davisjam reported some potential regex backtracking vulnerabilities to us via email. In such a vulnerability, extremely long inputs could cause a regex to block for a very long time while parsing.

We believe there is no significant risk to these particular issues. Four of six of them are in the jsdoc template, and therefore do not affect Shaka Player itself. The other two are in the TTML text parser.

Application developers generally have some control or trust in their content catalogs and are not subject to malicious TTML content. Such content, if encountered, would only cause individual browser tabs to lock up. Shaka Player does not run in nodejs or other such environments, and we do not expect this could be used for any kind of DOS attack.

The affected regex should be refactored to avoid this. @davisjam recommends these tools to assess progress:

Here are the details of the reported vulnerabilities:

Vuln 1:

  • file: "docs/jsdoc-template/static/scripts/prettify/lang-css.js"
  • pattern: "^(-?(?:[_a-z]|\\[\da-f]+ ?)(?:[\w-]|\\\\[\da-f]+ ?))\s:"

Vuln 2:

  • file: "docs/jsdoc-template/static/scripts/prettify/lang-css.js"
  • pattern: "^"(?:[^\\n\\f\\r\"\\\\]|\\(?:\r\n?|\n|\f)|\\[\S\s])*""

Vuln 3:

  • file: "docs/jsdoc-template/static/scripts/prettify/lang-css.js"
  • pattern: "^'(?:[^\\n\\f\\r'\\\\]|\\(?:\r\n?|\n|\f)|\\[\S\s])*'"

Vuln 4:

  • file: "docs/jsdoc-template/static/scripts/prettify/prettify.js"
  • pattern: "^<(?:(?:(?:\.\.\/)|\/?)(?:[\w-]+(?:\/[\w-]+)+)?[\w-]+\.h|[a-z]\w)>"

Vuln 5:

  • file: "lib/text/ttml_text_parser.js"
  • pattern: "^(\d*\.?\d*)t$",

Vuln 6:

  • file: "lib/text/ttml_text_parser.js"
  • pattern: "^(?:(\d*\.?\d*)h)?(?:(\d*\.?\d*)m)?(?:(\d*\.?\d*)s)?(?:(\d*\.?\d*)ms)?$",
@joeyparrish joeyparrish added the type: bug Something isn't working correctly label Feb 20, 2018
@joeyparrish joeyparrish added this to the Backlog milestone Feb 20, 2018
@TheModMaker
Copy link
Contributor

For those unfamiliar with catastrophic backtracking, here is a short explanation.

Consider the following regex: /a*a*b/

So if you have an input string of aaax, it obviously doesn't match. First it tries to match the first a* with the longest string of a's (since it's greedy). The 'x' doesn't match 'a', so it tries the next regex part; the second a* is optional, so it tries the next regex part; the 'x' doesn't match the expected 'b' so it fails to match.

But the * matches any number of something, so what if we chose less of them? So the regex engine backs up and selects one fewer 'a' from the first a* and moves forward. This time, the remaining string is 'ax', so the second a* can match the 'a'. But the 'x' still doesn't match 'b'.

But what if the second a* didn't use all the a's? See the problem? Here is a list of all the matches the regex engine will try on the above string:

a* a* b
aaa x
aa a x
aa ax
a aa x
a a ax
a aax
aaa x
aa ax
a aax
aaax

You see there are 10 attempts to match the given string, which is really just a series of a's. The regex /a*b/ will only make 4 matches as it tries less values for a*. This can become exponentially worse as there are more characters in the input string.

In the TTML parser we use the regex /\d*\.?\d*/ to match numbers. Since the . is optional, it is similar to the above case.

shaka-bot pushed a commit that referenced this issue Feb 21, 2018
Issue #1312

Change-Id: I0aed14068776a800eee35f03d2f878db0dd565b6
joeyparrish pushed a commit that referenced this issue Feb 21, 2018
Issue #1312

Change-Id: I0aed14068776a800eee35f03d2f878db0dd565b6
@joeyparrish
Copy link
Member Author

TTML parser fix cherry-picked for v2.3.3. The remaining issues are in the documentation template, not Shaka Player.

@joeyparrish
Copy link
Member Author

The vulnerable expressions in jsdoc are all from the "prettify" module, which enables syntax highlighting for CSS. https://github.com/jsdoc3/jsdoc/tree/master/templates/default/static/scripts/prettify

That code seems to have been forked in 2012 and never updated. The original code has been updated a few times since then: https://github.com/google/code-prettify/commits/master/src/lang-css.js

I will put together a bug report against jsdoc for this, and a pull request to update, if I can.

@davisjam
Copy link

Sounds good. Link to the bug report and I can also take a peek at a fix.

@davisjam
Copy link

@joeyparrish I have contacted the code-prettify devs by email. I CCd you in case you want to weigh in or point out the prettify fork.

@ismena
Copy link
Contributor

ismena commented Mar 21, 2018

@joeyparrish Is there something else we should with regards to this issue or are we good to close it?

@joeyparrish
Copy link
Member Author

We have fixed the regex issues that affect our library, but there are others in our jsdoc template. We're leaving it open until we fix those, too.

Our template is forked from the jsdoc default template, and the relevant code in jsdoc was forked from prettify (and never updated). If we do fix these issues in the template ourselves before jsdoc and prettify do, we can upstream the fix.

@joeyparrish joeyparrish modified the milestones: Backlog, v2.6 May 16, 2019
@shaka-project shaka-project locked and limited conversation to collaborators Jul 14, 2019
@shaka-bot shaka-bot added the status: archived Archived and locked; will not be updated label Apr 15, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
status: archived Archived and locked; will not be updated type: bug Something isn't working correctly
Projects
None yet
Development

No branches or pull requests

5 participants