Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve worst-case performance of inline.text regex #1460

Merged
merged 1 commit into from Apr 5, 2019

Conversation

@andersk
Copy link
Contributor

@andersk andersk commented Apr 3, 2019

The old regex may take quadratic time to scan for potential email addresses starting at every point. Fix it to avoid scanning from points that would have been in the middle of a previous scan.

Marked version:

0.1.3 and later (problem introduced by commit 00f1f7a)

Markdown flavor: GitHub Flavored Markdown

Description

  • Fixes DoS issue reported privately.

Contributor

  • Test(s) exist to ensure functionality and minimize regression (if no tests added, list tests covering this PR); or,
  • no tests required for this PR.
  • If submitting new feature, it has been documented in the appropriate places.

Committer

In most cases, this should be a different person than the contributor.

  • Draft GitHub release notes have been updated.
  • CI is green (no forced merge required).
  • Merge PR
@UziTech
Copy link
Member

@UziTech UziTech commented Apr 3, 2019

actual diff of gfm inline.text

- /^(`+|[^`])[\s\S]*?(?=[\\<!\[`*~]|\b_| {2,}\n|https?:\/\/|ftp:\/\/|www\.|[a-zA-Z0-9.!#$%&'*+\/=?^_`{\|}~-]+@|$)/
+ /^(`+|[^`])(?:[\s\S]*?(?:(?=[\\<!\[`*~]|\b_| {2,}\n|https?:\/\/|ftp:\/\/|www\.|$)|[^a-zA-Z0-9.!#$%&'*+\/=?_`{\|}~-](?=[a-zA-Z0-9.!#$%&'*+\/=?_`{\|}~-]+@))|(?=[a-zA-Z0-9.!#$%&'*+\/=?_`{\|}~-]+@))/

Copy link
Member

@UziTech UziTech left a comment

Could you add a redos test in /test/redos/ that will fail before this change and pass after?

@andersk andersk force-pushed the inline-text-quadratic branch from 175fae6 to 18dbc0b Apr 3, 2019
@andersk
Copy link
Contributor Author

@andersk andersk commented Apr 3, 2019

(Updated to address a separate quadratic slowdown in the same regex.)

@UziTech Do you want me to literally drop in a gigantic .md file consisting of aaaaaaaaaaaaa…, or should we find a way to test this more intelligently?

@UziTech
Copy link
Member

@UziTech UziTech commented Apr 3, 2019

Tests that take longer than 1 second are marked as failed. maybe slim it down to a test taking 2 seconds before this fix.

@andersk andersk force-pushed the inline-text-quadratic branch from 18dbc0b to bd789b3 Apr 4, 2019
@andersk
Copy link
Contributor Author

@andersk andersk commented Apr 4, 2019

@UziTech Done.

UziTech
UziTech approved these changes Apr 4, 2019
Copy link
Member

@UziTech UziTech left a comment

Thanks for working on this 💯 🏅

lib/marked.js Outdated Show resolved Hide resolved
The old regex may take quadratic time to scan for potential line
breaks or email addresses starting at every point.  Fix it to avoid
scanning from points that would have been in the middle of a previous
scan.

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
@andersk andersk force-pushed the inline-text-quadratic branch from 830413b to be27472 Apr 4, 2019
@UziTech UziTech requested a review from davisjam Apr 4, 2019
@UziTech
Copy link
Member

@UziTech UziTech commented Apr 4, 2019

@davisjam do you want to look at this and make sure no redos vectors are added?

@@ -546,7 +546,7 @@ var inline = {
code: /^(`+)([^`]|[^`][\s\S]*?[^`])\1(?!`)/,
br: /^( {2,}|\\)\n(?!\s*$)/,
del: noop,
text: /^(`+|[^`])[\s\S]*?(?=[\\<!\[`*]|\b_| {2,}\n|$)/
text: /^(`+|[^`])(?:[\s\S]*?(?:(?=[\\<!\[`*]|\b_|$)|[^ ](?= {2,}\n))|(?= {2,}\n))/
Copy link
Contributor

@davisjam davisjam Apr 5, 2019

I believe this is safe

.replace(']|', '~]|')
.replace('|$', '|https?://|ftp://|www\\.|[a-zA-Z0-9.!#$%&\'*+/=?^_`{\\|}~-]+@|$')
.getRegex()
text: /^(`+|[^`])(?:[\s\S]*?(?:(?=[\\<!\[`*~]|\b_|https?:\/\/|ftp:\/\/|www\.|$)|[^ ](?= {2,}\n)|[^a-zA-Z0-9.!#$%&'*+\/=?_`{\|}~-](?=[a-zA-Z0-9.!#$%&'*+\/=?_`{\|}~-]+@))|(?= {2,}\n|[a-zA-Z0-9.!#$%&'*+\/=?_`{\|}~-]+@))/
Copy link
Contributor

@davisjam davisjam Apr 5, 2019

I believe this is safe

Copy link
Contributor

@davisjam davisjam left a comment

LGTM

@UziTech UziTech merged commit b1ddd3c into markedjs:master Apr 5, 2019
1 check passed
@UziTech
Copy link
Member

@UziTech UziTech commented Apr 5, 2019

This will be released in v0.6.2 🎉 #1441

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants