Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inefficient regular expression #2138

Closed
1 task
sebix opened this issue Jan 14, 2022 · 2 comments · Fixed by #2148
Closed
1 task

Inefficient regular expression #2138

sebix opened this issue Jan 14, 2022 · 2 comments · Fixed by #2148
Labels
bug Indicates an unexpected problem or unintended behavior component: bots security

Comments

@sebix
Copy link
Member

sebix commented Jan 14, 2022

Found by CodeQL:

remove_comments = re.compile(r"<!--(.|\s|\n)*?-->")

This part of the regular expression may cause exponential backtracking on strings starting with '<!--' and containing many repetitions of '\n'.

Some regular expressions take a long time to match certain input strings to the point where the time it takes to match a string of length n is proportional to nk or even 2n. Such regular expressions can negatively affect performance, or even allow a malicious user to perform a Denial of Service ("DoS") attack by crafting an expensive input string for the regular expression to match.

Tracking issue for:

@sebix sebix added bug Indicates an unexpected problem or unintended behavior component: bots security labels Jan 14, 2022
@monoidic
Copy link
Contributor

Would this work?

remove_comments = re.compile(r"<!--.*?-->", re.DOTALL)

@sebix
Copy link
Member Author

sebix commented Jan 27, 2022

Maybe even "<!--([^\s\n]|\s|\n)*?-->" works by preventing the ambiguity of the first and the other alternative matches, but the re.DOTALL approach definitely is smarter :)

waldbauer-certat added a commit that referenced this issue Feb 1, 2022
As suggested in #2138, credits @monoidic

Fixes #2138

Signed-off-by: Sebastian Waldbauer <waldbauer@cert.at>
waldbauer-certat added a commit that referenced this issue Feb 1, 2022
As suggested in #2138, credits @monoidic

Fixes #2138

Signed-off-by: Sebastian Waldbauer <waldbauer@cert.at>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior component: bots security
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants