Skip to content

ReDoS refactorizations #8522

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 14 commits into from
Closed

ReDoS refactorizations #8522

wants to merge 14 commits into from

Conversation

erik-krogh
Copy link
Contributor

@erik-krogh erik-krogh commented Mar 22, 2022

Don't look at this PR, it's just for me right now.

I'll do the implementation for JS first, and then get Python/Ruby along by running the sync script.

TODO:


Many things are happening in this PR, all of which are outlined below.

Commit-by-commit review is highly recommended.
All the tests should pass on every commit.

1: Disable parts of QL-for-QL

The consistency query and the ql/dead-code queries are very noisy until we have proper support for parameterized modules.
So I've disabled those.

2: Introduce a ReDoSPruning parameterized module

The class Config extends string pattern used previously was ugly.
This is just a straight-forward conversion of that pattern into a parameterized module.

3: Implement a more efficient algorithm for computing strings from chains

Based on Toms work: #8589

But I moved the implementation into a parameterized module, and I use 2 different strategies depending on the length of the chain. (I have no idea if that will work out).

4: add further normalization of char classses

Inspired by esbena: #4847 (comment)

Now \d and [\d] and [0-9] will literally be represented with a single value.
(an arbitrary one of them is chosen as the canonical char class).

The extra normalization has a real, but limited, impact.
The largest impact is on the size of the SuperlinearBackTracking::getAThreewayIntersect predicate, the size of that predicate drops by ~15% on large JS benchmarks.

I also drive-by fixed a bug related to case-normalization.
(The bug didn't affect anything before my changes).

5: make a parameterized module out of the RegexpMatching implementation

I didn't like the configuration pattern used previously, which was also why I hadn't made a real library out of it.
It can be better with parameterized modules, so I did that.

6: Rename ReDoSUtil to NfaUtils, and rename the performance folder to regexp.

The ReDoSUtil file was used for more than just ReDoS, it's a general library for constructing and reasoning about NFAs.
And the performance folder is also not a great name.

7: Improve performance.

Lots of states in the process() predicate could be eliminated.
Because we know that if a regexp ever "reaches" the accept-all state, then it can never be rejected.
This mostly benefits ruby/ruby / opal/opal in the Ruby evaluation.

Also ports this exponential-redos fix from Shack.


Evaluations.

Evaluation looks OK.

@erik-krogh erik-krogh force-pushed the redosNormal branch 3 times, most recently from a296098 to 6b2e68d Compare March 22, 2022 08:59
@erik-krogh erik-krogh force-pushed the redosNormal branch 6 times, most recently from 3989fdc to 7dda90a Compare March 22, 2022 22:20
@erik-krogh erik-krogh changed the title ReDoS refactorization ReDoS refactorizations Mar 22, 2022
@erik-krogh erik-krogh force-pushed the redosNormal branch 2 times, most recently from 3c4c532 to 2cac72a Compare March 28, 2022 06:41
@erik-krogh erik-krogh added the no-change-note-required This PR does not need a change note label Mar 28, 2022
@erik-krogh erik-krogh force-pushed the redosNormal branch 2 times, most recently from 3be640f to cd74e76 Compare April 18, 2022 19:03
Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 100 vulnerabilities.

Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 37 vulnerabilities.

Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 3 vulnerabilities.

Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 3 vulnerabilities.

Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 3 vulnerabilities.

@erik-krogh erik-krogh force-pushed the redosNormal branch 2 times, most recently from c58647d to ab711c8 Compare April 29, 2022 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant