Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protection against ReDoS #6163

Merged
merged 4 commits into from Nov 7, 2019
Merged

Conversation

stsewd
Copy link
Member

@stsewd stsewd commented Sep 10, 2019

The regex module is compatible with the re module (VERSION0 flag).
It is also faster.

>>> import re
>>> import regex
>>> import timeit
>>> pattert = "(a+)+b"
>>> input = "a" * 25
>>> timeit.timeit(lambda: re.search(pattern, input), number=10)
32.332445038000515
>>> timeit.timeit(lambda: regex.search(pattern, input, flags=regex.VERSION0), number=10)
0.003861578001306043
>>> input = "a" * 10000
>>> regex.search(pattern, input, flags=regex.VERSION0, timeout=5)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/stsewd/.pyenv/versions/readthedocs.org/lib/python3.6/site-packages/regex/regex.py", line 266, in search
    concurrent, partial, timeout)
TimeoutError: regex timed out

I put the timeout to 15, maybe we can drop it to 5?

@humitos
Copy link
Member

@humitos humitos commented Oct 2, 2019

This PR is related to #5996.

@humitos humitos added the Needed: design decision label Oct 30, 2019
@humitos
Copy link
Member

@humitos humitos commented Nov 4, 2019

We decided to ship with regex (#4641 (comment)) so we should merge this PR before that PR gets merged, or merge this PR into the other first.

@humitos humitos added Accepted and removed Needed: design decision labels Nov 4, 2019
@humitos
Copy link
Member

@humitos humitos commented Nov 4, 2019

I put the timeout to 15, maybe we can drop it to 5?

Even less, should be better. Parsing a regex shouldn't take more than 1s.

The regex module is compatible with the re module (VERSION0 flag).
It is also faster.

```python
>>> import re
>>> import regex
>>> import timeit
>>> pattert = "(a+)+b"
>>> input = "a" * 25
>>> timeit.timeit(lambda: re.search(pattern, input), number=10)
32.332445038000515
>>> timeit.timeit(lambda: regex.search(pattern, input, flags=regex.VERSION0), number=10)
0.003861578001306043
>>> input = "a" * 10000
>>> regex.search(pattern, input, flags=regex.VERSION0, timeout=5)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/stsewd/.pyenv/versions/readthedocs.org/lib/python3.6/site-packages/regex/regex.py", line 266, in search
    concurrent, partial, timeout)
TimeoutError: regex timed out
```
@stsewd stsewd force-pushed the prevent-redos-attacks branch from 48e187f to 7cc0b47 Compare Nov 6, 2019
@stsewd
Copy link
Member Author

@stsewd stsewd commented Nov 6, 2019

Ok, I've decreased the timeout to 1 second. Another alternative is to use a finite state machine type of regex, but I wasn't able to find one lib for python...

@stsewd stsewd requested a review from Nov 6, 2019
humitos
humitos approved these changes Nov 7, 2019
@stsewd stsewd merged commit a8611aa into readthedocs:master Nov 7, 2019
2 checks passed
@stsewd stsewd deleted the prevent-redos-attacks branch Nov 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants