Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SafeRe is vulnerable to ReDoS #2757

Open
gqgs opened this issue Jul 13, 2021 · 3 comments
Open

SafeRe is vulnerable to ReDoS #2757

gqgs opened this issue Jul 13, 2021 · 3 comments

Comments

@gqgs
Copy link
Contributor

gqgs commented Jul 13, 2021

Step 1: Please describe your environment

  • ZeroNet version: 0.7.2 (4555)

Step 2: Describe the problem:

"To avoid the ReDoS algorithmic complexity attack" the function bellow is used to validate user defined regular expressions.

def isSafePattern(pattern):
if len(pattern) > 255:
raise UnsafePatternError("Pattern too long: %s characters in %s" % (len(pattern), pattern))
unsafe_pattern_match = re.search(r"[^\.][\*\{\+]", pattern) # Always should be "." before "*{+" characters to avoid ReDoS
if unsafe_pattern_match:
raise UnsafePatternError("Potentially unsafe part of the pattern: %s in %s" % (unsafe_pattern_match.group(0), pattern))
repetitions = re.findall(r"\.[\*\{\+]", pattern)
if len(repetitions) >= 10:
raise UnsafePatternError("More than 10 repetitions of %s in %s" % (repetitions[0], pattern))
return True

This function fails to identify regular expressions that can require exponential time complexity to match user inputs.

Steps to reproduce:

>>> from SafeRe import isSafePattern, match
>>> p = "a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
>>> isSafePattern(p)
True
>>> match(p, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa")

Observed Results:

match hangs and the execution never completes.

Expected Results:

isSafePattern should properly detect that the pattern is unsafe.
Alternatively, match should use an algorithm with guaranteed linear time complexity to compile and match inputs (e.g. Thompson NFA).

@rllola
Copy link
Contributor

rllola commented Jul 26, 2021

We could replace this by the RE2 (https://github.com/google/re2). There is python bindings available (https://pypi.org/project/google-re2/).

@wandrien
Copy link
Contributor

@rllola

Many zites make use of (?!...) and RE2 doesn't seem to support it. (https://github.com/google/re2/wiki/Syntax)
The problem is we neither check for formal allowed regexp syntax, nor have the formal definition at all. Our regexp syntax is implicitly python re syntax.

Not sure if it is possible to move to RE2 in a backward compatible way.

@wandrien
Copy link
Contributor

https://github.com/zeronet-enhanced/ZeroNet/commit/2a25d61b968a21aa98c6db2ca9d64f1bbdc54773

In my fork, I (temporarily) fixed this by treating ?s in the same ways as other "repetitions", so the total number of repetition markers cannot exceed 9.

Not sure if it is a proper or a complete solution. I'm not familiar with the ReDoS type of attack and regexp implementation details.

caryoscelus pushed a commit to caryoscelus/zeronet-conservancy that referenced this issue Jul 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants