Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inexplicable error -8 while matching certain pattern #23

Closed
michaeltyson opened this issue Oct 6, 2021 · 5 comments
Closed

Inexplicable error -8 while matching certain pattern #23

michaeltyson opened this issue Oct 6, 2021 · 5 comments

Comments

@michaeltyson
Copy link

michaeltyson commented Oct 6, 2021

Hi!

There is an error when matching the following pattern:

a(.|\s)*?asdf

against:

a                b b bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbf

Specifically: a, followed by newline, followed by 16 spaces, followed by b, space, b, space, 35 b's, then an f.

pcregrep 'a(.|\s)*?asdf' returns:

pcregrep: pcre_exec() gave error -8 while matching this text:

a                b b bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbf


pcregrep: Error -8, -21 or -27 means that a resource limit was exceeded.
pcregrep: Check your regex for nested unlimited loops.

Using pcre2 10.37_1, as installed on macOS 11.6 via Homebrew.

Am I missing something obvious? This should work, right?

@zherczeg
Copy link
Collaborator

zherczeg commented Oct 6, 2021

If you using pcre_exec, that means you are using PCRE1, which is not maintained anymore, and it will never see another release. Anyway, the problem is not with pcre1, because PCRE2 also throws PCRE2_ERROR_MATCHLIMIT. This pattern is 2^n exponential pattern on this particular input because both \s and dot matches to space. To avoid long runs, the engine exits at some point.

Because '.|\s' essentially matches to all characters, you can simply enable dotall temporarily for the dot character: a(?s:.)*?asdf.

@michaeltyson
Copy link
Author

🤨😄

@zherczeg
Copy link
Collaborator

zherczeg commented Oct 6, 2021

Let me know if you are ok with this answer because we can close this bug then.

@michaeltyson
Copy link
Author

michaeltyson commented Oct 6, 2021

I suspect that's more in your court - this still feels like incorrect behaviour to me, but I don't think it's my place 😊

And I'm fine with the workaround.

Thanks!

@zherczeg
Copy link
Collaborator

zherczeg commented Oct 6, 2021

Regex101 says: https://regex101.com/r/6XKrAV/1
Catastrophic backtracking has been detected and the execution of your expression has been halted. To find out more and what this is, please read the following article: Runaway Regular Expressions

Writing good patterns is important, the engine has limits to help you. E.g. a bubble sort is always much slower than a quick sort, whatever a C compiler does, and this is not the fault of the compiler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants