Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fuzz: compiling '\P{any}' panics by tripping an assertion in the compiler #722

BurntSushi opened this issue Oct 19, 2020 · 0 comments


Copy link

@BurntSushi BurntSushi commented Oct 19, 2020

Specifically, this one:


Normally, regexes like [^\w\W] with empty classes are banned at translation time. But it looks like \P{any} (which is empty) slipped through. So we should just improve the ban to cover that case.

However, empty character classes are occasionally useful constructs for injecting a "fail" sub-pattern into a regex, typically in the context of cases where regexes are generated. Indeed, the NFA compiler in regex-automata handles this case fine:

$ regex-cli debug nfa thompson '\P{any}' -B
      parse time:  48.809µs
  translate time:  17.48µs
compile nfa time:  18.638µs
   pattern count:  1

>000000: alt(2, 1)
 000001: \x00-\xff => 0
^000002: sparse()
 000003: MATCH(0)

Where it's impossible to ever move past state 2. Arguably, it might be nicer if it were an explicit "fail" instruction, but an empty sparse instruction (a state with no outgoing transitions) serves the purpose as well.

So once #656 is done, we should be able to relax this restriction.

This bug was found by OSS-Fuzz.

@BurntSushi BurntSushi added the bug label Oct 19, 2020
@BurntSushi BurntSushi closed this in 6fdb6e1 Nov 1, 2020
This was referenced Mar 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant