New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elpi lexer has pathological backtracking #2053
Comments
Thanks for pinning this down. What I'm trying to do is to make it so that, for
My lexer is probably silly, but I don't know how to improve it. So help is welcome. Examples:
|
The problem is that Can the outer |
I would say it's legit, but I would put I read between the lines that these regexp are not compiled to an DFA. If they were all these expressions would collapse to the same. :-/ |
No; DFA regex engines don't support features like backreferences or look-behind assertions, which Python's re does. |
idcharstarns_re = rf"({idchar_re}+|\.{lcase_re}+)" The relevant alternative of
This sounds like it's not matching what it intends. For example, it can match “ |
it should NOT match |
OK, but should it match |
no, that should be lexed as |
As is most PL, idents can contain number and some other special chars, but can't begin with a number. |
I've opened #2061, but I'm not very sure about it. Could you take a look, please? Thank you. |
The Elpi lexer has pathological backtracking that can be triggered with the following minimal input:
Only lowercase characters trigger pathological backtracking. Uppercase characters and digits will not.
The culprit is at line 62.
@gares I'm not familiar with Elpi, so if you can jump in to address this, great! Otherwise I'll be able to take a crack at it after this weekend.
The text was updated successfully, but these errors were encountered: