Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crackfortran regular expression could use improvement #23338

Open
charris opened this issue Mar 4, 2023 · 4 comments
Open

Crackfortran regular expression could use improvement #23338

charris opened this issue Mar 4, 2023 · 4 comments

Comments

@charris
Copy link
Member

charris commented Mar 4, 2023

In crackfortran.py line 938.

nameargspattern = re.compile(
    r'\s*(?P<name>\b[\w$]+\b)\s*(@\(@\s*(?P<args>[\w\s,]*)\s*@\)@|)\s*((result(\s*@\(@\s*(?P<result>\b[\w$]+\b)\s*@\)@|))|(bind\s*@\(@\s*(?P<bind>.*)\s*@\)@))*\s*\Z', re.I)

This part of the regular expression may cause exponential backtracking on strings containing many repetitions of '@)@bind@(@'.
CodeQL
@molsonkiko
Copy link
Contributor

I have my eye on this one. I've looped in a regex wizard I know and hopefully I'll be able to make it better.

@charris
Copy link
Member Author

charris commented Mar 26, 2023

That would be great.

@molsonkiko
Copy link
Contributor

It occurs to me that probably the issue is the .* in (bind\s*@\(@\s*(?P<bind>.*)\s*@\)@).

I have essentially no knowledge of FORTRAN, so I feel pretty far out of my depth here, but I assume what's going on here is that the .* can reach forward arbitrarily far, including through subsequent @)@bind@(@. Probably there's some sort of negative lookahead that could be done here to prevent this problem.

@molsonkiko
Copy link
Contributor

Note for myself, and anyone else who's interested - looks like the source code that was used to flag regexes is probably here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants