Skip to content

Python: use the shared regex pack #11247

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Nov 14, 2022
Merged

Conversation

erik-krogh
Copy link
Contributor

@erik-krogh erik-krogh commented Nov 14, 2022

the commits should explain what's happening

Evaluation looks fine.

I'm pretty sure the QLDoc check failure is spurious, but I don't know that happens.
I've tried to add all the QLDoc I could in those files, and I've even tried to add extra modules into the file (just to have something to attach a QLDoc too), but it keeps failing.

@erik-krogh erik-krogh marked this pull request as ready for review November 14, 2022 10:22
@erik-krogh erik-krogh requested a review from a team as a code owner November 14, 2022 10:22
Copy link
Contributor

@yoff yoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a very minor typo.

You write that the dropped result is due to locations. I can see that a result in a nested location is retained. Is that what is going on, some stronger canonicalisation?

It looks like we are getting better alert messages? I remember we turned off prefix computation for Python for a period due to performance issues, perhaps it was never fully turned on again...in any case, the evaluation looks fine.

Unrelated to this PR: Is the construction in PolynimoalReDoS.ql, where we filter the sources to be PolynomialBackTrackingTerms after the dataflow computation a little silly? Should the constraint be part of the definition of Sink instead?

@erik-krogh
Copy link
Contributor Author

erik-krogh commented Nov 14, 2022

You write that the dropped result is due to locations. I can see that a result in a nested location is retained. Is that what is going on, some stronger canonicalisation?

I would call it some slightly better canonicalisation.
I drilled into it, and it is this de-duplication code that produces different results.
The old behavior is actually a bug, the result was supposed to be removed (because it's effectively a duplicate), but with the better canonicalisation we now properly remove that result.

You can add more terms to the order by clause in the old implementation to replicate the results seen in the new implementation.

I'll make the canonicalisation even more robust.

erik-krogh and others added 2 commits November 14, 2022 17:33
Co-authored-by: yoff <lerchedahl@gmail.com>
@erik-krogh
Copy link
Contributor Author

It looks like we are getting better alert messages? I remember we turned off prefix computation for Python for a period due to performance issues, perhaps it was never fully turned on again...in any case, the evaluation looks fine.

I seem to remember us fixing the root cause of that a while ago.
I'm quite sure this PR doesn't change the alert-messages? We had prefix as part of the alert-message before this PR.

Unrelated to this PR: Is the construction in PolynimoalReDoS.ql, where we filter the sources to be PolynomialBackTrackingTerms after the dataflow computation a little silly? Should the constraint be part of the definition of Sink instead?

Yes.
But could you wait with that until after this PR?

@erik-krogh erik-krogh requested a review from yoff November 14, 2022 16:37
@yoff
Copy link
Contributor

yoff commented Nov 14, 2022

I would call it some slightly better canonicalisation. I drilled into it, and it is this de-duplication code that produces different results. The old behavior is actually a bug, the result was supposed to be removed (because it's effectively a duplicate), but with the better canonicalisation we now properly remove that result.

Thanks for finding the precise code :-)

I seem to remember us fixing the root cause of that a while ago.

That is how I remember it also.

I'm quite sure this PR doesn't change the alert-messages? We had prefix as part of the alert-message before this PR.

The diff (of the .expected-files) has a lot of added starting with '*' and and similar, though.

But could you wait with that until after this PR?

Absolutely, that belongs in a fix of its own.

Copy link
Contributor

@yoff yoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran the tests in a Codespace to make sure everything could compile and find each other. I have no further issues. The QLdoc failure is annoying, is it because there is no longer a public import, so it gets confused about where the comment should be? In any case, the check looks to be in the wrong..

@erik-krogh
Copy link
Contributor Author

The QLdoc failure is annoying, is it because there is no longer a public import, so it gets confused about where the comment should be?

I tried to add a public module in the files to see if that made the errors go away, but that didn't happen.
So your theory doesn't seem to be what's happening.

@erik-krogh erik-krogh merged commit d285700 into github:main Nov 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants