-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fuzzer testing: less strict special-case regex match passthrough for multi-line EOF exceptions #1998
Fuzzer testing: less strict special-case regex match passthrough for multi-line EOF exceptions #1998
Conversation
@@ -52,7 +52,7 @@ def test_idempotent_any_syntatically_valid_python( | |||
except TokenError as e: | |||
if ( | |||
e.args[0] == "EOF in multi-line statement" | |||
and re.search(r"\\\r?\n", src_contents) is not None | |||
and re.search(r"\\($|\r?\n)", src_contents) is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Zac-HD NB: As it turns out, the existing regex hasn't been matching on "\\"
.
There's definitely a complexity trade-off coming up. If we end up requiring a third change to the regex, then I'd either consider adding test coverage to the script (which seems a little ridiculous, in a way), or taking your suggestion of simply checking for presence of backslash in the input at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh, I think "raises TokenError("EOF in multi-line statement")
with backslash in input" is precise enough... and a lot easier to explain.
To put it another way, why would we care about a second bug with this more general symptom before we've fixed the first anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh, I think "raises
TokenError("EOF in multi-line statement")
with backslash in input" is precise enough... and a lot easier to explain.
That is true, and alongside the simplicity argument is pretty compelling.
To put it another way, why would we care about a second bug with this more general symptom before we've fixed the first anyway?
That does make sense; the side-effect I'm aiming to avoid is one of warning fatigue. If people start ignoring fuzzer failures, then we might miss it if a change introduces a genuine problem. There's also risk in hiding 'too many' failures - i.e. allowing some true positives to go unnoticed - but it's a balance against retaining maintainer attention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does make sense; the side-effect I'm aiming to avoid is one of warning fatigue. If people start ignoring fuzzer failures, then we might miss it if a change introduces a genuine problem. There's also risk in hiding 'too many' failures - i.e. allowing some true positives to go unnoticed - but it's a balance against retaining maintainer attention.
Personally, I think warning fatigue has already set-in for me. Sometimes I'll just look at a red CI and be like: "d'oh, it's the fuzz workflow that's failing, it's probably fine", so +1 from me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, honestly I've been ignoring fuzzer failures for a little while since I assume it's just this backslash bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://patricegodefroid.github.io/public_psfiles/bugs2005.pdf - better to report fewer bugs but have them all fixed than have people ignore the test!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something makes me feel a bit wary here. Maybe there was no ideal resolution, but each option (alert fatigue vs complex special-case pass logic vs an overly-permissive pass logic) has downsides of the kind that could compound into larger problems over time.
If we think there are problems that won't be solved during day-to-day updates to and maintenance of the codebase, it'd be worth finding ways to raise and track those - GitHub issues and/or further discussion here likely being the simplest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we think there are problems that won't be solved during day-to-day updates to and maintenance of the codebase, it'd be worth finding ways to raise and track those - GitHub issues and/or further discussion here likely being the simplest.
Yeah I agree, I don't believe the core team (of which I'm a member of) actually has even a basic list of poorly defined TODOs written down anywhere. Creating and maintaining one sounds like a good idea... although I feel like it would never have anything crossed out on it.
Relates to #1012, and in particular aims to address #1012 (comment), #1012 (comment).