Fuzzer testing: less strict special-case regex match passthrough for multi-line EOF exceptions #1998

jayaddison · 2021-02-22T11:37:37Z

Relates to #1012, and in particular aims to address #1012 (comment), #1012 (comment).

jayaddison · 2021-02-22T11:42:31Z

fuzz.py

@@ -52,7 +52,7 @@ def test_idempotent_any_syntatically_valid_python(
    except TokenError as e:
        if (
            e.args[0] == "EOF in multi-line statement"
-            and re.search(r"\\\r?\n", src_contents) is not None
+            and re.search(r"\\($|\r?\n)", src_contents) is not None


@Zac-HD NB: As it turns out, the existing regex hasn't been matching on "\\".

There's definitely a complexity trade-off coming up. If we end up requiring a third change to the regex, then I'd either consider adding test coverage to the script (which seems a little ridiculous, in a way), or taking your suggestion of simply checking for presence of backslash in the input at all.

Eh, I think "raises TokenError("EOF in multi-line statement") with backslash in input" is precise enough... and a lot easier to explain.

To put it another way, why would we care about a second bug with this more general symptom before we've fixed the first anyway?

Eh, I think "raises TokenError("EOF in multi-line statement") with backslash in input" is precise enough... and a lot easier to explain.

That is true, and alongside the simplicity argument is pretty compelling.

To put it another way, why would we care about a second bug with this more general symptom before we've fixed the first anyway?

That does make sense; the side-effect I'm aiming to avoid is one of warning fatigue. If people start ignoring fuzzer failures, then we might miss it if a change introduces a genuine problem. There's also risk in hiding 'too many' failures - i.e. allowing some true positives to go unnoticed - but it's a balance against retaining maintainer attention.

That does make sense; the side-effect I'm aiming to avoid is one of warning fatigue. If people start ignoring fuzzer failures, then we might miss it if a change introduces a genuine problem. There's also risk in hiding 'too many' failures - i.e. allowing some true positives to go unnoticed - but it's a balance against retaining maintainer attention.

Personally, I think warning fatigue has already set-in for me. Sometimes I'll just look at a red CI and be like: "d'oh, it's the fuzz workflow that's failing, it's probably fine", so +1 from me.

Yes, honestly I've been ignoring fuzzer failures for a little while since I assume it's just this backslash bug.

See https://patricegodefroid.github.io/public_psfiles/bugs2005.pdf - better to report fewer bugs but have them all fixed than have people ignore the test!

Something makes me feel a bit wary here. Maybe there was no ideal resolution, but each option (alert fatigue vs complex special-case pass logic vs an overly-permissive pass logic) has downsides of the kind that could compound into larger problems over time.

If we think there are problems that won't be solved during day-to-day updates to and maintenance of the codebase, it'd be worth finding ways to raise and track those - GitHub issues and/or further discussion here likely being the simplest.

If we think there are problems that won't be solved during day-to-day updates to and maintenance of the codebase, it'd be worth finding ways to raise and track those - GitHub issues and/or further discussion here likely being the simplest.

Yeah I agree, I don't believe the core team (of which I'm a member of) actually has even a basic list of poorly defined TODOs written down anywhere. Creating and maintaining one sounds like a good idea... although I feel like it would never have anything crossed out on it.

jayaddison added 3 commits February 22, 2021 11:31

Regex tweak: remove leading newline requirement

959766a

Regex tweak: permit match on end-of-input after backslash

b04f79a

Attach explanatory comment to conditional check

0fb96d6

jayaddison commented Feb 22, 2021

View reviewed changes

JelleZijlstra merged commit e1c86f9 into psf:master Feb 22, 2021

jayaddison deleted the tests/less-strict-eof-exception-passthrough branch February 22, 2021 17:58

jayaddison mentioned this pull request Mar 13, 2021

Tokenization: improve EOF and multi-line statement interaction #1961

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fuzzer testing: less strict special-case regex match passthrough for multi-line EOF exceptions #1998

Fuzzer testing: less strict special-case regex match passthrough for multi-line EOF exceptions #1998

jayaddison commented Feb 22, 2021

jayaddison Feb 22, 2021

Zac-HD Feb 22, 2021

jayaddison Feb 22, 2021

ichard26 Feb 22, 2021

JelleZijlstra Feb 22, 2021

Zac-HD Feb 22, 2021

jayaddison Feb 23, 2021

ichard26 Feb 23, 2021

Fuzzer testing: less strict special-case regex match passthrough for multi-line EOF exceptions #1998

Fuzzer testing: less strict special-case regex match passthrough for multi-line EOF exceptions #1998

Conversation

jayaddison commented Feb 22, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment