-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check stability for both preview and non-preview styles #3423
Conversation
This idea can be generalized to other aspects of the mode: run the stability/equivalence tests only while setting some part of the mode to the opposite of what the test normally does. I tried locally and found that every new configuration adds about 4 s to the test runtime. The most interesting configuration was setting line-length to 1, which found #3424, #3425, #3427, and #3428. Setting I'm going to add a case that does line length = 1, preview = False, because that configuration seems to be good at catching bugs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! This is a clever way to discover bugs :)
Is it clear when the secondary format runs are the ones erroring out? I just don't want this to confuse contributors unfamiliar with our testing setup. I'm not happy that this quadruples the runtime of the test_format.py
tests (12s -> 45s), but c'est la vie.
# It's not useful to run safety checks if we're expecting no changes anyway. The | ||
# assertion right above will raise if reality does actually make changes. This just | ||
# avoids wasted CPU cycles. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# It's not useful to run safety checks if we're expecting no changes anyway. The | |
# assertion right above will raise if reality does actually make changes. This just | |
# avoids wasted CPU cycles. |
Unfortunately this optimization is now gone and I can't find a way to bring it back :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why? The optimization still applies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you are right. My bad.
I pushed a change that wraps failures from these checks to hopefully make it clear.
I don't think it's going to be quite so bad, where did you get that number from? The current version only adds two extra checks, and the test does more than just calling this function, so it shouldn't 3x the overall runtime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should've clarified this was on my machine (which is probably underpowered at this point). On GHA CI, it's negligible.
And yes when you take the runtime of the rest of the test suite, it's not that bad.
Co-authored-by: Richard Si <63936253+ichard26@users.noreply.github.com>
For all format tests, check that both preview and non-preview succeed without crashes. This makes sure we catch bugs early where preview mode introduces a new crash.
Fixes #3419. Fixes #3420. Closes #3422 (this PR is a more general solution).