Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack overflows are often reported as flaky. #2494

Closed
pschanely opened this issue Jul 18, 2020 · 5 comments · Fixed by #2506
Closed

Stack overflows are often reported as flaky. #2494

pschanely opened this issue Jul 18, 2020 · 5 comments · Fixed by #2506
Assignees
Labels
enhancement it's not broken, but we want it to be better

Comments

@pschanely
Copy link
Contributor

As mentioned here, deterministic tests that can hit python's maximum recursion depth may be reported as flaky by Hypothesis, because it is attempted at different initial stack depths.

A simple example that Hypothesis claims is flaky:

from hypothesis import given, strategies

def slow(i):
    return 0 if i <= 0 else slow(i - 1) + 1

@given(strategies.integers(min_value=0))
def test_slow(i) -> None:
    assert slow(i) == i

test_slow()

Something of a question, rather than a bug report: would we consider detecting RecursionError and giving it some kind of special treatment?

@DRMacIver
Copy link
Member

Sure, I'd be happy to consider that as a feature.

I'm not sure what the best special handling might be for it though. The easiest thing to do might be always to just immediately re-raise RecursionError as soon as it's thrown rather than attempt to shrink and report it, but that's not necessarily great user experience either. We could potentially do something with fiddling around with getting and setting the recursion limit but that sounds a bit yikes. Any ideas?

@pschanely
Copy link
Contributor Author

Yeah, I considered the recursion limit fiddling too and agree it sounds scary. I guess I'd say that although re-raising RecursionError doesn't sound like the optimal behavior, it feels incrementally better?
When I saw this organically, I spent a fair amount of time trying to figure out whether my code was actually nondeterministic, so just suppressing that seems pretty helpful.

@DRMacIver
Copy link
Member

Yeah, I considered the recursion limit fiddling too and agree it sounds scary.

I wonder if there's something relatively unintrusive we can do here, like setting the recursionlimit so that inside the test function we are always effectively at the top level. Something like setting it to the default recursion limit plus the current stack depth?

When I saw this organically, I spent a fair amount of time trying to figure out whether my code was actually nondeterministic, so just suppressing that seems pretty helpful.

One light touch option might be to improve the error message for flakiness for RecursionError?

Hmm. Potentially another problem here is that the way we differentiate exceptions is by type and line number. We might need to stop doing that for RecursionError because if the stack depth changes at all it might raise RecursionError on a different line number, which will cause Hypothesis to think it's a different exception.

@pschanely
Copy link
Contributor Author

Interesting. Indeed, perhaps the tools we need to reset the stack are there, and would work?:

cur_depth = len(inspect.stack()[0])
sys.setrecursionlimit(sys.getrecursionlimit() + cur_depth)

... and inside a context manager or something.

@Zac-HD Zac-HD added the enhancement it's not broken, but we want it to be better label Jul 19, 2020
@Zac-HD
Copy link
Member

Zac-HD commented Jul 19, 2020

Sounds good to me - it's a terrible hack to fix a rare UX problem, but that's most of our internals 😂

I'd want to add a check on __exit__ that the recursion limit is wherever we left it, and fail loudly if we're undoing a change made by user code... hopefully this will never actually trigger, but it'll save someone a lot of debugging time if it does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement it's not broken, but we want it to be better
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants