-
Notifications
You must be signed in to change notification settings - Fork 578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Busy loop randomly runs 6x slower causing flaky DeadlineExceeded errors #4009
Comments
After the first failure, the test is re-run under tracing to try to give more information about the failure. The busy loop is probably close to worst case for tracing performance impact. To verify, you can add
at the top of the test. This will effectively disable hypothesis' tracing, since only one tracer can be attached. Note that tracing is much faster in python >= 3.12. |
And thanks for the detailed investigation! It supplied all the clues 😁 |
Ahh yeah this is it. That does indeed make the issue go away. As does doing: @settings(phases=[p for p in Phase if p != Phase.explain]) |
This is an unfortunate experience, but I don't think we can really improve on it - any solution would also make the logic more complicated, and therefore make Hypothesis harder to understand. Only a few years until all supported Python versions are fast, though... |
Is it worth adding something to the error message? Like this message Or add a note in the docs? Maybe here https://hypothesis.readthedocs.io/en/latest/settings.html#controlling-what-runs |
Running a busy loop takes significantly more time for some runs leading to flaky
DeadlineExceeded
errors. Here is an example that is significantly reduced from the code in my actual project.Output:
Running this multiple times gives fairly consistent results.
239.56ms, which exceeded the deadline of 100.00ms, but on a subsequent run it took 31.85 ms
241.02ms, which exceeded the deadline of 100.00ms, but on a subsequent run it took 33.00 ms
238.56ms, which exceeded the deadline of 100.00ms, but on a subsequent run it took 32.03 ms
241.47ms, which exceeded the deadline of 100.00ms, but on a subsequent run it took 33.48 ms
Instrumenting with
cProfile
like this:We get (line breaks removed):
The "lag spike" can vary in the number of runs. Sometimes it's just a single run of the test. Not 100% sure but I think the "lag spike" starts on the example immediately after the first error (whether or not the next example also fails).
I tried to reproduce this without
hypothesis
but sticking fairly close to whathypothesis
is doing and came up with this:But the "lag spike" doesn't occur. All the timings are 35-36ms.
Details
Python 3.10
Hypothesis 6.103.1 (tried a couple older versions too, 6.103.0 and 6.100.0)
I'm on a fairly old laptop (Core i5) so you might need to drop the deadline or increase the number of runs in the busy loop to reproduce this.
In my actual project I'm trying to test a predicate logic solver and seeing an even starker difference:
295.24ms, which exceeded the deadline of 200.00ms, but on a subsequent run it took 21.73 ms
, about 14x slower.Possible Causes
Garbage Collection
Doesn't seem like it. Adding
gc.disable()
at the start of the test andgc.enable()
at the end has no effect. The "lag spike" is still visible in the profile stats when the profiler is in between thegc.disable()
andgc.enable()
.CPU Instruction cache
Maybe, but you would think just the first couple runs of the list comprehension in the busy loop would be slow but then get faster.
Hypothesis test harness code being included in the measured time
Doesn't really make sense because we see the "lag spike" in the profiler timings which are narrowed to just the busy loop.
The text was updated successfully, but these errors were encountered: