-
Notifications
You must be signed in to change notification settings - Fork 586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hypothesis Seed not reproducing "max allowable size" HealthCheck #3446
Comments
|
Hmm. The seed should reproduce it, even for a healthcheck that analyses multiple examples - we're careful to always use our internal PRNG with that known seed. Instead of turning up the number of examples (if the healthcheck doesn't fire in 100 it's probably not going to), I'd try just scanning through seeds upwards from zero looking for one that does reproduce. Your test does look like it's got firmly-bounded size, so I'd guess that you do indeed have some weird state-pollution thing going on. That said, I'd be pretty surprised if you can trigger such problems via our public API, and of course if that is the case we'll consider it a serious bug. (practically, once I'm back from no-internet-access vacation around Sep 7) https://github.com/pytest-dev/pytest-randomly might also be helpful for understanding which tests might interact, if you wrap some tooling around to bisect. |
This sounds like a good idea. I'll dig in and try this out.
Good to know -- I'll continue working on how to reproduce + shrink this test failure.
No worries -- I'm about to go on leave for about a month, so I'm not in a rush here. I'll let you know if/when I get more information here. |
|
Ok -- I was finally able to reproduce this and have a somewhat minimal example. If you use the following file https://github.com/alanhdu/hypothesis_3446/blob/591e2efc1e3241f72c2a01352cc3e7e7a116f7ab/test_hypothesis.py and call it set -ex
for i in {1..100}
do
echo $i
pytest test_hypothesis.py
doneI can reliably reproduce the error. (I put up a repository at https://github.com/alanhdu/hypothesis_3446 with exact reproducibility instructions using a Docker image if that helps). I will continue to try to minimize the error, but this error seems weird in a couple of ways:
I will trying to pin down the exact problem on my side as well, but hopefully this is enough breadcrumbs to confirm there is a problem here! |
|
Hi @Zac-HD -- I was wondeirng whether you had time to check out https://github.com/alanhdu/hypothesis_3446? I tried taking a look at hypothesis and admittedly did not get very far (the error seems to be fairly deep in the guts of conjecture, which I'm not very familiar with it). |
|
Not yet, sorry! Getting https://github.com/Zac-HD/hypofuzz off to a good start has taken most of my free time lately 😅 |
|
No worries! That seems like a big, exciting project. I got some more time at work to look into this and have a couple of (hopefully helpful) observations:
or in the then I can reproduce this 100% of the time. But if I use this does not reproduce! I've also checked that To me, this implies that
so there's some kind of on-disk state that is maybe causing the actual error as well.
|
|
I've done a little more digging and think there are two (or three) potential interrelated problems here:
hypothesis/hypothesis-python/src/_hypothesis_pytestplugin.py Lines 266 to 268 in 47b35ce
In the same way that we add the
@Zac-HD -- do you have thoughts on what to do next on any of these issues? And in particular, would you prefer if I split this into multiple issues to discuss them separately? |
|
Hi @alanhdu, my analysis: I think you're correct that the problem stems from database key collisions. In particular, that the tests effectively draw different amounts and when replaying an example with stored values (from another instance which draws less), it is reported as OVERRUN. There are probably other problems (non-reproducibility), but those are not as visible. I don't think the proposed change (adding Also, it's a special case of something that will always be problematic: dynamic behaviour outside of hypothesis' control, here controlled by the test runner directly - e.g., which subclass that calls Example: So yes, I think that database key collisions are problematic for db-example replays, if the colliding entries have different behaviour. |
|
To summarize, and some action points for this issue: Database key collisions are possible, and lead to examples being replayed for the wrong test. This is usually not a big problem, just wasted work; but in this case it leads to a spurious health check failure.
Item (2) above will make the error go away for the present issue by hiding the underlying db-key collisions. That's ok since they are fairly benign. But we could add a new health check to warn in at least some of these cases. I can try to create a PR for this, if @Zac-HD can point me to a clean way to achieve (2), disable-health-check. ...it also occurs to me, is it possible at present to set a stable test id (db-key) explicitly through |
|
@Zac-HD Turns out that fixing the overrun issue takes more thought than just disabling health checks during replay. Disabling the replay does fix it, so I think much of the analysis above is on the right track, but the actual overrun happens later. I won't be able to follow up on this for the moment. |
Class-based tests that inherit a Hypothesis test case emit a Hypothesis health check warning starting from hypothesis-6.83.2 [0][1]. This is due to inherited tests being run by different Hypothesis executors and may cause issues when replaying examples [2]. Inheriting Hypothesis tests in subclasses is clearly not wanted, so it makes sense to remove the pytest-asyncio test that tests for this feature. [0] https://hypothesis.readthedocs.io/en/latest/changes.html#v6-83-2 [1] HypothesisWorks/hypothesis#3720 [2] HypothesisWorks/hypothesis#3446 Signed-off-by: Michael Seifert <m.seifert@digitalernachschub.de>
Class-based tests that inherit a Hypothesis test case emit a Hypothesis health check warning starting from hypothesis-6.83.2 [0][1]. This is due to inherited tests being run by different Hypothesis executors and may cause issues when replaying examples [2]. Inheriting Hypothesis tests in subclasses is clearly not wanted, so it makes sense to remove the pytest-asyncio test that tests for this feature. [0] https://hypothesis.readthedocs.io/en/latest/changes.html#v6-83-2 [1] HypothesisWorks/hypothesis#3720 [2] HypothesisWorks/hypothesis#3446 Signed-off-by: Michael Seifert <m.seifert@digitalernachschub.de>
|
I'm pretty sure this is fixed by #3862, at last! |
Class-based tests that inherit a Hypothesis test case emit a Hypothesis health check warning starting from hypothesis-6.83.2 [0][1]. This is due to inherited tests being run by different Hypothesis executors and may cause issues when replaying examples [2]. Inheriting Hypothesis tests in subclasses is clearly not wanted, so it makes sense to remove the pytest-asyncio test that tests for this feature. [0] https://hypothesis.readthedocs.io/en/latest/changes.html#v6-83-2 [1] HypothesisWorks/hypothesis#3720 [2] HypothesisWorks/hypothesis#3446 Signed-off-by: Michael Seifert <m.seifert@digitalernachschub.de>
I have a test that is intermittently failing with
hypothesis.errors.FailedHealthCheck: Examples routinely exceeded the max allowable size.(see https://gist.github.com/alanhdu/dc2248ffc5697e91c84944ce85e63a16 as an example). Unfortunately, I have not been able to figure out how to reproduce this reliably -- when I use the provided--hypothesis-seedfrom the failure message to invoke my tests, all the tests pass and the health check is not triggered!This is on
hypothesis 6.47.1andpython 3.8(instaleld via conda). Unfortunately, this is in a fairly large internal codebase that I can't share, and I haven't been able to find a minimal reproducing example, so I'm not able to give a tangible information. Some things that might be helpful though:--hypothesis-seeddoes not reproduce the test failure (either by running the single test alone or by running the entire test suite)py.testselecting the individual test 10000 times, and none of those times triggered the health check failure. I wonder whether reproducing this somehow requires running the full test suite (as opposed to just the single test).This was a way for us to easily test multiple aspects of our interface contract given a single
construct_objectimplementation, but I wonder whether the inheritance is somehow "entangling" some hypothesis internal state causing this unreprudible health check failure.Any advice on how to debug this would be much appreciated! I will (of course) continue pursuing this on our side as well.
The text was updated successfully, but these errors were encountered: