Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests/cover/test_lookup.py::test_resolving_mutually_recursive_types{,_with_defaults} fail randomly when running the test suite w/ xdist #3671

Closed
mgorny opened this issue Jun 6, 2023 · 10 comments · Fixed by #3683
Labels
flaky-tests for when our tests only sometimes pass

Comments

@mgorny
Copy link
Contributor

mgorny commented Jun 6, 2023

Well, you're not going to like this but I'm getting quasi-random test failures from the following two tests when running the large subset of the test suite in parallel:

FAILED tests/cover/test_lookup.py::test_resolving_mutually_recursive_types - RecursionError: maximum recursion depth exceeded
FAILED tests/cover/test_lookup.py::test_resolving_mutually_recursive_types_with_defaults - RecursionError: maximum recursion depth exceeded

This seems to be very flaky. It has been happening to us with PyPy3 before but I've assumed it's PyPy-specific. With 6.76.0, it suddenly started happening with CPython 3.11 as well. Curiously enough, I can reproduce it reliably when running a very specific set of tests:

python -m pytest -n 12 tests/cover tests/pytest tests/quality

but if I change tested files even a little, it stops happening.

Full log with tracebacks: dev-python:hypothesis-6.76.0:20230606-074652.log

I realize this is going to be very hard to reproduce, and I'll be glad to try anything you may suggest while I still can.

@jobh
Copy link
Contributor

jobh commented Jun 6, 2023

I'm not able to reproduce — but that is hardly surprising, from your description.

I would be curious if it may be related to borderline recursion limits, if perhaps the tests are executed at different stack depths depending on the specific pytest-xdist workload. One way to rule this out would be to add f.x.

sys.setrecursionlimit(5000)

near the top of tests/cover/test_lookup.py.

@mgorny
Copy link
Contributor Author

mgorny commented Jun 6, 2023

@jobh, indeed that seems to help. Could you add it or should I make a pull request?

@mgorny
Copy link
Contributor Author

mgorny commented Jun 6, 2023

(that said, I'm not 100% sure if it just doesn't change some other random factor but I'm happy enough that the issue disappears for now)

@jobh
Copy link
Contributor

jobh commented Jun 6, 2023

@mgorny I'm not sure if this is the right fix, but it certainly helps with narrowing down the problem.

On my local runs, a limit of 300 is sufficient (already a lot), and the default is 1000. I wonder what makes it grow so much.

@Zac-HD
Copy link
Member

Zac-HD commented Jun 6, 2023

Hmm, we do try to handle recursion-depth problems internally, but maybe need to repeat this trick somewhere else to handle strategy resolution?

try:
sys.setrecursionlimit(depth + self.__recursion_limit)
self._test_function(data)
except StopTest as e:
if e.testcounter == data.testcounter:
# This StopTest has successfully stopped its test, and can now
# be discarded.
pass
else:
# This StopTest was raised by a different ConjectureData. We
# need to re-raise it so that it will eventually reach the
# correct engine.
raise
finally:
sys.setrecursionlimit(self.__recursion_limit)

@jobh
Copy link
Contributor

jobh commented Jun 7, 2023

@Zac-HD That was my concern, that this problem was in strategy resolution due to the new recursion guard not doing its job. But it seems that it's "just" the new test having deeper effective nesting than any of the old. I.e., it could happen without recursion, at roughly 30-deep strategies (but >100-deep to fail consistently).

The error happens in strategy validation, starting here:

...but I don't know if bumping the limit here will just move the problem elsewhere (draw time)?

To answer your previous question @mgorny , I don't have time to follow up further now I'm afraid.

@jobh
Copy link
Contributor

jobh commented Jun 7, 2023

... except possibly to add a skip to the two failing tests as a short term measure

@mgorny
Copy link
Contributor Author

mgorny commented Jun 7, 2023

... except possibly to add a skip to the two failing tests as a short term measure

We're already skipping them on our end via --deselect, so there's no need to do that.

@mgorny
Copy link
Contributor Author

mgorny commented Jun 30, 2023

Thanks! I have reenabled the tests in Gentoo for 6.80.0 and I have seen no problems so far.

@Zac-HD
Copy link
Member

Zac-HD commented Jun 30, 2023

Thanks for confirming! Here's hoping that everything keeps working 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flaky-tests for when our tests only sometimes pass
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants