Skip to content

All tests fail on one node and run never completes, around 20% of the time #1203

@dgolombek

Description

@dgolombek

We have an annoyingly high rate (~20%) of a test suite run using pytest-xdist having all tests on one worker fail, then the collection never completing:

$ poetry run pytest --durations=5 -v -W ignore::DeprecationWarning tests/integration/mycor -n auto
============================= test session starts ==============================
platform linux -- Python 3.11.11, pytest-8.3.4, pluggy-1.5.0 -- /home/runner/work/dave-energy/dave-energy/.venv/bin/python
cachedir: .pytest_cache
rootdir: /home/runner/work/dave-energy/dave-energy
configfile: pytest.ini
plugins: xdist-3.6.1, asyncio-0.26.0, pytest_httpx-0.29.0, Faker-35.0.0, respx-0.20.2, ddtrace-2.21.7, timeout-2.3.1, typeguard-2.13.3, anyio-3.7.1, time-machine-2.16.0
asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
created: 4/4 workers
4 workers [2215 items]

scheduling tests via LoadScheduling

tests/integration/foo.py::test_bar
[gw2] [  0%] ERROR tests/integration/foo.py::test_bar
tests/integration/foo.py::test_baz
[gw2] [  0%] ERROR tests/integration/foo.py::test_baz
[gw3] [  1%] PASSED tests/integration/quux.py::test_quux
tests/integration/quux.py::test_quux

The tests run until 85-92% completion, then hang forever.

Oddly (and almost certainly relatedly), I only see logs for gw2 and gw3 in this run -- I never see gw0 or gw1. In the runs that do pass, I see all 4 workers. I've added psutil, but that didn't help. I tried dropping to pytest-xdist 3.4.0, to see if an older version would help, but no luck. Using --dist loadfile did not help either. I'm using execnet 2.1.1.

Please let me know if there's other information I can provide, or flags I can add to provide more details -- unfortunately I can't reproduce locally, only in CI -- which is Linux vs Mac, and X86 vs ARM.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions