-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pytest-xdist
workers crashing
#404
Comments
[gw2] node down: Not properly terminated
[gw2] [ 4%] FAILED tests/benchmarks/test_csv.py::test_csv_basic
replacing crashed worker gw2 [gw3] node down: Not properly terminated
[gw3] [ 4%] FAILED tests/benchmarks/test_dataframe.py::test_dataframe_align
replacing crashed worker gw3 |
We saw something similar in #412 with [gw0] node down: Not properly terminated
[gw0] [ 4%] FAILED tests/benchmarks/test_array.py::test_anom_mean
replacing crashed worker gw0 |
We saw something similar in #413 with [gw1] node down: Not properly terminated
[gw1] [ 4%] FAILED tests/benchmarks/test_zarr.py::test_select_scalar
replacing crashed worker gw1 |
Looking at it quickly, It seems it is a known issue and is still open pytest-dev/pytest-xdist#466 Not quite sure how to proceed here. There are other issues opened or closed with no answer related to this. |
It may just be that we have too many concurrent xdist workers. The theory was there should not be much work done on the client, so 8 workers is fine. But that theory might not be correct. In particular, I think that package_sync might be kind of expensive for the client. Some support for package_sync being expensive is that in #429 all of the worker crashes happen on the first test of the given module, which would be when the cluster is being spun up. Two possible ways to alleviate this:
Thoughts? |
We can test this on a branch and run it every day for a few days and see if it fixes it. |
Workflow Run URL
The text was updated successfully, but these errors were encountered: