Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix inconsistent hashing for Nanny-spawned workers #8400

Merged
merged 4 commits into from Dec 18, 2023

Conversation

cisaacstern
Copy link
Contributor

@cisaacstern cisaacstern commented Dec 12, 2023

Towards #4141

  • Tests added / passed
  • Passes pre-commit run --all-files

This PR implements @mrocklin's suggestion in #4141 (comment) to replicate dask/dask#6660, but for Nanny-spawned workers. As noted there, this is not a complete solution for #4141, as the Nanny is not required for spawning workers, but nonetheless should unblock a large percentage of uses cases currently blocked by #4141.

First commit is just a failing test, will push the fix momentarily.

@GPUtester
Copy link
Collaborator

Can one of the admins verify this patch?

Admins can comment ok to test to allow this one PR to run or add to allowlist to allow all future PRs from the same author to run.

Copy link
Contributor

github-actions bot commented Dec 12, 2023

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

       27 files  ±  0         27 suites  ±0   11h 45m 14s ⏱️ - 2m 44s
  3 939 tests +  1    3 823 ✔️  -   3     112 💤 +  2  4 +2 
49 547 runs  +11  47 224 ✔️  - 11  2 317 💤 +27  6  - 5 

For more details on these failures, see this check.

Results for commit acc5075. ± Comparison against base commit 87576ae.

This pull request removes 3 and adds 4 tests. Note that renamed tests count towards both.
distributed.diagnostics.tests.test_scheduler_plugin ‑ test_register_plugin_pickle_disabled
distributed.tests.test_scheduler ‑ test_run_on_scheduler_disabled
distributed.tests.test_worker ‑ test_gpu_executor
distributed.deploy.tests.test_subprocess ‑ test_subprocess_cluster_does_not_depend_on_logging
distributed.tests.test_dask_collections ‑ test_bag_groupby_key_hashing
distributed.tests.test_scheduler ‑ test_allow_pickle_false
distributed.tests.test_scheduler ‑ test_override_is_rootish
This pull request skips 2 tests.
distributed.tests.test_scheduler ‑ test_decide_worker_coschedule_order_neighbors[nthreads0-1]
distributed.tests.test_scheduler ‑ test_decide_worker_coschedule_order_neighbors[nthreads1-1]

♻️ This comment has been updated with latest results.

@fjetter
Copy link
Member

fjetter commented Dec 13, 2023

There is one related test failure in distributed/tests/test_nanny.py::test_environment_variable_pre_post_spawn which h ard codes what variables we set in the pre spawn config

You're also affected by #8397

Otherwise this LGTM. If you can adjust the failing test we can merge

@cisaacstern
Copy link
Contributor Author

Thanks so much for the prompt review @fjetter! The last commit fixes the broken test.

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing up the tests @cisaacstern. Failures we see now look unrelated tot his change.

@jacobtomlinson jacobtomlinson merged commit 8c3eb6f into dask:main Dec 18, 2023
28 of 33 checks passed
@cisaacstern
Copy link
Contributor Author

Thanks @jacobtomlinson and @fjetter !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants