Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure workers do not kill on restart #8611

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

fjetter
Copy link
Member

@fjetter fjetter commented Apr 10, 2024

Closes #7312
Supersedes #7323

closes #7321

xref #7320

Copy link
Contributor

github-actions bot commented Apr 10, 2024

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

    29 files  +    10      29 suites  +10   12h 10m 14s ⏱️ + 5h 47m 38s
 4 084 tests +     1   3 957 ✅  -      9    112 💤  -   2  15 ❌ +12 
55 244 runs  +21 365  52 775 ✅ +20 398  2 431 💤 +932  38 ❌ +35 

For more details on these failures, see this check.

Results for commit 014ed07. ± Comparison against base commit eab58be.

This pull request removes 7 and adds 8 tests. Note that renamed tests count towards both.
distributed.diagnostics.tests.test_memory_sampler ‑ test_async
distributed.tests.test_client ‑ test_restart_workers_kill_timeout[False]
distributed.tests.test_client ‑ test_restart_workers_kill_timeout[True]
distributed.tests.test_failed_workers ‑ test_worker_ttl_restarts_worker[False]
distributed.tests.test_failed_workers ‑ test_worker_ttl_restarts_worker[True]
distributed.tests.test_scheduler ‑ test_restart_nanny_timeout_exceeded
distributed.tests.test_scheduler ‑ test_restart_worker_rejoins_after_timeout_expired
distributed.deploy.tests.test_local ‑ test_localcluster_restart
distributed.tests.test_failed_workers ‑ test_worker_ttl_restarts_worker[None]
distributed.tests.test_failed_workers ‑ test_worker_ttl_restarts_worker[event_loop]
distributed.tests.test_failed_workers ‑ test_worker_ttl_restarts_worker[threadpool]
distributed.tests.test_nanny ‑ test_restart_stress[kill]
distributed.tests.test_nanny ‑ test_restart_stress[restart]
distributed.tests.test_nanny ‑ test_worker_start_exception_after_restart[kill]
distributed.tests.test_nanny ‑ test_worker_start_exception_after_restart[restart]

♻️ This comment has been updated with latest results.

@fjetter fjetter force-pushed the nanny_dont_kill_worker_restart branch from b345ae1 to 6fa046b Compare April 12, 2024 07:01
@hendrikmakait
Copy link
Member

What's the status of this PR? Do you need help resolving conflicts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Confusing Nanny on_exit callback structure Restart can kill a worker
2 participants