New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not rely on logging for SubprocessCluster
#8398
Conversation
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 27 files ± 0 27 suites ±0 11h 52m 36s ⏱️ + 14m 41s For more details on these failures, see this check. Results for commit 2c49f6b. ± Comparison against base commit c408b6a. |
The new test is very flaky: |
with new_config_file( | ||
{"distributed": {"logging": {"distributed": logging.CRITICAL + 1}}} | ||
): | ||
await asyncio.wait_for(_start(), timeout=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what's the purpose of this super-short timeout, and why the timeout integrated in gen_test is not what we want in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote the test before the fix and didn't want to wait the entire 30 seconds while fixing.
Attempt fix on #8413 |
Partially addresses #8392, #8393
This PR relies on the
scheduler_file
to propagate the scheduler address to the worker.Note:
Conceptually, the same fix can be applied to the
SSHCluster
but things get mildly more complicated because one might have to deal with different file systems for the client and the scheduler.pre-commit run --all-files