Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add customizable worker_threads #353

Merged
merged 5 commits into from
Jan 19, 2021
Merged

Conversation

AndreaGiardini
Copy link
Contributor

@AndreaGiardini AndreaGiardini commented Nov 26, 2020

Hey

I added the possibility to customize the number of threads per worker.
Until now worker_threads = cores but that's not always the best combination.

If worker_threads is not set then it will fall-back to the default.

Let me know if you have any feedback

Fixes #364
Fixes #362

Copy link

@lorenzori lorenzori left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice thanks! so we can use it as an argument in gateway.new_cluster()?

@AndreaGiardini
Copy link
Contributor Author

Hey @jcrist
I had a look at the failing CI and it looks like dask/distributed does not communicate the number of cores but only the nthreads, this is why the tests for the widget fail:

https://github.com/dask/distributed/blob/b4dfc925bac32a488be2016a5930a9b7dd95cec5/distributed/scheduler.py#L1707-L1728

Do you have any preference on how to tackle this?

@AndreaGiardini
Copy link
Contributor Author

Friendly ping to @consideRatio . Maybe you can give me some feedback on this?

@erl987
Copy link

erl987 commented Dec 11, 2020

Very useful initiative, relates to issue #362.

Copy link
Collaborator

@consideRatio consideRatio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work @AndreaGiardini! I'm not representing this project but hopefully my review can be helpful.

First of all I think its sensible to allow worker_threads to be customizable, so the feature idea is sound.

What I would suggest iterating on is mainly the default value of worker_threads in the normal ClusterConfig and in the KubeClusterConfig.

The ClusterConfig has a default value of worker_threads to be 1, but in the past, it was matching the amount of CPU. The KubeClusterConfig deriving from ClusterConfig has a dynamic default value of worker_threads set to be worker_cores, but in the past int(worker_cores_limit) was used.

As the PR summary or title doesn't indicate an intent to change the default behavior, I think what makes sense to recommend, is to update the PRs code suggestion to implement a dynamic _default_worker_threads function with the @default decorator in both ClusterConfig and KubeClusterConfig with the different defaults previously used.

@@ -1052,7 +1052,7 @@ def get_worker_command(self, namespace, cluster_name, config):
"--name",
"$(DASK_GATEWAY_WORKER_NAME)",
"--nthreads",
str(int(config.worker_cores_limit)),
str(int(config.worker_threads)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think worker_threads is of an Integer type already (traitlets), while worker_cores_limit was a Float traitlet type and needed the cast to integer first.

Suggested change
str(int(config.worker_threads)),
str(config.worker_threads),

@AndreaGiardini
Copy link
Contributor Author

@consideRatio Thank you for the valuable input

I fixed the PR as you suggest, I can always squash as final step before merging if needed.
Still, the problem remains with the widget unfortunately... How do you think we should tackle this?

@AndreaGiardini
Copy link
Contributor Author

Hey @consideRatio and @jcrist

I took some time to rebase the PR and make CI green without upgrading the distributed dep.
I found an issue with the docs generation, that should be fixed as well.

Since there seems to be quite some interest in this feature, can you tell me if something else is needed to get this merged? Thank you

@martindurant
Copy link
Member

@consideRatio , are you happy with the changes here?

This all looks fair as far as I can tell.

@jcrist , are you likely to have time to have a quick look?

@jacobtomlinson , this is one of the few repos still on Travis, which still appears to work for now. I expect it might be more complex than most to transform.

@jcrist
Copy link
Member

jcrist commented Jan 19, 2021

Apologies for letting this linger, this looks good to me. Merging. Thanks for the contribution!

@jcrist jcrist merged commit dbcb726 into dask:master Jan 19, 2021
@martindurant
Copy link
Member

Thanks, @jcrist

@AndreaGiardini AndreaGiardini deleted the worker_threads branch January 19, 2021 15:27
@AndreaGiardini
Copy link
Contributor Author

Thank you to all of you! :) Would you mind tagging a new release whenever possible? We could really benefit from this

@consideRatio
Copy link
Collaborator

Wieee! 🎉

@consideRatio consideRatio added new and removed enhancement New feature or request labels Apr 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[kubernetes] Cannot specify threads per worker Allow single-threaded running of workers
6 participants