-
-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add deployment update post #58
Conversation
This includes a summary and links to the various deployment efforts that have occurred in the last few months.
Dask-Cloudprovider and Dask-CUDA libraries place them | ||
all under the same `dask.distributed.SpecCluster` superclass. So we can expect a high degree of | ||
uniformity from them. Additionally, all of the classes now inherit from the | ||
`dask.distributed.Cluster` class, which standardizes things like adaptivity, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: dask-gateway and dask-yarn don't inherit from dask.distributed.Cluster
, but do match the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look great to me! Thanks for putting this together.
For cloud deployments we generally recommend using a hosted Kubernetes or Yarn | ||
service, and then using Dask-Kubernetes or Dask-Yarn on top of these. | ||
|
||
However in some institutions these hosted services aren't yet accessible, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would maybe reword this as something like "in some institutions they have made decisions or commitments to use certain vendor specific technologies."
I think a lot of the use cases for dask-cloudprovider are going to be folks who have gone "all in" on a certain cloud provider or set of technologies. I've seen a few organisations to this, sometimes for a reduced price, sometimes to encourage a constrained pallete of tools, etc.
found in HPC centers and Dask-Kubernetes. These now share a common codebase | ||
along with Dask SSH, and so are much more consistent and hopefully bug free. | ||
|
||
Hopefully users shouldn't notice much difference with existing workloads, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've got two hopefully
s four words apart here. May want to swap one for something else.
|
||
In some cases users may not have access to the cluster manager. For example | ||
the institution may not give all of their data science users access to the Yarn | ||
or Kubernetes cluster. In this the [Dask-Gateway](https://gateway.dask.org) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or Kubernetes cluster. In this the [Dask-Gateway](https://gateway.dask.org) | |
or Kubernetes cluster. In this case the [Dask-Gateway](https://gateway.dask.org) |
1. One Dask-worker per GPU on a machine | ||
2. Specify the `CUDA_VISIBLE_DEVICES` environment variable to pin that worker | ||
to that GPU | ||
3. If your machine has multiple network interfaces then choose the network interface closest to that GPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should define what "closest" means in this context, which is relative to system's topology. Ideally, we would provide samples on how can users find that information, but I think this may be too out of context for this blog.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks fo this @mrocklin, sorry I did not notice it among all other notifications 🙂.
Did a few suggestions, take or leave, this is already very nice!
the institution may not give all of their data science users access to the Yarn | ||
or Kubernetes cluster. In this case the [Dask-Gateway](https://gateway.dask.org) | ||
project may be useful. | ||
It can launch and manage Dask jobs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dask jobs or Dask clusters?
Co-Authored-By: Guillaume Eynard-Bontemps <g.eynard.bontemps@gmail.com>
Thanks all. This is in. |
This includes a summary and links to the various deployment efforts that
have occurred in the last few months.
So far this is pretty rough. Any help, including direct edits, would be welcome.
cc @lesteve @guillaumeeb @jhamman @andersy005 @jacobtomlinson @jcrist @pentschev