-
-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update cpu and memory requests after cluster creation #88
Comments
cc @yuvipanda . I suspect that he may have counter-examples where this
might be problematic? I'm not sure.
…On Fri, Jul 20, 2018 at 11:42 AM, Jacob Tomlinson ***@***.***> wrote:
On our Pangeo deployment users get a default worker-template.yaml which
allows them to create clusters by simply running cluster = KubeCluster()
without having to worry about what a kubernetes even is.
However in some occasions people want to be able to update the memory and
cpu ratios of their workers depending on what they are running. The current
workflow for this is to either specify the whole template as a dict or to
copy the default worker-template.yaml, understand it, update the values
and then user KubeCluster.from_yaml().
Personally I ended up writing a couple of helper functions in my notebook
which look like this:
def update_worker_memory(cluster, new_limit):
cluster.pod_template.spec.containers[0].resources.limits["memory"] = new_limit
cluster.pod_template.spec.containers[0].resources.requests["memory"] = new_limit
if '--memory-limit' in cluster.pod_template.spec.containers[0].args:
index = cluster.pod_template.spec.containers[0].args.index('--memory-limit')
cluster.pod_template.spec.containers[0].args[index + 1] = new_limit
return cluster
def update_worker_cpu(cluster, new_limit):
cluster.pod_template.spec.containers[0].resources.limits["cpu"] = new_limit
cluster.pod_template.spec.containers[0].resources.requests["cpu"] = new_limit
if '--nthreads' in cluster.pod_template.spec.containers[0].args:
index = cluster.pod_template.spec.containers[0].args.index('--nthreads')
cluster.pod_template.spec.containers[0].args[index + 1] = new_limit
return cluster
This allows me to adjust the worker template after the cluster has been
created and all new workers will follow the updated values.
I'm considering how to add this functionality into the core project. I'm
inspired by the dask-jobqueue SLURMCluster
<http://dask-jobqueue.readthedocs.io/en/latest/generated/dask_jobqueue.SLURMCluster.html#dask_jobqueue.SLURMCluster>
which allows you to specify cores and memory as kwargs. Therefore perhaps
@mrocklin <https://github.com/mrocklin>, @jhamman
<https://github.com/jhamman> or @guillaumeeb
<https://github.com/guillaumeeb> have thoughts.
Before I go charging in to raise a PR I would like to discuss options.
- Would it be useful to add methods to the KubeCluster object to
update sizes after creation as I am above?
- Should we add kwargs to the cluster init and if so should they
create the cluster and use the helpers or update the config before creation?
- Are there any other ways of specifying memory and cpu that I haven't
captured in the examples above?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#88>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AASszFgkz62mGYU0fay_WdW8eJcWTvM_ks5uIfpcgaJpZM4VYPj5>
.
|
I'd be happy to help here, but I don't really know how to... I've never used One thought: does or may this affect Cloud VM flavor also? Are these hard coded in another place when defining the Kubernetes cluster? It will be good if by specifying cpu, mem, and perhaps flavor, the automatically created VM are adapted to those. |
@guillaumeeb you currently specify the whole worker config when you initialise the cluster and can configure everything from resources to docker image. The admin of the cluster can also specify a default config to use if the user doesn't specify one. This issue is discussing how to make tweaks to resources in the default config on the fly without having to specify a whole new config. |
@jacobtomlinson does Kubernetes chose by itself the machine type to use according to what is specified in the pod spec? Looking at http://dask-kubernetes.readthedocs.io/en/latest/#quickstart, I don't see anything like Sorry if this is a dumb question, I still need to learn how Kubernetes is working. |
That kind of thing is down to kubernetes to decide, you don't really need
to worry about it. The way you set up your kubernetes cluster will decide
which types are used, and kubernetes will try to pack as efficiently as
possible into the available space.
For example we used kops to create a kubernetws cluster on AWS which is
made up of a mix of m5.2xlarge and m5.4xlarge instances. But when using
dask kubernetes we ask for pods with 1 cpu and 6GB of memory and they get
packed into those instances.
…On Wed, 1 Aug 2018, 21:21 Guillaume EB, ***@***.***> wrote:
@jacobtomlinson <https://github.com/jacobtomlinson> does Kubernetes chose
by itself the machine type to use according to what is specified in the pod
spec? Looking at
http://dask-kubernetes.readthedocs.io/en/latest/#quickstart, I don't see
anything like n1-standard-2 or another flavor.
Sorry if this is a dumb question, I still need to learn how Kubernetes is
working.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#88 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABiUYtJb12GCWQK89YTf2ngg5YiG02Mjks5uMg3lgaJpZM4VYPj5>
.
|
So looking at core.py, currently the easiest way to do what you say is to use the following method: pod_spec = make_pod_spec(image='daskdev/dask:latest',
memory_limit='4G', memory_request='4G',
cpu_limit=1, cpu_request=1,
env={'EXTRA_PIP_PACKAGES': 'fastparquet git+https://github.com/dask/distributed'}) Which implies that you understand all the rest. Again I've only a limited view of all this, but it sounds like it would be a welcome functionality here. The way I would do that is to add your update methods or an equivalent, and call that in the Updating after creation looks like and edge case to me, this is not what we are doing in dask-jobqueue. But I can understand in some situation you might want to do this. |
Yes I agree that updating it after creating is an edge case. Perhaps a better way would be to allow users to call E.g pod_spec = make_pod_spec(memory_limit='4G', memory_request='4G',
cpu_limit=1, cpu_request=1) In this example the image and extra packages would come from the default. There is no reason why a scientist should even know what a docker image is. |
Not sure it's such an edge case. It's possible to think about running a single dask scheduler for multiple tasks, each of which may have a slightly different set of arguments/variables. |
@mturok interesting point. However the example I put above would modify the scheduler which would not be great for multi-use clusters. |
As long as the pods get created with matching values, there should be no pathological cases. |
Quick question, that may be related to this issue.
The main use case is to have some instances with GPUs and others without, and use them as needed. I guess this would require another scheduling layer to fit the work in the smallest pool possible, plus some logic for when a pool is maxed out but the other one is not. |
That is an interesting idea but would involve upstream changes in dask and distributed. For now I would just create multiple clusters. @mrocklin I'm sure you are very busy at the moment do you have any thoughts on this? |
We are thinking about this in dask/distributed#2118 and dask/distributed#2208 (comment). But we didn't talk about adaptive part yet ("scale them according to what the client requests"), which would probably need modifications too. |
I think that this seems like a reasonable request. I think that it would require additional logic to the |
As work is ongoing in the issues that @guillaumeeb mentioned and much of the scaling logic in dask-kubernetes has been replaced with |
On our Pangeo deployment users get a default
worker-template.yaml
which allows them to create clusters by simply runningcluster = KubeCluster()
without having to worry about what a kubernetes even is.However in some occasions people want to be able to update the memory and cpu ratios of their workers depending on what they are running. The current workflow for this is to either specify the whole template as a dict or to copy the default
worker-template.yaml
, understand it, update the values and then userKubeCluster.from_yaml()
.Personally I ended up writing a couple of helper functions in my notebook which look like this:
This allows me to adjust the worker template after the cluster has been created and all new workers will follow the updated values.
I'm considering how to add this functionality into the core project. I'm inspired by the dask-jobqueue SLURMCluster which allows you to specify cores and memory as kwargs. Therefore perhaps @mrocklin, @jhamman or @guillaumeeb have thoughts.
Before I go charging in to raise a PR I would like to discuss options.
KubeCluster
object to update sizes after creation as I am above?The text was updated successfully, but these errors were encountered: