-
-
Notifications
You must be signed in to change notification settings - Fork 156
Description
The current implementation of KubeCluster insists users specify a pod spec for their workers and schedulers and it provides some helper functions to do this and does some heavy customisation under the hood.
from dask_kubernetes import KubeCluster, make_pod_spec
pod_spec = make_pod_spec(image="ghcr.io/dask/dask:latest")
cluster = KubeCluster(pod_template=pod_spec)In the new operator based implementation of KubeCluster we have exposed a few pod config options as kwargs, such as image, env, resources, etc.
from dask_kubernetes.experimental import KubeCluster
cluster = KubeCluster(name="foo", image="ghcr.io/dask/dask:latest")I like the simplicity of this, but it doesn't scale well and we could easily find ourselves with kwarg creep where folks casually contribute more and more options that they want to configure.
In #452 the new CRDs have been refactored to contain a pod spec for both the worker and scheduler and a service spec for the scheduler. This opens things up so that we could also allow users to pass pod and service specs to the new cluster manager.
from dask_kubernetes import make_pod_spec
from dask_kubernetes.experimental import KubeCluster
pod_spec = make_pod_spec(image="ghcr.io/dask/dask:latest")
cluster = KubeCluster(name="foo", pod_template=pod_spec)To do this we would probably need to align make_pod_spec from common with KubeCluster._build_worker_spec from experimental to ensure they were returning compatible specs.
The Scheduler pod and service specs could also be supported.
The benefit of this is it will ease migration for folks to transition from the classic KubeCluster to the new experimental one. It also makes it easy for us to say no to PRs that extend the supported kwargs and instead point folks to creating their own specs.