-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Description
Description
KubeRay currently uses the BatchingNodeProvider to manage clusters externally (using the KubeRay operator), which enables users to interact with external cluster management systems. However, to support custom providers with the BatchingNodeProvider, users must implement a module and integrate it as an external type provider, which leads to inconvenience.
On the other hand, LocalNodeProvider offers the CoordinatorSenderNodeProvider to manage clusters externally through a coordinator server, but the local type provider currently does not support updates for clusters.
To simplify custom cluster management, adding the BatchingNodeProvider and BatchingSenderNodeProvider would be highly beneficial. This would significantly assist users who wish to customize and use their own providers for managing clusters (on-premises or multi cloud environments).
For example, the following configuration could be used to add the BatchingNodeProvider to the provider type:
provider:
type: batch
coordinator_address: "127.0.0.1:8000"This would allow users to easily configure external cluster management with the BatchingNodeProvider, enhancing the flexibility and usability of the system.
Use case
ray/python/ray/autoscaler/_private/providers.py
Lines 184 to 197 in 8773682
| _NODE_PROVIDERS = { | |
| "local": _import_local, | |
| "fake_multinode": _import_fake_multinode, | |
| "fake_multinode_docker": _import_fake_multinode_docker, | |
| "readonly": _import_readonly, | |
| "aws": _import_aws, | |
| "gcp": _import_gcp, | |
| "vsphere": _import_vsphere, | |
| "azure": _import_azure, | |
| "kuberay": _import_kuberay, | |
| "aliyun": _import_aliyun, | |
| "external": _import_external, # Import an external module | |
| "spark": _import_spark, | |
| } |
If the 'batch' type is additionally supported in the provider configuration, users will be able to manage the creation and deletion of cluster nodes externally in the coordinator server.