Skip to content

[Autoscaler] Add Support for BatchingNodeProvider in Autoscaler Config Option #51514

@nadongjun

Description

@nadongjun

Description

KubeRay currently uses the BatchingNodeProvider to manage clusters externally (using the KubeRay operator), which enables users to interact with external cluster management systems. However, to support custom providers with the BatchingNodeProvider, users must implement a module and integrate it as an external type provider, which leads to inconvenience.

On the other hand, LocalNodeProvider offers the CoordinatorSenderNodeProvider to manage clusters externally through a coordinator server, but the local type provider currently does not support updates for clusters.

To simplify custom cluster management, adding the BatchingNodeProvider and BatchingSenderNodeProvider would be highly beneficial. This would significantly assist users who wish to customize and use their own providers for managing clusters (on-premises or multi cloud environments).

For example, the following configuration could be used to add the BatchingNodeProvider to the provider type:

provider:
    type: batch
    coordinator_address: "127.0.0.1:8000"

This would allow users to easily configure external cluster management with the BatchingNodeProvider, enhancing the flexibility and usability of the system.

Use case

_NODE_PROVIDERS = {
"local": _import_local,
"fake_multinode": _import_fake_multinode,
"fake_multinode_docker": _import_fake_multinode_docker,
"readonly": _import_readonly,
"aws": _import_aws,
"gcp": _import_gcp,
"vsphere": _import_vsphere,
"azure": _import_azure,
"kuberay": _import_kuberay,
"aliyun": _import_aliyun,
"external": _import_external, # Import an external module
"spark": _import_spark,
}

If the 'batch' type is additionally supported in the provider configuration, users will be able to manage the creation and deletion of cluster nodes externally in the coordinator server.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Important issue, but not time-criticalcommunity-backlogcoreIssues that should be addressed in Ray CoreenhancementRequest for new feature and/or capability

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions