[Autoscaler] Add idleTerminationSeconds for cluster-level idle termination#63465
Draft
win5923 wants to merge 1 commit into
Draft
[Autoscaler] Add idleTerminationSeconds for cluster-level idle termination#63465win5923 wants to merge 1 commit into
win5923 wants to merge 1 commit into
Conversation
Contributor
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
…ation Signed-off-by: win5923 <ken89@kimo.com>
cad0af4 to
3298172
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Terminate idle cluster when the cluster with autoscaler
When
autoscalerOptions.idleTerminationSecondsis set, the V2 autoscaler evaluates a cluster-level idle predicate every reconcile loop and, when it fires, patch a single annotation on the RayCluster CR:The KubeRay operator observes the condition and decides the terminal action. (delete RayCluster)
Changes
This PR adds
autoscalerOptions.idleTerminationSecondsand a four-layer predicate that decides when the cluster is truly idle. The autoscaler emits an annotation; KubeRay owns the lifecycle action.1. New
autoscalerOptions.idleTerminationSecondsfield, V2 + KubeRay onlyBackground: per-node
idleTimeoutSecondsonly scales worker podsThis PR introduces the field with the following semantics:
spec.autoscalerOptions.idleTerminationSeconds.idleTimeoutSeconds(default 60 when unset). Strict>avoids the race where worker scale-down and cluster termination fire on the same reconcile loop, and keeps the predicate's Gate 0 well-behaved.KubeRayProvider) ensure no leakage.2. Four-layer idle predicate in the V2 scheduler + reconciler
Background: a native
min(idle_duration_ms across alive nodes) > thresholdis unsound. Drivers register throughWorkerPool::RegisterDriverand never enterleased_workers_, so a pure-Python driver on the head keepsidle_duration_msgrowing whilestatusisIDLE. Per-node and cluster-level idle also race during scale-down, and the scheduler does not see pending demand placed mid-reconcile.The predicate composes four layers, each addressing a distinct failure mode:
minReplicas. Defers the cluster predicate until per-node idle termination has finished its work.idle_duration_msmust exceedidleTerminationSeconds. Aligned with the existing_enforce_idle_terminationdefinition of "alive" =SCHEDULABLE.request.resource_requests,gang_resource_requests, andcluster_resource_constraintsmust all be empty. Catches the moment between "user submitted task" and "worker assigned".Related issues
Closes #63452
Additional information