Skip to content

[Core] report cluster config to gcs#63012

Open
wxwmd wants to merge 1 commit intoray-project:masterfrom
wxwmd:report_cluster_config_in_autoscaler_v2
Open

[Core] report cluster config to gcs#63012
wxwmd wants to merge 1 commit intoray-project:masterfrom
wxwmd:report_cluster_config_in_autoscaler_v2

Conversation

@wxwmd
Copy link
Copy Markdown
Contributor

@wxwmd wxwmd commented Apr 29, 2026

I am using ray with autoscaler v2 and found that my program had issues during scaling up(related PR #62984). After investigation, I discovered this was because the cluster_config retrieved from GCS is None:

cluster_config = ray._private.state.state.get_cluster_config()

So I looked into it and found that this function is responsible for writing the cluster_config to GCS:

def report_cluster_config(

But this function was never actually called.

Therefore, in this PR, I modified autoscaler v2 to make it write the cluster_config to GCS.

@wxwmd wxwmd requested a review from a team as a code owner April 29, 2026 03:18
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements reporting of the cluster configuration to the GCS in the Ray V2 autoscaler, allowing downstream consumers to access the latest node group configurations. The changes include a new conversion utility to transform internal autoscaling configurations into protobuf format and a reporting mechanism triggered during initialization and state updates. Review feedback highlights a critical issue regarding the loss of precision for fractional resources due to integer casting and suggests optimizing the reporting process by caching the last reported configuration to prevent redundant GCS writes and log flooding.

Comment thread python/ray/autoscaler/v2/autoscaler.py
Comment thread python/ray/autoscaler/v2/autoscaler.py
Comment thread python/ray/autoscaler/v2/autoscaler.py
@wxwmd wxwmd changed the title [core] to #81611970 {report cluster config to gcs} [core] report cluster config to gcs Apr 29, 2026
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit de3c64a. Configure here.

Comment thread python/ray/autoscaler/v2/tests/test_report_cluster_config.py
@wxwmd wxwmd force-pushed the report_cluster_config_in_autoscaler_v2 branch from 1b368db to a23973b Compare April 29, 2026 06:15
@wxwmd wxwmd changed the title [core] report cluster config to gcs [Core] report cluster config to gcs Apr 29, 2026
@ray-gardener ray-gardener Bot added core Issues that should be addressed in Ray Core data Ray Data-related issues community-contribution Contributed by the community labels Apr 29, 2026
Signed-off-by: xiaowen.wxw <wxw403883@alibaba-inc.com>
@wxwmd wxwmd force-pushed the report_cluster_config_in_autoscaler_v2 branch from a23973b to 3134150 Compare May 2, 2026 06:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community core Issues that should be addressed in Ray Core data Ray Data-related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant