[Data] Track peak object store memory usage in benchmarks by yuhuan130 · Pull Request #63418 · ray-project/ray

yuhuan130 · 2026-05-18T04:17:42Z

Why are these changes needed?

Ray Data benchmarks already report runtime and object store spilling, but they do not show how much object store memory was used at peak during a benchmark run.

This metric is useful for backpressure tuning. We want to know not only whether a run spilled or got slower, but also how much object store memory it used.

What changes were made?

This PR adds a lightweight sampler to the benchmark utility.

While each benchmark case is running, the sampler periodically reads aggregate object store memory stats and keeps the highest value it sees.

This adds two new benchmark metrics:

object_store_memory_used_peak_gb
object_store_memory_utilization_peak

Why sample during the benchmark?

Spilled bytes are cumulative, so they can be measured with an end-minus-start delta.

Object store memory usage is different: it can go up during the benchmark and drop back down before the end. Sampling during the run lets us capture that peak.

Test

Smoke test with a 100 MB object store allocation:

Closes #63417

gemini-code-assist

Code Review

This pull request introduces an ObjectStoreMemorySampler to track peak object store usage and utilization during benchmarks using a background thread. It also adds new metrics to the BenchmarkMetric enum and integrates the sampler into the run_fn execution flow. Reviewer feedback identifies performance improvements, specifically regarding the caching of GlobalState to avoid expensive re-initializations and the removal of a redundant gRPC call by reusing stats already collected by the sampler.

yuhuan130 · 2026-05-18T04:31:16Z

Will follow up and rebuild Ray locally for some additional testing once I’m off the plane 🙌

Signed-off-by: Alex Chien <alexchien130@gmail.com>

bveeramani · 2026-05-20T00:18:56Z

+def _get_object_store_stats(state):
+    """Get aggregate object store stats across the cluster."""
+    memory_info = get_memory_info_reply(state)
+    return memory_info.store_stats


This seems like a really thin abstraction (2 LOC). Given that this is only called in two places, I'm not sure if this is worth abstracting

bveeramani · 2026-05-20T00:19:41Z

    return round(b / (1024**3), 4)


+class ObjectStoreMemorySampler:


Make this a context manager like MemoryProfiler? More canonical than try-finally for cleanup

bveeramani · 2026-05-20T00:19:55Z

+        self.peak_used_bytes = 0
+        self.peak_utilization = 0.0


Nit: Make property to make more explicit this is part of public interface?

Signed-off-by: Alex Chien <alexchien130@gmail.com>

gemini-code-assist Bot reviewed May 18, 2026

View reviewed changes

Comment thread release/nightly_tests/dataset/benchmark.py Outdated

Comment thread release/nightly_tests/dataset/benchmark.py Outdated

ray-gardener Bot added data Ray Data-related issues observability Issues related to the Ray Dashboard, Logging, Metrics, Tracing, and/or Profiling community-contribution Contributed by the community labels May 18, 2026

feat: track peak object store memory in data benchmarks

761f092

Signed-off-by: Alex Chien <alexchien130@gmail.com>

yuhuan130 force-pushed the feat/add-object-store-high-watermark branch from 0986c2c to 761f092 Compare May 18, 2026 17:45

bveeramani self-assigned this May 19, 2026

yuhuan130 mentioned this pull request May 19, 2026

[Data] Add backpressure release benchmarks #63527

Open

bveeramani approved these changes May 20, 2026

View reviewed changes

refactor: clean up object store memory sampler API

b69f345

Signed-off-by: Alex Chien <alexchien130@gmail.com>

bveeramani enabled auto-merge (squash) May 20, 2026 22:03

github-actions Bot added the go add ONLY when ready to merge, run all tests label May 20, 2026

bveeramani merged commit 7cd206a into ray-project:master May 20, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Data] Track peak object store memory usage in benchmarks#63418

[Data] Track peak object store memory usage in benchmarks#63418
bveeramani merged 2 commits into
ray-project:masterfrom
yuhuan130:feat/add-object-store-high-watermark

yuhuan130 commented May 18, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

yuhuan130 commented May 18, 2026

Uh oh!

bveeramani May 20, 2026

Uh oh!

bveeramani May 20, 2026

Uh oh!

bveeramani May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		return round(b / (1024**3), 4)


		class ObjectStoreMemorySampler:

Conversation

yuhuan130 commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

What changes were made?

Why sample during the benchmark?

Test

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

yuhuan130 commented May 18, 2026

Uh oh!

bveeramani May 20, 2026

Choose a reason for hiding this comment

Uh oh!

bveeramani May 20, 2026

Choose a reason for hiding this comment

Uh oh!

bveeramani May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yuhuan130 commented May 18, 2026 •

edited

Loading