feat: ReplayBuffer example. by k82cn · Pull Request #441 · xflops/flame

k82cn · 2026-05-09T06:41:29Z

No description provided.

Signed-off-by: Klaus Ma <klausm@nvidia.com>

gemini-code-assist

Code Review

This pull request introduces a distributed reinforcement learning replay buffer example that leverages Flame's patch_object API for efficient data synchronization. It includes a collector service, a replay buffer implementation with custom deserialization, and updated gRPC configurations in both the Rust cache server and Python SDK to support larger message sizes. Review feedback identified potential security risks in setting unlimited message sizes, suggesting a 2GB cap instead to prevent resource exhaustion, and pointed out a mathematical inaccuracy in the average reward calculation within the example script.

gemini-code-assist · 2026-05-09T06:44:01Z

+        .add_service(
+            FlightServiceServer::new(server)
+                .max_decoding_message_size(usize::MAX)
+                .max_encoding_message_size(usize::MAX),
+        )


Setting the maximum message size to usize::MAX effectively disables all size limits. While necessary for large reinforcement learning buffers, using an unlimited value can expose the server to Out-Of-Memory (OOM) issues or denial-of-service attacks if malformed or excessively large packets are received. It is safer to use a very large but finite limit (e.g., 2GB).

Suggested change

.add_service(

FlightServiceServer::new(server)

.max_decoding_message_size(usize::MAX)

.max_encoding_message_size(usize::MAX),

)

.add_service(

FlightServiceServer::new(server)

.max_decoding_message_size(2 * 1024 * 1024 * 1024)

.max_encoding_message_size(2 * 1024 * 1024 * 1024),

)

gemini-code-assist · 2026-05-09T06:44:01Z

+GRPC_OPTIONS = [
+    ("grpc.max_send_message_length", -1),
+    ("grpc.max_receive_message_length", -1),
+]


Using -1 for gRPC message length options disables the limits entirely. Similar to the server-side change, it is recommended to use a large but bounded value to prevent potential stability issues or memory exhaustion.

Suggested change

GRPC_OPTIONS = [

("grpc.max_send_message_length", -1),

("grpc.max_receive_message_length", -1),

]

GRPC_OPTIONS = [

("grpc.max_send_message_length", 2 * 1024 * 1024 * 1024),

("grpc.max_receive_message_length", 2 * 1024 * 1024 * 1024),

]

gemini-code-assist · 2026-05-09T06:44:01Z

+            stats = buffer_svc.state().get()
+            total_size = stats["size"]
+            total_added = stats["total_added"]
+            avg_reward = sum(r["avg_reward"] for r in collect_results) / num_collections


This line calculates an average of averages, which can be mathematically incorrect if the number of episodes completed by each collector varies. A more accurate approach is to calculate a weighted average based on the total episodes seen by each collector.

Suggested change

avg_reward = sum(r["avg_reward"] for r in collect_results) / num_collections

total_episodes = sum(r["episode_count"] for r in collect_results)

avg_reward = sum(r["avg_reward"] * r["episode_count"] for r in collect_results) / max(1, total_episodes)

codecov · 2026-05-09T06:59:50Z

Codecov Report

❌ Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
object_cache/src/cache.rs	0.00%	3 Missing ⚠️

📢 Thoughts on this report? Let us know!

- Use weighted average for avg_reward calculation based on episode count - Add codecov.yml to disable patch coverage check while keeping project coverage

feat: ReplayBuffer example.

1e4c842

Signed-off-by: Klaus Ma <klausm@nvidia.com>

gemini-code-assist Bot reviewed May 9, 2026

View reviewed changes

fix: address PR review comments

252396c

- Use weighted average for avg_reward calculation based on episode count - Add codecov.yml to disable patch coverage check while keeping project coverage

k82cn merged commit 56579ec into xflops:main May 9, 2026
6 of 7 checks passed

k82cn deleted the rb_example branch May 9, 2026 09:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: ReplayBuffer example.#441

feat: ReplayBuffer example.#441
k82cn merged 2 commits intoxflops:mainfrom
k82cn:rb_example

k82cn commented May 9, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 9, 2026

Uh oh!

gemini-code-assist Bot May 9, 2026

Uh oh!

gemini-code-assist Bot May 9, 2026

Uh oh!

codecov Bot commented May 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	avg_reward = sum(r["avg_reward"] for r in collect_results) / num_collections
	total_episodes = sum(r["episode_count"] for r in collect_results)
	avg_reward = sum(r["avg_reward"] * r["episode_count"] for r in collect_results) / max(1, total_episodes)

Conversation

k82cn commented May 9, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented May 9, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant