Skip to content

[ISSUE #4539]✨Add performance optimizations and new benchmarks for ScheduleMessageService#4540

Merged
rocketmq-rust-bot merged 1 commit intomainfrom
enh-4539
Dec 9, 2025
Merged

[ISSUE #4539]✨Add performance optimizations and new benchmarks for ScheduleMessageService#4540
rocketmq-rust-bot merged 1 commit intomainfrom
enh-4539

Conversation

@mxsm
Copy link
Copy Markdown
Owner

@mxsm mxsm commented Dec 9, 2025

Which Issue(s) This PR Fixes(Closes)

Fixes #4539

Brief Description

How Did You Test This Change?

Summary by CodeRabbit

  • Tests

    • Added comprehensive unit test coverage for scheduled message service components.
  • Chores

    • Introduced performance benchmark configurations and suite for message scheduling operations.
    • Optimized scheduled message delivery with batch processing and backpressure mechanisms to improve throughput under load.

✏️ Tip: You can customize this high-level summary in your review settings.

Copilot AI review requested due to automatic review settings December 9, 2025 07:55
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Dec 9, 2025

Walkthrough

This change introduces performance benchmarks and optimizations for ScheduleMessageService, including pre-allocation constants, batch processing capabilities, backpressure mechanisms, and refactored task handle management from Arc<Mutex> to Option. A new Criterion-based benchmark module measures VecDeque allocation, lock contention, and batch processing performance.

Changes

Cohort / File(s) Summary
Benchmark Configuration
rocketmq-broker/Cargo.toml
Adds two bench entries with harness = false: one for existing subscription_group_manager_benchmark and a new schedule_message_service_performance benchmark.
Performance Benchmarks
rocketmq-broker/benches/schedule_message_service_performance.rs
New Criterion benchmark module testing VecDeque pre-allocation, Mutex vs RwLock read/write performance, batch vs single item processing, and atomic operations. Includes Tokio runtime setup and multiple benchmark groups with helper processing functions.
Service Optimizations
rocketmq-broker/src/schedule/schedule_message_service.rs
Introduces performance constants (MAX_BATCH_SIZE, INITIAL_QUEUE_CAPACITY, MAX_PENDING_QUEUE_SIZE), refactors task_handles from Arc<Mutex<Vec>> to Option<Vec>, modifies start() to accept mutable ArcMut<Self>, changes shutdown() to mutable borrow, implements pre-allocation and backpressure logic, and adds comprehensive test coverage.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45–60 minutes

Areas requiring extra attention:

  • API signature changes: ScheduleMessageService::start() and shutdown() now take mutable self; verify all call sites are updated and ownership/lifetime implications are correct
  • task_handles refactoring: Change from Arc<Mutex<Vec<JoinHandle>>> to Option<Vec<JoinHandle>> alters synchronization strategy; confirm thread-safety and initialization logic in new(), start(), and shutdown()
  • Backpressure and pre-allocation logic: New constants and batch processing thresholds in async_deliver and queue management; validate capacity calculations and batch size limits under various load scenarios
  • Benchmark validation: Ensure benchmark scenarios (lock contention, batch sizes, queue allocations) are representative and the Criterion setup correctly isolates performance characteristics

Poem

🐰 Optimized queues now leap and bound,
Pre-allocated, no slowdown found!
Benchmarks measure every race,
Batch by batch, at faster pace. ✨🚀

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically identifies the main changes: performance optimizations and new benchmarks for ScheduleMessageService, directly matching the changeset.
Linked Issues check ✅ Passed The pull request implements both primary objectives from issue #4539: introduces performance optimizations (constants, pre-allocation, backpressure) and adds new benchmarks (schedule_message_service_performance.rs).
Out of Scope Changes check ✅ Passed All code changes are directly aligned with performance optimization and benchmarking objectives. Cargo.toml modifications enable benchmarks, the new benchmark file implements performance testing, and service changes are performance-focused optimizations.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch enh-4539

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@rocketmq-rust-bot
Copy link
Copy Markdown
Collaborator

🔊@mxsm 🚀Thanks for your contribution🎉!

💡CodeRabbit(AI) will review your code first🔥!

Note

🚨The code review suggestions from CodeRabbit are to be used as a reference only, and the PR submitter can decide whether to make changes based on their own judgment. Ultimately, the project management personnel will conduct the final code review💥.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
rocketmq-broker/src/schedule/schedule_message_service.rs (1)

1130-1147: Bug: Backpressure check is performed after message delivery, causing delivered messages to be lost.

The new MAX_PENDING_QUEUE_SIZE check at lines 1136-1144 is performed after deliver_message() has already been called (line 1131-1133). When backpressure triggers and returns Ok(false), the message has been delivered to the store but the result_process is not added to processes_queue. This means:

  1. The message is delivered but the result is not tracked
  2. The offset won't be updated for that message
  3. On service restart, the message may be re-delivered

The backpressure check should be performed before calling deliver_message(), similar to the existing flow control check.

+        // Performance optimization: Check queue size before delivering
+        if processes_queue.len() >= MAX_PENDING_QUEUE_SIZE {
+            warn!(
+                "Pending queue for delay level {} is full (size={}), applying backpressure",
+                self.delay_level,
+                processes_queue.len()
+            );
+            // Return false to signal backpressure
+            return Ok(false);
+        }
+
         // Deliver the message
         let result_process = self
             .deliver_message(msg_inner, msg_id, offset, offset_py, size_py, true)
             .await?;

-        // Performance optimization: Check queue size before adding
-        if processes_queue.len() >= MAX_PENDING_QUEUE_SIZE {
-            warn!(
-                "Pending queue for delay level {} is full (size={}), applying backpressure",
-                self.delay_level,
-                processes_queue.len()
-            );
-            // Return false to signal backpressure
-            return Ok(false);
-        }
-
         // Add to pending queue
         processes_queue.push_back(result_process);
🧹 Nitpick comments (3)
rocketmq-broker/benches/schedule_message_service_performance.rs (2)

77-105: Consider adding concurrent reader benchmarks for a more meaningful RwLock comparison.

The current benchmark tests sequential reads, but RwLock's advantage over Mutex is primarily in concurrent read scenarios where multiple readers can acquire the lock simultaneously. The current sequential benchmark may not show significant differences between the two.

Consider adding a concurrent variant:

+    group.bench_function("rwlock_concurrent_read", |b| {
+        b.iter(|| {
+            rt.block_on(async {
+                let handles: Vec<_> = (0..4)
+                    .map(|_| {
+                        let data = Arc::clone(&rwlock_data);
+                        tokio::spawn(async move {
+                            let guard = data.read().await;
+                            black_box(guard.len())
+                        })
+                    })
+                    .collect();
+                for h in handles {
+                    let _ = h.await;
+                }
+            })
+        });
+    });

112-134: Unbounded vector growth may skew benchmark results.

The mutex_data and rwlock_data vectors grow continuously across all benchmark iterations without being reset. This causes:

  1. Increasing memory usage throughout the benchmark
  2. Periodic Vec reallocations that add noise
  3. Measurements that change based on iteration count
     group.bench_function("mutex_sequential_write", |b| {
+        rt.block_on(async { mutex_data.lock().await.clear() });
         b.iter(|| {
             rt.block_on(async {
                 let mut data = mutex_data.lock().await;
                 data.push(black_box(42));
+                if data.len() > 1000 {
+                    data.clear();
+                }
             })
         });
     });

     group.bench_function("rwlock_sequential_write", |b| {
+        rt.block_on(async { rwlock_data.write().await.clear() });
         b.iter(|| {
             rt.block_on(async {
                 let mut data = rwlock_data.write().await;
                 data.push(black_box(42));
+                if data.len() > 1000 {
+                    data.clear();
+                }
             })
         });
     });
rocketmq-broker/src/schedule/schedule_message_service.rs (1)

258-259: Minor: Capacity calculation may over-allocate when async delivery is disabled.

The capacity len * 2 + 1 assumes both delivery and async handler tasks per level, but async handler tasks are only spawned when enable_async_deliver is true. This is a minor inefficiency.

-            // Pre-allocate task_handles vector for all delay levels + persist task
-            let mut task_handles = Vec::with_capacity(this.delay_level_table.len() * 2 + 1);
+            // Pre-allocate task_handles vector for all delay levels + persist task
+            let multiplier = if this.enable_async_deliver { 2 } else { 1 };
+            let mut task_handles = Vec::with_capacity(this.delay_level_table.len() * multiplier + 1);
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 098a09a and 23f0203.

📒 Files selected for processing (3)
  • rocketmq-broker/Cargo.toml (1 hunks)
  • rocketmq-broker/benches/schedule_message_service_performance.rs (1 hunks)
  • rocketmq-broker/src/schedule/schedule_message_service.rs (11 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Agent
  • GitHub Check: Check (fmt + clippy)
  • GitHub Check: auto-approve
🔇 Additional comments (8)
rocketmq-broker/Cargo.toml (1)

77-83: LGTM!

The benchmark configuration is correct. Using harness = false is required for Criterion-based benchmarks, and the naming follows the existing convention in the project.

rocketmq-broker/benches/schedule_message_service_performance.rs (1)

171-211: LGTM!

The atomic operations benchmark and helper functions are well-structured. The benchmark covers the key atomic operations used in the service (load, store, fetch_add), and the helper functions provide appropriate workload simulation for batch processing comparisons.

rocketmq-broker/src/schedule/schedule_message_service.rs (6)

70-76: LGTM!

The performance constants are well-chosen and documented. The values align with common practices: MAX_BATCH_SIZE = 32 is a reasonable batch size for message processing, INITIAL_QUEUE_CAPACITY = 128 reduces early reallocations, and MAX_PENDING_QUEUE_SIZE = 10000 provides a sensible upper bound for backpressure.


135-136: Good simplification of the task handle storage.

Changing from Arc<Mutex<Vec<...>>> to Option<Vec<...>> is appropriate since task handles are only written during start() and consumed during shutdown(). This eliminates unnecessary synchronization overhead.

Also applies to: 155-156


344-355: LGTM!

The shutdown implementation correctly uses take() for one-time consumption of task handles. The early return with a warning handles the edge case where shutdown is called without a prior start (or called twice).


588-594: LGTM!

Pre-allocating the VecDeque with INITIAL_QUEUE_CAPACITY is a good optimization that reduces reallocations during normal message processing operations.


1656-1709: LGTM with observation.

The batch processing optimization is well-implemented. The early exit on Running status and breaks on Exception/Skip preserve the original semantics while allowing batch processing for successful completions. The queue size logging at >1000 items provides useful operational visibility.

Note: When Exception or Skip status is encountered, the batch effectively processes only one item before breaking. This is intentional and correct for maintaining proper error handling order.


1742-1884: LGTM!

The test coverage is comprehensive, covering:

  • DelayOffsetSerializeWrapper creation and accessors
  • Bidirectional queue_id/delay_level conversions
  • ProcessStatus enum traits
  • Edge cases with negative and large values
  • JSON serialization format verification

The tests appropriately focus on the public API and helper functions without requiring a full service instantiation.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds performance optimizations and comprehensive benchmarks for the ScheduleMessageService component, focusing on reducing memory allocations, improving batch processing, and implementing backpressure mechanisms.

Key Changes

  • Refactored task handle management from Arc<Mutex<Vec>> to Option<Vec> for one-time initialization/consumption pattern
  • Added VecDeque capacity pre-allocation to reduce reallocations during normal operation
  • Implemented batch processing optimization with MAX_BATCH_SIZE constant (32 items per cycle)
  • Added backpressure mechanism when pending queue exceeds MAX_PENDING_QUEUE_SIZE (10,000 items)
  • Created comprehensive benchmark suite covering VecDeque allocation, lock performance, batch processing, and atomic operations
  • Added unit tests for utility functions and data structures

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
rocketmq-broker/src/schedule/schedule_message_service.rs Core optimizations including task handle refactoring, VecDeque capacity pre-allocation, backpressure logic, batch processing improvements, and new unit tests
rocketmq-broker/benches/schedule_message_service_performance.rs New comprehensive benchmark suite measuring memory allocation overhead, lock performance, and batch processing efficiency
rocketmq-broker/Cargo.toml Added benchmark configuration for the new performance test suite

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1135 to +1144
// Performance optimization: Check queue size before adding
if processes_queue.len() >= MAX_PENDING_QUEUE_SIZE {
warn!(
"Pending queue for delay level {} is full (size={}), applying backpressure",
self.delay_level,
processes_queue.len()
);
// Return false to signal backpressure
return Ok(false);
}
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backpressure check occurs after deliver_message has already been called (line 1131-1133), which means the message has been delivered and processed but the result is discarded if the queue is full. This could lead to message loss or duplicate processing. The backpressure check should be performed before calling deliver_message, similar to the existing flow control check at lines 1113-1120.

Copilot uses AI. Check for mistakes.
Comment on lines +1845 to +1855
/// Test compute_deliver_timestamp with known delay level
#[test]
fn test_compute_deliver_timestamp() {
// This test requires a ScheduleMessageService instance with delay_level_table populated
// We'll test the logic by understanding the function behavior
let store_timestamp = 1000i64;
let delay_time = 5000i64; // 5 seconds
let expected = store_timestamp + delay_time;

assert_eq!(expected, 6000);
}
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't actually test the compute_deliver_timestamp function - it only performs arithmetic verification (1000 + 5000 = 6000) without calling any actual method. Consider either removing this test or implementing it to actually test the compute_deliver_timestamp method with a properly configured ScheduleMessageService instance.

Copilot uses AI. Check for mistakes.
@codecov
Copy link
Copy Markdown

codecov bot commented Dec 9, 2025

Codecov Report

❌ Patch coverage is 70.70707% with 29 lines in your changes missing coverage. Please review.
✅ Project coverage is 30.36%. Comparing base (098a09a) to head (23f0203).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
...mq-broker/src/schedule/schedule_message_service.rs 70.70% 29 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4540      +/-   ##
==========================================
+ Coverage   30.29%   30.36%   +0.06%     
==========================================
  Files         673      673              
  Lines       97651    97736      +85     
==========================================
+ Hits        29588    29676      +88     
+ Misses      68063    68060       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Collaborator

@rocketmq-rust-bot rocketmq-rust-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - All CI checks passed ✅

@rocketmq-rust-bot rocketmq-rust-bot merged commit 1345639 into main Dec 9, 2025
27 checks passed
@rocketmq-rust-bot rocketmq-rust-bot added approved PR has approved and removed ready to review waiting-review waiting review this PR labels Dec 9, 2025
@mxsm mxsm deleted the enh-4539 branch December 9, 2025 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enhancement✨] Add performance optimizations and new benchmarks for ScheduleMessageService

4 participants