Modify strictReplicaGroup rebalance batching to categorize on current+target instances to partitionId to currentAssignment#15838
Merged
somandal merged 2 commits intoapache:masterfrom May 20, 2025
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #15838 +/- ##
============================================
+ Coverage 62.90% 63.36% +0.46%
+ Complexity 1386 1353 -33
============================================
Files 2867 2898 +31
Lines 163354 165797 +2443
Branches 24952 25360 +408
============================================
+ Hits 102755 105055 +2300
+ Misses 52847 52807 -40
- Partials 7752 7935 +183
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
...ntroller/src/main/java/org/apache/pinot/controller/helix/core/rebalance/TableRebalancer.java
Show resolved
Hide resolved
…rget instances to partitionId to currentAssignment
8438509 to
9f8c4bc
Compare
Jackie-Jiang
approved these changes
May 20, 2025
songwdfu
pushed a commit
to songwdfu/pinot
that referenced
this pull request
Jun 3, 2025
…+target instances to partitionId to currentAssignment (apache#15838) * Fix strictReplicaGroup rebalance batching to categorize on current+target instances to partitionId to currentAssignment * Address review comment: short-circuit return when rebalance batching is disabled
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR modifies
strictReplicaGrouprebalance batching to categorize onPair(current instances, target instances) to partitionId to currentAssignment, instead ofpartitionId -> current instances -> currentAssignment. It also modifies the code to move a fullPair(current instances, target instances) to partitionIdat a time rather than a fullpartitionIdat a time to provide more granular batching.This is to more closely mimic how
strictReplicaGroupinstance selection is done, which is based on the assignments and knows nothing about the partitions, or whether partitioning is even enabled. The main objective is to assess segment availability and mark instances as a whole as unavailable if even one segment is unavailable for that instance.For
StrictRealtimeSegmentAssignment, the full partitionId is guaranteed to belong to a singlePair(current instances, target instances), since the IdealState is used to update the segments that are newly assigned. Thus the invariant for consistency purposes that the full partition must move as a whole is still maintained, since the full partition will be part of the same pair. Even when choosing whether the segment can be moved based on availability, the partition will be chosen as a whole or be skipped for move as a whole since the assignedInstances is used for the availability tracking and all segments of a partition will have the same assignedInstances and available instances (which is same as the older rebalance code without batching).For other segment assignment strategies, the code will now move a full
Pair(current instances, target instances) to partitionIdas a whole, but it can happen that a partitionId is split across multiplePair(current instances, target instances). It is okay to move them separately since there is no mandate in these segment assignment strategies that the partition must be moved as a whole or assigned based on IdealState (it is instead assigned based on instance partitions if it exists or the tagged servers). It is enough to ensure that the minAvailableReplicas will be honored based onstrictReplicaGroupinstance selector, which is done by the original rebalance code already (and utilized even with batching).Testing done:
Used
HybridQuickstartto try the following withstrictReplicaGroupinstance selector enabled:StrictRealtimeSegmentAssignmentfor REALTIME tablesRealtimeSegmentAssignmentfor REALTIME tablesOfflineSegmentAssignmentfor OFFLINE tablesVerified the grouping happens as expected, and the full Pair(current instances, target instances) to partitionId is moved as a whole. Non-batching works as expected.