Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support bootstrap mode for table rebalance #5224

Merged
merged 2 commits into from
Apr 9, 2020

Conversation

Jackie-Jiang
Copy link
Contributor

Add bootstrap mode where the rebalancer starts with an empty table and reassign all segments.
This mode can be used to rebalance table with hotspot servers.
All segments will be sorted alphabetically and assigned in a round-robin fashion, no minimum movement is guaranteed.

Other related changes:

  • Extract some pieces of code for sharing between offline and real-time assignment
  • Extract assign segment logic without logging to reduce the log for bootstrapping
  • Add tests to cover the new feature

SegmentAssignmentUtils.getInstancesForNonReplicaGroupBasedAssignment(instancePartitions, _replication);
newAssignment = SegmentAssignmentUtils
.rebalanceTableWithHelixAutoRebalanceStrategy(currentAssignment, instances, _replication);
// When bootstrap is enabled, start with an empty table add reassign all segments
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you adjust the comment here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the suggestion? I feel this comment is clear?

if (instancePartitions.getNumReplicaGroups() == 1) {
List<String> instancesAssigned = assignSegment(segmentName, currentAssignment, instancePartitions);

LOGGER
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it print too many messages if there are many segments in a table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is exactly the reason why the assignSegment is extracted into a helper method (line 128)

Copy link
Contributor

@snleee snleee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

.getInstancesForNonReplicaGroupBasedAssignment(completedInstancePartitions, _replication);
newAssignment = SegmentAssignmentUtils
.rebalanceTableWithHelixAutoRebalanceStrategy(completedSegmentAssignment, instances, _replication);
// When bootstrap is enabled, start with an empty table add reassign all segments
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty table -> empty assignment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added more comments

SegmentAssignmentUtils.getInstancesForNonReplicaGroupBasedAssignment(instancePartitions, replication);
int[] numSegmentsAssignedPerInstance = getNumSegmentsAssignedPerInstance(currentAssignment, instances);
int numInstances = numSegmentsAssignedPerInstance.length;
PriorityQueue<Pairs.IntPair> heap = new PriorityQueue<>(numInstances, Pairs.intPairComparator());
Copy link
Contributor

@snleee snleee Apr 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that Pairs.intPairComparator will be an ascending order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is sorted on the left value firstly, then right value secondly.

Add bootstrap mode where the rebalancer starts with an empty table and reassign all segments.
This mode can be used to rebalance table with hotspot servers.
All segments will be sorted alphabetically and assigned in a round-robin fashion, no minimum movement is guaranteed.

Other related changes:
- Extract some pieces of code for sharing between offline and real-time assignment
- Extract assign segment logic without logging to reduce the log for bootstrapping
- Add tests to cover the new feature
@Jackie-Jiang Jackie-Jiang merged commit 8f0d059 into apache:master Apr 9, 2020
@Jackie-Jiang Jackie-Jiang deleted the rebalance_bootstrap branch April 9, 2020 00:38
}

/**
* Helper method to check whether the number of replica-groups matches the table replication for replica-group based
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When will this be the case? Can you also add that in the comments?

if (instancePartitions.getNumReplicaGroups() == 1) {
// Non-replica-group based assignment
if (bootstrap) {
LOGGER.info("Bootstrapping the table: {}", _offlineTableName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest wording : "Bootstrapping segment assignment for table ..." Otherwise it may throw an admin off

if (completedInstancePartitions.getNumReplicaGroups() == 1) {
// Non-replica-group based assignment
if (bootstrap) {
LOGGER.info("Bootstrapping the COMPLETED segments for table: {}", _realtimeTableName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Bootstrapping segment assignment for COMPLETED segments..."

Jackie-Jiang added a commit to Jackie-Jiang/pinot that referenced this pull request Apr 9, 2020
Jackie-Jiang added a commit that referenced this pull request Apr 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants