-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Add unthrottled path for replicas in ThrottlingAllocationDecider #138545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add unthrottled path for replicas in ThrottlingAllocationDecider #138545
Conversation
aef7911 to
bd3d442
Compare
bd3d442 to
4ade9fa
Compare
4ade9fa to
4b5ef05
Compare
|
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
| // During simulation, this supports early publishing DesiredBalance, with all unassigned shards assigned. | ||
| // Notably, this bypass is only in simulation decisions. Reconciliation will continue to obey throttling, in particular the | ||
| // requirement to assign a primary before allowing its replicas to begin initializing. | ||
| return allocation.decision(Decision.YES, NAME, "replica allocation is not throttled when simulating"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change possibly redundant since we implemented #134786. If I understand correctly the ThrottlingAllocationDecider won't ever kick in now that we do a single move per balancing round?
I thought that ES-12942 was more similar to this change #115511
Never mind I see now we'll need this with the subsequent changes
| ShardRouting shardRouting1Primary, | ||
| ShardRouting shardRouting1Replica, | ||
| ShardRouting shardRouting2Primary, | ||
| ShardRouting shardRouting2Replica |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should these have unassigned in the name (e.g. unassignedShard1Primary) or similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that'd be clearer, done 👍
| String[] indices, | ||
| int numberOfShards, | ||
| List<ShardRouting.Role> replicaRoles | ||
| ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we validate that numberOfShards == replicaRoles.size() ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that has to be true, AFAICT. The number of replicaRoles is equal to the number of replicas, which is not limited to the number of shards.
| } else if (i == numberOfDataNodes) { | ||
| discoBuilder.masterNodeId(node.getId()); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm reading it right, the actual number of data nodes ends up being numberOfDataNodes + 1, this seems counter-intuitive, could we change numberOfDataNodes to have the actual number of data nodes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is pretty convoluted. I never have found a reason for it in the other helpers. Perhaps some historical testing need that no longer exists.
Fixing 👍
| ShardRouting shardRouting1Primary = TestShardRouting.newShardRouting(testShardId1, null, null, true, ShardRoutingState.UNASSIGNED); | ||
| ShardRouting shardRouting2Primary = TestShardRouting.newShardRouting(testShardId2, null, null, true, ShardRoutingState.UNASSIGNED); | ||
| ShardRouting shardRouting1Replica = TestShardRouting.newShardRouting(testShardId1, null, null, false, ShardRoutingState.UNASSIGNED); | ||
| ShardRouting shardRouting2Replica = TestShardRouting.newShardRouting(testShardId2, null, null, false, ShardRoutingState.UNASSIGNED); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's not important, but I think we should be able to pull these out of the routing table with
RoutingTable routingTable = clusterState.routingTable(ProjectId.DEFAULT);
routingTable.shardRoutingTable(shardId).primaryShard();
routingTable.shardRoutingTable(shardId).replicaShards().get(0);Then perhaps we can avoid the change to TestShardRouting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, routing table is probably better. I just copy-pasted.
The TestShardRouting change is label tidying, other callers provide null.
nicktindall
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with some minor queries
Adds a new cluster setting to allow the ThrottlingAllocationDecider
to bypass replica shard throttling during balancer simulation. Primary
shards already always bypass throttling during simulation so that new
index shards are assigned (and made available) as quickly as possible.
Replicas need the same quick availability in some environments.
Relates ES-12942
I'm splitting the work for ES-12942 into pieces. Next I'll need to make sure the BalancedShardsAllocator#allocate() can assign all the replicas in one call, and then change the DesiredBalanceComputer's early return logic to pick up new assignment of unassigned replicas.