Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up shard balancer by reusing shard model while moving shards that can no longer be allocated to a node #16926

Merged
merged 1 commit into from Mar 4, 2016

Conversation

Projects
None yet
5 participants
@ywelsch
Copy link
Contributor

commented Mar 3, 2016

Decommissioning a node or applying a filter inclusion / exclusion can potentially lead to many shards that need to be moved to other nodes. The BalancedShardsAllocator deals very poorly with this situation if many shards are affected: For every shard that has to leave a node due to some filter exclusion or other constraint, the whole model that represents currently allocated shards is reconstructed. If 1000 shards need to leave a node, we currently calculate the model a 1000 times.

This PR reuses the model across all shard movements in an allocation round: It calculates the shard model once and simulates the application of all shards that can be moved on this model.

Additional notes:

  • I marked the PR as breaking as it makes changes to the ShardsAllocator interface, which might be implemented by other balancers. I still think we should backport this to v2.3.0, however.
  • For maximum reuse of the model, we should fold allocateUnassigned, (the newly introduced) moveShards and rebalance into a single method as they are only used in AllocationService by being called in succession. I see this as a follow-up PR.
@s1monw

View changes

...ain/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java Outdated
currentNode.addShard(targetRelocatingShard, decision);
if (logger.isTraceEnabled()) {
logger.trace("Moved shard [{}] to node [{}]", shard, currentNode.getNodeId());
if (changed) {

This comment has been minimized.

Copy link
@s1monw

s1monw Mar 3, 2016

Contributor

can we have only one return statement in this method and make this if a if (changed == false) {

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Mar 3, 2016

I marked the PR as breaking as it makes changes to the ShardsAllocator interface, which might be implemented by other balancers. I still think we should backport this to v2.3.0, however.

++ new versions need to recompile it anyway

@s1monw

View changes

...ain/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java Outdated
int index = 0;
boolean found = true;
final RoutingNodes routingNodes = allocation.routingNodes();
while (found) {

This comment has been minimized.

Copy link
@s1monw

s1monw Mar 3, 2016

Contributor

hmm this look confuses me a lot, can't we somehow concat all the RoutingNodes since they implement Iterable<ShardRouting> and use a filter predicate to remove shards that are not started? Otherwise we iterate that stuff twice and an I don't think the case where all shards are non-active is representative?

This comment has been minimized.

Copy link
@s1monw

s1monw Mar 3, 2016

Contributor

btw. I know you just move thsi stuff around and if you feel like it you can also do it in a followup to keep this change smallish?

This comment has been minimized.

Copy link
@ywelsch

ywelsch Mar 3, 2016

Author Contributor

The iteration order is a very interesting one currently: We take one shard from each node in succession (and start again on the first node as long as there are still shards). I don't know exactly why, but I guess it is to make it fairer so that every node gets a chance to get rid of some shards (i.e. a more "balanced" riddance of shards)? I'm not sure whether we really need this and if it has any effect at all. It is not even truly fair as it creates the order independently on whether a shard is later moved away.

I was a bit hesitant on changing the semantics in this PR though.

@kimchy Can you clarify why this interleaving between nodes was done and whether you believe it to be still useful?

This comment has been minimized.

Copy link
@kimchy

kimchy Mar 3, 2016

Member

Yea, the idea was to create a somehow fair movement. That was before we had things like throttled allocation and even the balanced algo, so it might not be relevant anymore.

* @param allocation current node allocation
* @return <code>true</code> if the allocation has changed, otherwise <code>false</code>
*/
boolean move(ShardRouting shardRouting, RoutingNode node, RoutingAllocation allocation);
boolean moveShards(RoutingAllocation allocation);

This comment has been minimized.

Copy link
@s1monw

s1monw Mar 3, 2016

Contributor

I like this way more anyway

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Mar 3, 2016

left some comments - this change looks great!

@ywelsch

This comment has been minimized.

Copy link
Contributor Author

commented Mar 4, 2016

@s1monw I pushed a new commit 975ef83 removing the early return. I won't change the node interleaving iteration order for now as I believe it to be still useful. I might revisit this though in a future PR.

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Mar 4, 2016

@ywelsch can we at least have a comment on that why we have this odd iteration?

@ywelsch

This comment has been minimized.

Copy link
Contributor Author

commented Mar 4, 2016

Added a comment in 6356e08

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Mar 4, 2016

LGTM thanks

@ywelsch ywelsch force-pushed the ywelsch:fix/balancer-move branch Mar 4, 2016

Speed up shard balancer by reusing shard model while moving shards th…
…at can no longer be allocated to a node

Decommissioning a node or applying a filter inclusion / exclusion can potentially lead to many shards that need to be moved to other nodes. This commit reuses the model across all
shard movements in an allocation round: It calculates the shard model once and simulates the application of all shards that can be moved on this model.

Closes #16926

@ywelsch ywelsch force-pushed the ywelsch:fix/balancer-move branch to 250db49 Mar 4, 2016

ywelsch added a commit that referenced this pull request Mar 4, 2016

Merge pull request #16926 from ywelsch/fix/balancer-move
Speed up shard balancer by reusing shard model while moving shards that can no longer be allocated to a node

@ywelsch ywelsch merged commit 675d940 into elastic:master Mar 4, 2016

1 check passed

CLA Commit author is a member of Elasticsearch
Details

ywelsch added a commit that referenced this pull request Mar 4, 2016

Speed up shard balancer by reusing shard model while moving shards th…
…at can no longer be allocated to a node

Decommissioning a node or applying a filter inclusion / exclusion can potentially lead to many shards that need to be moved to other nodes. This commit reuses the model across all
shard movements in an allocation round: It calculates the shard model once and simulates the application of all shards that can be moved on this model.

Closes #16926
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.