-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Balancer changes to use Decision#NOT_PREFERRED #134160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
88fac67
c9c2645
0f837c5
4fef486
e3da024
a07e654
3c16fa0
5c04938
89e2dd0
2621ac1
d04e073
515175d
0aba59a
b609df8
8dd940c
fd217db
18fa1f2
f14e4f1
f36746b
d048da0
c10f655
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -503,9 +503,10 @@ private void moveShards() { | |
|
||
final var routingNode = routingNodes.node(shardRouting.currentNodeId()); | ||
final var canRemainDecision = allocation.deciders().canRemain(shardRouting, routingNode, allocation); | ||
if (canRemainDecision.type() != Decision.Type.NO) { | ||
// it's desired elsewhere but technically it can remain on its current node. Defer its movement until later on to give | ||
// priority to shards that _must_ move. | ||
if (canRemainDecision.type() != Decision.Type.NO && canRemainDecision.type() != Decision.Type.NOT_PREFERRED) { | ||
// If movement is throttled, a future reconciliation round will see a resolution. For now, leave it alone. | ||
// Reconciliation treats canRemain NOT_PREFERRED answers as YES because the DesiredBalance computation already decided | ||
// how to handle the situation. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This will mean we move There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was actually thinking the other way, prioritizing moves to address hot-spots first, would be more ideal since it addresses a performance problem. Though actually, that would deprioritize shutdown moves.. But then again, timeout during shutdown is often because something else is going on than allocation. This bit of code, though, doesn't control ordering -- that would have to be a new feature in the code to organize shard selection based on NO vs NOT_PREFERRED, probably hard -- rather it's an early exit if canRemain say YES or THROTTLE. But yeah, perhaps we'll see a motivation later for something fancier. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It does control ordering in that shards moved in this phase will consume limited incoming/outgoing recovery slots, right? so shards eligible for movement in this phase will be prioritised before undesired allocations eligible only for movement in the In saying that it probably makes sense to prioritise |
||
continue; | ||
} | ||
|
||
|
@@ -650,6 +651,7 @@ private DiscoveryNode findRelocationTarget( | |
Set<String> desiredNodeIds, | ||
BiFunction<ShardRouting, RoutingNode, Decision> canAllocateDecider | ||
) { | ||
DiscoveryNode chosenNode = null; | ||
for (final var nodeId : desiredNodeIds) { | ||
// TODO consider ignored nodes here too? | ||
if (nodeId.equals(shardRouting.currentNodeId())) { | ||
|
@@ -661,12 +663,24 @@ private DiscoveryNode findRelocationTarget( | |
} | ||
final var decision = canAllocateDecider.apply(shardRouting, node); | ||
logger.trace("relocate {} to {}: {}", shardRouting, nodeId, decision); | ||
|
||
// Assign shards to the YES nodes first. This way we might delay moving shards to NOT_PREFERRED nodes until after shards are | ||
// first moved away. The DesiredBalance could be moving shards away from a hot node as well as moving shards to it, and it's | ||
// better to offload shards first. | ||
if (decision.type() == Decision.Type.YES) { | ||
return node.node(); | ||
chosenNode = node.node(); | ||
// As soon as we get any YES, we return it. | ||
break; | ||
} else if (decision.type() == Decision.Type.NOT_PREFERRED && chosenNode == null) { | ||
// If the best answer is not-preferred, then the shard will still be assigned. It is okay to assign to a not-preferred | ||
// node because the desired balance computation had a reason to override it: when there aren't any better nodes to | ||
// choose and the shard cannot remain where it is, we accept not-preferred. NOT_PREFERRED is essentially a YES for | ||
// reconciliation. | ||
chosenNode = node.node(); | ||
} | ||
} | ||
|
||
return null; | ||
return chosenNode; | ||
} | ||
|
||
private Decision decideCanAllocate(ShardRouting shardRouting, RoutingNode target) { | ||
|
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally an early return if not NO for canRemain. Excluding NOT_PREFERRED from early return, so we'll go on to try to move it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we want to keep this as-is, pending the introduction of the proposed
moveNotPreferred
phase. For example if a node is hot spotting and all its shards are returningNOT_PREFERRED
, we probably want to delay dealing with those untilmoveNotPreferred
when we'll move them in preferential order.I have a PR for
moveNotPreferred
which I'll put up for review shortly to get feedback.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of our work is feature gated, so in that respect I'm not worried about waiting for other code first. I can't test without this change: I've got the canRemain work done, waiting on the PR so I can rebase before publishing the work. You will also be able to take advantage of the testing / functionality, once your feature is in place, however it turns out, so that might be appealing. This logic can be changed easily since it's a line of code.
If you're comfortable with that, I'd like to go ahead with getting the dumb case (of picking any shard) working, so we don't bottleneck work. I was actually expecting moveNotPreferred to run before moveShards. In that case, though, we would not actually exercise this check. We might even turn this into an assert that not-preferred never occurs.