-
Notifications
You must be signed in to change notification settings - Fork 25.6k
WriteLoadConstraintDecider: Don't require shard write load to return canRemain=NOT_PREFERRED #137416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WriteLoadConstraintDecider: Don't require shard write load to return canRemain=NOT_PREFERRED #137416
Conversation
|
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
ywangd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| assertEquals( | ||
| "A shard without write load should remain on a node with queuing above the threshold", | ||
| Decision.Type.YES, | ||
| writeLoadDecider.canRemain( | ||
| testHarness.clusterState.metadata().getProject().index(indexName), | ||
| testHarness.shardRoutingNoWriteLoad, | ||
| testHarness.aboveQueuingThresholdRoutingNode, | ||
| testHarness.routingAllocation | ||
| ).type() | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a test to show a shard without write load can still trigger NOT_PREFERRED?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in 6fc953f
|
I assume we have a protection against non-existent shard write load in |
Yep: Lines 1110 to 1122 in 09f5349
|
|
I don't understand why we'd want to remove this? Logically, if a shard has no write load, there's no benefit in moving it away. The decider seems more logically complete with the check. The Allocator may not pass a null/0 value write load today, but there's no testing to protect against it in future, and then the behavior will be strange. |
My reasoning is that deciding whether the shard can remain and picking exactly which shard to move are two things. The later is done elsewhere which compares shard write loads already. So we don't need it for the can remain check. If for some reason, all shards reporting 0 write load incorrectly, we will still be able to move one of them with this change instead of taking no action.
There is some protection as Nick commented here. We could also add a test for it. |
This might have been a hangover from before we did the work to prioritise shard movement. It's redundant now.