-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Log NOT_PREFERRED shard movements #138069
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Log NOT_PREFERRED shard movements #138069
Conversation
| """ | ||
| Node [%s] has a queue latency of [%d] millis that exceeds the queue latency threshold of [%s]. This node is \ | ||
| hot-spotting. Current thread pool utilization [%f]. Moving shard(s) away.""", | ||
| hot-spotting. Current thread pool utilization [%f]. Shard write load [%s]. Moving shard(s) away.""", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add shard write load into explanation, so we can see it when we log the movement
| shardRouting, | ||
| moveDecision.getCanRemainDecision().getExplanation() | ||
| ); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have debug logging turned on for the WriteLoadConstraintDecider the explanation will include the shard write load and the node utilisation.
|
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
...ain/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java
Outdated
Show resolved
Hide resolved
| final var moveDecision = shardMoved ? decideMove(index, shardRouting) : storedShardMovement.moveDecision(); | ||
| if (moveDecision.isDecisionTaken() && moveDecision.cannotRemainAndCanMove()) { | ||
| if (notPreferredLogger.isDebugEnabled()) { | ||
| logMoveNotPreferred.maybeExecute( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do away with the throttled logging? Rather, just log everything.
Conceptually, we should only be picking the best shard to move away when a node is hot-spotting. We fix the hot-spot with one move, and then we don't have a hot-spot and this code doesn't run. So logging once per node per 30 seconds. I don't think it needs throttling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, removed in 3ba31a0 🤞
| "Moving shard [{}] from a NOT_PREFERRED allocation, explanation is [{}]", | ||
| shardRouting, | ||
| moveDecision.getCanRemainDecision().getExplanation() | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be helpful to see where the shard is being moved (moveDecision), not only why it cannot remain. Then we can check whether the shard moved where intended, or got derailed later (either not moved at all, or moved to a different node, perhaps, than the original target).
A canAllocate YES will also give us information about the target node utilization, which might be interesting: "Shard [%s] in index [%s] can be assigned to node [%s]. The node's utilization would become [%s]"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, added in 7dfedbe
I don't think we can print the allocate decisions because nodeDecisions won't be populated under normal circumstances (unless we debugDecision)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can print the allocate decisions because nodeDecisions won't be populated under normal circumstances (unless we debugDecision)
Is this something to do with Mutli vs Single Decision types? nodeDecisions looks like something in the explain path, yes. But the Decision returned from canAllocate for the chosen target node should have an explanation string. The Multi type I recall obfuscating some things, though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, maybe this is the problem:
Line 1019 in e0fcab7
| AllocationDecision.fromDecisionType(bestDecision), |
We lose that information when converting Decision into an AllocationDecision.
Ooph. Okay cool 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah if allocation.debugDecisions() was turned on we'd preserve them in nodeResults, but it won't be for normal allocation
Lines 991 to 993 in e0fcab7
| if (explain) { | |
| nodeResults.add(new NodeAllocationResult(currentNode.getRoutingNode().node(), allocationDecision, ++weightRanking)); | |
| } |
DiannaHohensee
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm 👍
* main: (135 commits)
Mute org.elasticsearch.upgrades.IndexSortUpgradeIT testIndexSortForNumericTypes {upgradedNodes=1} elastic#138130
Mute org.elasticsearch.upgrades.IndexSortUpgradeIT testIndexSortForNumericTypes {upgradedNodes=2} elastic#138129
Mute org.elasticsearch.search.basic.SearchWithRandomDisconnectsIT testSearchWithRandomDisconnects elastic#138128
[DiskBBQ] avoid EsAcceptDocs bug by calling cost before building iterator (elastic#138127)
Log NOT_PREFERRED shard movements (elastic#138069)
Improve bulk loading of binary doc values (elastic#137995)
Add internal action for getting inference fields and inference results for those fields (elastic#137680)
Address issue with DateFieldMapper#isFieldWithinQuery(...) (elastic#138032)
WriteLoadConstraintDecider: Have separate rate limiting for canRemain and canAllocate decisions (elastic#138067)
Adding NodeContext to TransportBroadcastByNodeAction (elastic#138057)
Mute org.elasticsearch.simdvec.ESVectorUtilTests testSoarDistanceBulk elastic#138117
Mute org.elasticsearch.xpack.esql.qa.single_node.GenerativeIT test elastic#137909
Backport batched_response_might_include_reduction_failure version to 8.19 (elastic#138046)
Add summary metrics for tdigest fields (elastic#137982)
Add gp-llm-v2 model ID and inference endpoint (elastic#138045)
Various tracing fixes (elastic#137908)
[ML] Fixing KDE evaluate() to return correct ValueAndMagnitude object (elastic#128602)
Mute org.elasticsearch.xpack.shutdown.NodeShutdownIT testStalledShardMigrationProperlyDetected elastic#115697
[ML] Fix Flaky Audit Message Assertion in testWithDatastream for RegressionIT and ClassificationIT (elastic#138065)
[ML] Fix Non-Deterministic Training Set Selection in RegressionIT testTwoJobsWithSameRandomizeSeedUseSameTrainingSet (elastic#138063)
...
# Conflicts:
# rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/search.vectors/200_dense_vector_docvalue_fields.yml
It is interesting to see which shards the write-load constraint decider is nominating for movement, and what their write load is. I made this a separate logger to the
BalancedShardsAllocatorbecause turning debug on for that would be very noisy.Relates: ES-13491