Skip to content

Conversation

@nicktindall
Copy link
Contributor

@nicktindall nicktindall commented Nov 14, 2025

It is interesting to see which shards the write-load constraint decider is nominating for movement, and what their write load is. I made this a separate logger to the BalancedShardsAllocator because turning debug on for that would be very noisy.

Relates: ES-13491

"""
Node [%s] has a queue latency of [%d] millis that exceeds the queue latency threshold of [%s]. This node is \
hot-spotting. Current thread pool utilization [%f]. Moving shard(s) away.""",
hot-spotting. Current thread pool utilization [%f]. Shard write load [%s]. Moving shard(s) away.""",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add shard write load into explanation, so we can see it when we log the movement

shardRouting,
moveDecision.getCanRemainDecision().getExplanation()
);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have debug logging turned on for the WriteLoadConstraintDecider the explanation will include the shard write load and the node utilisation.

@elasticsearchmachine elasticsearchmachine added the serverless-linked Added by automation, don't add manually label Nov 14, 2025
@nicktindall nicktindall added >non-issue :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) labels Nov 14, 2025
@nicktindall nicktindall marked this pull request as ready for review November 14, 2025 04:37
@nicktindall nicktindall requested a review from a team as a code owner November 14, 2025 04:37
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Coordination Meta label for Distributed Coordination team label Nov 14, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

final var moveDecision = shardMoved ? decideMove(index, shardRouting) : storedShardMovement.moveDecision();
if (moveDecision.isDecisionTaken() && moveDecision.cannotRemainAndCanMove()) {
if (notPreferredLogger.isDebugEnabled()) {
logMoveNotPreferred.maybeExecute(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do away with the throttled logging? Rather, just log everything.

Conceptually, we should only be picking the best shard to move away when a node is hot-spotting. We fix the hot-spot with one move, and then we don't have a hot-spot and this code doesn't run. So logging once per node per 30 seconds. I don't think it needs throttling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, removed in 3ba31a0 🤞

"Moving shard [{}] from a NOT_PREFERRED allocation, explanation is [{}]",
shardRouting,
moveDecision.getCanRemainDecision().getExplanation()
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be helpful to see where the shard is being moved (moveDecision), not only why it cannot remain. Then we can check whether the shard moved where intended, or got derailed later (either not moved at all, or moved to a different node, perhaps, than the original target).

A canAllocate YES will also give us information about the target node utilization, which might be interesting: "Shard [%s] in index [%s] can be assigned to node [%s]. The node's utilization would become [%s]"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, added in 7dfedbe

I don't think we can print the allocate decisions because nodeDecisions won't be populated under normal circumstances (unless we debugDecision)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can print the allocate decisions because nodeDecisions won't be populated under normal circumstances (unless we debugDecision)

Is this something to do with Mutli vs Single Decision types? nodeDecisions looks like something in the explain path, yes. But the Decision returned from canAllocate for the chosen target node should have an explanation string. The Multi type I recall obfuscating some things, though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, maybe this is the problem:


We lose that information when converting Decision into an AllocationDecision.

Ooph. Okay cool 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah if allocation.debugDecisions() was turned on we'd preserve them in nodeResults, but it won't be for normal allocation

if (explain) {
nodeResults.add(new NodeAllocationResult(currentNode.getRoutingNode().node(), allocationDecision, ++weightRanking));
}

@nicktindall nicktindall removed the serverless-linked Added by automation, don't add manually label Nov 14, 2025
Copy link
Contributor

@DiannaHohensee DiannaHohensee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 👍

@nicktindall nicktindall enabled auto-merge (squash) November 15, 2025 01:34
@nicktindall nicktindall merged commit 93410cf into elastic:main Nov 15, 2025
34 checks passed
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Nov 16, 2025
* main: (135 commits)
  Mute org.elasticsearch.upgrades.IndexSortUpgradeIT testIndexSortForNumericTypes {upgradedNodes=1} elastic#138130
  Mute org.elasticsearch.upgrades.IndexSortUpgradeIT testIndexSortForNumericTypes {upgradedNodes=2} elastic#138129
  Mute org.elasticsearch.search.basic.SearchWithRandomDisconnectsIT testSearchWithRandomDisconnects elastic#138128
  [DiskBBQ] avoid EsAcceptDocs bug by calling cost before building iterator (elastic#138127)
  Log NOT_PREFERRED shard movements (elastic#138069)
  Improve bulk loading of binary doc values (elastic#137995)
  Add internal action for getting inference fields and inference results for those fields (elastic#137680)
  Address issue with DateFieldMapper#isFieldWithinQuery(...) (elastic#138032)
  WriteLoadConstraintDecider: Have separate rate limiting for canRemain and canAllocate decisions (elastic#138067)
  Adding NodeContext to TransportBroadcastByNodeAction (elastic#138057)
  Mute org.elasticsearch.simdvec.ESVectorUtilTests testSoarDistanceBulk elastic#138117
  Mute org.elasticsearch.xpack.esql.qa.single_node.GenerativeIT test elastic#137909
  Backport batched_response_might_include_reduction_failure version to 8.19 (elastic#138046)
  Add summary metrics for tdigest fields (elastic#137982)
  Add gp-llm-v2 model ID and inference endpoint (elastic#138045)
  Various tracing fixes (elastic#137908)
  [ML] Fixing KDE evaluate() to return correct ValueAndMagnitude object (elastic#128602)
  Mute org.elasticsearch.xpack.shutdown.NodeShutdownIT testStalledShardMigrationProperlyDetected elastic#115697
  [ML] Fix Flaky Audit Message Assertion in testWithDatastream for RegressionIT and ClassificationIT (elastic#138065)
  [ML] Fix Non-Deterministic Training Set Selection in RegressionIT testTwoJobsWithSameRandomizeSeedUseSameTrainingSet (elastic#138063)
  ...

# Conflicts:
#	rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/search.vectors/200_dense_vector_docvalue_fields.yml
@nicktindall nicktindall deleted the log_not_preferred_movements branch November 16, 2025 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >non-issue Team:Distributed Coordination Meta label for Distributed Coordination team v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants