Disk decider prevents allocation/fast recovery #56578

henningandersen · 2020-05-12T09:36:21Z

If a node is above low disk threshold when being restarted (rolling restart, network disruption or crash), the disk threshold decider prevents reusing the shard content on the restarted node.

This seems unfortunate, in particular in the good case where we can do a noop recovery or an operations based recovery with only few operations (since that disk usage is already accounted for).

Notice that being above the low disk threshold on a node is not a bad state in itself. The cluster may have plenty space available, but an imbalance w.r.t. disk usage can make some nodes go above low threshold anyway. We only start moving shards off the node when high threshold is reached.

The test case here demonstrates this. It also demonstrates that even having nodes with space available for the shard is not enough due to delayed allocation.

This might partially repair itself (not demonstrated) in that when enough shards have been recovered elsewhere, the node could drop below the low threshold, making the rest of the shard contents available for faster recoveries.

elasticmachine · 2020-05-12T09:36:23Z

Pinging @elastic/es-distributed (:Distributed/Allocation)

henningandersen · 2020-05-13T13:54:04Z

We discussed this at our weekly sync and while we did not reach a conclusion, we did cover following:

There were agreement that this is a problem today.
The watermarks may exist primarily as a hysteresis when moving shards off nodes.
We may want to consider adding a test framework for clusters in high disk usage situations.

As per #49972 and #56578, if a node is above low disk threshold when being restarted (rolling restart, network disruption or crash), the disk threshold decider prevents reusing the shard content on the restarted node. The consequence of the event is the node may take a long time to start.

…89018) * Create restart-cluster.asciidoc As per #49972 and #56578, if a node is above low disk threshold when being restarted (rolling restart, network disruption or crash), the disk threshold decider prevents reusing the shard content on the restarted node. The consequence of the event is the node may take a long time to start. * Update docs/reference/setup/restart-cluster.asciidoc LGTM! Thanks! Co-authored-by: Adam Locke <adam.locke@elastic.co> Co-authored-by: Adam Locke <adam.locke@elastic.co>

…lastic#89018) * Create restart-cluster.asciidoc As per elastic#49972 and elastic#56578, if a node is above low disk threshold when being restarted (rolling restart, network disruption or crash), the disk threshold decider prevents reusing the shard content on the restarted node. The consequence of the event is the node may take a long time to start. * Update docs/reference/setup/restart-cluster.asciidoc LGTM! Thanks! Co-authored-by: Adam Locke <adam.locke@elastic.co> Co-authored-by: Adam Locke <adam.locke@elastic.co>

…89018) (#89702) * Create restart-cluster.asciidoc As per #49972 and #56578, if a node is above low disk threshold when being restarted (rolling restart, network disruption or crash), the disk threshold decider prevents reusing the shard content on the restarted node. The consequence of the event is the node may take a long time to start. * Update docs/reference/setup/restart-cluster.asciidoc LGTM! Thanks! Co-authored-by: Adam Locke <adam.locke@elastic.co> Co-authored-by: Adam Locke <adam.locke@elastic.co> Co-authored-by: Adam Locke <adam.locke@elastic.co>

…lastic#89018) * Create restart-cluster.asciidoc As per elastic#49972 and elastic#56578, if a node is above low disk threshold when being restarted (rolling restart, network disruption or crash), the disk threshold decider prevents reusing the shard content on the restarted node. The consequence of the event is the node may take a long time to start. * Update docs/reference/setup/restart-cluster.asciidoc LGTM! Thanks! Co-authored-by: Adam Locke <adam.locke@elastic.co> Co-authored-by: Adam Locke <adam.locke@elastic.co>

henningandersen added >bug :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) team-discuss labels May 12, 2020

elasticmachine added the Team:Distributed Meta label for distributed team label May 12, 2020

henningandersen removed the team-discuss label Jun 10, 2020

DaveCTurner mentioned this issue Jul 29, 2022

Restarting data node should not cause local shards unassigned when disk size above low-watermark #49972

Closed

Leaf-Lin mentioned this issue Aug 2, 2022

[DOCS] Add warning on restarting nodes exceeding low disk watermark #89018

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disk decider prevents allocation/fast recovery #56578

Disk decider prevents allocation/fast recovery #56578

henningandersen commented May 12, 2020 •

edited

elasticmachine commented May 12, 2020

henningandersen commented May 13, 2020

Disk decider prevents allocation/fast recovery #56578

Disk decider prevents allocation/fast recovery #56578

Comments

henningandersen commented May 12, 2020 • edited

elasticmachine commented May 12, 2020

henningandersen commented May 13, 2020

henningandersen commented May 12, 2020 •

edited