Auto-expand-replicas causes problems in a small cluster rolling restart #95104

DaveCTurner · 2023-04-08T08:35:43Z

Many indices (particularly system indices) use auto_expand_replicas: 0-1 to avoid unassigned shards in a one-node cluster. However, when restarting a node in a cluster with just two nodes in a data tier this setting causes undesirable behaviour:

the cluster may remain in green health even though really it has unassigned shards and is running with less resilience than intended
replicas on the restarting node are effectively destroyed (at least, their retention leases are dropped) which forces a full file-based recovery

There's room for improvement here. Could we use the node-shutdown or desired-nodes features to report accurate health, and avoid file-based recovery, for auto-expand replicas indices while a node is restarting?

Workaround

If you have a small cluster but never intend to shrink its hot/warm/content tiers to a single node, set number_of_replicas: 1 instead of using auto_expand_replicas.

The text was updated successfully, but these errors were encountered:

elasticsearchmachine · 2023-04-08T08:36:12Z

Pinging @elastic/es-distributed (Team:Distributed)

DaveCTurner added >bug :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) labels Apr 8, 2023

elasticsearchmachine added the Team:Distributed Meta label for distributed team label Apr 8, 2023

This comment was marked as off-topic.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto-expand-replicas causes problems in a small cluster rolling restart #95104

Auto-expand-replicas causes problems in a small cluster rolling restart #95104

DaveCTurner commented Apr 8, 2023

elasticsearchmachine commented Apr 8, 2023

This comment was marked as off-topic.

This comment was marked as off-topic.

Auto-expand-replicas causes problems in a small cluster rolling restart #95104

Auto-expand-replicas causes problems in a small cluster rolling restart #95104

Comments

DaveCTurner commented Apr 8, 2023

Workaround

elasticsearchmachine commented Apr 8, 2023

This comment was marked as off-topic.

This comment was marked as off-topic.