Clarify the difference between `cluster.routing.allocation.total_shards_per_node` and `cluster.max_shards_per_node` #51839

gwbrown · 2020-02-04T00:03:43Z

I've seen some confusion about the differences between cluster.routing.allocation.total_shards_per_node and cluster.max_shards_per_node, which is reasonable given the similarity in their names and the brevity of their documentation. I think the docs could be clarified fairly easily with just a few additions to the docs for each setting. Below are some notes I wrote up elsewhere about the difference between these two settings - these should probably be condensed down before being included in the docs rather than pasted directly.

cluster.routing.allocation.total_shards_per_node is checked at allocation time to control how many shards will be allocated to any given node. To give an example: If this setting is set to 100, and there are 3 data nodes - A with 100 shards, B with 98 shards, and C with 1 shard, and node C dies and its shard needs to be reallocated, they can only go to node B, because node A already has 100 shards.
cluster.max_shards_per_node controls how many shards are allowed to exist in the cluster as a whole, and is checked at shard creation time, but does not pay attention to how many shards any individual node has. To give an example: If this setting is set to 100, and there are 3 data nodes A with 100 shards, B with 98 shards, and C with 98 shards, there are 296 shards total in the cluster. The limit is number_of_data_nodes * cluster.max_shards_per_node = 3 * 100 = 300, so up to 4 new shards can be created in the cluster. If you try to create an index with 3 shards/1 replica, this would result in the creation of 6 total shards (3 primaries and 3 replicas), which would take the cluster over the limit of 300 and the request will be rejected.

To summarize: cluster.routing.allocation.total_shards_per_node controls how many shards each individual node can have allocated to it at any given time, while cluster.max_shards_per_node controls how many shards the cluster can have in it as a whole, scaled by number of data nodes, but regardless of how many shards any particular node might have allocated to it.

Some notes:

Hitting the limit from cluster.max_shards_per_node will result in requests that would create more shards being rejected. This is to prevent the most extreme cases of oversharding.
cluster.max_shards_per_node counts both primary and replica shards for open indices. Closed indices don't count, so if you see a cluster hitting this limit, closing some indices will help. Frozen indices count as open, at time of writing, although if we get feedback that counting them as closed would be useful that's open to changing.
Operations that create shards include but are not limited to: Creating an index, increasing the number of replicas on an index, resizing or cloning an index, restoring an index from a snapshot.
cluster.max_shards_per_node uses the number of data nodes to determine the limit. Non-data nodes (dedicated master, ingest-only, etc.) don't count. A cluster with no data nodes has no limit (apparently some users will set up master nodes first, create indices and such, and then add data nodes later, this is to enable that use case).

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-02-04T00:03:46Z

Pinging @elastic/es-core-infra (:Core/Infra/Core)

elasticmachine · 2020-02-04T00:03:48Z

Pinging @elastic/es-docs (>docs)

Clarifies differences between the `cluster.routing.allocation.total_shards_per_node` and `cluster.max_shards_per_node` cluster settings. Closes #51839 Co-authored-by: Gordon Brown <arcsech@gmail.com>

gwbrown added >docs General docs changes :Core/Infra/Core Core issues without another label labels Feb 4, 2020

gwbrown self-assigned this Feb 4, 2020

rjernst added Team:Core/Infra Meta label for core/infra team Team:Docs Meta label for docs team labels May 4, 2020

Bukhtawar mentioned this issue Aug 14, 2020

Factor available heap in the cluster.max_shards_per_node limit #61140

Open

jrodewig self-assigned this Nov 9, 2020

jrodewig mentioned this issue Nov 10, 2020

[DOCS] Clarify diff between shards per node settings #64875

Merged

jrodewig closed this as completed in #64875 Nov 16, 2020

This was referenced Apr 16, 2024

max_shards_per_node not behaving as documented crate/crate#15803

Closed

Clarify behavior of max_shards_per_node and total_shards_per_node crate/crate#15860

Merged

This was referenced Apr 18, 2024

Clarify behavior of max_shards_per_node and total_shards_per_node (backport #15860) crate/crate#15863

Merged

Clarify behavior of max_shards_per_node and total_shards_per_node (backport #15860) crate/crate#15864

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify the difference between `cluster.routing.allocation.total_shards_per_node` and `cluster.max_shards_per_node` #51839

Clarify the difference between `cluster.routing.allocation.total_shards_per_node` and `cluster.max_shards_per_node` #51839

gwbrown commented Feb 4, 2020

elasticmachine commented Feb 4, 2020

elasticmachine commented Feb 4, 2020

Clarify the difference between cluster.routing.allocation.total_shards_per_node and cluster.max_shards_per_node #51839

Clarify the difference between cluster.routing.allocation.total_shards_per_node and cluster.max_shards_per_node #51839

Comments

gwbrown commented Feb 4, 2020

elasticmachine commented Feb 4, 2020

elasticmachine commented Feb 4, 2020

Clarify the difference between `cluster.routing.allocation.total_shards_per_node` and `cluster.max_shards_per_node` #51839

Clarify the difference between `cluster.routing.allocation.total_shards_per_node` and `cluster.max_shards_per_node` #51839