Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip zone/host awareness with auto-expand replicas #69334

Conversation

DaveCTurner
Copy link
Contributor

Today if an index is set to auto_expand_replicas: N-all then we will
try and create a shard copy on every node that matches the applicable
allocation filters. This conflits with shard allocation awareness and
the same-host allocation decider if there is an uneven distribution of
nodes across zones or hosts, since these deciders prevent shard copies
from being allocated unevenly and may therefore leave some unassigned
shards.

The point of these two deciders is to improve resilience given a limited
number of shard copies but there is no need for this behaviour when the
number of shard copies is not limited, so this commit supresses them in
that case.

Closes #54151
Closes #2869

Today if an index is set to `auto_expand_replicas: N-all` then we will
try and create a shard copy on every node that matches the applicable
allocation filters. This conflits with shard allocation awareness and
the same-host allocation decider if there is an uneven distribution of
nodes across zones or hosts, since these deciders prevent shard copies
from being allocated unevenly and may therefore leave some unassigned
shards.

The point of these two deciders is to improve resilience given a limited
number of shard copies but there is no need for this behaviour when the
number of shard copies is not limited, so this commit supresses them in
that case.

Closes elastic#54151
Closes elastic#2869
@DaveCTurner DaveCTurner added >enhancement :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) v8.0.0 v7.13.0 labels Feb 22, 2021
@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team label Feb 22, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@DaveCTurner
Copy link
Contributor Author

DaveCTurner commented Feb 22, 2021

I'm saying that this closes #2869 because I could not see anything in the comments or issues linked to that issue that would still be worth addressing after this change. In general I don't see a use for setting auto_expand_replicas to something other than 0-1 or 0-all on internal indices, depending on whether we want copies everywhere or just one replica, and these two cases should now work quite well. There may be other corner cases on user-controlled indices that can't be addressed by appropriate combinations of settings, but I'd prefer to hear about them in new issues rather than add to that rather long and complicated story.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (one Q)

and <<allocation-total-shards,total shards per node>>, and this can lead to the
cluster health becoming `YELLOW` if the applicable rules prevent all the replicas
from being allocated.
Auto-expand the number of replicas based on the number of data nodes in the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove the indentation? Does this still display properly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, see https://elasticsearch_69334.docs-preview.app.elstc.co/guide/en/elasticsearch/reference/master/index-modules.html#dynamic-index-settings

I don't know if it's possible to add paragraph breaks (the + on its own line) to one of these blocks of text otherwise.

@DaveCTurner DaveCTurner merged commit bb3ea99 into elastic:master Feb 22, 2021
@DaveCTurner DaveCTurner deleted the 2021-02-22-auto-expand-replicas-vs-allocation-awareness branch February 22, 2021 16:54
DaveCTurner added a commit that referenced this pull request Feb 22, 2021
Today if an index is set to `auto_expand_replicas: N-all` then we will
try and create a shard copy on every node that matches the applicable
allocation filters. This conflits with shard allocation awareness and
the same-host allocation decider if there is an uneven distribution of
nodes across zones or hosts, since these deciders prevent shard copies
from being allocated unevenly and may therefore leave some unassigned
shards.

The point of these two deciders is to improve resilience given a limited
number of shard copies but there is no need for this behaviour when the
number of shard copies is not limited, so this commit supresses them in
that case.

Closes #54151
Closes #2869
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Feb 22, 2021
Today we count `null` (i.e. missing) as a valid attribute value in
allocation awareness, even though allocation awareness forbids the
allocation of shards to such a node. Prior to elastic#69334 this didn't matter,
a data node without allocation attributes was pointless.

However, elastic#69334 means we now can allocate shards to such a node: for
instance, there is no need for nodes holding only enrich indices to have
allocation attributes. Therefore we should stop counting `null` as one
of the attribute values.
DaveCTurner added a commit that referenced this pull request Feb 23, 2021
Today we count `null` (i.e. missing) as a valid attribute value in
allocation awareness, even though allocation awareness forbids the
allocation of shards to such a node. Prior to #69334 this didn't matter,
a data node without allocation attributes was pointless.

However, #69334 means we now can allocate shards to such a node: for
instance, there is no need for nodes holding only enrich indices to have
allocation attributes. Therefore we should stop counting `null` as one
of the attribute values.
DaveCTurner added a commit that referenced this pull request Feb 23, 2021
Today we count `null` (i.e. missing) as a valid attribute value in
allocation awareness, even though allocation awareness forbids the
allocation of shards to such a node. Prior to #69334 this didn't matter,
a data node without allocation attributes was pointless.

However, #69334 means we now can allocate shards to such a node: for
instance, there is no need for nodes holding only enrich indices to have
allocation attributes. Therefore we should stop counting `null` as one
of the attribute values.
easyice pushed a commit to easyice/elasticsearch that referenced this pull request Mar 25, 2021
Today if an index is set to `auto_expand_replicas: N-all` then we will
try and create a shard copy on every node that matches the applicable
allocation filters. This conflits with shard allocation awareness and
the same-host allocation decider if there is an uneven distribution of
nodes across zones or hosts, since these deciders prevent shard copies
from being allocated unevenly and may therefore leave some unassigned
shards.

The point of these two deciders is to improve resilience given a limited
number of shard copies but there is no need for this behaviour when the
number of shard copies is not limited, so this commit supresses them in
that case.

Closes elastic#54151
Closes elastic#2869
easyice pushed a commit to easyice/elasticsearch that referenced this pull request Mar 25, 2021
Today we count `null` (i.e. missing) as a valid attribute value in
allocation awareness, even though allocation awareness forbids the
allocation of shards to such a node. Prior to elastic#69334 this didn't matter,
a data node without allocation attributes was pointless.

However, elastic#69334 means we now can allocate shards to such a node: for
instance, there is no need for nodes holding only enrich indices to have
allocation attributes. Therefore we should stop counting `null` as one
of the attribute values.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >enhancement Team:Distributed Meta label for distributed team v7.13.0 v8.0.0-alpha1
Projects
None yet
4 participants