-
Notifications
You must be signed in to change notification settings - Fork 24.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Index creation does not cause the cluster health to go RED #18737
Index creation does not cause the cluster health to go RED #18737
Conversation
623a4c7
to
e982a0a
Compare
How will this affect client programmers. For example after creating an index, as a developer it was common to code in logic to check the cluster health and make sure it is not If I'm misinterpreting the implementation of the change, then my apologies. If not then I believe we need to make sure we have a |
@djschny you are correct, the work is split into two parts that will go into a feature branch before going into master. The second part is to wait until at least the primaries are initialized before returning from the index creation call, unless a primary shard allocation actually failed. See the comment here: #9126 (comment) |
Cool thanks @abeyad did not realize this was part of larger picture goal. All 👍 as my previous comments are addressed in the parent issue you referenced. Thanks. |
@@ -59,6 +62,9 @@ public ClusterShardHealth(final int shardId, final IndexShardRoutingTable shardR | |||
} else if (shardRouting.unassigned()) { | |||
computeUnassignedShards++; | |||
} | |||
if (shardRouting.primary()) { | |||
primaryRouting = shardRouting; | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simpler just to write ShardRouting primaryRouting = shardRoutingTable.primaryShard();
which is guaranteed non-null.
Let's also remove computePrimaryActive
and directly write primaryRouting.active()
instead.
Left some comments. As @bleskes noted, we should ensure that cluster health goes red if we cannot allocate newly created/recovered index to any node. I would suggest adding a boolean to |
I think it will be clearer and easier to debug if we store the last allocation decision on the UnassignedInfo, instead of a boolean. |
62f81b5
to
1a480ae
Compare
@@ -116,6 +117,7 @@ | |||
private final String message; | |||
private final Throwable failure; | |||
private final int failedAllocations; | |||
private final Optional<Decision> lastAllocationDecision; // the last allocation decision take for this shard |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are only interested in Decision.Type, so let's only store that one in UnassignedInfo. Putting the whole Decision object into the cluster state (with explanations and all) seems wasteful to me.
* NB: this method should *not* be called on active shards nor on non-primary shards. | ||
*/ | ||
public static ClusterHealthStatus getInactivePrimaryHealth(final ShardRouting shardRouting, final IndexMetaData indexMetaData) { | ||
assert shardRouting.primary() : "cannot invoke on a replica shard: " + shardRouting.shardId() + "[R]]"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for simplicity, just put shardRouting
there instead of shardRouting.shardId() + "[R]]"
return NO_ATTEMPT; | ||
default: | ||
throw new IllegalArgumentException("Unknown AllocationStatus value [" + v + "]"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for toXContent
and toString()
, we can also provide a method here that gives a nice description string .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have a precedent for this with other enums? it seems like in toXContent
for the cluster state, we just use the enum's string value, e.g. in UnassignedInfo.Reason
Left a few very minor comments. LGTM otherwise. Thanks @abeyad |
8de6b41
to
a6ee9ff
Compare
Previously, index creation would momentarily cause the cluster health to go RED, because the primaries were still being assigned and activated. This commit ensures that when an index is created or an index is being recovered during cluster recovery and it does not have any active allocation ids, then the cluster health status will not go RED, but instead be YELLOW. Relates elastic#9126
If the allocation decision for a primary shard was NO, this should cause the cluster health for the shard to go RED, even if the shard belongs to a newly created index or is part of cluster recovery. Relates elastic#9126
a6ee9ff
to
f61f2e7
Compare
@ywelsch thank you for your review! |
Previously, index creation would momentarily cause the cluster health to
go RED, because the primaries were still being assigned and activated.
This commit ensures that when an index is created or an index is being
recovered during cluster recovery and it does not have any active
allocation ids, then the cluster health status will not go RED, but
instead be YELLOW.
Relates #9126
Closes #9106