Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index creation does not cause the cluster health to go RED #18737

Conversation

abeyad
Copy link

@abeyad abeyad commented Jun 4, 2016

Previously, index creation would momentarily cause the cluster health to
go RED, because the primaries were still being assigned and activated.
This commit ensures that when an index is created or an index is being
recovered during cluster recovery and it does not have any active
allocation ids, then the cluster health status will not go RED, but
instead be YELLOW.

Relates #9126

Closes #9106

@abeyad
Copy link
Author

abeyad commented Jun 4, 2016

@bleskes @ywelsch FYI, most of the PR is tests.

@abeyad abeyad force-pushed the index-creation-cluster-health branch from 623a4c7 to e982a0a Compare June 4, 2016 03:05
@djschny
Copy link
Contributor

djschny commented Jun 5, 2016

How will this affect client programmers. For example after creating an index, as a developer it was common to code in logic to check the cluster health and make sure it is not red prior to starting to index data. With this change it looks like there might be a false state of affairs as indexing could start even though the primaries have not been fully allocated yet, and hence indexing errors.

If I'm misinterpreting the implementation of the change, then my apologies. If not then I believe we need to make sure we have a wait_for_completion style option on the index creation request (I thought we had issue for this, by searching is failing me).

@abeyad
Copy link
Author

abeyad commented Jun 5, 2016

@djschny you are correct, the work is split into two parts that will go into a feature branch before going into master. The second part is to wait until at least the primaries are initialized before returning from the index creation call, unless a primary shard allocation actually failed. See the comment here: #9126 (comment)

@djschny
Copy link
Contributor

djschny commented Jun 5, 2016

Cool thanks @abeyad did not realize this was part of larger picture goal. All 👍 as my previous comments are addressed in the parent issue you referenced. Thanks.

@@ -59,6 +62,9 @@ public ClusterShardHealth(final int shardId, final IndexShardRoutingTable shardR
} else if (shardRouting.unassigned()) {
computeUnassignedShards++;
}
if (shardRouting.primary()) {
primaryRouting = shardRouting;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simpler just to write ShardRouting primaryRouting = shardRoutingTable.primaryShard(); which is guaranteed non-null.

Let's also remove computePrimaryActive and directly write primaryRouting.active() instead.

@ywelsch
Copy link
Contributor

ywelsch commented Jun 8, 2016

Left some comments. As @bleskes noted, we should ensure that cluster health goes red if we cannot allocate newly created/recovered index to any node. I would suggest adding a boolean to UnassignedInfo that captures the information if shard could not be assigned on first (or subsequent) tries due to allocation deciders saying NO.

@abeyad abeyad closed this Jun 8, 2016
@abeyad abeyad reopened this Jun 8, 2016
@bleskes
Copy link
Contributor

bleskes commented Jun 9, 2016

I think it will be clearer and easier to debug if we store the last allocation decision on the UnassignedInfo, instead of a boolean.

@abeyad abeyad force-pushed the index-creation-cluster-health branch from 62f81b5 to 1a480ae Compare June 15, 2016 05:01
@@ -116,6 +117,7 @@
private final String message;
private final Throwable failure;
private final int failedAllocations;
private final Optional<Decision> lastAllocationDecision; // the last allocation decision take for this shard
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are only interested in Decision.Type, so let's only store that one in UnassignedInfo. Putting the whole Decision object into the cluster state (with explanations and all) seems wasteful to me.

* NB: this method should *not* be called on active shards nor on non-primary shards.
*/
public static ClusterHealthStatus getInactivePrimaryHealth(final ShardRouting shardRouting, final IndexMetaData indexMetaData) {
assert shardRouting.primary() : "cannot invoke on a replica shard: " + shardRouting.shardId() + "[R]]";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for simplicity, just put shardRouting there instead of shardRouting.shardId() + "[R]]"

@abeyad
Copy link
Author

abeyad commented Jun 17, 2016

@ywelsch @bleskes aee01c4 addresses the latest review comments

return NO_ATTEMPT;
default:
throw new IllegalArgumentException("Unknown AllocationStatus value [" + v + "]");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for toXContent and toString(), we can also provide a method here that gives a nice description string .

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a precedent for this with other enums? it seems like in toXContent for the cluster state, we just use the enum's string value, e.g. in UnassignedInfo.Reason

@ywelsch
Copy link
Contributor

ywelsch commented Jun 17, 2016

Left a few very minor comments. LGTM otherwise. Thanks @abeyad

@abeyad abeyad force-pushed the index-creation-cluster-health branch from 8de6b41 to a6ee9ff Compare June 20, 2016 14:48
Ali Beyad added 3 commits June 20, 2016 10:53
Previously, index creation would momentarily cause the cluster health to
go RED, because the primaries were still being assigned and activated.
This commit ensures that when an index is created or an index is being
recovered during cluster recovery and it does not have any active
allocation ids, then the cluster health status will not go RED, but
instead be YELLOW.

Relates elastic#9126
If the allocation decision for a primary shard was NO, this should
cause the cluster health for the shard to go RED, even if the shard
belongs to a newly created index or is part of cluster recovery.

Relates elastic#9126
@abeyad abeyad force-pushed the index-creation-cluster-health branch from a6ee9ff to f61f2e7 Compare June 20, 2016 14:54
@abeyad abeyad merged commit 1c209d9 into elastic:feature/friendly-index-creation Jun 20, 2016
@abeyad
Copy link
Author

abeyad commented Jun 20, 2016

@ywelsch thank you for your review!

@clintongormley clintongormley added :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. and removed :Cluster labels Feb 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. >enhancement resiliency v5.0.0-alpha4
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants