Make index creation more user-friendly #9126

s1monw · 2015-01-02T14:26:27Z

Today when we create an index we return immediately after executing sanity checks and adding metadata to the cluster-state. Yet, we don't wait for any kind of allocations etc. such that an index can be created with more replicas than nodes in the cluster and once it's closed it can't be reopened since reopening an index requires a quorum of the replicas for each shard. If such an index is reopened the shards that have no quorum / not enough replicas are found in the cluster will just not be allocated at all.

Unfortunately not even waiting for yellow will help here since it means waiting for the primary to be allocated which might not be enough in the case of #replicas > 1.

There are a couple of things we can do here to improve the situation:

Add another wait to the cluster health to wait for quorum
By default wait for quorum for the index when an index is created
By default reject closing an index if less than the quorum of shards is allocated

The text was updated successfully, but these errors were encountered:

jpountz · 2015-01-02T14:31:20Z

+1 This will especially make integration testing less trappy!

bsandvik · 2015-01-20T22:55:10Z

+1 this would help us out a lot

Mpdreamz · 2015-07-07T13:03:18Z

+1

synhershko · 2015-07-07T13:08:24Z

Reject sealing an index if no quorum, too?

nik9000 · 2015-07-07T13:43:16Z

Reject sealing an index if no quorum, too?

I don't think that's a good idea. The worst thing that happens if you do a synced_flush when one of the shards isn't around is that it doesn't effect them. Meaning that shard can't be restored quickly. There isn't really anything that can be done to get that shard to restore quickly regardless because its already offline. I wouldn't want to prevent speeding recovery on the other copies of the shard.

And I think its safe because if, by some nasty turn of events, one of the down shards ends up coming back and being the master shard then the synced flush won't have any effect because it won't be on the master.

s1monw · 2015-07-09T09:28:22Z

@kimchy promised to work on this today

clintongormley · 2015-08-24T10:29:01Z

This wait-for-quorum should be extended to the open-index API, eg see #12987

ywelsch · 2016-02-14T11:27:55Z

@s1monw In v3.0 we allocate primary shard based on allocation IDs #14739. This means that reopening an index only requires 1 good copy (not a quorum anymore). With allocation ids, is this a usability issue now (and not resiliency-related anymore)?

#14252 , #7572 , #15900, #12573, #14671, #15281 and #9126 have all been closed/merged and will be part of 5.0.0.

bleskes · 2016-05-12T13:59:19Z

@clintongormley and I discussed this again and came up with a plan.

There are two issues still left with index creation- the first is that an index creation move the cluster health status to RED even if everything is OK. The second is that when an index creation successfully returns, there is no guarantee that a follow up indexing operation will not have to wait (maybe not so bad) and that all operations on that index (for example _analyze, that needs a shard copy) will succeed.

Index Creation puts status to RED:
Currently RED means that a shard has a non-active primary. The idea is to change the semantics to exclude primary shards that were never successfully assigned and also didn't experience any shard failure during assignment. If the allocation deciders block the allocation of primary (not throttle it) we will treat it as a failure and make the shard red as well. In other cases the shard is YELLOW.
Index Creation should wait or enough shard copies to reach started
The index creation call should add the index to the cluster metadata and wait for enough shard copies (typically only primaries, but this should be based on action.write_consistency) to be started. It will return immediately if the status of one of those shards becomes RED (allocation failure or it can't be assigned to any node), reporting the failure.

nik9000 · 2016-05-19T14:30:50Z

So now that we're testing snippets in the documentation this user-unfriendliness is leaking into the documentation. Which makes the issue pretty obvious. So I'd be pretty excited to have this fixed/make time to do it myself.

Previously, index creation would momentarily cause the cluster health to go RED, because the primaries were still being assigned and activated. This commit ensures that when an index is created or an index is being recovered during cluster recovery and it does not have any active allocation ids, then the cluster health status will not go RED, but instead be YELLOW. Relates elastic#9126

If the allocation decision for a primary shard was NO, this should cause the cluster health for the shard to go RED, even if the shard belongs to a newly created index or is part of cluster recovery. Relates elastic#9126

Before returning, index creation now waits for the configured number of shard copies to be started. In the past, a client would create an index and then potentially have to check the cluster health to wait to execute write operations. With the cluster health semantics changing so that index creation does not cause the cluster health to go RED, this change enables waiting for the desired number of active shards to be active before returning from index creation. Relates elastic#9126

Previously, index creation would momentarily cause the cluster health to go RED, because the primaries were still being assigned and activated. This commit ensures that when an index is created or an index is being recovered during cluster recovery and it does not have any active allocation ids, then the cluster health status will not go RED, but instead be YELLOW. Relates #9126

If the allocation decision for a primary shard was NO, this should cause the cluster health for the shard to go RED, even if the shard belongs to a newly created index or is part of cluster recovery. Relates #9126

Before returning, index creation now waits for the configured number of shard copies to be started. In the past, a client would create an index and then potentially have to check the cluster health to wait to execute write operations. With the cluster health semantics changing so that index creation does not cause the cluster health to go RED, this change enables waiting for the desired number of active shards to be active before returning from index creation. Relates elastic#9126

Before returning, index creation now waits for the configured number of shard copies to be started. In the past, a client would create an index and then potentially have to check the cluster health to wait to execute write operations. With the cluster health semantics changing so that index creation does not cause the cluster health to go RED, this change enables waiting for the desired number of active shards to be active before returning from index creation. Relates #9126

abeyad · 2016-07-15T15:48:37Z

Closed by #19450

robinst · 2016-11-23T05:58:21Z

Just to check, this was released with 5.0.0, right?

ywelsch · 2016-11-23T08:25:37Z

yes, the version label is on the linked PR #19450 that was used to close this issue.

s1monw added v2.0.0-beta1 help wanted adoptme resiliency :Core/Infra/Core Core issues without another label labels Jan 2, 2015

jpountz mentioned this issue Jan 9, 2015

Index creation causes cluster health to turn red momentarily #9106

Closed

s1monw mentioned this issue Apr 8, 2015

Automatically wait for green / yellow if possible on index creation #10473

Closed

clintongormley mentioned this issue Apr 25, 2015

Indices exists returns false during recovery #8105

Closed

clintongormley added the >enhancement label Jun 8, 2015

clintongormley added v2.0.0 v2.1.0 and removed v2.0.0-beta1 v2.0.0 labels Aug 13, 2015

clintongormley mentioned this issue Aug 24, 2015

Opening an index with a missing analyzer stopword file result in a node unavailability #12987

Closed

Mpdreamz mentioned this issue Oct 17, 2015

Running Count on a newly created index throws an exception with "Could not parse server exception" elastic/elasticsearch-net#1596

Closed

clintongormley added v2.2.0 and removed v2.1.0 labels Nov 20, 2015

spinscale added v2.3.0 and removed v2.2.0 labels Dec 23, 2015

ywelsch mentioned this issue Feb 14, 2016

Updates to resiliency documentation #16658

Merged

clintongormley added v2.4.0 and removed v2.3.0 labels Mar 16, 2016

bleskes added a commit that referenced this issue Apr 7, 2016

Update resliency page

557a3d1

#14252 , #7572 , #15900, #12573, #14671, #15281 and #9126 have all been closed/merged and will be part of 5.0.0.

bleskes mentioned this issue Apr 7, 2016

Update resliency page #17586

Merged

bleskes added a commit that referenced this issue Apr 7, 2016

Update resiliency page (#17586)

8eee28e

#14252 , #7572 , #15900, #12573, #14671, #15281 and #9126 have all been closed/merged and will be part of 5.0.0.

ywelsch mentioned this issue May 12, 2016

Add wait_for_health=yellow to reindex snippets #18295

Merged

nik9000 mentioned this issue May 13, 2016

Test docs for plugins #18337

Merged

bleskes removed the v2.4.0 label May 19, 2016

abeyad self-assigned this May 20, 2016

nik9000 mentioned this issue Jun 3, 2016

[TEST] wait for yellow after setup doc tests #18726

Merged

This was referenced Jun 5, 2016

Index creation does not cause the cluster health to go RED #18737

Merged

WIP: Index creation waits for write consistency shards #18759

Closed

abeyad mentioned this issue Jul 15, 2016

Makes index creation more friendly #19450

Merged

abeyad closed this as completed in #19450 Jul 15, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make index creation more user-friendly #9126

Make index creation more user-friendly #9126

s1monw commented Jan 2, 2015 •

edited by abeyad

jpountz commented Jan 2, 2015

bsandvik commented Jan 20, 2015

Mpdreamz commented Jul 7, 2015

synhershko commented Jul 7, 2015

nik9000 commented Jul 7, 2015

s1monw commented Jul 9, 2015

clintongormley commented Aug 24, 2015

ywelsch commented Feb 14, 2016

bleskes commented May 12, 2016

nik9000 commented May 19, 2016

abeyad commented Jul 15, 2016

robinst commented Nov 23, 2016

ywelsch commented Nov 23, 2016

Make index creation more user-friendly #9126

Make index creation more user-friendly #9126

Comments

s1monw commented Jan 2, 2015 • edited by abeyad

jpountz commented Jan 2, 2015

bsandvik commented Jan 20, 2015

Mpdreamz commented Jul 7, 2015

synhershko commented Jul 7, 2015

nik9000 commented Jul 7, 2015

s1monw commented Jul 9, 2015

clintongormley commented Aug 24, 2015

ywelsch commented Feb 14, 2016

bleskes commented May 12, 2016

nik9000 commented May 19, 2016

abeyad commented Jul 15, 2016

robinst commented Nov 23, 2016

ywelsch commented Nov 23, 2016

s1monw commented Jan 2, 2015 •

edited by abeyad