Index creation causes cluster health to turn red momentarily #9106

ppf2 · 2014-12-30T17:54:37Z

It is not uncommon for admins in the field to set up alerts against the cluster health (red/yellow/green). Currently, index creation can cause the cluster health to go red momentarily until its primary shards are allocated (expected). It would be a nice enhancement to have a way to create an index without causing the cluster health to go red (even for a short subsecond durations).

jpountz · 2015-01-09T09:31:49Z

Could this be related to #9126? If the index creation API starts waiting for yellow by default then maybe the health status could only take into account the newly created index once the index creation request terminates (including timeouts)?

barakcoh · 2015-08-20T09:20:15Z

+1

Marvel is causing us a sub-second red status every day at midnight and it's quite annoying to constantly see it in the Shard Allocation section. We also have the above alert in place. If it happens to query the cluster at that exact time people will get a Twillio call in the middle of the night which is less than ideal.

jhansen-tt · 2015-10-05T15:24:41Z

+1
What can I do to help fix this?

ppf2 · 2015-10-19T22:44:48Z

+1 Use case: Indexing a ton of data via Logstash hourly indices and seeing red every hour ..

mikemccand · 2015-10-20T17:34:06Z

+1, the cluster should never go red unless data loss has occurred ... this is a nasty bug in our cluster health.

It's like the smoke alarms that go off in my house when it's too dusty or we are cooking something "unusual".

#9126 seems very much related.

bleskes · 2015-10-20T18:27:30Z

@mikemccand it is related though slightly different. Even if we hold back the create index response until the index is green/yellow etc an independent monitoring of cluster health will report it's status.

I'm +100 on solving this but I couldn't come up - to date - with a proper solution. When we create an index we add unassigned primaries + replicas to the routing table. We try to assign the primaries immediately (which may fail because of throttling) and publish the cluster state to the nodes for the primaries to initialize. Here lies the problem - a cluster state with initializing primaries is technically red. Only once the shards are started do we move to yellow. One we could say that a cluster health should ignore initialzing/unassigned shards which are guaranteed to not contain data but then what happens when those primaries can not be assigned (because of allocation filtering or whatever)? we should still communicate that somehow as the situation is wrong. I'd love to hear an elegant suggestion here...

nik9000 · 2015-10-20T19:06:14Z

I had this trouble to - every time I did an online mapping change I had to rebuild the index and stream one index to another - and I had 1600 indexes to do. Icinga generally thought Elasticsearch was flapping at that time because it was.

Maybe ignore indexes less than 60 seconds old in overall cluster state. The index itself should be red, but maybe not the whole cluster.

Any solution to this is going to break a whole lot of tests somewhere but is probably worth it.

felipegs · 2015-11-24T19:50:42Z

+1

bashok001 · 2015-12-14T20:49:58Z

+1 Very much needed. Our alarms are going off every couple of days. I worry that continuing the practice of waiting it out will one day cost us dearly one day when there is a real problem.

jeffkirk1 · 2016-03-22T18:02:31Z

+1 I'm experiencing this issue daily as well, coincidental with Marvel index reloads. Elasticsearch 2.2.0. Temporarily disabled Marvel refreshes to compensate but obviously that's not a great long term solution.

majormoses · 2016-04-15T01:54:14Z

@nik9000

Maybe ignore indexes less than 60 seconds old in overall cluster state. The index itself should be red, but maybe not the whole cluster.

This makes sense to me

clintongormley · 2016-06-27T13:36:24Z

Fixed by #18737

clintongormley added the discuss label Dec 31, 2014

clintongormley added the :Data Management/Stats Statistics tracking and retrieval APIs label Aug 24, 2015

clintongormley added >enhancement :Cluster and removed :Data Management/Stats Statistics tracking and retrieval APIs labels Dec 3, 2015

abeyad mentioned this issue Jun 27, 2016

Index creation does not cause the cluster health to go RED #18737

Merged

clintongormley closed this as completed Jun 27, 2016

clintongormley added :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. and removed :Cluster labels Feb 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Index creation causes cluster health to turn red momentarily #9106

Index creation causes cluster health to turn red momentarily #9106

ppf2 commented Dec 30, 2014

jpountz commented Jan 9, 2015

barakcoh commented Aug 20, 2015

jhansen-tt commented Oct 5, 2015

ppf2 commented Oct 19, 2015

mikemccand commented Oct 20, 2015

bleskes commented Oct 20, 2015

nik9000 commented Oct 20, 2015

felipegs commented Nov 24, 2015

bashok001 commented Dec 14, 2015

jeffkirk1 commented Mar 22, 2016

majormoses commented Apr 15, 2016

clintongormley commented Jun 27, 2016

Index creation causes cluster health to turn red momentarily #9106

Index creation causes cluster health to turn red momentarily #9106

Comments

ppf2 commented Dec 30, 2014

jpountz commented Jan 9, 2015

barakcoh commented Aug 20, 2015

jhansen-tt commented Oct 5, 2015

ppf2 commented Oct 19, 2015

mikemccand commented Oct 20, 2015

bleskes commented Oct 20, 2015

nik9000 commented Oct 20, 2015

felipegs commented Nov 24, 2015

bashok001 commented Dec 14, 2015

jeffkirk1 commented Mar 22, 2016

majormoses commented Apr 15, 2016

clintongormley commented Jun 27, 2016