A cluster with no data nodes or indices reports YELLOW health when an index is created. #41073

DaveCTurner · 2019-04-10T12:27:49Z

If you start up a new cluster without any data nodes then it will report green health, because there are no unassigned shards. If you subsequently create an index it will report yellow health because there are unassigned newly-created primaries. However we never attempt to assign any shards if there are no data nodes so the health will continue to report yellow and will never move to red:

elasticsearch/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java

Lines 117 to 120 in abd861b

    
           if (allocation.routingNodes().size() == 0) { 
        
               /* with no nodes this is pointless */ 
        
               return; 
        
           }

I think in this case we should move the unassigned status for any new shards from AllocationStatus.NO_ATTEMPT to AllocationStatus.DECIDERS_NO before bailing out, which would result in red health.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2019-04-10T12:27:50Z

Pinging @elastic/es-distributed

Gaurav614 · 2019-04-10T12:37:31Z

https://discuss.elastic.co/t/primary-shards-unassigned-but-still-cluster-state-health-yellow/176189/12
This issue was first reported by me.

DaveCTurner · 2019-04-10T12:38:59Z

Hi @Gaurav614, thanks for reporting this. If you'd like to open a PR to propose a fix then that'd be very welcome.

Gaurav614 · 2019-04-10T12:40:44Z

Will work upon it soon.

atris · 2019-04-10T13:02:54Z

I wonder if we should let the status quo prevail here. RED cluster status is typically interpreted as a critical state by users, including data loss scenarios, so we should avoid unnecessary transition into that state.

Note that in this case, ES does not even attempt to allocate shards. So it is not really an allocation failure, more of a strange scenario (and likely a user error). This raises the question of whether we should go into the RED state for a customer induced scenario especially since the cluster is still very much functional.

Gaurav614 · 2019-04-10T13:13:29Z

@atris This scenario can occur in rare case if all the data nodes went down and the user dont have any info about it. So when he tries to create an index he will be seeing the index health as Yellow.
Secondly as per ES definition the Cluster State is RED when primary shards are unassigned . But this scenario will violate that definition.

atris · 2019-04-10T13:19:46Z

@Gaurav614 I doubt if that would be the case, since IIRC, if an index creation attempt happens when there are no data nodes present and none of the data nodes was safely shut down, we do go to RED state (not sure though, would be good to confirm)

Gaurav614 · 2019-04-10T13:32:01Z

@atris

ES does not even attempt to allocate shards.

This could be a another issue as well. Since ES is deflecting from its behavior of index.allocation.max_retries. as ES should try attempting . But that's different case

@Gaurav614 I doubt if that would be the case, since IIRC, if an index creation attempt happens when there are no data nodes present and none of the data nodes was safely shut down, we do go to RED state (not sure though, would be good to confirm)

The Cluster Health will be yellow (_cluster/health) if there is no red index previously in the data node and the yellow status will be due to newly created index. This rare scenario can even occur when the first timer tries to make a ES cluster and its data node when down even before creating any index. But as user is first timer and he might be not aware that his data node went down so he will try to create the index and which will result in Cluster State to be yellow as The IndexHealth is yellow for that newly created index. this problem will be aggravated more if went on creating the new indices and each Index health will be Yellow and which will eventually result in Cluster state to be yellow. All these are rare and corner cases . Wont impact the business of any organization as such but will be very helpful in creating elastic search more clean and robust

DaveCTurner · 2019-04-10T13:48:36Z

RED cluster status is typically interpreted as a critical state by users, including data loss scenarios, so we should avoid unnecessary transition into that state.

On the contrary I think it's appropriate to treat this situation as critical: you've created an index but you won't be able to write to it, which is what RED health indicates.

it is not really an allocation failure, more of a strange scenario (and likely a user error)

On the contrary I think it's appropriate to consider this to be an allocation failure: it doesn't make sense to distinguish this case from the case where there are data nodes present but none is suitable for allocation, perhaps due to an allocation filter.

if an index creation attempt happens when there are no data nodes present and none of the data nodes was safely shut down, we do go to RED state

This is not the case, although it is exactly what I expected too until @Gaurav614 raised this issue.

atris · 2019-04-10T14:10:48Z

it is not really an allocation failure, more of a strange scenario (and likely a user error)

On the contrary I think it's appropriate to consider this to be an allocation failure: it doesn't make sense to distinguish this case from the case where there are data nodes present but none is suitable for allocation, perhaps due to an allocation filter.

I would consider these two as different cases since the lack of data nodes may be a user triggered scenario. However, the line is too thin to attempt to disambiguate the behaviour for the two cases.

if an index creation attempt happens when there are no data nodes present and none of the data nodes was safely shut down, we do go to RED state

This is not the case, although it is exactly what I expected too until @Gaurav614 raised this issue.

Ah, ok. That changes my opinion then, since we definitely run the risk of having dead data nodes without the user being aware, as @Gaurav614 highlighted upstream.

In all, +1 from me.

vigyasharma · 2019-04-11T05:50:50Z

+1. Also makes sense for cases where master nodes came up but data nodes failed, and users start indexing data assuming cluster is up and healthy. With no former indices to go red, it is misleading to see the new index yellow.

sawyna · 2019-05-14T12:17:04Z

If no-one is working on this at the moment, I'd like to take this up as a first issue to get started with es codebase.

DaveCTurner · 2019-05-14T13:53:10Z

@Gaurav614 are you still planning to work on this?

Gaurav614 · 2019-05-14T16:30:03Z

@DaveCTurner I have requested for the addition to CLA.

Gaurav614 · 2019-05-27T05:00:00Z

@DaveCTurner Hey. I have requested Baird Garrett for addition to CLA. But didnt received any response from him. Is there anything you can do from your end?

DaveCTurner · 2019-05-28T10:42:54Z

Signing the CLA is an automatic process that doesn't need the involvement of Baird or the rest of the legal team. You should sign up here: https://www.elastic.co/contributor-agreement

Addition of test case that creates the scenario when there are no data nodes in Cluster and user tries for index Creation. Changing the status of primary shards that are unassigned to AllocationStatus.Deciders_NO when there are no data nodes helps in solving this issue

Today if you create an index in a cluster without any data nodes then it will report yellow health because it never attempts to assign any shards if there are no data nodes, so the new shards remain at `AllocationStatus.NO_ATTEMPT`. This commit moves the new primaries to `AllocationStatus.DECIDERS_NO` in this situation, causing the cluster health to move to red. Fixes #41073

DaveCTurner added >bug good first issue low hanging fruit help wanted adoptme :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) labels Apr 10, 2019

Gaurav614 mentioned this issue Jun 17, 2019

Fail allocation of new primaries in empty cluster #43284

Merged

DaveCTurner closed this as completed in #43284 Sep 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A cluster with no data nodes or indices reports YELLOW health when an index is created. #41073

A cluster with no data nodes or indices reports YELLOW health when an index is created. #41073

DaveCTurner commented Apr 10, 2019

elasticmachine commented Apr 10, 2019

Gaurav614 commented Apr 10, 2019

DaveCTurner commented Apr 10, 2019

Gaurav614 commented Apr 10, 2019

atris commented Apr 10, 2019 •

edited

Gaurav614 commented Apr 10, 2019

atris commented Apr 10, 2019

Gaurav614 commented Apr 10, 2019 •

edited

DaveCTurner commented Apr 10, 2019

atris commented Apr 10, 2019

vigyasharma commented Apr 11, 2019

sawyna commented May 14, 2019

DaveCTurner commented May 14, 2019

Gaurav614 commented May 14, 2019 •

edited

Gaurav614 commented May 27, 2019

DaveCTurner commented May 28, 2019

A cluster with no data nodes or indices reports YELLOW health when an index is created. #41073

A cluster with no data nodes or indices reports YELLOW health when an index is created. #41073

Comments

DaveCTurner commented Apr 10, 2019

elasticmachine commented Apr 10, 2019

Gaurav614 commented Apr 10, 2019

DaveCTurner commented Apr 10, 2019

Gaurav614 commented Apr 10, 2019

atris commented Apr 10, 2019 • edited

Gaurav614 commented Apr 10, 2019

atris commented Apr 10, 2019

Gaurav614 commented Apr 10, 2019 • edited

DaveCTurner commented Apr 10, 2019

atris commented Apr 10, 2019

vigyasharma commented Apr 11, 2019

sawyna commented May 14, 2019

DaveCTurner commented May 14, 2019

Gaurav614 commented May 14, 2019 • edited

Gaurav614 commented May 27, 2019

DaveCTurner commented May 28, 2019

atris commented Apr 10, 2019 •

edited

Gaurav614 commented Apr 10, 2019 •

edited

Gaurav614 commented May 14, 2019 •

edited