New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A cluster with no data nodes or indices reports YELLOW health when an index is created. #41073
Comments
Pinging @elastic/es-distributed |
https://discuss.elastic.co/t/primary-shards-unassigned-but-still-cluster-state-health-yellow/176189/12 |
Hi @Gaurav614, thanks for reporting this. If you'd like to open a PR to propose a fix then that'd be very welcome. |
Will work upon it soon. |
I wonder if we should let the status quo prevail here. RED cluster status is typically interpreted as a critical state by users, including data loss scenarios, so we should avoid unnecessary transition into that state. Note that in this case, ES does not even attempt to allocate shards. So it is not really an allocation failure, more of a strange scenario (and likely a user error). This raises the question of whether we should go into the RED state for a customer induced scenario especially since the cluster is still very much functional. |
@atris This scenario can occur in rare case if all the data nodes went down and the user dont have any info about it. So when he tries to create an index he will be seeing the index health as Yellow. |
@Gaurav614 I doubt if that would be the case, since IIRC, if an index creation attempt happens when there are no data nodes present and none of the data nodes was safely shut down, we do go to RED state (not sure though, would be good to confirm) |
This could be a another issue as well. Since ES is deflecting from its behavior of
The Cluster Health will be yellow (_cluster/health) if there is no red index previously in the data node and the yellow status will be due to newly created index. This rare scenario can even occur when the first timer tries to make a ES cluster and its data node when down even before creating any index. But as user is first timer and he might be not aware that his data node went down so he will try to create the index and which will result in Cluster State to be yellow as The IndexHealth is yellow for that newly created index. this problem will be aggravated more if went on creating the new indices and each Index health will be Yellow and which will eventually result in Cluster state to be yellow. All these are rare and corner cases . Wont impact the business of any organization as such but will be very helpful in creating elastic search more clean and robust |
On the contrary I think it's appropriate to treat this situation as critical: you've created an index but you won't be able to write to it, which is what RED health indicates.
On the contrary I think it's appropriate to consider this to be an allocation failure: it doesn't make sense to distinguish this case from the case where there are data nodes present but none is suitable for allocation, perhaps due to an allocation filter.
This is not the case, although it is exactly what I expected too until @Gaurav614 raised this issue. |
I would consider these two as different cases since the lack of data nodes may be a user triggered scenario. However, the line is too thin to attempt to disambiguate the behaviour for the two cases.
Ah, ok. That changes my opinion then, since we definitely run the risk of having dead data nodes without the user being aware, as @Gaurav614 highlighted upstream. In all, +1 from me. |
+1. Also makes sense for cases where master nodes came up but data nodes failed, and users start indexing data assuming cluster is up and healthy. With no former indices to go red, it is misleading to see the new index yellow. |
If no-one is working on this at the moment, I'd like to take this up as a first issue to get started with es codebase. |
@Gaurav614 are you still planning to work on this? |
@DaveCTurner I have requested for the addition to CLA. |
@DaveCTurner Hey. I have requested Baird Garrett for addition to CLA. But didnt received any response from him. Is there anything you can do from your end? |
Signing the CLA is an automatic process that doesn't need the involvement of Baird or the rest of the legal team. You should sign up here: https://www.elastic.co/contributor-agreement |
Addition of test case that creates the scenario when there are no data nodes in Cluster and user tries for index Creation. Changing the status of primary shards that are unassigned to AllocationStatus.Deciders_NO when there are no data nodes helps in solving this issue
Addition of test case that creates the scenario when there are no data nodes in Cluster and user tries for index Creation. Changing the status of primary shards that are unassigned to AllocationStatus.Deciders_NO when there are no data nodes helps in solving this issue
Today if you create an index in a cluster without any data nodes then it will report yellow health because it never attempts to assign any shards if there are no data nodes, so the new shards remain at `AllocationStatus.NO_ATTEMPT`. This commit moves the new primaries to `AllocationStatus.DECIDERS_NO` in this situation, causing the cluster health to move to red. Fixes #41073
Today if you create an index in a cluster without any data nodes then it will report yellow health because it never attempts to assign any shards if there are no data nodes, so the new shards remain at `AllocationStatus.NO_ATTEMPT`. This commit moves the new primaries to `AllocationStatus.DECIDERS_NO` in this situation, causing the cluster health to move to red. Fixes #41073
If you start up a new cluster without any data nodes then it will report
green
health, because there are no unassigned shards. If you subsequently create an index it will reportyellow
health because there are unassigned newly-created primaries. However we never attempt to assign any shards if there are no data nodes so the health will continue to reportyellow
and will never move tored
:elasticsearch/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java
Lines 117 to 120 in abd861b
I think in this case we should move the unassigned status for any new shards from
AllocationStatus.NO_ATTEMPT
toAllocationStatus.DECIDERS_NO
before bailing out, which would result inred
health.The text was updated successfully, but these errors were encountered: