Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node shut down during the last phase of recovery needlessly fails shard #9496

Closed
bleskes opened this issue Jan 30, 2015 · 3 comments
Closed
Labels
:Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. >enhancement v1.5.0 v2.0.0-beta1

Comments

@bleskes
Copy link
Contributor

bleskes commented Jan 30, 2015

During the final stage of recovery, the target shard is being moved to POST_RECOVERY and the master is sent a request to activate the shard. At the point the master reports the cluster as green, i.e., it is safe to shut down a node without loosing data (potentially going to yellow). However, if the source node is shut down quickly enough before the recovery code cleanly finishes, we may fail the new copy resulting in a red index.

This is an issue with a non-released refactoring done on 1.x and master.

See:
http://build-us-00.elasticsearch.org/job/es_g1gc_1x_metal/3366/testReport/junit/org.elasticsearch.recovery/FullRollingRestartTests/testFullRollingRestart/

@bleskes bleskes added v2.0.0-beta1 v1.5.0 :Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. labels Jan 30, 2015
@brwe
Copy link
Contributor

brwe commented Jan 30, 2015

I think this failure of RelocationTests.testMoveShardsWhileRelocation could be caused by the same effect: http://build-us-00.elasticsearch.org/job/es_core_1x_small/1474/

@s1monw
Copy link
Contributor

s1monw commented Mar 17, 2015

@bleskes I moved this out to 1.6

@s1monw s1monw added v1.6.0 and removed v1.5.0 labels Mar 17, 2015
@bleskes bleskes added v1.5.0 and removed v1.6.0 labels Mar 20, 2015
@bleskes
Copy link
Contributor Author

bleskes commented Mar 20, 2015

This should be fixed with #9902

@bleskes bleskes closed this as completed Mar 20, 2015
@clintongormley clintongormley changed the title Recovery: node shut down during the last phase of recovery needlessly fails shard Node shut down during the last phase of recovery needlessly fails shard Jun 8, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. >enhancement v1.5.0 v2.0.0-beta1
Projects
None yet
Development

No branches or pull requests

4 participants