Recovery: Quick cluster state processing can cause relocation finalization to fail and delete both copies #9503

Closed
bleskes opened this Issue Jan 30, 2015 · 0 comments

Projects

None yet

2 participants

@bleskes
Member
bleskes commented Jan 30, 2015

#8570 added some extra protection for the case where a source shard is being closed during recovery. However, this introduces a race condition in the case that the target shard has moved to POST_RECOVERY and the master processes the shard started action and activates the shard before the source node completes the recovery. In that case the source node will close the source shard, causing the recovery to be cancelled. The target node receives the cancellation notification and deletes the local copy (still in POST_RECOVERY).

The extra close listener is not yet released but is part of the 1.5 push.

See: http://build-us-00.elasticsearch.org/job/es_core_1x_debian/3474/

@s1monw s1monw added the blocker label Feb 9, 2015
@bleskes bleskes added a commit that referenced this issue Feb 27, 2015
@bleskes bleskes Recovery: unify RecoveryState management to IndexShard and clean up s…
…emantics

We keep track of the current stage of recovery using an instance of RecoveryState which is stored on the relevant IndexShard. At the moment changes to this object are made in many places of the code, which are charged of doing it in the right order, keeping track of timers and many more. Also the changes to shard state are decoupled from the recovery stages which caused #9503.

This PR refactors this and brings all of the changes into IndexShard. It also makes all recovery follow the exact same stages and shortcut some. This is in order to keep things simple and always the same (those shortcuts didn't add anything, we ended doing it all anyway).

Also, all timer management is now folded into RecoveryState and unit tests are added.

This closes #9503 by moving the shard to post recovery only once the recovery is done (before they were decoupled), meaning that master promotion of the target shard to started can not cancel the recovery.

Closes #9902
0cec37f
@bleskes bleskes added a commit that closed this issue Feb 27, 2015
@bleskes bleskes Recovery: unify RecoveryState management to IndexShard and clean up s…
…emantics

We keep track of the current stage of recovery using an instance of RecoveryState which is stored on the relevant IndexShard. At the moment changes to this object are made in many places of the code, which are charged of doing it in the right order, keeping track of timers and many more. Also the changes to shard state are decoupled from the recovery stages which caused #9503.

This PR refactors this and brings all of the changes into IndexShard. It also makes all recovery follow the exact same stages and shortcut some. This is in order to keep things simple and always the same (those shortcuts didn't add anything, we ended doing it all anyway).

Also, all timer management is now folded into RecoveryState and unit tests are added.

This closes #9503 by moving the shard to post recovery only once the recovery is done (before they were decoupled), meaning that master promotion of the target shard to started can not cancel the recovery.

Closes #9902
0cec37f
@bleskes bleskes closed this in 0cec37f Feb 27, 2015
@bleskes bleskes added a commit to bleskes/elasticsearch that referenced this issue Feb 27, 2015
@bleskes bleskes Recovery: unify RecoveryState management to IndexShard and clean up s…
…emantics

We keep track of the current stage of recovery using an instance of RecoveryState which is stored on the relevant IndexShard. At the moment changes to this object are made in many places of the code, which are charged of doing it in the right order, keeping track of timers and many more. Also the changes to shard state are decoupled from the recovery stages which caused #9503.

This PR refactors this and brings all of the changes into IndexShard. It also makes all recovery follow the exact same stages and shortcut some. This is in order to keep things simple and always the same (those shortcuts didn't add anything, we ended doing it all anyway).

Also, all timer management is now folded into RecoveryState and unit tests are added.

This closes #9503 by moving the shard to post recovery only once the recovery is done (before they were decoupled), meaning that master promotion of the target shard to started can not cancel the recovery.

Closes #9902
3e7d94a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment