Recovery: Quick cluster state processing can cause relocation finalization to fail and delete both copies

https://github.com/elasticsearch/elasticsearch/pull/8570 added some extra protection for the case where a source shard is being closed during recovery. However,  this introduces a race condition in the case that the target shard has moved to POST_RECOVERY and the master processes the shard started action and activates the shard before the source node completes the recovery. In that case the source node will close the source shard, causing the recovery to be cancelled. The target node receives the cancellation notification and deletes the local copy (still in POST_RECOVERY).

The extra close listener is not yet released but is part of the 1.5 push.

See: http://build-us-00.elasticsearch.org/job/es_core_1x_debian/3474/


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Recovery: Quick cluster state processing can cause relocation finalization to fail and delete both copies #9503

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Recovery: Quick cluster state processing can cause relocation finalization to fail and delete both copies #9503

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions