Remove unneeded waits on recovery cancellation #7717

bleskes · 2014-09-15T07:49:22Z

When cancelling recoveries, we wait for up to 10s for the source node to be notified before continuing. This is not needed in two cases:

The source node has been disconnected due to node shutdown (recovery is canceled as a response to cluster state processing)
The current thread is the one that will be notifying the source node (happens when when of the calls from the source nodes discoveres local index is closed)

The first one is especially important as it may delay cluster state update processing with 10s.

When cancelling recoveries, we wait for up to 10s for the source node to be notified before continuing. This is not needed in two cases: 1) The source node has been disconnected due to node shutdown (recovery is canceled as a response to cluster state processing) 2) The current thread is the one that will be notifying the source node (happens when when of the calls from the source nodes discoveres local index is closed) The first one is especially important as it may delay cluster state update processing with 10s.

…eRecovery

kimchy · 2014-09-15T13:18:44Z

src/main/java/org/elasticsearch/indices/recovery/RecoveryTarget.java

@@ -634,7 +636,8 @@ private void validateRecoveryStatus(RecoveryStatus onGoingRecovery, ShardId shar
            throw new IndexShardClosedException(shardId);
        }
        if (onGoingRecovery.indexShard.state() == IndexShardState.CLOSED) {
-            cancelRecovery(onGoingRecovery.indexShard);
+            // mark sentCanceledToSource after cancel recovery, o.w. cancelRecovery will do nothing


comment not relevant anymore, right?

kimchy · 2014-09-15T13:18:53Z

LGTM, minor comment

When cancelling recoveries, we wait for up to 10s for the source node to be notified before continuing. This is not needed in two cases: 1) The source node has been disconnected due to node shutdown (recovery is canceled as a response to cluster state processing) 2) The current thread is the one that will be notifying the source node (happens when one of the calls from the source nodes discoveres the local index is closed) The first one is especially important as it may delay cluster state update processing with 10s. Closes #7717

When cancelling recoveries, we wait for up to 10s for the source node to be notified before continuing. This is not needed in two cases: 1) The source node has been disconnected due to node shutdown (recovery is canceled as a response to cluster state processing) 2) The current thread is the one that will be notifying the source node (happens when one of the calls from the source nodes discoveres the local index is closed) The first one is especially important as it may delay cluster state update processing with 10s. Closes elastic#7717

bleskes added v1.4.0.Beta1 review v2.0.0-beta1 >enhancement labels Sep 15, 2014

remove extra cancel recovery overload and fold the logic into validat…

e487e5a

…eRecovery

kimchy reviewed Sep 15, 2014
View reviewed changes

remove comment

a49a82f

bleskes closed this in d228606 Sep 15, 2014

bleskes deleted the recovery_no_wait_on_cancel branch September 15, 2014 13:34

jpountz removed the review label Oct 21, 2014

bleskes added the resiliency label Feb 2, 2015

clintongormley added the :Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. label Jun 7, 2015

clintongormley changed the title ~~Recovery: remove unneeded waits on recovery cancellation~~ Remove unneeded waits on recovery cancellation Jun 7, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove unneeded waits on recovery cancellation #7717

Remove unneeded waits on recovery cancellation #7717

bleskes commented Sep 15, 2014

kimchy Sep 15, 2014

kimchy commented Sep 15, 2014

Remove unneeded waits on recovery cancellation #7717

Remove unneeded waits on recovery cancellation #7717

Conversation

bleskes commented Sep 15, 2014

kimchy Sep 15, 2014

Choose a reason for hiding this comment

kimchy commented Sep 15, 2014