Throw IndexShardClosedException if shard is closed #8648

s1monw · 2014-11-25T12:10:28Z

Today we throw a generic ElasticsearchException when a recovery is cancled. This
causes verbose logging and send shard failures and additional unnecessary cluster state
events. We can just throw IndexShardClosedException which prevents the send shard failures.

we see lots of these exception in the logs which are misleading - this was introduced lately and never released:

1&gt; [2014-11-25 11:07:40,742][DEBUG][index.service            ] [node_0] [test2] [3] closed (reason: [recovery failure [RecoveryFailedException[[test2][3]: Recovery failed from [node_1][9pORZ5dyT6C0Y8oZxZ9q-w][ip-10-255-15-175][local[3]]{mode=local} into [node_0][_ZbAkLDCSCa1GB_dEv7nUQ][ip-10-255-15-175][local[2]]{mode=local}]; nested: RemoteTransportException[[node_0][local[2]][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[[test2][3] Phase[3] Execution failed]; nested: ElasticsearchException[recovery was canceled reason [shard is closed]]; ]])
  1&gt; [2014-11-25 11:07:40,742][WARN ][indices.cluster          ] [node_0] [test2][3] sending failed shard after recovery failure
  1&gt; org.elasticsearch.indices.recovery.RecoveryFailedException: [test2][3]: Recovery failed from [node_1][9pORZ5dyT6C0Y8oZxZ9q-w][ip-10-255-15-175][local[3]]{mode=local} into [node_0][_ZbAkLDCSCa1GB_dEv7nUQ][ip-10-255-15-175][local[2]]{mode=local}
  1&gt;         at org.elasticsearch.indices.recovery.RecoveryTarget.doRecovery(RecoveryTarget.java:245)
  1&gt;         at org.elasticsearch.indices.recovery.RecoveryTarget.access$500(RecoveryTarget.java:64)
  1&gt;         at org.elasticsearch.indices.recovery.RecoveryTarget$RecoveryRunner.doRun(RecoveryTarget.java:485)
  1&gt;         at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
  1&gt;         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  1&gt;         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  1&gt;         at java.lang.Thread.run(Thread.java:744)
  1&gt; Caused by: org.elasticsearch.transport.RemoteTransportException: [node_0][local[2]][internal:index/shard/recovery/start_recovery]
  1&gt; Caused by: org.elasticsearch.index.engine.RecoveryEngineException: [test2][3] Phase[3] Execution failed
  1&gt;         at org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1182)
  1&gt;         at org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:675)
  1&gt;         at org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:127)
  1&gt;         at org.elasticsearch.indices.recovery.RecoverySource.access$200(RecoverySource.java:51)
  1&gt;         at org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:148)
  1&gt;         at org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:134)
  1&gt;         at org.elasticsearch.transport.local.LocalTransport$2.doRun(LocalTransport.java:267)
  1&gt;         at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
  1&gt;         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  1&gt;         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  1&gt;         at java.lang.Thread.run(Thread.java:744)
  1&gt; Caused by: org.elasticsearch.ElasticsearchException: recovery was canceled reason [shard is closed]
  1&gt;         at org.elasticsearch.indices.recovery.ShardRecoveryHandler$CancelableThreads.failIfCanceled(ShardRecoveryHandler.java:640)
  1&gt;         at org.elasticsearch.indices.recovery.ShardRecoveryHandler$CancelableThreads.remove(ShardRecoveryHandler.java:661)
  1&gt;         at org.elasticsearch.indices.recovery.ShardRecoveryHandler$CancelableThreads.run(ShardRecoveryHandler.java:655)
  1&gt;         at org.elasticsearch.indices.recovery.ShardRecoveryHandler.phase3(ShardRecoveryHandler.java:430)
  1&gt;         at org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1178)
  1&gt;         ... 10 more
  1&gt; [2014-11-25 11:07:40,743][WARN ][cluster.action.shard     ] [node_0] [test2][3] sending failed shard for [test2][3], node[_ZbAkLDCSCa1GB_dEv7nUQ], [R], s[INITIALIZING], indexUUID [sUuEvRUORCeBRQgKKtg4UQ], reason [Failed to start shard, message [RecoveryFailedException[[test2][3]: Recovery failed from [node_1][9pORZ5dyT6C0Y8oZxZ9q-w][ip-10-255-15-175][local[3]]{mode=local} into [node_0][_ZbAkLDCSCa1GB_dEv7nUQ][ip-10-255-15-175][local[2]]{mode=local}]; nested: RemoteTransportException[[node_0][local[2]][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[[test2][3] Phase[3] Execution failed]; nested: ElasticsearchException[recovery was canceled reason [shard is closed]]; ]]
  1&gt; [2014-11-25 11:07:40,743][WARN ][cluster.action.shard     ] [node_0] [test2][3] received shard failed for [test2][3], node[_ZbAkLDCSCa1GB_dEv7nUQ], [R], s[INITIALIZING], indexUUID [sUuEvRUORCeBRQgKKtg4UQ], reason [Failed to start shard, message [RecoveryFailedException[[test2][3]: Recovery failed from [node_1][9pORZ5dyT6C0Y8oZxZ9q-w][ip-10-255-15-175][local[3]]{mode=local} into [node_0][_ZbAkLDCSCa1GB_dEv7nUQ][ip-10-255-15-175][local[2]]{mode=local}]; nested: RemoteTransportException[[node_0][local[2]][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[[test2][3] Phase[3] Execution failed]; nested: ElasticsearchException[recovery was canceled reason [shard is closed]]; ]]

dakrone · 2014-11-25T12:48:28Z

src/main/java/org/elasticsearch/indices/recovery/ShardRecoveryHandler.java

@@ -89,7 +89,16 @@
    private final MappingUpdatedAction mappingUpdatedAction;

    private final RecoveryResponse response;
-    private final CancelableThreads cancelableThreads = new CancelableThreads();
+    private final CancelableThreads cancelableThreads = new CancelableThreads() {
+        @Override


Why not add this directly to the CancelableThreads class since there are no other uses of it? That way we can keep it as a final class and don't need to create an anonymous class here.

I wanted to factor this out into a utils class and use it on the other side of the recovery too

Makes sense, though even on the other side wouldn't you want to check if the index were closed and if so throw the appropriate response?

there are several things we want to interrupt in the future safely so I keep it generic?

Sounds good.

dakrone · 2014-11-25T12:54:49Z

Left one comment otherwise LGTM

s1monw · 2014-11-25T13:16:26Z

@dakrone replied to your comment

Today we throw a generic ElasticsearchException when a recovery is cancled. This causes verbose logging and send shard failures and additional unnecessary cluster state events. We can just throw IndexShardClosedException which prevents the send shard failures

s1monw added v1.5.0 v2.0.0-beta1 review labels Nov 25, 2014

dakrone reviewed Nov 25, 2014
View reviewed changes

s1monw force-pushed the cancle_with_correct_exception branch from 900e37d to 85a3971 Compare November 25, 2014 13:34

s1monw merged commit 85a3971 into elastic:1.x Nov 25, 2014

s1monw deleted the cancle_with_correct_exception branch November 25, 2014 13:35

clintongormley added :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. >enhancement and removed review labels Mar 19, 2015

clintongormley changed the title ~~[RECOVERY] Throw IndexShardClosedException if shard is closed~~ Throw IndexShardClosedException if shard is closed Jun 7, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Throw IndexShardClosedException if shard is closed #8648

Throw IndexShardClosedException if shard is closed #8648

Uh oh!

s1monw commented Nov 25, 2014

Uh oh!

dakrone Nov 25, 2014

Uh oh!

s1monw Nov 25, 2014

Uh oh!

dakrone Nov 25, 2014

Uh oh!

s1monw Nov 25, 2014

Uh oh!

dakrone Nov 25, 2014

Uh oh!

dakrone commented Nov 25, 2014

Uh oh!

s1monw commented Nov 25, 2014

Uh oh!

Uh oh!

Throw IndexShardClosedException if shard is closed #8648

Throw IndexShardClosedException if shard is closed #8648

Uh oh!

Conversation

s1monw commented Nov 25, 2014

Uh oh!

dakrone Nov 25, 2014

Choose a reason for hiding this comment

Uh oh!

s1monw Nov 25, 2014

Choose a reason for hiding this comment

Uh oh!

dakrone Nov 25, 2014

Choose a reason for hiding this comment

Uh oh!

s1monw Nov 25, 2014

Choose a reason for hiding this comment

Uh oh!

dakrone Nov 25, 2014

Choose a reason for hiding this comment

Uh oh!

dakrone commented Nov 25, 2014

Uh oh!

s1monw commented Nov 25, 2014

Uh oh!

Uh oh!