Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throw IndexShardClosedException if shard is closed #8648

Merged
merged 1 commit into from Nov 25, 2014

Conversation

s1monw
Copy link
Contributor

@s1monw s1monw commented Nov 25, 2014

Today we throw a generic ElasticsearchException when a recovery is cancled. This
causes verbose logging and send shard failures and additional unnecessary cluster state
events. We can just throw IndexShardClosedException which prevents the send shard failures.

we see lots of these exception in the logs which are misleading - this was introduced lately and never released:

1> [2014-11-25 11:07:40,742][DEBUG][index.service            ] [node_0] [test2] [3] closed (reason: [recovery failure [RecoveryFailedException[[test2][3]: Recovery failed from [node_1][9pORZ5dyT6C0Y8oZxZ9q-w][ip-10-255-15-175][local[3]]{mode=local} into [node_0][_ZbAkLDCSCa1GB_dEv7nUQ][ip-10-255-15-175][local[2]]{mode=local}]; nested: RemoteTransportException[[node_0][local[2]][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[[test2][3] Phase[3] Execution failed]; nested: ElasticsearchException[recovery was canceled reason [shard is closed]]; ]])
  1> [2014-11-25 11:07:40,742][WARN ][indices.cluster          ] [node_0] [test2][3] sending failed shard after recovery failure
  1> org.elasticsearch.indices.recovery.RecoveryFailedException: [test2][3]: Recovery failed from [node_1][9pORZ5dyT6C0Y8oZxZ9q-w][ip-10-255-15-175][local[3]]{mode=local} into [node_0][_ZbAkLDCSCa1GB_dEv7nUQ][ip-10-255-15-175][local[2]]{mode=local}
  1>         at org.elasticsearch.indices.recovery.RecoveryTarget.doRecovery(RecoveryTarget.java:245)
  1>         at org.elasticsearch.indices.recovery.RecoveryTarget.access$500(RecoveryTarget.java:64)
  1>         at org.elasticsearch.indices.recovery.RecoveryTarget$RecoveryRunner.doRun(RecoveryTarget.java:485)
  1>         at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
  1>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  1>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  1>         at java.lang.Thread.run(Thread.java:744)
  1> Caused by: org.elasticsearch.transport.RemoteTransportException: [node_0][local[2]][internal:index/shard/recovery/start_recovery]
  1> Caused by: org.elasticsearch.index.engine.RecoveryEngineException: [test2][3] Phase[3] Execution failed
  1>         at org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1182)
  1>         at org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:675)
  1>         at org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:127)
  1>         at org.elasticsearch.indices.recovery.RecoverySource.access$200(RecoverySource.java:51)
  1>         at org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:148)
  1>         at org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:134)
  1>         at org.elasticsearch.transport.local.LocalTransport$2.doRun(LocalTransport.java:267)
  1>         at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
  1>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  1>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  1>         at java.lang.Thread.run(Thread.java:744)
  1> Caused by: org.elasticsearch.ElasticsearchException: recovery was canceled reason [shard is closed]
  1>         at org.elasticsearch.indices.recovery.ShardRecoveryHandler$CancelableThreads.failIfCanceled(ShardRecoveryHandler.java:640)
  1>         at org.elasticsearch.indices.recovery.ShardRecoveryHandler$CancelableThreads.remove(ShardRecoveryHandler.java:661)
  1>         at org.elasticsearch.indices.recovery.ShardRecoveryHandler$CancelableThreads.run(ShardRecoveryHandler.java:655)
  1>         at org.elasticsearch.indices.recovery.ShardRecoveryHandler.phase3(ShardRecoveryHandler.java:430)
  1>         at org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1178)
  1>         ... 10 more
  1> [2014-11-25 11:07:40,743][WARN ][cluster.action.shard     ] [node_0] [test2][3] sending failed shard for [test2][3], node[_ZbAkLDCSCa1GB_dEv7nUQ], [R], s[INITIALIZING], indexUUID [sUuEvRUORCeBRQgKKtg4UQ], reason [Failed to start shard, message [RecoveryFailedException[[test2][3]: Recovery failed from [node_1][9pORZ5dyT6C0Y8oZxZ9q-w][ip-10-255-15-175][local[3]]{mode=local} into [node_0][_ZbAkLDCSCa1GB_dEv7nUQ][ip-10-255-15-175][local[2]]{mode=local}]; nested: RemoteTransportException[[node_0][local[2]][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[[test2][3] Phase[3] Execution failed]; nested: ElasticsearchException[recovery was canceled reason [shard is closed]]; ]]
  1> [2014-11-25 11:07:40,743][WARN ][cluster.action.shard     ] [node_0] [test2][3] received shard failed for [test2][3], node[_ZbAkLDCSCa1GB_dEv7nUQ], [R], s[INITIALIZING], indexUUID [sUuEvRUORCeBRQgKKtg4UQ], reason [Failed to start shard, message [RecoveryFailedException[[test2][3]: Recovery failed from [node_1][9pORZ5dyT6C0Y8oZxZ9q-w][ip-10-255-15-175][local[3]]{mode=local} into [node_0][_ZbAkLDCSCa1GB_dEv7nUQ][ip-10-255-15-175][local[2]]{mode=local}]; nested: RemoteTransportException[[node_0][local[2]][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[[test2][3] Phase[3] Execution failed]; nested: ElasticsearchException[recovery was canceled reason [shard is closed]]; ]]

@@ -89,7 +89,16 @@
private final MappingUpdatedAction mappingUpdatedAction;

private final RecoveryResponse response;
private final CancelableThreads cancelableThreads = new CancelableThreads();
private final CancelableThreads cancelableThreads = new CancelableThreads() {
@Override
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not add this directly to the CancelableThreads class since there are no other uses of it? That way we can keep it as a final class and don't need to create an anonymous class here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to factor this out into a utils class and use it on the other side of the recovery too

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, though even on the other side wouldn't you want to check if the index were closed and if so throw the appropriate response?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are several things we want to interrupt in the future safely so I keep it generic?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

@dakrone
Copy link
Member

dakrone commented Nov 25, 2014

Left one comment otherwise LGTM

@s1monw
Copy link
Contributor Author

s1monw commented Nov 25, 2014

@dakrone replied to your comment

Today we throw a generic ElasticsearchException when a recovery is cancled. This
causes verbose logging and send shard failures and additional unnecessary cluster state
events. We can just throw IndexShardClosedException which prevents the send shard failures
@s1monw s1monw force-pushed the cancle_with_correct_exception branch from 900e37d to 85a3971 Compare November 25, 2014 13:34
@s1monw s1monw merged commit 85a3971 into elastic:1.x Nov 25, 2014
@s1monw s1monw deleted the cancle_with_correct_exception branch November 25, 2014 13:35
@clintongormley clintongormley added :Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. >enhancement and removed review labels Mar 19, 2015
@clintongormley clintongormley changed the title [RECOVERY] Throw IndexShardClosedException if shard is closed Throw IndexShardClosedException if shard is closed Jun 7, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. >enhancement v1.5.0 v2.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants