Fail recovery if retry recovery if resetRecovery fails #11149

s1monw · 2015-05-13T14:51:58Z

This might fail if the shard is closed for instance. This will leak
a shard lock causing the shard being locked on this node forever.

  1> [2015-05-13 16:08:52,118][DEBUG][indices.recovery         ] [node_s1] unexpected error during recovery, but recovery id [420] is finished
  1> [test_index2][0] CurrentState[CLOSED] Shard not in recovering state
  1>    at org.elasticsearch.index.shard.IndexShard.performRecoveryRestart(IndexShard.java:870)
  1>    at org.elasticsearch.indices.recovery.RecoveryStatus.resetRecovery(RecoveryStatus.java:233)
  1>    at org.elasticsearch.indices.recovery.RecoveryTarget.retryRecovery(RecoveryTarget.java:151)
  1>    at org.elasticsearch.indices.recovery.RecoveryTarget.doRecovery(RecoveryTarget.java:237)
  1>    at org.elasticsearch.indices.recovery.RecoveryTarget.access$700(RecoveryTarget.java:72)
  1>    at org.elasticsearch.indices.recovery.RecoveryTarget$RecoveryRunner.doRun(RecoveryTarget.java:462)
  1>    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  1>    at java.lang.Thread.run(Thread.java:745)

followed by

  1> [2015-05-13 16:08:52,123][DEBUG][indices                  ] [node_s1] [test_index2] failed to delete index store - at least one shards is still locked
  1> org.apache.lucene.store.LockObtainFailedException: Can't lock shard [test_index2][0], timed out after 0ms
  1>    at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:520)
  1>    at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:448)
  1>    at org.elasticsearch.env.NodeEnvironment.lockAllForIndex(NodeEnvironment.java:392)
  1>    at org.elasticsearch.env.NodeEnvironment.deleteIndexDirectorySafe(NodeEnvironment.java:342)
  1>    at org.elasticsearch.indices.IndicesService.deleteIndexStore(IndicesService.java:496)
  1>    at org.elasticsearch.indices.IndicesService.removeIndex(IndicesService.java:403)
  1>    at org.elasticsearch.indices.IndicesService.deleteIndex(IndicesService.java:445)
  1>    at org.elasticsearch.indices.cluster.IndicesClusterStateService.deleteIndex(IndicesClusterStateService.java:844)
  1>    at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyDeletedIndices(IndicesClusterStateService.java:243)
  1>    at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:180)
  1>    at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:489)
  1>    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:188)
  1>    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:158)
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  1>    at java.lang.Thread.run(Thread.java:745)

This might fail if the shard is closed for instance. This will leak a shard lock causing the shard being locked on this node forever.

dakrone · 2015-05-13T14:54:04Z

LGTM

Fail recovery if retry recovery if resetRecovery fails

6a43fe3

This might fail if the shard is closed for instance. This will leak a shard lock causing the shard being locked on this node forever.

s1monw added >bug v2.0.0-beta1 v1.6.0 v1.5.3 labels May 13, 2015

s1monw merged commit 6a43fe3 into elastic:master May 13, 2015

s1monw deleted the fail_recovery_on_retry_error branch May 13, 2015 14:57

clintongormley added the :Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. label May 29, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fail recovery if retry recovery if resetRecovery fails #11149

Fail recovery if retry recovery if resetRecovery fails #11149

s1monw commented May 13, 2015

dakrone commented May 13, 2015

Fail recovery if retry recovery if resetRecovery fails #11149

Fail recovery if retry recovery if resetRecovery fails #11149

Conversation

s1monw commented May 13, 2015

dakrone commented May 13, 2015