You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We currently cancel recoveries when the shard is no longer assigned to the target node, or the primary shard (source of copying) is moved to another node (and there are more scenarios). That cancel logic doesn't clean up any temporary files created during the recovery.
Normally that's not a problem as the files will be cleaned up once the shard is safely recovered somewhere else (or locally). However, if one runs into continuous failure cycles we can fill up disk space, causing bigger problems like corrupting other shards on the node.
The text was updated successfully, but these errors were encountered:
clintongormley
changed the title
Recovery: cancelling a recovery may leave temporary files behind
Resiliency: Cancelling a recovery may leave temporary files behind
Sep 26, 2014
At the moment, we leave around temporary files if a peer (replica) recovery is canceled. Those files will normally be cleaned up once the shard is started else but in case of errors this can lead to trouble. If recovery are started and canceled often, we may cause nodes to run out of disk space.
Closeselastic#7893
bleskes
added a commit
to bleskes/elasticsearch
that referenced
this issue
Oct 14, 2014
At the moment, we leave around temporary files if a peer (replica) recovery is canceled. Those files will normally be cleaned up once the shard is started else but in case of errors this can lead to trouble. If recovery are started and canceled often, we may cause nodes to run out of disk space.
Closeselastic#7893
We currently cancel recoveries when the shard is no longer assigned to the target node, or the primary shard (source of copying) is moved to another node (and there are more scenarios). That cancel logic doesn't clean up any temporary files created during the recovery.
Normally that's not a problem as the files will be cleaned up once the shard is safely recovered somewhere else (or locally). However, if one runs into continuous failure cycles we can fill up disk space, causing bigger problems like corrupting other shards on the node.
The text was updated successfully, but these errors were encountered: