Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed shard recovery can cause shard data to be deleted (replicas will still work) #1227

Closed
kimchy opened this issue Aug 10, 2011 · 2 comments

Comments

@kimchy
Copy link
Member

kimchy commented Aug 10, 2011

A failed shard recovery (for example, because of OOM and the like) can cause shard data to be delete from disk. Though replicas can be there to recover from, it should not happen.

@kimchy kimchy closed this as completed in d25c939 Aug 10, 2011
@abh
Copy link
Contributor

abh commented Aug 11, 2011

We're looking forward to a release with this.

We "lucked out" yesterday that enough of our nodes died with an out-of-heap error within a few minutes and we lost 3/4 of our shards (been furiously rebuilding on a new cluster since then).

@bcoe
Copy link

bcoe commented Aug 11, 2011

I believe I've run into this issue as well.

The behaviour I've observed:

  • a shard blows up (either out of memory, or in one case out of file descriptors).
  • Upon restarting the shard never recovers state with errors resembling (org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException) being thrown endlessly.

I wil be eagerly waiting for this patch to make it into the next stable release :)

mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
jev001 pushed a commit to jev001/elasticsearch that referenced this issue Dec 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants