Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshot failed in deleting missing files during cleanup #27061

Closed
saijone opened this issue Oct 20, 2017 · 3 comments
Closed

Snapshot failed in deleting missing files during cleanup #27061

saijone opened this issue Oct 20, 2017 · 3 comments
Assignees
Labels
:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs feedback_needed

Comments

@saijone
Copy link

saijone commented Oct 20, 2017

Describe the feature:

Elasticsearch version (bin/elasticsearch --version): 5.4.3

Plugins installed: [a custom repository plugin]

JVM version (java -version): 1.8.0_131

OS version (uname -a if on a Unix-like system): Oracle Enterprise Linux 6 with Redhat kernel

Description of the problem including expected versus actual behavior:
We are using a custom repository plugin talking https://wiki.openstack.org/wiki/Swift to the backend ObjectStore. Snapshots are taken hourly into a repository, and every month a new repository is used for snapshots.

We found a repository could have been corrupted in that every time a snapshot is being taken, the snapshot process always fail with two errors trying to delete two missing files during cleanup:

IndexShardSnapshotFailedException[error deleting index file [index-232] during cleanup]; nested: NoSuchFileException[Failed to delete 'indices/g0UOKeKXSCSL3rbNvfjq4Q/0/index-232

IndexShardSnapshotFailedException[error deleting index file [pending-index-427] during cleanup]; nested: NoSuchFileException[Failed to delete 'indices/g0UOKeKXSCSL3rbNvfjq4Q/1/pending-index-427

The custom repository plugin does throw NoSuchFileException in BlobContainer.deleteBlob(String blobName) if the blob does not exist. This NoSuchFileException stop the snapshot process and fail in cleanup. If we modify the repository plugin to not throwing NoSuchFileException in BlobContainer.deleteBlob(String blobName) but log a warning, then the snapshot does proceed to success (on every shards) and the snapshot can also restore the index.

After making this walkaround of not throwing NoSuchFileException in BlobContainer.deleteBlob(String blobName), the warning of NoSuchFile from deleteBlob still show up everytime taking a snapshot in that repository. We suspect there the file names could be referenced in some metadata and the metadata is not cleanup correctly.

We have also tried to delete the snapshots starting these errors, the snapshots are deleted with same errors(warnings), but it does not 'fix' the repository as taking a snapshot in this repository still produce the same errors of deleting the same missing files.

Steps to reproduce:

Please include a minimal but complete recreation of the problem, including
(e.g.) index creation, mappings, settings, query etc. The easier you make for
us to reproduce it, the more likely that somebody will take the time to look at it.

Provide logs (if relevant):

@ywelsch ywelsch added the :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs label Oct 22, 2017
@saijone
Copy link
Author

saijone commented Nov 1, 2017

We are investigating a problem in the backend ObjectStore the plugin is talking to that could potentially cause the delete failure.

@tlrx
Copy link
Member

tlrx commented Nov 9, 2017

@saijone Do you have an update about this or are you still digging the backend issue?

@tlrx
Copy link
Member

tlrx commented Nov 27, 2017

Closing as no feedback provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs feedback_needed
Projects
None yet
Development

No branches or pull requests

3 participants