New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partial snapshots when deleted s3 files reappear in bucket #28322
Comments
Thanks for this very detailed report @TruthyBoolish ! I've seen a similar behavior from time to time too on repositories with a lot of creation/deletion per hour. I created #30332 that relaxes the constraint when deleting files during the finalization of the snapshot. It should help to reduce these errors. |
When deleting or creating a snapshot for a given shard, elasticsearch usually starts by listing all the existing snapshotted files in the repository. Then it computes a diff and deletes the snapshotted files that are not needed anymore. During this deletion, an exception is thrown if the file to be deleted does not exist anymore. This behavior is challenging with cloud based repository implementations like S3 where a file that has been deleted can still appear in the bucket for few seconds/minutes (because the deletion can take some time to be fully replicated on S3). If the deleted file appears in the listing of files, then the following deletion will fail with a NoSuchFileException and the snapshot will be partially created/deleted. This pull request makes the deletion of these files a bit less strict, ie not failing if the file we want to delete does not exist anymore. It introduces a new BlobContainer.deleteIgnoringIfNotExists() method that can be used at some specific places where not failing when deleting a file is considered harmless. Closes #28322
When deleting or creating a snapshot for a given shard, elasticsearch usually starts by listing all the existing snapshotted files in the repository. Then it computes a diff and deletes the snapshotted files that are not needed anymore. During this deletion, an exception is thrown if the file to be deleted does not exist anymore. This behavior is challenging with cloud based repository implementations like S3 where a file that has been deleted can still appear in the bucket for few seconds/minutes (because the deletion can take some time to be fully replicated on S3). If the deleted file appears in the listing of files, then the following deletion will fail with a NoSuchFileException and the snapshot will be partially created/deleted. This pull request makes the deletion of these files a bit less strict, ie not failing if the file we want to delete does not exist anymore. It introduces a new BlobContainer.deleteIgnoringIfNotExists() method that can be used at some specific places where not failing when deleting a file is considered harmless. Closes #28322
When deleting or creating a snapshot for a given shard, elasticsearch usually starts by listing all the existing snapshotted files in the repository. Then it computes a diff and deletes the snapshotted files that are not needed anymore. During this deletion, an exception is thrown if the file to be deleted does not exist anymore. This behavior is challenging with cloud based repository implementations like S3 where a file that has been deleted can still appear in the bucket for few seconds/minutes (because the deletion can take some time to be fully replicated on S3). If the deleted file appears in the listing of files, then the following deletion will fail with a NoSuchFileException and the snapshot will be partially created/deleted. This pull request makes the deletion of these files a bit less strict, ie not failing if the file we want to delete does not exist anymore. It introduces a new BlobContainer.deleteIgnoringIfNotExists() method that can be used at some specific places where not failing when deleting a file is considered harmless. Closes #28322
When deleting or creating a snapshot for a given shard, elasticsearch usually starts by listing all the existing snapshotted files in the repository. Then it computes a diff and deletes the snapshotted files that are not needed anymore. During this deletion, an exception is thrown if the file to be deleted does not exist anymore. This behavior is challenging with cloud based repository implementations like S3 where a file that has been deleted can still appear in the bucket for few seconds/minutes (because the deletion can take some time to be fully replicated on S3). If the deleted file appears in the listing of files, then the following deletion will fail with a NoSuchFileException and the snapshot will be partially created/deleted. This pull request makes the deletion of these files a bit less strict, ie not failing if the file we want to delete does not exist anymore. It introduces a new BlobContainer.deleteIgnoringIfNotExists() method that can be used at some specific places where not failing when deleting a file is considered harmless. Closes #28322
As far as I can find in release notes this is fixed in 6.3.0 and 6.4.0 version, correct my if I am wrong. |
Elasticsearch version: 5.6.5 and 6.1.2
Plugins installed: [discovery-ec2, repository-s3]
JVM version : 1.8
OS version : Ubuntu 16.04
Description of the problem including expected versus actual behavior:
The s3 repository plugin occasionally produces partial snapshots due to throwing an exception while deleting a file from AWS s3 storage. This is a result of s3 only providing eventual consistency for read-after-delete operations. The full documentation on that is here:
https://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html#ConsistencyModel
but the key section says:
A process deletes an existing object and immediately lists keys within its bucket. Until the deletion is fully propagated, Amazon S3 might list the deleted object.
In more concrete terms, I'm using the s3 repository plugin in circumstances where the deleted snapshot files can still show up in the bucket for seconds/minutes after they've been deleted. When the s3 plugin then tries to delete one of these files for the second time, s3 then correctly shows it doesn't exist, and the s3 plugin considers that enough of an error condition to abort that particular part of the snapshot.
Here is a log of such an error (with some redactions). This log shows the problem happens to multiple indices and shards in the snapshot operation, and some of the failures involve a common file in s3.
This log was produced with elasticsearch and repository-s3 version 5.6.5, but I have recreated the problem in 6.1.2 as well. The logic involved has not changed, so this makes sense.
When listBlobs() is called at the start of a snapshot operation,
https://github.com/elastic/elasticsearch/blob/v6.1.2/core/src/main/java/org/elasticsearch/repositories/blobstore/BlobStoreRepository.java#L1133
s3 can return the files which have already been deleted.
Later, during the finalize phase, it will now attempt to delete all of the blobs that were listed above. The deleteBlob() method will verify that each blob exists, but the blobs that were previously deleted will not individually show as existent. So deleteBlob() will throw an exception.
https://github.com/elastic/elasticsearch/blob/v6.1.2/core/src/main/java/org/elasticsearch/repositories/blobstore/BlobStoreRepository.java#L1238
Steps to reproduce:
This depends on unpredictable s3 consistency behaviors, and so may not be easy to reproduce naturally. I will include a patch against 6.1.2 which will allow you to simulate this scenario by sleeping at the key points in the snapshot operation. This will allow you to add and delete the files from s3, to simulate its eventual consistency.
All that's needed to reproduce is to take a snapshot and have an s3 deleted file reappear in the bucket contents at the right time. It will be a partial snapshot, and the above errors will be logged.
If using the patch which will be provided:
index-
to the s3 path noted in the logs. For instance,index-DUMMY.txt
(This simulates a file which was previously deleted by the plugin showing up again in the bucket contents.)Suggested fix:
Since this is unlikely to occur for the majority of s3 users, a fix which would not affect the majority of s3 users also seems appropriate. An optional configuration property which lets me filter the listBlobs() result for any blobs that I know I have already deleted occurs to me. Keeping track of that state may present a challenge, so being able to configure more lenient deleteBlob() checks (and thus accepting the risk of the s3 repo being concurrently modified by other processes) would also make sense to me.
Patch to simulate the issue:
The text was updated successfully, but these errors were encountered: