Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Bulk Delete Api to BlobStore #40322

Merged
merged 14 commits into from
Apr 2, 2019

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Mar 21, 2019

  • Adds Bulk delete API to blob container
  • Implement bulk delete API for S3
    • I'd look into other implementations in a subsequent PR if the approach here is ok. This would also be easier once Async Snapshot Repository Deletes #40144 is merged as it would probably be best to push the async logic down to the blob container to get more fine grained control (since we will want different parallelism depending on whether or not bulk deletes are available in an implementation I think)
  • Use bulk API instead of loops over single deletes
  • Adjust S3Fixture to accept both path styles for bulk deletes since the S3 SDK uses both during our ITs
  • Closes Add Bulk Delete API to Blob Store Interface #40250

For S3 this should pretty much give us almost 1000x speedup for deletes of huge indices/shards.

@original-brownbear original-brownbear added >enhancement :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.2.0 labels Mar 21, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@original-brownbear
Copy link
Member Author

Jenkins run elasticsearch-ci/bwc

Copy link
Contributor

@andrershov andrershov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!
I've added a few comments. Also, I wonder if we need to add some kind of tests for deleteBlobs method?

});
};
handlers.insert(nonAuthPath(HttpPost.METHOD_NAME, "/"), bulkDeleteHandler);
handlers.insert(nonAuthPath(HttpPost.METHOD_NAME, "/{bucket}"), bulkDeleteHandler);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this implementation work for both cases? It seems that {bucket} is even not used in the implementation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, the unfortunate reality here is that the current fixtures logic for bulk delete doesn't even care about the bucket and simply tries to find the given blobs in any bucket.
I wouldn't really put any effort into this tbh. I think we should probably rather look into just removing the fixture now that we have the Docker-based Minio tests and third-party tests.
The fixture seems completely redundant now ...

}
});
} catch (final AmazonClientException e) {
throw new IOException("Exception when deleting blobs [" + blobNames + "]", e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is an IOException we do not proceed even if we have more DeleteRequests to be sent.
Previously when performing deletes - if one delete failed, we still were proceeding with the next delete requests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, fixed this for S3 as well as the generic case now by catching and aggregating exceptions in the loop :)

logger.warn(() ->
new ParameterizedMessage(
"[{}] indices [{}] are no longer part of any snapshots in the repository, " +
"but failed to clean up their index folders.", metadata.name(), indicesToCleanUp), ioe);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer know what particular indices are not removed. We just log all indices, including those that are successful.
The same thing applies to deleteBlobs usage below.
Probably we can add this kind of information to IOException thrown by deleteBlobs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 9a13dadd0c1 :) We now aggregate all the exceptions. The IndexId that are printed here contain the name and the index snapshot id so with the information in the aggregate exception we can work out what index failed to clean up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know how AWS S3 works when you send a bulk request of 1000 and entry number 500 fails? Will it stop at entry 500 or will it try to delete all entries from the bulk?
Will S3 include information about each failed entry to the exception? (or just the first one?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrershov it will try to delete all of the entries and given errors for all the ones that failed :)

@original-brownbear
Copy link
Member Author

@andrershov thanks for taking a look! All points addressed and tests added in
ba3d485 and 2c3fd34 :)

Copy link
Contributor

@andrershov andrershov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@original-brownbear mostly looks good. But before approving I want to understand how S3 deals with bulk requests in terms of exceptions when several deletions have failed.
I want to be sure that S3 will try to delete all the elements from the bulk even if some elements from the bulk have problems and that information about all failed elements will be included to the exception.

* Adds Bulk delete API to blob container
* Implement bulk delete API for S3
* Adjust S3Fixture to accept both path styles for bulk deletes since the S3 SDK uses both during our ITs
* Closes elastic#40250
@original-brownbear
Copy link
Member Author

@andrershov fixed the docs on deleteBlob (sorry for the force push, I was a little clumsy there and didn't realize what computer I was on at the time :D) and answered your S3 question :)

Copy link
Contributor

@andrershov andrershov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@original-brownbear
Copy link
Member Author

@andrershov thanks
@ywelsch you're good with this one too?

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @original-brownbear. Looking very good already. I've left some minor points

@@ -56,6 +57,11 @@

class S3BlobContainer extends AbstractBlobContainer {

/**
* Maximum number of deletes in a {@link DeleteObjectsRequest}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps link to AWS docs here

@@ -118,6 +124,51 @@ public void deleteBlob(String blobName) throws IOException {
deleteBlobIgnoringIfNotExists(blobName);
}

@Override
public void deleteBlobs(List<String> blobNames) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should call this deleteBlobsIgnoringIfNotExists?

"but failed to clean up its index folder.", metadata.name(), indexId), ioe);
logger.warn(() ->
new ParameterizedMessage(
"[{}] indices [{}] are no longer part of any snapshots in the repository, " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second placeholder can be just {} instead of [{}] as it's a list that will render anyway with the brackets. Same issue in other places in this PR

snapshotId, shardId, blobName), e);
}
}
final List<String> staleBlobs = blobs.keySet().stream()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps call this orphanedBlobs?

@original-brownbear original-brownbear deleted the 40250 branch April 2, 2019 20:46
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Apr 3, 2019
…leniency

* elastic/master:
  SQL: Fix deserialisation issue of TimeProcessor (elastic#40776)
  Improve GCS docs for using keystore (elastic#40605)
  Add Restore Operation to SnapshotResiliencyTests (elastic#40634)
  Small refactorings to analysis components (elastic#40745)
  SQL: Fix display size for DATE/DATETIME (elastic#40669)
  add HLRC protocol tests for transform state and stats (elastic#40766)
  Inline TransportReplAction#registerRequestHandlers (elastic#40762)
  remove experimental label from search_as_you_type documentation (elastic#40744)
  Remove some abstractions from `TransportReplicationAction` (elastic#40706)
  Upgrade to latest build scan plugin (elastic#40702)
  Use default memory lock setting in testing (elastic#40730)
  Add Bulk Delete Api to BlobStore (elastic#40322)
  Remove yaml skips older than 7.0 (elastic#40183)
  Docs: Move id in the java-api (elastic#40748)
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Apr 16, 2019
* Adds Bulk delete API to blob container
* Implement bulk delete API for S3
* Adjust S3Fixture to accept both path styles for bulk deletes since the S3 SDK uses both during our ITs
* Closes elastic#40250
original-brownbear added a commit that referenced this pull request Apr 16, 2019
* Adds Bulk delete API to blob container
* Implement bulk delete API for S3
* Adjust S3Fixture to accept both path styles for bulk deletes since the S3 SDK uses both during our ITs
* Closes #40250
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Apr 19, 2019
* Just like elastic#40322 for AWS
* We already had a bulk delete API but weren't using it from the blob container implementation, now we are using it
  * Made the bulk delete API also compliant with our interface that only suppresses errors about non existent blobs by stating failed deletes (I didn't use any bulk stat action here since having to stat here should be the exception anyway and it would make error handling a lot more complex)
* Fixed bulk delete API to limit its batch size to 100 in line with GCS recommendations
original-brownbear added a commit that referenced this pull request Apr 30, 2019
* Implement Bulk Deletes for GCS Repository

* Just like #40322 for AWS
* We already had a bulk delete API but weren't using it from the blob container implementation, now we are using it
  * Made the bulk delete API also compliant with our interface that only suppresses errors about non existent blobs by stating failed deletes (I didn't use any bulk stat action here since having to stat here should be the exception anyway and it would make error handling a lot more complex)
* Fixed bulk delete API to limit its batch size to 100 in line with GCS recommendations
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Apr 30, 2019
* Implement Bulk Deletes for GCS Repository

* Just like elastic#40322 for AWS
* We already had a bulk delete API but weren't using it from the blob container implementation, now we are using it
  * Made the bulk delete API also compliant with our interface that only suppresses errors about non existent blobs by stating failed deletes (I didn't use any bulk stat action here since having to stat here should be the exception anyway and it would make error handling a lot more complex)
* Fixed bulk delete API to limit its batch size to 100 in line with GCS recommendations
original-brownbear added a commit that referenced this pull request Apr 30, 2019
* Implement Bulk Deletes for GCS Repository (#41368)

* Just like #40322 for AWS
* We already had a bulk delete API but weren't using it from the blob container implementation, now we are using it
  * Made the bulk delete API also compliant with our interface that only suppresses errors about non existent blobs by stating failed deletes (I didn't use any bulk stat action here since having to stat here should be the exception anyway and it would make error handling a lot more complex)
* Fixed bulk delete API to limit its batch size to 100 in line with GCS recommendations

back port of #41368
akhil10x5 pushed a commit to akhil10x5/elasticsearch that referenced this pull request May 2, 2019
* Implement Bulk Deletes for GCS Repository

* Just like elastic#40322 for AWS
* We already had a bulk delete API but weren't using it from the blob container implementation, now we are using it
  * Made the bulk delete API also compliant with our interface that only suppresses errors about non existent blobs by stating failed deletes (I didn't use any bulk stat action here since having to stat here should be the exception anyway and it would make error handling a lot more complex)
* Fixed bulk delete API to limit its batch size to 100 in line with GCS recommendations
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
* Adds Bulk delete API to blob container
* Implement bulk delete API for S3
* Adjust S3Fixture to accept both path styles for bulk deletes since the S3 SDK uses both during our ITs
* Closes elastic#40250
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
* Implement Bulk Deletes for GCS Repository

* Just like elastic#40322 for AWS
* We already had a bulk delete API but weren't using it from the blob container implementation, now we are using it
  * Made the bulk delete API also compliant with our interface that only suppresses errors about non existent blobs by stating failed deletes (I didn't use any bulk stat action here since having to stat here should be the exception anyway and it would make error handling a lot more complex)
* Fixed bulk delete API to limit its batch size to 100 in line with GCS recommendations
mkleen added a commit to crate/crate that referenced this pull request May 26, 2020
mkleen added a commit to crate/crate that referenced this pull request May 26, 2020
mkleen added a commit to crate/crate that referenced this pull request May 26, 2020
mkleen added a commit to crate/crate that referenced this pull request May 26, 2020
mkleen added a commit to crate/crate that referenced this pull request May 26, 2020
mkleen added a commit to crate/crate that referenced this pull request May 26, 2020
mkleen added a commit to crate/crate that referenced this pull request May 26, 2020
mkleen added a commit to crate/crate that referenced this pull request May 27, 2020
mergify bot pushed a commit to crate/crate that referenced this pull request May 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement v7.2.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Bulk Delete API to Blob Store Interface
5 participants