New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-18656: [Backport to 3.4] [ABFS] Adding Support for Paginated Delete for Large Directories in HNS Account #6718
HADOOP-18656: [Backport to 3.4] [ABFS] Adding Support for Paginated Delete for Large Directories in HNS Account #6718
Conversation
…tories in HNS Account (apache#6409) Contributed by Anuj Modi
🎊 +1 overall
This message was automatically generated. |
:::: AGGREGATED TEST RESULT :::: ============================================================
|
@steveloughran |
@steveloughran, @mukund-thakur... |
there are conflicts here after the test patch has been merged. |
:::: AGGREGATED TEST RESULT :::: ============================================================
|
Resolved conflicts. Thanks a lot. |
🎊 +1 overall
This message was automatically generated. |
Description of PR
Jira Ticket: https://issues.apache.org/jira/browse/HADOOP-18656
Commit from trunk: 6ed7389
Today, when a recursive delete is issued for a large directory in ADLS Gen2 (HNS) account, the directory deletion happens in O(1) but in backend ACL Checks are done recursively for each object inside that directory which in case of large directory could lead to request time out. Pagination is introduced in the Azure Storage Backend for these ACL checks.
More information on how pagination works can be found on public documentation of Azure Delete Path API.
This PR contains changes to support this from client side. To trigger pagination, client needs to add a new query parameter "paginated" and set it to true along with recursive set to true. In return if the directory is large, server might return a continuation token back to the caller. If caller gets back a continuation token, it has to call the delete API again with continuation token along with recursive and pagination set to true. This is similar to directory delete of FNS account.
Pagination is available only in versions "2023-08-03" onwards.
PR also contains functional tests to verify driver works well with different combinations of recursive and pagination features for both HNS and FNS account.
Full E2E testing of pagination requires large dataset to be created and hence not added as part of driver test suite. But extensive E2E testing has been performed.