Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore mistake in #1315 #1482

Conversation

phoebusm
Copy link
Collaborator

@phoebusm phoebusm commented Apr 9, 2024

Reference Issues/PRs

https://github.com/man-group/ArcticDB/pull/1315/files#r1557428013
Batch delete container is needed to be manually cleared after each batch run.
The operation has been incorrectly removed in #1315

What does this implement or fix?

Restore batch delete container clearing operation

Any other comments?

Checklist

Checklist for code changes...
  • Have you updated the relevant docstrings, documentation and copyright notice?
  • Is this contribution tested against all ArcticDB's features?
  • Do all exceptions introduced raise appropriate error messages?
  • Are API changes highlighted in the PR description?
  • Is the PR labelled as enhancement or bug so it appears in autogenerated release notes?

@phoebusm
Copy link
Collaborator Author

phoebusm commented Apr 9, 2024

@muhammadhamzasajjad

@phoebusm phoebusm marked this pull request as ready for review April 9, 2024 10:54
@phoebusm phoebusm requested a review from poodlewars April 9, 2024 10:54
@phoebusm phoebusm self-assigned this Apr 9, 2024
@phoebusm phoebusm added the bug Something isn't working label Apr 9, 2024
@poodlewars
Copy link
Collaborator

Test?

@phoebusm phoebusm force-pushed the fix/restore_emptying_azure_batch_delete_container_after_each_batch_run branch 2 times, most recently from d327ff2 to 4dce7ee Compare April 9, 2024 11:08
@phoebusm
Copy link
Collaborator Author

phoebusm commented Apr 9, 2024

Test?

Will come shortly. This PR is created urgently for fixing the problem as the mistake is critical as it affects all users use Azure as backend storage.
I won't merge the PR until the test is pushed, unless some user does flag it

@muhammadhamzasajjad
Copy link
Collaborator

@phoebusm, thanks for raising this. Not clearing to_delete vector makes it significantly slower if we have many batches. The bug must have been affecting the performance or was it also causing errors? AFAIK, azure does silent deletes if you attempt to remove an inexistant blob. How much slower was it?

@phoebusm phoebusm force-pushed the fix/restore_emptying_azure_batch_delete_container_after_each_batch_run branch from 6132896 to 44dc3ba Compare April 9, 2024 11:46
@phoebusm
Copy link
Collaborator Author

phoebusm commented Apr 9, 2024

@phoebusm, thanks for raising this. Not clearing to_delete vector makes it significantly slower if we have many batches. The bug must have been affecting the performance or was it also causing errors? AFAIK, azure does silent deletes if you attempt to remove an inexistant blob. How much slower was it?

@muhammadhamzasajjad It is more than just performance problem. Deletion can't be run as exception will be thrown. Or, if delete_library is being called, the exception will be swallowed, which could result in memory leak.

Explation:
If the vector don't get cleared, in the next cycle of the loop, extra blob_name will be added to the vector.
On

if (++batch_counter == delete_object_limit) {
, batch_counter will definitely larger than delete_object_limit so all batch delete requests will be run at
submit_batch(to_delete);
.
Because of the extra checking you added in
util::check(blob_names.size() <= BATCH_SUBREQUEST_LIMIT,
, exception arcticdb_ext.exceptions.InternalException: E_ASSERTION_FAILURE Azure delete batch size xxx exceeds maximum permitted batch size of 256 will be thrown before the batch requests being submitted to the SDK.

@phoebusm
Copy link
Collaborator Author

phoebusm commented Apr 9, 2024

@muhammadhamzasajjad Also I have remove the obsolete batch size counter as it becomes obsolete after your refactor.

@phoebusm
Copy link
Collaborator Author

phoebusm commented Apr 9, 2024

Test?

Just added.

@phoebusm phoebusm force-pushed the fix/restore_emptying_azure_batch_delete_container_after_each_batch_run branch from 44dc3ba to 51f029d Compare April 9, 2024 12:04
@phoebusm phoebusm merged commit 23b3b94 into master Apr 9, 2024
112 of 114 checks passed
@phoebusm phoebusm deleted the fix/restore_emptying_azure_batch_delete_container_after_each_batch_run branch April 9, 2024 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants