Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add recursiveDelete() #1427
feat: add recursiveDelete() #1427
Changes from 1 commit
3d1fa79
4f40d8c
aa3c1cc
e3aaea4
8952a37
ec6be1c
a0c1162
a8d1250
baee6fd
474e1c6
e24c591
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we move this to BulkWriter (kind of like a singleton pattern)? This could then be a shared instance that could be used by other operations in the future as well.
I also wonder whether the default ramp up makes sense here. It will slow down the delete (especially when compared to the CLI) and it also doesn't quite fit the use case for 5/5/5. We do not have to ramp up to give the database a chance to shard - the data is already sharded, we just want to delete it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean that calling
firestore.bulkWriter()
would return the same instance each time, or having a Firestore-internalBulkWriter.bulkWriter()
getter that lazy-loads?Yeah, that's something I was thinking about as well to implement as part of the RST_STREAM retries. Deletes aren't subject to 555 throttling, which creates two possibilities. 1) We can add an internal override on BulkWriter to turn off throttling when recursive delete is called. However, this would mean that writes being performed at the same time won't get throttled. 2) We can add logic for BulkWriter to automatically ignore throttling limits for deletes, but then we'd have to batch deletes separately from writes (or think of another solution that allows writes to still be throttled).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Firestore internal getter that lazy-loads. I want to keep it similar to what you have, but try to move these internals out of Firestore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a static
BulkWriter.bulkWriter()
method.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should add a code snippet here that shows how to collect the references for the failed delete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I included the default error handler to make it easier to paste. Do you think including it would confuse developers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need
T
here. It can beunknown
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to change the signature to
firestore.DocumentRef
and add some casting logic from the firestore typed classes to our internal classes. Can you double check to see if there's a cleaner way? Thanks!There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wonder if we should move this somewhere else (not that there is a good place for this). The class is getting a bit big.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm not really sure where to move it. Are you suggesting we encapsulate all the behavior in a separate class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take a look at https://developers.google.com/maps-booking/reference/grpc-api/status_codes
FAILED_PRECONDITION means that a simple retry won't work here. The developer has to change something about the state of the system before retrying the operation. This is likely not the case here. I would guess that we will have an error code here already when we hit this code path, but if we don't, I would suggest using UNAVAILABLE as the return code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is not deleting the provided reference in the case of a failed StructuredQuery the ideal, expected behavior? I need a quick sanity check. Here are the options:
I think we originally agreed on Option 2. It makes logical sense -- if Firestore can't fetch any of the descendants, that's a failed precondition, so Firestore shouldn't delete the provided reference. However, what's tripping me up is if the StructuredQuery stream errors out midway through the deletion process. In this case, an argument can be made both ways -- Firestore shouldn't delete the provided reference because it couldn't fetch the descendants (failed precondition argument) OR Firestore should delete the provided reference to be consistent in behavior (the provided reference is deleted even if BulkWriter fails to delete some references).
Trying to document all these edge cases led me to consider Option 3 -- if I'm a developer who simply wants the collection or document deleted, I don't care about the provided reference, and Firestore should just nuke whatever it can get a hold of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The query can error at any point in time. I would suggest we make no guarantees whatsoever, which makes the overall behavior of the API much more predictable. If any operation fails, the database tree is in an undefined state, and the user needs to try again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Changed it to always delete the provided reference when possible, but kept the error message in for query stream failures. Not sure if we want the FAILED_PRECONDITION status code though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't think of a more elegant way to differentiate between the final document reference delete failing vs. the BulkWriter deletes. Any suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked at the rest of the PR, but if you want to work on it before I do - couldn't you just let the BulkWriter handle this delete as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still have a few more changes/cleanup I need to push upstream, so reviewing now might result in some wasted work.
Thanks for the suggestion! I originally figured a normal delete is less verbose than enqueuing another operation to BulkWriter and calling flush(), but using BulkWriter allows for any user callbacks to also run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should still be underscore prefixed to prevent "usage by code completion".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed. Aren't private methods hidden from code completion? How do they show up? Also, what's the difference between the
_method()
andmethod_()
patterns in this file?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"private" is only hidden from TS developers. JS developers will still see it in code completion.
Underscore suffix is the same as underscore prefix, but from about 10 years ago. I am surprised this client uses underscore suffix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also get a test that has "randomCollA/docA" and "randomCollB/docB" and only deletes randomCollA? There might be some nuances in how to delete parent collection (in other ports) and I want to make sure that all of our clients handle this case correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, done.