perf: replace causal scan in find_last_delete_op with peer-by-peer range scan#978
Merged
Merged
Conversation
c14396e to
13e6ede
Compare
…nge scan iter_changes_causally_rev walks every change in the oplog in DAG order, which is O(total changes). Any delete op covering `id` must have observed it, so start_vv (the vv at `id`) is a valid lower bound per peer. Switching to iter_changes_peer_by_peer with that bound skips all changes that predate `id` without needing causal ordering. The match with the highest lamport timestamp is the latest delete.
b19fbcd to
3c49ceb
Compare
Member
|
The original impl actually has similar time complexity as the new one, it only scans the changes inside the start_vv -> latest_vv. But the new impl should be more accurate |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
find_last_delete_op was flagged with FIXME: PERF for iterating all changes in the oplog using iter_changes_causally_rev, which performs a DAG-ordered merge across all peers - O(total changes).
Furthermore, any delete op that covers id must have causally observed it, meaning the deleting peer's counter at delete time was strictly greater than id.counter. The version vector at id (start_vv) is therefore a valid lower bound per peer; thus changes at or before start_vv[peer] predate id and cannot have deleted it.
Switching to iter_changes_peer_by_peer(&start_vv, oplog.vv()) skips those changes without needing causal ordering. Since a given ID can only be deleted once, there is typically exactly one match; the lamport comparison is a safety net for any edge case.
The existing test suite passes without changes.