-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] Make object directory robust to out-of-order updates #16314
Conversation
@@ -1012,8 +1017,11 @@ bool ReferenceCounter::RemoveObjectLocation(const ObjectID &object_id, | |||
<< " that doesn't exist in the reference table"; | |||
return false; | |||
} | |||
it->second.locations.erase(node_id); | |||
PushToLocationSubscribers(it); | |||
it->second.locations[node_id]--; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if it goes negative due to out of order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's definitely okay for this to be negative. The assumption is that eventually you will receive the corresponding Add request and then this will go back to 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A possible complication of counting here is if we have duplicate add locs, we could leak entries (count > 0 persistently if there were two adds and one remove). Would this be a concern?
Also, what would happen if the count goes below zero?
The assumption in this PR is that every add will have a corresponding remove (or a node failure). So a leak could definitely happen if there are duplicate adds, although it can happen today too just with message reordering. I think we should merge this as is since it will be more robust under heavy loads, but in the future we could have a method to handle duplicates, e.g., resetting the directory. I'll add a note about this. |
What about adding sequencer blocks around OBOD calls instead? I'm assuming
add/remove for a particular loc only comes from the client at that loc, is
that correct?
The sequencer seems like a more robust way to guarantee this property
without introducing edge cases like negative ref counts.
…On Tue, Jun 8, 2021, 11:52 AM Stephanie Wang ***@***.***> wrote:
A possible complication of counting here is if we have duplicate add locs,
we could leak entries (count > 0 persistently if there were two adds and
one remove). Would this be a concern?
Also, what would happen if the count goes below zero?
The assumption in this PR is that every add will have a corresponding
remove (or a node failure). So a leak could definitely happen if there are
duplicate adds, although it can happen today too just with message
reordering.
I think we should merge this as is since it will be more robust under
heavy loads, but in the future we could have a method to handle duplicates,
e.g., resetting the directory. I'll add a note about this.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#16314 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAADUSVTPBSIDWEKHWJ3HTTTRZRHRANCNFSM46KNNXLA>
.
|
(Btw, if you merge the latest master, the build issue will be gone) |
Oh also I think we should merge this regardless of the obod pubsub because it fixes the different path that obod pubsub handles cc @clarkzinzow |
5d6091a
to
78dc958
Compare
Why are these changes needed?
The ownership-based object directory (OBOD) can lose updates if they arrive out of order. Under heavy load and especially if there's thrashing, this can lead to memory leaks (location that never gets deleted) and possibly hanging (the OBOD registers a location that doesn't actually exist). This fixes the issue by collecting all the updates as a per-location count instead of adding/removing the location entry from a set.
Checks
scripts/format.sh
to lint the changes in this PR.