New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
osd/PrimaryLogPG: send requests to primary on cache miss #16884
Conversation
if (!is_primary()) { | ||
dout(20) << __func__ << " cache miss; ask the primary" << dendl; | ||
osd->reply_op_error(op, -EAGAIN); | ||
return cache_result_t::REPLIED_WITH_EAGAIN; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we actually need this to go at the top or we might get weird ordering bugs. eg, if the primary is promoting objects while the listing is happening, or if we return out for NOOPs because we have incorrect metadata that we're looking at as a result of being non-primary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The replica will have a missing set, so it'll bail out already if the pool is out of sync for the object in question before this method is called in do_op(). And more practically we can't move it up any higher in this method anyway or else we're above the "cache hit" case. Do you have a specific case in mind? I can't think of anything..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I don't remember anything more specific now. I guess if we're a replica we're only serving read operations, so as long as we can't serve a read while the object is somehow invalid then we're okay.
I can't think of any way the primary could logically finish nuking or overwriting an object that we wouldn't know about as a replica, so we should be good. Can you?
While you're editing the function, I'm also a bit surprised we check if there's a cache enabled after we check if the CACHE_IGNORE flag is set. |
If a client has {BALANCE,LOCALIZE}_READS and sends a request to a replica, but the object isn't in the cache, send them back to the primary. Otherwise we might do something rash (like trigger a promotion from a replica). Fixes: http://tracker.ceph.com/issues/20919 Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Yeah, that was weird.. moved it up! |
retest this please |
@gregsfortytwo ping |
Approved this @liewegas. |
the failed one is caused by d7b29ac#r141996981, i believe |
If a client has {BALANCE,LOCALIZE}_READS and sends a request to a
replica, but the object isn't in the cache, send them back to the
primary. Otherwise we might do something rash (like trigger a
promotion from a replica).
Fixes: http://tracker.ceph.com/issues/20919
Signed-off-by: Sage Weil sage@redhat.com