New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rgw multisite: use a rados lock to coordinate data log trimming #10546
Conversation
@cbodley see my comments, also need to rebase this |
@yehudasa ok, will rebase - but i'm not seeing any comments? |
RGWRados *store; | ||
RGWHTTPManager *http; | ||
const int num_shards; | ||
const utime_t interval; //< polling interval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cbodley should use ceph::real_time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, just noticed that this was just a code that was moved around
RGWHTTPManager *http; | ||
const int num_shards; | ||
const utime_t interval; //< polling interval | ||
const std::string& zone; //< my zone id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/zone/zone_id to be explicit
|
||
// prevent others from trimming for our entire wait interval | ||
set_status("acquiring trim lock"); | ||
yield call(new RGWSimpleRadosLockCR(store->get_async_rados(), store, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cbodley, maybe use RGWContinuousLeaseCR instead? and need to shut it down when done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was by design - if there are multiple gateways in a zone, we only want one of them to attempt trimming each rgw_sync_log_trim_interval
. this is accomplished by leaving the lock held for the duration
(we exchanged email about it with the subject line "lease for multisite lock trimming")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cbodley right.. I see that discussion. I'm not sure I like the idea of just firing a lease without making the effort of releasing it. Is it a problem releasing it at the end here? Or is it just so that we don't trigger the trim code more frequent that originally planned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a problem releasing it at the end here? Or is it just so that we don't trigger the trim code more frequent that originally planned?
right. if we release the lock then all of the gateways would try to trim, when we only want it to happen once per trim interval
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yehudasa do i you still object to this appoach? any ideas for a better solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cbodley I'm not happy with this one, but let's just comment it clearly
@cbodley maybe now? |
7b5a49f
to
7614443
Compare
rebased and renamed to zone_id |
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
7614443
to
8e269aa
Compare
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
8e269aa
to
f8d9ac6
Compare
updated comments to clarify use of rados lock:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
jenkins test this please |
This PR is based on top of #10372. I'll rebase once that merges.
You can review the delta here: cbodley/ceph@wip-rgw-data-log-trim...cbodley:wip-rgw-log-trim-lease