New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
osd: Remove extra call to reg_next_scrub() during splits #11206
Conversation
Add assert() to catch this in the future Fixes: http://tracker.ceph.com/issues/16474 Signed-off-by: David Zafman <dzafman@redhat.com>
lgtm! adding needs-qa |
http://pulpito.ceph.com/dzafman-2016-09-22_17:11:48-rados-wip-zafman-testing-distro-basic-smithi/ There is a failure I'm looking at in the run above. An example is job 432090. I'm rerunning that job with my wip-zafman-fixes ceph-qa-suite branch. |
@@ -3411,6 +3411,7 @@ void PG::reg_next_scrub() | |||
double scrub_min_interval = 0, scrub_max_interval = 0; | |||
pool.info.opts.get(pool_opts_t::SCRUB_MIN_INTERVAL, &scrub_min_interval); | |||
pool.info.opts.get(pool_opts_t::SCRUB_MAX_INTERVAL, &scrub_max_interval); | |||
assert(scrubber.scrub_reg_stamp == utime_t()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further up, utime_t() was changing to ceph_clock_now(cct). Should this be changed here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rabbat2 No, this was to indicate that no time is set.
osd->unreg_pg_scrub(info.pgid, scrubber.scrub_reg_stamp); | ||
scrubber.scrub_reg_stamp = utime_t(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, should this be ceph_clock_now(cct) instead of utime_t()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robbat2 Same here this is supposed to indicate that no time is set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robbat2 I see the following at top of master for the utime_t ctor:
utime_t() { tv.tv_sec = 0; tv.tv_nsec = 0; }
We should add in some docs that make it more clear why the unreg scrub uses zero as the value for scrub_reg_stamp. 5e44040 is what prompted me to ask if ceph_clock_now was more correct. |
The scrub_reg_stamp contents is a stashed value when a scrub is registered in reg_next_scrub() that is only used to allow unreg_next_scrub() to find and remove that registration later. It is not the same as last_scrub_stamp or anything that can be seen outside of the OSD. I set it to utime_t() to indicate that that scrub isn't registered, so we can assert if you try to remove a scrub twice. That value is the same as what is initialized by the constructor of PG::Scrubber. |
Only one unrelated failure: |
Add assert() to catch this in the future
Fixes: http://tracker.ceph.com/issues/16474
Signed-off-by: David Zafman dzafman@redhat.com