Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: Remove extra call to reg_next_scrub() during splits #11206

Merged
merged 1 commit into from Oct 10, 2016

Conversation

dzafman
Copy link
Contributor

@dzafman dzafman commented Sep 22, 2016

Add assert() to catch this in the future

Fixes: http://tracker.ceph.com/issues/16474

Signed-off-by: David Zafman dzafman@redhat.com

Add assert() to catch this in the future

Fixes: http://tracker.ceph.com/issues/16474

Signed-off-by: David Zafman <dzafman@redhat.com>
@athanatos
Copy link
Contributor

lgtm! adding needs-qa

@dzafman
Copy link
Contributor Author

dzafman commented Sep 28, 2016

http://pulpito.ceph.com/dzafman-2016-09-22_17:11:48-rados-wip-zafman-testing-distro-basic-smithi/

There is a failure I'm looking at in the run above. An example is job 432090. I'm rerunning that job with my wip-zafman-fixes ceph-qa-suite branch.

@@ -3411,6 +3411,7 @@ void PG::reg_next_scrub()
double scrub_min_interval = 0, scrub_max_interval = 0;
pool.info.opts.get(pool_opts_t::SCRUB_MIN_INTERVAL, &scrub_min_interval);
pool.info.opts.get(pool_opts_t::SCRUB_MAX_INTERVAL, &scrub_max_interval);
assert(scrubber.scrub_reg_stamp == utime_t());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further up, utime_t() was changing to ceph_clock_now(cct). Should this be changed here as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rabbat2 No, this was to indicate that no time is set.

osd->unreg_pg_scrub(info.pgid, scrubber.scrub_reg_stamp);
scrubber.scrub_reg_stamp = utime_t();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, should this be ceph_clock_now(cct) instead of utime_t()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robbat2 Same here this is supposed to indicate that no time is set.

Copy link
Contributor Author

@dzafman dzafman Sep 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robbat2 I see the following at top of master for the utime_t ctor:

utime_t() { tv.tv_sec = 0; tv.tv_nsec = 0; }

@robbat2
Copy link
Contributor

robbat2 commented Oct 1, 2016

We should add in some docs that make it more clear why the unreg scrub uses zero as the value for scrub_reg_stamp. 5e44040 is what prompted me to ask if ceph_clock_now was more correct.

@dzafman
Copy link
Contributor Author

dzafman commented Oct 6, 2016

The scrub_reg_stamp contents is a stashed value when a scrub is registered in reg_next_scrub() that is only used to allow unreg_next_scrub() to find and remove that registration later. It is not the same as last_scrub_stamp or anything that can be seen outside of the OSD. I set it to utime_t() to indicate that that scrub isn't registered, so we can assert if you try to remove a scrub twice. That value is the same as what is initialized by the constructor of PG::Scrubber.

@dzafman
Copy link
Contributor Author

dzafman commented Oct 10, 2016

Only one unrelated failure:
79/146 Test #86: unittest_erasure_code_plugin_jerasure ...***Exception: SegFault 0.13 sec

@dzafman dzafman merged commit 3a2be48 into ceph:master Oct 10, 2016
@dzafman dzafman deleted the wip-16474 branch October 10, 2016 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants