-
Notifications
You must be signed in to change notification settings - Fork 6.3k
osd/scrub: no shared scrub-job ownership between PGs and the scrub queue #58173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I think this seems like a reasonable direction, though it's not entirely clear from this PR what the end goal is. It might be easier to understand with the followup changes alluded to here, but on balance I don't think it's worth making this PR larger. Let me know when you need a review. |
Thanks a lot. Will do. |
b6b510d to
cfc0240
Compare
42004a6 to
f8b5d19
Compare
ee92ae4 to
fa665a9
Compare
|
fd0685c to
1604cd5
Compare
|
make check seems to fail |
src/osd/scrubber/osd_scrub.h
Outdated
| */ | ||
| void update_job( | ||
| Scrub::ScrubJobRef sjob, | ||
| Scrub::ScrubJob& sjob, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to have neither implementation nor callers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed...
Forgot to rebase with the '.h' changes when removing from the '.cc'.
Fixed.
src/osd/scrubber/osd_scrub.h
Outdated
| * To be used if we are no longer the PG's primary, or if the PG is removed. | ||
| */ | ||
| void remove_from_osd_queue(Scrub::ScrubJobRef sjob); | ||
| void remove_from_osd_queue(Scrub::ScrubJob& sjob); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove_from_osd_queue only needs the pgid, not a mutable reference to a whole ScrubJob.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/osd/scrubber/osd_scrub.h
Outdated
| */ | ||
| void delay_on_failure( | ||
| Scrub::ScrubJobRef sjob, | ||
| Scrub::ScrubJob& sjob, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't obviously make sense to me. There appears to be only one caller:
/*
* note: arbitrary delay used in this early version of the
* scheduler refactoring.
*/
void PgScrubber::penalize_next_scrub(Scrub::delay_cause_t cause)
{
m_osds->get_scrub_services().delay_on_failure(
*m_scrub_job, 5s, cause, ceph_clock_now());
}
The implementation
void OsdScrub::delay_on_failure(
Scrub::ScrubJob& sjob,
std::chrono::seconds delay,
Scrub::delay_cause_t delay_cause,
utime_t now_is)
{
m_queue.delay_on_failure(sjob, delay, delay_cause, now_is);
}
simply invokes ScrubQueue::delay_on_failure
void ScrubQueue::delay_on_failure(
Scrub::ScrubJob& sjob,
std::chrono::seconds delay,
Scrub::delay_cause_t delay_cause,
utime_t now_is)
{
dout(10) << fmt::format(
"pg[{}] delay_on_failure: delay:{} now:{:s}",
sjob.pgid, delay, now_is)
<< dendl;
sjob.delay_on_failure(delay, delay_cause, now_is);
}
which finally invokes ScrubJob::delay_on_failure. This seems like a holdover from the previous scheme. PgScrubber::requeue_penalized, by contrast, seems to be perfectly happy invoking ScrubJob::delay_on_failure directly:
void PgScrubber::requeue_penalized(Scrub::delay_cause_t cause)
{
/// \todo fix the 5s' to use a cause-specific delay parameter
m_scrub_job->delay_on_failure(5s, cause, ceph_clock_now());
m_osds->get_scrub_services().enqueue_target(*m_scrub_job);
}
Let's remove OsdScrub::delay_on_failure and ScrubQueue::delay_on_failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, and PgScrubber::penalize_next_scrub() actually has no further callers since PgScrubber::requeue_penalized calls ScrubJob::delay_on_failure and OsdScrub::enqueue_target.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - the whole chain was irrelevant after the changes. Removed.
src/osd/scrubber/osd_scrub.h
Outdated
| void register_with_osd( | ||
| Scrub::ScrubJobRef sjob, | ||
| const Scrub::sched_params_t& suggested); | ||
| void enqueue_target(Scrub::ScrubJob& sjob); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This takes a mutable reference because the implementation updates target_queued --ScrubQueue::enqueue_target actually asserts !target_queued. The comment here does not mention this behavior or that the caller is responsible for ensuring that target_queued is cleared. This is particularly confusing as pg_scrubber appears to be responible for clearing this flag. I'd really prefer that components outside of pg simply queue the ScrubJob and ignore any fields they don't specifically need to access for that purpose. External interfaces like this one should take const references to ScrubJob.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modified
src/osd/scrubber/osd_scrub_sched.cc
Outdated
| dout(10) << fmt::format("pg[{}] not scrub-able by this OSD", sjob.pgid) << dendl; | ||
| return; | ||
| } | ||
| ceph_assert(!sjob.target_queued); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned above, let's not look at members irrelevant to this component.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove. But note: it did serve a purpose in detecting omissions during development.
I'll add the check to all callers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding these checks to the point in PgScrubber where it does the scheduling would make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a second thought: I see this differently. The ScrubQueue object maintains the queue, and should
be allowed to access what it queues.
Anyway - will remove it at this phase, as the element queued is changed in the next phase.
src/osd/scrubber/osd_scrub_sched.h
Outdated
| * To be used if we are no longer the PG's primary, or if the PG is removed. | ||
| */ | ||
| void remove_from_osd_queue(Scrub::ScrubJobRef sjob); | ||
| void remove_from_osd_queue(const Scrub::ScrubJob& sjob); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only pass pgid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/osd/scrubber/osd_scrub_sched.h
Outdated
|
|
||
| void delay_on_failure( | ||
| Scrub::ScrubJobRef sjob, | ||
| Scrub::ScrubJob& sjob, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned above, remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
86754ad to
efabc04
Compare
e6757ea to
a01feb5
Compare
added a commit to remove the irrelevant offending unit-test |
|
jenkins test api |
|
jenkins test make check arm64 |
372ffd7 to
8311fb0
Compare
8311fb0 to
0479028
Compare
to convey the change in the scheduling data ownership: no longer a scrub-job object shared between the Scrubber and the OSD scrub queue. Instead - the scrub queue holds a copied snapshot of the scheduling data. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
... and not a shared pointer to it. The PG's entry in the scrub queue and the scrub-job object that is part of the scrubber are now separate entities, that can be modified separately - each with its own locking mechanism. the interaction between OsdScrub (the OSD's scrub scheduler object) and the scrub queue during scrub initiation is simplified: instead of fetching from the queue a list of scrub targets possibly ready to be scrubbed, the OsdScrub now picks only one candidate - the top of the queue. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
following the change to the queue into holding a copy of the ScrubJob, the registration process - initiated by schedule_scrub_with_osd() - can now be simplified. adjust_target_time() is relocated as is. It will be refactored in the next commit. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
... to use populate_config_params() output Also: fix determine_scrub_time() to use the same already-available data. determine_scrub_time() renamed to determine_initial_schedule(), to better reflect its purpose. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
moving the scrubbed sjob copy thru the scrubber - from scrub session initiation to its termination (or abort - where we use the handed "old version" of the sjob to update the new one). Note that in this version - not all the information that was used to determine the specifics of the initiated scrub is passed to the scrubber and back. In this half-baked stage of the refactoring, the resulting implementation handling of corner cases, still using the "planned scrub" flags, is less than optimal. The next step (dual targets, replacing the 'planned scrub' flags with specific attributes in the scheduling target) fixes this. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
...and on_scrub_schedule_input_change() by accepting a parameter to control whether the scheduled time for periodic scrubs that are already due should be updated. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
or after an aborted scrub. To note: one of the important changes in this commit: merging the functionality of adjust_target_time() & update_schedule() into a single function - adjust_schedule(). Regarding the handling of aborts: Most of the time - all that is required following a scrub abort is to requeue the scrub job - the one that triggered the aborted scrub - with just a delay added to its n.b.. But we must take into account scenarios where "something" caused the parameters prepared for the *next* scrub to show higher urgency or priority. "Something" - as in an operator command requiring immediate scrubbing, or a change in the pool/cluster configuration. In such cases - the current requested flags and the parameters of the aborted scrub must be merged. Note that the current implementation is a temporary solution, to be replaced by a per-level updating of the relevant target. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
following changes in scrub code Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
Removing the use of C++ ranges, as Clang versions below 16 are buggy when libstd++ ranges are used. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
removing this unit-test, as it must be rewritten to match the changes in the implementation of the scrub scheduling mechanism, and that implementation is still in flux. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
Previous commits handled the following two cases correctly: - requeueing a scrub job while the OSD is still the primary, and - not restoring the scrub job to the queue if the PG is not there; Here we handle the missed scenario: the PG is there (we were able to lock it), but is no longer the primary. Also - a configuration change must not cause a re-queue of a scrub-job for a PG that is in the middle of scrubbing. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
0479028 to
699dd28
Compare
|
jenkins retest this please |
|
jenkins test make check |
|
jenkins test submodules |
|
Merging based on my Teuthology tests |
osd/scrub: no shared scrub-job ownership between PGs and the scrub queue Reviewed-by: Samuel Just <sjust@redhat.com>
This is a follow-up to the previous commits in this series of changes
to the scrub scheduling mechanism.
In this step:
and not a shared pointer to it.
The PG's entry in the scrub queue and the scrub-job object that
is part of the scrubber are now separate entities, that can be
modified separately - each with its own locking mechanism.
object) and the scrub queue during scrub initiation is simplified:
instead of fetching from the queue a list of scrub targets
possibly ready to be scrubbed, the OsdScrub now picks only one
candidate - the top of the queue.
schedule_scrub_with_osd() - is simplified.
or an abort is improved.
The next step (next PR):
one for the next shallow scrub, and one for the next deep scrub.
in the scheduling target.