Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mds: use regular dispatch for processing beacons #57469

Merged
merged 4 commits into from
May 23, 2024
Merged

Conversation

batrick
Copy link
Member

@batrick batrick commented May 15, 2024

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@batrick batrick requested a review from a team as a code owner May 15, 2024 01:45
@github-actions github-actions bot added cephfs Ceph File System core mon labels May 15, 2024
@batrick batrick requested a review from a team May 15, 2024 01:51
Copy link
Contributor

@leonid-s-usov leonid-s-usov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a flow where the dispatchers are added dynamically, which would require a priority queue? Until now I was under the impression that dispatchers are added in a predictable static initialization flow, meaning that simply ordering the add_dispatcher_* calls should provide enough control over the invocation order.

If we have to go for the new priority argument, I think we should define standard high/default/low priorities, evenly spaced in the priority range. The default should be in the middle of the priority range, not on the lowest side.

src/msg/Messenger.h Outdated Show resolved Hide resolved
src/msg/Messenger.h Outdated Show resolved Hide resolved
src/msg/Messenger.h Outdated Show resolved Hide resolved
@batrick
Copy link
Member Author

batrick commented May 15, 2024

Is there a flow where the dispatchers are added dynamically, which would require a priority queue? Until now I was under the impression that dispatchers are added in a predictable static initialization flow, meaning that simply ordering the add_dispatcher_* calls should provide enough control over the invocation order.

Dispatchers can be added dynamically but it is uncommon.

Adding dispatchers to the head/tail of the list is just too clumsy; there's no reason not to just define the order. It's rather important to get right because some dispatchers acquire highly contentious locks (mds_lock being one example).

If we have to go for the new priority argument, I think we should define standard high/default/low priorities, evenly spaced in the priority range. The default should be in the middle of the priority range, not on the lowest side.

Sure.

Copy link
Contributor

@leonid-s-usov leonid-s-usov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK then let's go with the priorities.

However, we should use the priority constants instead of literal magic numbers and try our best to leave calls unchanged where the default priority should apply.

src/mds/MDSDaemon.cc Outdated Show resolved Hide resolved
src/mds/MDSDaemon.cc Outdated Show resolved Hide resolved
src/mds/MDSRank.cc Outdated Show resolved Hide resolved
src/mds/MDSRank.cc Outdated Show resolved Hide resolved
src/mon/MonClient.cc Outdated Show resolved Hide resolved
src/mon/MonClient.cc Outdated Show resolved Hide resolved
src/mon/MonClient.cc Outdated Show resolved Hide resolved
@batrick batrick force-pushed the i66014 branch 2 times, most recently from d9794d4 to 8defa5d Compare May 15, 2024 15:11
@batrick
Copy link
Member Author

batrick commented May 15, 2024

I've switched Messenger::(fast_)dispatchers to a std::vector from std::deque:

  • std::vector ensures the elements are contiguous
  • optimizing for add_dispatcher_head is not worthwhile when the dispatchers list is rarely changed and usually only at startup

@@ -2137,7 +2137,7 @@ void MDSRank::active_start()

dout(10) << __func__ << ": initializing metrics handler" << dendl;
metrics_handler.init();
messenger->add_dispatcher_tail(&metrics_handler);
messenger->add_dispatcher_head(&metrics_handler, 20); // very high but behind Beacon
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still magic numbers. Now, one needs to go find where beacon handler is added and check that number. And if that number changes, how do they know to update this place?

If this should be high but behind beacon, this should be

  messenger->add_dispatcher_tail(&metrics_handler, ::priority_high);

and beacon should stay

  messenger->add_dispatcher_head(&beacon, ::priority_high);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

For order checking.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
So we can ensure that e.g. MDSRank::ms_dispatch is lowest priority so that we
do not acquire the mds_lock when looking at beacons.

This change maintains the current behavior when the priority is unset: the use
of std::stable_sort will ensure that the add_dispatcher_head and
add_dispatcher_tail calls will preserve order when dispatcher priorities are
equal.

Fixes: 7fc04be
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Similar to the issue with MClientMetrics, beacons should also not be handled
via fast dispatch because it's necessary to acquire Beacon::mutex. This is a
big no-no as it may block one of the Messenger threads leading to improbable
deadlocks or DoS.

Instead, use the normal dispatch where acquiring locks is okay to do.

Fixes: 7fc04be
See-also: linux.git/f7c2f4f6ce16fb58f7d024f3e1b40023c4b43ff9
Fixes: https://tracker.ceph.com/issues/65658

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This tries to preserve existing order but uses priorities to make it explicit
and robust to future dispatchers being added. Except:

- The beacon and metrics dispatcher have the highest priorities.  This is to
  ensure we process these messages before trying to acquire any expensive locks
  (like mds_lock).

- The monc dispatcher also has a relatively high priority for the same reasons.
  This change affects other daemons which may have ordered a dispatcher ahead
  of the monc but I cannot think of a legitimate reason to nor do I see an
  instance of it.

Fixes: 7fc04be
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Copy link
Contributor

@leonid-s-usov leonid-s-usov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love it!

@batrick
Copy link
Member Author

batrick commented May 17, 2024

This PR is under test in https://tracker.ceph.com/issues/66086.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants