Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: improve the libcephfs when MDS is stopping #52336

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
44 changes: 32 additions & 12 deletions src/client/Client.cc
Original file line number Diff line number Diff line change
Expand Up @@ -3114,12 +3114,24 @@ void Client::handle_mds_map(const MConstRef<MMDSMap>& m)
continue;
}
if (newstate >= MDSMap::STATE_ACTIVE) {
if (oldstate < MDSMap::STATE_ACTIVE) {
// kick new requests
kick_requests(session.get());
kick_flushing_caps(session.get());
signal_context_list(session->waiting_for_open);
wake_up_session_caps(session.get(), true);
if (oldstate <= MDSMap::STATE_ACTIVE && newstate != oldstate) {
// flush the delayed caps in case this MDS is stopping
for (auto p = delayed_list.begin(); p != delayed_list.end(); ) {
Inode *in = *p;
++p;
if (!mount_aborted && in->auth_cap->session == session.get()) {
in->delay_cap_item.remove_myself();
check_caps(in, CHECK_CAPS_NODELAY);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand at all the purpose of flushing the caps here. We can never guarantee they will get flushed before the MDS stops, and something else should take over for it, so why bother doing something that adds load to a daemon right when it's trying to shed load?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also could't guarantee that, just trying to flush the dirty caps to MDS before the MDS stops.

I just saw that when the MDS or client get the first mdsmap marking the one MDS is up:stopping the client continued sending client requests and cap update requests to it. Which last around 20 seconds. I think this could make the MDS stopping process last longer.

That means during these 20 seconds the dirty caps also possibly will be flushed by the tick thread. I am thinking why not trigger it as earlier as possible to speed up it ?

Makes sense ?


if (oldstate < MDSMap::STATE_ACTIVE) {
// kick new requests
kick_requests(session.get());
kick_flushing_caps(session.get());
signal_context_list(session->waiting_for_open);
wake_up_session_caps(session.get(), true);
}
}
connect_mds_targets(mds);
}
Expand Down Expand Up @@ -17317,14 +17329,22 @@ mds_rank_t Client::_get_random_up_mds() const
{
ceph_assert(ceph_mutex_is_locked_by_me(client_lock));

std::set<mds_rank_t> up;
std::set<mds_rank_t> up, stopping, valid;
mdsmap->get_up_mds_set(up);
mdsmap->get_mds_set(stopping, MDSMap::STATE_STOPPING);

if (up.empty())
return MDS_RANK_NONE;
std::set<mds_rank_t>::const_iterator p = up.begin();
for (int n = rand() % up.size(); n; n--)
++p;
// Try to skip the stopping MDSs
std::set_difference(up.begin(), up.end(), stopping.begin(), stopping.end(),
std::inserter(valid, valid.end()));
if (valid.empty()) {
if (stopping.empty())
return MDS_RANK_NONE;

valid = std::move(stopping); // use any stopping mds (probably rank 0)
}

auto p = valid.begin();
std::advance(p, ceph::util::generate_random_number<uint64_t>(0, valid.size()));
return *p;
}

Expand Down