New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mds: reset heartbeat when fetching or committing dentries #45107
Conversation
src/mds/CDir.cc
Outdated
@@ -2123,6 +2125,9 @@ void CDir::_omap_fetched(bufferlist& hdrbl, map<string, bufferlist>& omap, | |||
last_name = std::string_view(k_it->c_str(), n_key.name.length()); | |||
null_keys.emplace_back(std::move(n_key)); | |||
++k_it; | |||
|
|||
if (!(++count % 1000)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason to check 1000 or 2000 (or in multiple of that) ?
Can we use Macro for these numbers? Just in case if we need to update these values in future so we can only update macro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some code it may take a longer time for each loop so will use 1000
as others do in mds/
, and for some code it will be faster so will use 2000
instead.
Sounds good. Will add one macro for the whole mds/
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the user impact because of this bug ?
This is fixing the bug in bz#2041660 |
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Fixes: https://tracker.ceph.com/issues/54345 Signed-off-by: Xiubo Li <xiubli@redhat.com>
@lxbsz PR looks good to me. Is it possible to add a test case for this ? |
Hi Kotresh, let me try, thanks. |
@@ -206,6 +206,14 @@ options: | |||
services: | |||
- mds | |||
with_legacy: true | |||
- name: mds_heartbeat_reset_grace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a Q: Is this heartbeat also used to check whether node is active or not, like in real :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The heartbeat here is just for the MDS daemon's liveness. And the MDS should periodically tell the Monitors that it's alive or not stuck.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this grace a time period or number of iterations ?
Does mds_heartbeat_reset_grace_period
sound correct or mds_heartbeat_reset_grace_iterations
sound correct ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks ok and fixes the warnings. However, we might want to come up with an alternate approach for this rather than spraying a bunch of heartbeat_reset() calls all over the mds.
Something to ponder...
jenkins test windows |
Fixes: https://tracker.ceph.com/issues/54345
Signed-off-by: Xiubo Li xiubli@redhat.com
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox