Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mds: retry rdlock_path_pin_ref if needs to fwd request to auth #52453

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

lxbsz
Copy link
Member

@lxbsz lxbsz commented Jul 14, 2023

In case the _open() is called from _openc() the first
rdlock_path_pin_ref() will do nothing because the _openc() will
set the MutationImpl::PATH_LOCKED flag. And then we need to retry
it.

Signed-off-by: Xiubo Li xiubli@redhat.com

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@lxbsz lxbsz requested review from batrick and a team July 14, 2023 12:18
@github-actions github-actions bot added the cephfs Ceph File System label Jul 14, 2023
CInode *cur = rdlock_path_pin_ref(mdr, need_auth);
if (!cur)
return;

if (cur->is_frozen() || cur->state_test(CInode::STATE_EXPORTINGCAPS)) {
if ((cur->is_frozen() || cur->state_test(CInode::STATE_EXPORTINGCAPS)) && !need_auth) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the request should have been forwarded if it was not the auth mds and then we won't reach here. This looks fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but thinking more on this - it requires multiple MDSs to run into the deadlock? If yes, I'm not sure how this was hit in the setup where the deadlock was seen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this can be it because the inode dump does not show it as frozen or exporting caps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but thinking more on this - it requires multiple MDSs to run into the deadlock? If yes, I'm not sure how this was hit in the setup where the deadlock was seen.

There were 2 active MDSs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but thinking more on this - it requires multiple MDSs to run into the deadlock? If yes, I'm not sure how this was hit in the setup where the deadlock was seen.

There were 2 active MDSs.

Yeh, but the inode wasn't frozen, so it doesn't look to be this call path for the deadlock.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you found a genuine issue here Xiubo. Can you restructure this to break out !need_auth into a nested if which will assert if need_auth == true (as, in that case, the request should have been forwarded).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

There have two case:

1, _openc() --> _open(): the first call of rdlock_path_pin_ref() will do nothing even the need_auth == true, because in _openc() the rdlock_path_xlock_dentry() will set the MutationImpl::PATH_LOCKED flag.

So we should retry the rdlock_path_pin_ref() anyway if need_auth == true with:

-  if (cur->is_frozen() || cur->state_test(CInode::STATE_EXPORTINGCAPS)) {
-    ceph_assert(!need_auth);
+  if (need_auth || cur->is_frozen() || cur->state_test(CInode::STATE_EXPORTINGCAPS)) {
     mdr->locking_state &= ~(MutationImpl::PATH_LOCKED | MutationImpl::ALL_LOCKED);
     CInode *cur = rdlock_path_pin_ref(mdr, true);
     if (!cur)

2, _open() directly: the first call of rdlock_path_pin_ref() will forward to auth directly if need_auth == true, so it shouldn't be here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we should retry the rdlock_path_pin_ref() anyway if need_auth == true with:

That will always rerun rdlock_path_pin_ref twice if need_auth == true though, yes? I think we should try to avoid that except in the case of openc -> open. I wonder why not mdr->locking_state &= ~(MutationImpl::PATH_LOCKED | MutationImpl::ALL_LOCKED); before having openc call open.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we should retry the rdlock_path_pin_ref() anyway if need_auth == true with:

That will always rerun rdlock_path_pin_ref twice if need_auth == true though, yes? I think we should try to avoid that except in the case of openc -> open. I wonder why not mdr->locking_state &= ~(MutationImpl::PATH_LOCKED | MutationImpl::ALL_LOCKED); before having openc call open.

No, not always. Mostly the first one will do nothing in case of _openc() --> _open() and no need to do anything. Because the clients will always try to send the WRITE request to auth MDS and the _openc() has already done what we need here.

Just in case the open requests need be handled in auth MDS but the current MDS is not auth or the inode is under exporting will we need to retry it.

@batrick
Copy link
Member

batrick commented Jul 14, 2023

@lxbsz can you share your reproducer you mentioned in slack? Even if it's an ad-hoc script.

@lxbsz
Copy link
Member Author

lxbsz commented Jul 15, 2023

@lxbsz can you share your reproducer you mentioned in slack? Even if it's an ad-hoc script.

Locally I didn't reproduce it, but just added one test code try to acquire the rdlock for the same dentry after rdlock_path_xlock_dentry() in _openc():

 1616 2023-07-14T19:22:18.368+0800 7ff0aaa06640  0 mds.0.server  lxb 1 handle_client_openc get rdlock for [dentry #0x1/ffsb.sh [2,head] auth NULL (dn xlock x=1 by 0x55b564168400) (dversion lock w=1 last_client=4211) pv=0 v=1 ap=2 ino=(nil) state=1073741824 | request=1 lock=2 authpin=1 0x55b564125b80]                                                                                                                                                           
 1617 2023-07-14T19:22:18.368+0800 7ff0aaa06640  0 mds.0.server  lxb 2 handle_client_openc get rdlock for [dentry #0x1/ffsb.sh [2,head] auth NULL (dn xlock x=1 by 0x55b564168400) (dversion lock w=1 last_client=4211) pv=0 v=1 ap=2 ino=(nil) state=1073741824 | request=1 lock=2 authpin=1 0x55b564125b80]
 1618 2023-07-14T19:22:18.368+0800 7ff0aaa06640 10 mds.0.locker acquire_locks request(client.4211:2 nref=3 cr=0x55b5641b8f00)
 1619 2023-07-14T19:22:18.368+0800 7ff0aaa06640 20 mds.0.locker  must rdlock (dn xlock x=1 by 0x55b564168400) [dentry #0x1/ffsb.sh [2,head] auth NULL (dn xlock x=1 by 0x55b564168400) (dversion lock w=1 last_client=4211) pv=0 v=1
ap=2 ino=(nil) state=1073741824 | request=1 lock=2 authpin=1 0x55b564125b80]
 1620 2023-07-14T19:22:18.368+0800 7ff0aaa06640 10 mds.0.locker  must authpin [dentry #0x1/ffsb.sh [2,head] auth NULL (dn xlock x=1 by 0x55b564168400) (dversion lock w=1 last_client=4211) pv=0 v=1 ap=2 ino=(nil) state=1073741824 | request=1 lock=2 authpin=1 0x55b564125b80]
 1621 2023-07-14T19:22:18.368+0800 7ff0aaa06640 10 mds.0.locker  already auth_pinned [dentry #0x1/ffsb.sh [2,head] auth NULL (dn xlock x=1 by 0x55b564168400) (dversion lock w=1 last_client=4211) pv=0 v=1 ap=2 ino=(nil) state=1073741824 | request=1 lock=2 authpin=1 0x55b564125b80]
 1622 2023-07-14T19:22:18.368+0800 7ff0aaa06640  7 mds.0.locker rdlock_start  on (dn xlock x=1 by 0x55b564168400) on [dentry #0x1/ffsb.sh [2,head] auth NULL (dn xlock x=1 by 0x55b564168400) (dversion lock w=1 last_client=4211) pv=0 v=1 ap=2 ino=(nil) state=1073741824 | request=1 lock=2 authpin=1 0x55b564125b80]
 1623 2023-07-14T19:22:18.368+0800 7ff0aaa06640  7 mds.0.locker rdlock_start waiting on (dn xlock x=1 by 0x55b564168400) on [dentry #0x1/ffsb.sh [2,head] auth NULL (dn xlock x=1 by 0x55b564168400) (dversion lock w=1 last_client=4211) pv=0 v=1 ap=2 ino=(nil) state=1073741824 | request=1 lock=2 authpin=1 0x55b564125b80]
 1624 2023-07-14T19:22:18.368+0800 7ff0aaa06640 10 mds.0.locker nudge_log (dn xlock x=1 by 0x55b564168400) on [dentry #0x1/ffsb.sh [2,head] auth NULL (dn xlock x=1 by 0x55b564168400) (dversion lock w=1 last_client=4211) pv=0 v=1 ap=2 ino=(nil) state=1073741824 | request=1 lock=2 waiter=1 authpin=1 0x55b564125b80]
...
 2753 2023-07-14T19:22:48.984+0800 7ff0a9203640  0 log_channel(cluster) log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 30.619722 secs                                                                             
 2754 2023-07-14T19:22:48.984+0800 7ff0a9203640  0 log_channel(cluster) log [WRN] : slow request 30.619721 seconds old, received at 2023-07-14T19:22:18.365929+0800: client_request(client.4211:2 create #0x1/ffsb.sh 2023-07-14T19:22:18.364275+0800 caller_uid=1000, caller_gid=1000{10,1000,}) currently failed to rdlock, waiting

This is what I had hit when I was coding the unlink related fixing last month.

@batrick
Copy link
Member

batrick commented Jul 15, 2023

I was able to reproduce the problem. I'll have more details Monday.

@lxbsz
Copy link
Member Author

lxbsz commented Jul 17, 2023

I was able to reproduce the problem. I'll have more details Monday.

Cool @batrick.

@lxbsz
Copy link
Member Author

lxbsz commented Jul 17, 2023

jenkins retest this please

@batrick
Copy link
Member

batrick commented Jul 18, 2023

I was able to reproduce the problem. I'll have more details Monday.

https://tracker.ceph.com/issues/62052

lov.add_xlock(&dn->lock);
} else {
lov.add_rdlock(&dn->lock);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since cf5fdcd, openc will not xlock an existing non-null dentry if O_EXCL is not set. Instead it will acquire a rdlock:

cf5fdcd#diff-277dc6e796ccecb6aa14c9357f7a86898d6ddcf4113c110a4e00e0a10f6fefa7R4476

So I don't think this commit will do anything?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right.

Because only when dnl->is_null() || !want_inode will true will the _openc() xlock the dentry. While if dnl->is_null() is true it should go to _open(). And the rdlock_path_xlock_dentry() in _openc() will always make want_inode to true. So the _openc() won't xlock the dentry.

CInode *cur = rdlock_path_pin_ref(mdr, need_auth);
if (!cur)
return;

if (cur->is_frozen() || cur->state_test(CInode::STATE_EXPORTINGCAPS)) {
if ((cur->is_frozen() || cur->state_test(CInode::STATE_EXPORTINGCAPS)) && !need_auth) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you found a genuine issue here Xiubo. Can you restructure this to break out !need_auth into a nested if which will assert if need_auth == true (as, in that case, the request should have been forwarded).

@lxbsz lxbsz changed the title mds: do not acquire rdlock if the dentry lock already xlocked mds: retry rdlock_path_pin_ref if needs to fwd request to auth Jul 20, 2023
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Nov 24, 2023
@lxbsz lxbsz removed the stale label Nov 30, 2023
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Jan 29, 2024
Copy link

This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution!

@github-actions github-actions bot closed this Feb 28, 2024
@lxbsz lxbsz reopened this Feb 29, 2024
@github-actions github-actions bot removed the stale label Feb 29, 2024
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Apr 29, 2024
@lxbsz lxbsz removed the stale label Apr 29, 2024
@@ -4377,13 +4377,14 @@ void Server::handle_client_open(MDRequestRef& mdr)
respond_to_request(mdr, -CEPHFS_EROFS);
return;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove unnecessary whitespace changes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed it.

if (cur->is_frozen() || cur->state_test(CInode::STATE_EXPORTINGCAPS)) {
ceph_assert(!need_auth);
// In case the _open() is called from _openc() the previous rdlock_path_pin_ref()
// will fail to forward the request to auth MDS, we need to retry it here.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand in the comment here why this particular flow will fail to forward the request to auth?

I feel we could write a test for this:

  • make a config for the client to always choose a particular rank by default
  • make the necessary file, pin the directory its in to rank 1
  • set the config to have the client always choose rank 0 by default
  • write a python script to open the file O_CREATE|O_TRUNC|O_WRONLY.

yes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly, I will expand the comment and try to add a test case for it. Thanks @batrick

In openc() the 'PATH_LOCKED' will be set and if the file exists then it will
call open() dirrectly, but when the create requests need be handled in auth
MDS but the current MDS is not auth because of the inode is under exporting
and the auth is changed, we need to retry rdlock_path_pin_ref() and forward
the requests to the auth.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Xiubo Li <xiubli@redhat.com>
There have some tests need multiple mds, such when pining directories.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
@lxbsz
Copy link
Member Author

lxbsz commented Apr 30, 2024

@batrick Added a test case for it. Please take a look. Thanks!

[Edit] When testing this I just added the ceph_assert(!need_auth); back and I can see the mds was crashing, which means the test worked as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants