Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mds: fix issuing redundant reintegrate/migrate_stray requests #53280

Merged
merged 2 commits into from Oct 20, 2023

Conversation

lxbsz
Copy link
Member

@lxbsz lxbsz commented Sep 5, 2023

Just in cause a CInode's nlink is 1, and then a unlink request comes and then early replies and submits to the MDLogs, but just before the MDlogs are flushed a link request comes, and the link request also succeeds and early replies to client.

Later when the unlink/link requests' MDLog events are flushed and the callbacks are called, which will fire a stray denty reintegration. But it will pick the new dentry, which is from the link's request and is a remote dentry, to do the reintegration. While in the 'rename' code when traversing the path it will trigger to call the 'dn->link_remote()', which later will fire a new stray dentry reintegration.

The problem is if the first 'rename' request is retried several times, and in each time it will fire a new reintegration, which makes no sense and maybe blocked for a very long time dues to some reasons and then will be reported as slow request warning.

Fixes: https://tracker.ceph.com/issues/62702

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@lxbsz lxbsz requested review from vshankar, batrick and a team September 5, 2023 05:27
@github-actions github-actions bot added the cephfs Ceph File System label Sep 5, 2023
Copy link
Contributor

@mchangir mchangir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: update commit message to start with Just in case

@lxbsz
Copy link
Member Author

lxbsz commented Sep 6, 2023

nit: update commit message to start with Just in case

Sure @mchangir I will fix them both. Thanks!

src/mds/StrayManager.cc Outdated Show resolved Hide resolved
src/mds/StrayManager.cc Outdated Show resolved Hide resolved
@dparmar18
Copy link
Contributor

once the new reintegration flag is set, it stays set throughout the lifetime of that request right? in case if the reintegration fails in the first try, can there be any serious problems since now it cannot perform reintegration?

@lxbsz
Copy link
Member Author

lxbsz commented Sep 8, 2023

@batrick I just suspect this is the same issue with mds: infinite rename recursion on itself tracker.

From the logs I can see the same reintegration/rename was triggered 2000+ time, but finally they all finished or failed, not infinitely stuck or recursing.

@lxbsz
Copy link
Member Author

lxbsz commented Sep 8, 2023

once the new reintegration flag is set, it stays set throughout the lifetime of that request right? in case if the reintegration fails in the first try, can there be any serious problems since now it cannot perform reintegration?

No, it shouldn't. I just set the flag in the reintegration or rename request, not in the CDentry or the CInode.

src/mds/StrayManager.cc Outdated Show resolved Hide resolved
@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@lxbsz
Copy link
Member Author

lxbsz commented Sep 18, 2023

Since #52199 had been merged and rebased this to the upstream.

Copy link
Member

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM!

src/mds/StrayManager.cc Outdated Show resolved Hide resolved
src/mds/MDSRank.h Outdated Show resolved Hide resolved
@lxbsz lxbsz force-pushed the wip-62702 branch 2 times, most recently from 16d7ea6 to 0983390 Compare September 19, 2023 00:53
src/mds/MDSRank.h Outdated Show resolved Hide resolved
src/mds/MDSMetaRequest.h Outdated Show resolved Hide resolved
src/mds/MDSRank.h Outdated Show resolved Hide resolved
src/mds/StrayManager.cc Outdated Show resolved Hide resolved
src/mds/StrayManager.cc Outdated Show resolved Hide resolved
This will be used to avoid possible multiple reintegration issue
later.

Fixes: https://tracker.ceph.com/issues/62702
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Just in case a CInode's nlink is 1, and then a unlink request comes
and then early replies and submits to the MDLogs, but just before
the MDlogs are flushed a link request comes, and the link request
also succeeds and early replies to client.

Later when the unlink/link requests' MDLog events are flushed and
the callbacks are called, which will fire a stray denty reintegration.
But it will pick the new dentry, which is from the link's request
and is a remote dentry, to do the reintegration. While in the
'rename' code when traversing the path it will trigger to call the
'dn->link_remote()', which later will fire a new stray dentry
reintegration.

The problem is if the first 'rename' request is retried several
times, and in each time it will fire a new reintegration, which
makes no sense and maybe blocked for a very long time dues to some
reasons and then will be reported as slow request warning.

Fixes: https://tracker.ceph.com/issues/62702
Signed-off-by: Xiubo Li <xiubli@redhat.com>
@lxbsz
Copy link
Member Author

lxbsz commented Sep 21, 2023

@batrick Fixed them all and also run the pjd and ffstress tests locally. Thanks!

Copy link
Member

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\o/

@vshankar
Copy link
Contributor

@vshankar
Copy link
Contributor

vshankar commented Oct 4, 2023

@lxbsz I see a couple of fsstress failures with the branch that has this change: https://pulpito.ceph.com/vshankar-2023-09-28_07:23:59-fs-wip-vshankar-testing-20230926.081818-testing-default-smithi/7405355

... and not in the baseline run for main branch: https://pulpito.ceph.com/yuriw-2023-09-29_19:46:13-fs-main-distro-default-smithi/

I suspect this change is contributing to the failure, so, not merging this till the failures is debugged.

vshankar added a commit to vshankar/ceph that referenced this pull request Oct 7, 2023
* refs/pull/53280/head:
	mds: fix issuing redundant reintegrate/migrate_stray requests
	mds: record the internal client request and receive client reply

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
@lxbsz
Copy link
Member Author

lxbsz commented Oct 9, 2023

@lxbsz I see a couple of fsstress failures with the branch that has this change: https://pulpito.ceph.com/vshankar-2023-09-28_07:23:59-fs-wip-vshankar-testing-20230926.081818-testing-default-smithi/7405355

... and not in the baseline run for main branch: https://pulpito.ceph.com/yuriw-2023-09-29_19:46:13-fs-main-distro-default-smithi/

I suspect this change is contributing to the failure, so, not merging this till the failures is debugged.

I will have a look today or tomorrow.

@lxbsz
Copy link
Member Author

lxbsz commented Oct 10, 2023

@lxbsz I see a couple of fsstress failures with the branch that has this change: https://pulpito.ceph.com/vshankar-2023-09-28_07:23:59-fs-wip-vshankar-testing-20230926.081818-testing-default-smithi/7405355

... and not in the baseline run for main branch: https://pulpito.ceph.com/yuriw-2023-09-29_19:46:13-fs-main-distro-default-smithi/

I suspect this change is contributing to the failure, so, not merging this till the failures is debugged.

@vshankar This is a bug from kernel's mm, not introduced by this PR:

From teuthology.log there were a lot of osd slow ops:

2023-09-28T12:07:29.937 INFO:journalctl@ceph.osd.4.smithi144.stdout:Sep 28 12:07:29 smithi144 ceph-daf19bca-5dd9-11ee-8db4-212e2dc638e7-osd-4[130215]: 2023-09-28T12:07:29.822+0000 7fda4fc7c700 -1 osd.4 46 get_health_metrics reporting 2 slow ops, oldest is osd_op(client.14542.0:11063 3.1a 3:5ead09fd:::10000005bde.00000000:head [write 0~4194304 in=4194304b] snapc 1=[] ondisk+write+known_if_redirected e46)

And from the smith069 node it crashed:

[ 1450.001455] usercopy: Kernel memory exposure attempt detected from SLUB object 'kmalloc-512' (offset 274, size 479232)!^M
[ 1450.012479] ------------[ cut here ]------------^M
[ 1450.017335] kernel BUG at mm/usercopy.c:102!^M
^M
Entering kdb (current=0xffff888100feaa80, pid 173925) on processor 5 Oops: (null)^M
due to oops @ 0xffffffff8138616d^M
CPU: 5 PID: 173925 Comm: fsstress Tainted: G S                 6.5.0-rc7-gfef64f93da2b #1^M
Hardware name: Supermicro SYS-5018R-WR/X10SRW-F, BIOS 2.0 12/17/2015^M
RIP: 0010:usercopy_abort+0x6d/0x80^M
Code: 4c 0f 44 d0 41 53 48 c7 c0 18 36 13 82 48 c7 c6 5a b0 12 82 48 0f 45 f0 48 89 f9 48 c7 c7 a8 ae 1a 82 4c 89 d2 e8 33 79 df ff <0f> 0b 49 c7 c1 f2 13 14 82 4d 89 cb 4d 89 c8 eb a5 66 90 f3 0f 1e^M
RSP: 0018:ffffc90006947a88 EFLAGS: 00010246^M
RAX: 000000000000006b RBX: 0000000000075000 RCX: 0000000000000000^M
RDX: 0000000000000000 RSI: ffff88885fd5d800 RDI: ffff88885fd5d800^M
RBP: 0000000000075000 R08: 0000000000000000 R09: c0000000ffffdfff^M
R10: 0000000000000001 R11: ffffc90006947938 R12: 0000000000000001^M
R13: ffff888114257912 R14: ffff888140b667c0 R15: ffff8881141e2912^M
FS:  00007f94f6ac1500(0000) GS:ffff88885fd40000(0000) knlGS:0000000000000000^M
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
CR2: 000055917f531000 CR3: 0000000316f98002 CR4: 00000000003706e0^M
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000^M
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400^M
Call Trace:^M
 <TASK>^M
more>

Thanks

@vshankar
Copy link
Contributor

Thx @lxbsz - will have a look.

batrick added a commit to batrick/ceph that referenced this pull request Oct 20, 2023
* refs/pull/53280/head:
	mds: fix issuing redundant reintegrate/migrate_stray requests
	mds: record the internal client request and receive client reply
Copy link
Contributor

@vshankar vshankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vshankar vshankar merged commit 8400e77 into ceph:main Oct 20, 2023
10 of 11 checks passed
@batrick
Copy link
Member

batrick commented Oct 20, 2023

https://tracker.ceph.com/projects/cephfs/wiki/Main#16-Oct-2023

Also tested ^. This certainly seems to have resolved the issue thoroughly. Thanks Xiubo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants