New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mds: just wait the client flushes the snap and dirty buffer #53238
Conversation
jenkins test windows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix makes sense.
jenkins test windows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jenkins test windows |
1 similar comment
jenkins test windows |
@lxbsz could you please rebase and push - jenkins test are failing to pass :/ |
Done. |
jenkins test make check arm64 |
jenkins test windows |
@lxbsz please rebase and push - jenkins mess :/ |
Done. |
jenkins test make check arm64 |
Windows failure looks real
|
LibCephFS.SnapDiffLib2 seems to hang, I'll try running it locally. |
It's not a Windows issue, it hangs on Linux as well. It's just that
|
[...]
@petrutlucian94 BTW, did the above test include this PR ? Or just the |
jenkins test make check arm64 |
The above test fails if I apply this commit but passes if I revert it, so there's something wrong with the commit.
|
Okay, I will have a look later. Thanks @petrutlucian94 |
When truncating the inode we will just set the ifile lock state to LOCK_XLOCKSNAP and then try to revoke the 'Fb' caps, but if the client couldn't release the 'Fb' cap in time just replies with a normal cap updating request, the MDS will successfully transfer the ifile's lock state to LOCK_EXCL, which is stable. That means the MDS will wake up the truncating request and continue truncating the objects from Rados without waiting the clients to flush the diry buffer. Fixes: commit 9c65920 ("mds: force client flush snap data before truncating objects") Fixes: https://tracker.ceph.com/issues/62580 Signed-off-by: Xiubo Li <xiubli@redhat.com>
Fixed it. @vshankar Please review it again, thanks! |
The old code couldn't correctly handle the corner cases and couldn't wake up the waiters. The new fixes just make it more precise and only do this just in case in |
Test runs in ~2h - wip-vshankar-testing-20231127.102654 |
https://pulpito.ceph.com/?branch=wip-vshankar-testing-20231127.102654 (rhel pkg install failures are a bunch, so, this would need a revalidate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When truncating the inode we will just set the ifile lock state to LOCK_XLOCKSNAP and then try to revoke the 'Fb' caps, but if the client couldn't release the 'Fb' cap in time just replies with a normal cap updating request, the MDS will successfully transfer the ifile's lock state to LOCK_EXCL, which is stable.
That means the MDS will wake up the truncating request and continue truncating the objects from Rados without waiting the clients to flush the diry buffer.
Fixes: commit 9c65920 ("mds: force client flush snap data before
truncating objects")
Fixes: https://tracker.ceph.com/issues/62580
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows