Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

squid: mds/quiesce: let clients keep their buffered writes for a quiesced file #57013

Open
wants to merge 2 commits into
base: squid
Choose a base branch
from

Conversation

leonid-s-usov
Copy link
Contributor

Backport

Fixes: https://tracker.ceph.com/issues/65556
Original-Issue: https://tracker.ceph.com/issues/65472
Original-PR: #56755

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

With the quiesce protocol taking a `rdlock` on the file,
it also revokes the `Fb` capability, which the clients can't release
until they are done flushing, and that may take up arbitrarily long,
evidently, more than 10 minutes.

We went for the rdlock to avoid affecting readonly clients,
but given the evidence above we should not optimize for those.
Ideally, we’d like to have a QUIESCE file lock mode where both rd
and buffer are allowed, but as of now it seems like our best
available option is to `xlock` the file which will let the writing
clients keep their buffers for the duration of the quiesce.

We can only afford this change for a `splitauth` config,
i.e. where we drop the lock immediately after all `Fw`s are revoked

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit 8ac9842)
Fixes: https://tracker.ceph.com/issues/65556
Original-Issue: https://tracker.ceph.com/issues/65472
Original-PR: #56755
For every mirrored lock, the auth will message the replica to ensure
the replicated lock state. When we take x/rdlock on the auth, it will
ensure the LOCK_LOCK state on the replica, which has the file caps we
want for quiesce: CACHE and BUFFER.

It should be sufficient to only hold the quiesce local lock
on the replica side.

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit eac482b)
Fixes: https://tracker.ceph.com/issues/65556
Original-Issue: https://tracker.ceph.com/issues/65472
Original-PR: #56755
@leonid-s-usov leonid-s-usov requested a review from a team April 21, 2024 09:25
@github-actions github-actions bot added the cephfs Ceph File System label Apr 21, 2024
@leonid-s-usov
Copy link
Contributor Author

jenkins test api

@batrick
Copy link
Member

batrick commented May 7, 2024

jenkins test windows

@batrick batrick added this to the v19.1.0 milestone May 17, 2024
@batrick
Copy link
Member

batrick commented May 17, 2024

This PR is under test in https://tracker.ceph.com/issues/66101.

@batrick batrick modified the milestones: v19.1.0, v19.1.1 May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants