New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ceph: allow xlock state to be LOCK_PREXLOCK when putting it #52111
Conversation
src/mds/SimpleLock.h
Outdated
@@ -417,6 +417,7 @@ class SimpleLock { | |||
ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || | |||
state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || | |||
state == LOCK_LOCK || /* if we are a leader of a peer */ | |||
state == LOCK_PREXLOCK || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lxbsz From your comment on the tracker https://tracker.ceph.com/issues/44565#note-18, How does the lock state moved to LOCK_PREXLOCK on first decrement of x from 2. Logically it make sense to move to LOCK_PREXLOCK on x = 0 ? I don't understand the lock state machine. Sorry if it's a naive question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it doesn't much matter with the x
and the state still could be moved even it's not 0
.
This issue happened when one request tried to acquire the xlock for ilink
locker:
Before moving the state to LOCK_PREXLOCK
the state was LOCK_LOCK_XLOCK
, which is a intermediate state before successfully setting it to LOCK_XLOCK
. MDS will set the locker to a intermediate state if it couldn't move to the next target state directly, in this case it needed to revoke some caps from clients first.
From mds/lock.cc
file we can see it's next state is LOCK_PREXLOCK
:
[LOCK_LOCK_XLOCK]= { LOCK_PREXLOCK,false,LOCK_LOCK,0, XCL, 0, 0, 0, 0, XCL, 0,0,0,0 },
And just after the caps was successfully revoked and when continued moving the state to LOCK_XLOCK
, the eval_gather()
just set it to LOCK_PREXLOCK
. And then later when the corresponding request was retried it should move the state to LOCK_XLOCK
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank @lxbsz for the explaination.
When the journal logs are successfully flushed the lockers' state could be LOCK_PREXLOCK if the inflight OP need to gather issued caps. Fixes: https://tracker.ceph.com/issues/44565 Signed-off-by: Xiubo Li <xiubli@redhat.com>
When the journal logs are successfully flushed the lockers' state could be LOCK_SYNC during the xlock count is non-zero. Fixes: https://tracker.ceph.com/issues/44565 Signed-off-by: Xiubo Li <xiubli@redhat.com>
@lxbsz I see your testing label in this PR. Are you still testing your changes before seeking further reviews? |
jenkins retest this please |
Test update: Tests passed, preparing run wiki - will merge soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the journal logs are successfully flushed the lockers' state could be LOCK_PREXLOCK if the inflight OP need to gather issued caps.
Fixes: https://tracker.ceph.com/issues/44565
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows