New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reef: os/bluestore/bluefs: fix dir_link might add link that already exists in compact log #51001
Merged
yuriw
merged 2 commits into
ceph:reef
from
ifed01:wip-ifed-bluefs-duplicate-dir-link-ree
May 25, 2023
Merged
reef: os/bluestore/bluefs: fix dir_link might add link that already exists in compact log #51001
yuriw
merged 2 commits into
ceph:reef
from
ifed01:wip-ifed-bluefs-duplicate-dir-link-ree
May 25, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…in compact log After commit eac1807 os/bluestore/bluefs: Weaken locks in open_for_write There's a race window between open_for_write and log compaction Process A Process B open_for_write _compact_log_async_LD_LNF_D log.lock node.lock ... update nodes.dir_map(add dirlink A) node.lock(wait for process A) node.unlock ... log.lock(wait for Process B) <get lock> ... compact log(create log based on nodes.dir_map which has dirlink A) ... ... ... ... ... node.unlock() ... log.unlock <get lock> log file create event(dirlink A) log.unlock After the above case, bluefs log will have something like this 0x0: txn(seq 1 len 0x141ee crc 0x3e1c626f) 0x0: op_init 0x0: op_file_update file(ino 2524749 size 0x246b6 mtime 2023-02-08T03:07:19.950963+0800 allocated 30000 alloc_commit 30000 extents [1:0xa135e0000~30000]) 0x0: op_file_update file(ino 2524746 size 0x175af mtime 2023-02-08T03:07:19.771584+0800 allocated 20000 alloc_commit 20000 extents [1:0xa13530000~20000]) ... 0x0: op_dir_link db/2524749.sst to 2524751 0x0: op_dir_link db/2524750.sst to 2524752 0x0: op_dir_link db/CURRENT to 2491157 ... 0x0: op_jump seq 18414993 offset 0x20000 0x20000: txn(seq 18414994 len 0x65 crc 0xc1f9ec5f) 0x20000: op_file_update file(ino 2524752 size 0x0 mtime 2023-02-08T03:07:20.205074+0800 allocated 0 alloc_commit 0 extents []) 0x20000: op_dir_link db/2524750.sst to 2524752 dir_link db/2524750.sst to 2524752 exists at both compacted log(txn seq 1) and log txn seq 18414994. If log compaction won't happen later or abnormal shutdown happens, next time bluefs mount replay will fail at following assert 2023-02-10T11:05:09.826+0800 7f1f97b71280 10 bluefs _replay 0x20000: txn(seq 18414994 len 0x65 crc 0xc1f9ec5f) 2023-02-10T11:05:09.826+0800 7f1f97b71280 20 bluefs _replay 0x20000: op_file_update file(ino 2524752 size 0x0 mtime 2023-02-08T03:07:20.205074+0800 allocated 0 alloc_commit 0 extents []) 2023-02-10T11:05:09.826+0800 7f1f97b71280 20 bluefs _replay 0x20000: op_dir_link db/2524750.sst to 2524752 2023-02-10T11:05:09.832+0800 7f1f97b71280 -1 //source/ceph/src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_replay(bool, bool)' thread 7f1f97b71280 time 2023-02-10T11:05:09.827662+0800 //source/ceph/src/os/bluestore/BlueFS.cc: 1419: FAILED ceph_assert(r == q->second->file_map.end()) Refer to other operations that update the node and add a log entry at the same time, such as rename. Fixed this by taking log lock and node lock at the begining function(follow lock ordering, so log lock first.), i.e. N_LD -> LND Fixes: https://tracker.ceph.com/issues/56210 Signed-off-by: ethanwu <ethanwu@synology.com> (cherry picked from commit c55f737)
… compaction Test for https://tracker.ceph.com/issues/56210 Signed-off-by: ethanwu <ethanwu@synology.com> Signed-off-by: Adam Kupczyk <akupczyk@ibm.com> (cherry picked from commit a74d02b)
aclamk
approved these changes
May 23, 2023
jenkins test api |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
backport of #50185
backport tracker: https://tracker.ceph.com/issues/59391
parent tracker: https://tracker.ceph.com/issues/56210
Signed-off-by: Igor Fedotov igor.fedotov@croit.io
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows