New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
librbd: avoid object map corruption in snapshots taken under I/O #52109
Conversation
By effectively moving capturing of the snap context to the API layer, commit 1d0a3b1 ("librbd: pass IOContext to image-extent IO dispatch methods") introduced a nasty regression. The snap context can be captured only after exclusive lock is safely held for the duration of dealing with the image request and even then must be refreshed if a snapshot creation request is accepted from a peer. This is needed to ensure correctness of the object map in general and fast-diff states in particular (OBJECT_EXISTS vs OBJECT_EXISTS_CLEAN) and object deltas computed based off of them. Otherwise the object map that is forked for the snapshot isn't guaranteed to accurately reflect the contents of the snapshot when the snapshot is taken under I/O (as in disabling the object map may lead to different results being returned for reads). The regression affects mainly differential backup and snapshot-based mirroring use cases with object-map and/or fast-diff enabled: since some object deltas may be incomplete, the destination image may get corrupted. This commit represents a reasonable minimal fix: IOContext passed through to ImageDispatch is effected only for reads and just gets ignored for writes. The next commit cleans up further by undoing the passing of IOContext through the image dispatch layers for writes. Fixes: https://tracker.ceph.com/issues/61616 Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
This is a major footgun since any value passed e.g. at the API layer may be stale by the time we get to object dispatch. All callers are passing the IOContext returned by get_data_io_context() for their ImageCtx anyway, highlighting that the parameter is fictitious. Only the read method can meaningfully take IOContext. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Before kernel: https://pulpito.ceph.com/dis-2023-06-18_15:35:51-krbd-main-testing-default-smithi/
After kernel: https://pulpito.ceph.com/dis-2023-06-18_16:12:28-krbd-main-wip-exclusive-lock-snapc-default-smithi/ |
Before: https://pulpito.ceph.com/dis-2023-06-18_17:52:23-rbd-main-distro-default-smithi/
After: https://pulpito.ceph.com/dis-2023-06-18_17:53:28-rbd-wip-61616-distro-default-smithi/ |
jenkins test api |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I added more validation to the reproducer and encountered a side issue which I'm punting on for now, see the comment in the script. Example failure:
https://pulpito.ceph.com/dis-2023-06-19_21:33:07-rbd-wip-61616-distro-default-smithi/ |
Reruns on the mergeable version (same set of jobs as above):
https://pulpito.ceph.com/dis-2023-06-20_00:10:40-krbd-main-wip-exclusive-lock-snapc-default-smithi/
https://pulpito.ceph.com/dis-2023-06-19_23:47:15-rbd-wip-61616-distro-default-smithi/ |
The current version is pretty useless: - "rbd bench" writes the same byte (0xff) over and over again, so almost all checksumming is in vain - snapshots are taken in a steady state (i.e. not under I/O), so no race conditions can get exposed - even with these caveats, it's not wired up into the suite Redo this workunit to be a reliable reproducer for the issue fixed in the previous commit and wire it up for both krbd and rbd-nbd. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
jenkins test make check |
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows