Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

librbd: avoid object map corruption in snapshots taken under I/O #52109

Merged
merged 3 commits into from Jun 21, 2023

Conversation

idryomov
Copy link
Contributor

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

By effectively moving capturing of the snap context to the API layer,
commit 1d0a3b1 ("librbd: pass IOContext to image-extent IO
dispatch methods") introduced a nasty regression.  The snap context can
be captured only after exclusive lock is safely held for the duration
of dealing with the image request and even then must be refreshed if
a snapshot creation request is accepted from a peer.  This is needed to
ensure correctness of the object map in general and fast-diff states in
particular (OBJECT_EXISTS vs OBJECT_EXISTS_CLEAN) and object deltas
computed based off of them.  Otherwise the object map that is forked
for the snapshot isn't guaranteed to accurately reflect the contents of
the snapshot when the snapshot is taken under I/O (as in disabling the
object map may lead to different results being returned for reads).

The regression affects mainly differential backup and snapshot-based
mirroring use cases with object-map and/or fast-diff enabled: since
some object deltas may be incomplete, the destination image may get
corrupted.

This commit represents a reasonable minimal fix: IOContext passed
through to ImageDispatch is effected only for reads and just gets
ignored for writes.  The next commit cleans up further by undoing the
passing of IOContext through the image dispatch layers for writes.

Fixes: https://tracker.ceph.com/issues/61616
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
This is a major footgun since any value passed e.g. at the API layer
may be stale by the time we get to object dispatch.  All callers are
passing the IOContext returned by get_data_io_context() for their
ImageCtx anyway, highlighting that the parameter is fictitious.

Only the read method can meaningfully take IOContext.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
@idryomov
Copy link
Contributor Author

RBD_DEVICE_TYPE: "krbd" (only "qa/workunits/rbd: make continuous export-diff test actually work" commit is relevant, fix landed in kernel 6.4-rc6)

Before kernel:

https://pulpito.ceph.com/dis-2023-06-18_15:35:51-krbd-main-testing-default-smithi/

./7307030/teuthology.log:2023-06-18T16:28:19.914 INFO:tasks.workunit.client.0.smithi139.stdout:Mismatch at snap3: d358d561cf41350f369002850a42449f != 3b42e91d634ad87e912cc3627e55df76
./7307020/teuthology.log:2023-06-18T16:22:02.476 INFO:tasks.workunit.client.0.smithi089.stdout:Mismatch at snap3: 84603f18a29f41db39c752d5be78e373 != 4422a5c14e177981351721980e3c5b85
./7307018/teuthology.log:2023-06-18T16:23:36.792 INFO:tasks.workunit.client.0.smithi052.stdout:Mismatch at snap16: 4ee8986e4fa7a64c016ac10e5528f868 != 91e5473643c00de6ddc98137dfb1f3da
./7307025/teuthology.log:2023-06-18T16:17:07.346 INFO:tasks.workunit.client.0.smithi114.stdout:Mismatch at snap3: 73560e721484e2be4da6a64804074951 != f5e11cad05432702da30a8d41ab6955f
./7307019/teuthology.log:2023-06-18T16:26:45.025 INFO:tasks.workunit.client.0.smithi177.stdout:Mismatch at snap5: 0cb0c70f60a29f0a6b9c9ada0cdf2006 != 15bbc5bc9618698623aed1d74fb8f00d
./7307026/teuthology.log:2023-06-18T16:24:10.073 INFO:tasks.workunit.client.0.smithi164.stdout:Mismatch at snap5: c59214c610d8797e10c7b4e94a895d41 != bc5da97fc5767845c71c3765ee08f24b
./7307022/teuthology.log:2023-06-18T16:29:37.556 INFO:tasks.workunit.client.0.smithi192.stdout:Mismatch at snap3: 91d7c1a53090620ee6049496031b4117 != 5c5a5b6fb8de7c1ec6784053e39b801e
./7307013/teuthology.log:2023-06-18T16:22:32.147 INFO:tasks.workunit.client.0.smithi149.stdout:Mismatch at snap6: 8beb7da22059676a8a080ba5a6e238d9 != 6e7f3e57935b232879b230171af1faf9
./7307023/teuthology.log:2023-06-18T16:24:22.969 INFO:tasks.workunit.client.0.smithi158.stdout:Mismatch at snap3: 715e1dfc10addb39b735fe43ccd3041a != 42ab964775ffd14ac0fd11edaf5fcd0e
./7307029/teuthology.log:2023-06-18T16:31:28.989 INFO:tasks.workunit.client.0.smithi156.stdout:Mismatch at snap15: 557eaccb3aab0bb3ff807ccffe5a3d46 != 8e8e58c6ca90973f9af60e6e4335825a
./7307032/teuthology.log:2023-06-18T16:26:58.291 INFO:tasks.workunit.client.0.smithi186.stdout:Mismatch at snap3: ca176cc64a46b245a16bad8dd6e2ea4c != efc31805da66d71647d6a36332d1d8ef
./7307024/teuthology.log:2023-06-18T16:28:57.122 INFO:tasks.workunit.client.0.smithi083.stdout:Mismatch at snap7: ac3ff811fc0ff885a04871dac14563ef != 8aeb1c5aa73b1fedc26a08f964802f2d
./7307016/teuthology.log:2023-06-18T16:38:56.175 INFO:tasks.workunit.client.0.smithi153.stdout:Mismatch at snap51: 2b5ff0101b4329ffa4dcb9c47b256316 != 9eb2276a8f3b0324e656ada20040058c
./7307021/teuthology.log:2023-06-18T16:20:40.311 INFO:tasks.workunit.client.0.smithi184.stdout:Mismatch at snap20: 92a5f4d08659b2557e10e51c77162d2d != b2c08fd4734e76078043182297b56038
./7307015/teuthology.log:2023-06-18T16:18:18.540 INFO:tasks.workunit.client.0.smithi188.stdout:Mismatch at snap6: 57c60c5dba999d733a76a0efe383444b != 2b9d0f0b43965625a3a324bdbfd71e8b
./7307031/teuthology.log:2023-06-18T16:32:06.974 INFO:tasks.workunit.client.0.smithi191.stdout:Mismatch at snap7: 5280ccae31d72e5da9b33cd472cd64c9 != 6a8cf43c0cf79f94e33f1baa3e7211ba
./7307017/teuthology.log:2023-06-18T16:25:11.592 INFO:tasks.workunit.client.0.smithi160.stdout:Mismatch at snap3: dfba7cfcf40a574b01721cc95243a59b != 862f8985f3d1ce26960a1dedf5b4c8b3
./7307014/teuthology.log:2023-06-18T16:15:35.240 INFO:tasks.workunit.client.0.smithi169.stdout:Mismatch at snap3: e775bccc6d46cff555241d8d837913b3 != 77872c14f9b1ff1363ac7c684b2498cc
./7307027/teuthology.log:2023-06-18T16:29:43.804 INFO:tasks.workunit.client.0.smithi135.stdout:Mismatch at snap20: 803cf5dc88e7609c32dcfd79231cd2e0 != 4ea09f5f513d7011eeb14bbf848f3873
./7307028/teuthology.log:2023-06-18T16:28:57.533 INFO:tasks.workunit.client.0.smithi123.stdout:Mismatch at snap5: c148ba44917474da9c1b15ffa7eb4d4a != ab3e0398c81dd6b580469d7954be44e7

After kernel:

https://pulpito.ceph.com/dis-2023-06-18_16:12:28-krbd-main-wip-exclusive-lock-snapc-default-smithi/

@idryomov
Copy link
Contributor Author

RBD_DEVICE_TYPE: "nbd"

Before:

https://pulpito.ceph.com/dis-2023-06-18_17:52:23-rbd-main-distro-default-smithi/

./7307108/teuthology.log:2023-06-18T18:27:06.206 INFO:tasks.workunit.client.0.smithi089.stdout:Mismatch at snap3: 8ea764987266204c434702fd5bce915f != 6d4f074e4c78d770b35ec22300cb7129
./7307104/teuthology.log:2023-06-18T18:30:26.235 INFO:tasks.workunit.client.0.smithi119.stdout:Mismatch at snap6: 31d22dd282f1751db7142b8d92fb529f != 63ee0f7c1afd0aa42e723f22b1dca785
./7307102/teuthology.log:2023-06-18T18:30:12.247 INFO:tasks.workunit.client.0.smithi191.stdout:Mismatch at snap6: debc6860509f07b24e030627b4f9e7b6 != df23d5c121a27654017c15928fa07640
./7307111/teuthology.log:2023-06-18T18:31:56.439 INFO:tasks.workunit.client.0.smithi123.stdout:Mismatch at snap5: 8a708a20784efb4de1ffc6d640581ff6 != befcc5e4e0c9608641e6bde936d27841
./7307109/teuthology.log:2023-06-18T18:23:50.548 INFO:tasks.workunit.client.0.smithi098.stdout:Mismatch at snap4: 5ea4a3ea77aa3e8ebbd7bd719a1ce640 != cb0a219c222ead14fc0a8e7207434363
./7307101/teuthology.log:2023-06-18T18:28:56.184 INFO:tasks.workunit.client.0.smithi120.stdout:Mismatch at snap7: 2426aa01a2115e542f02ef1ba316f16c != 1bc68b682c10ff36dcf97a3d9ca1d0b0
./7307112/teuthology.log:2023-06-18T18:33:21.719 INFO:tasks.workunit.client.0.smithi196.stdout:Mismatch at snap11: 52bbfa9746b615b5d9970267c56fd70a != 12f2e15633fa65578dd4ef6e8b9219e1
./7307117/teuthology.log:2023-06-18T18:28:13.089 INFO:tasks.workunit.client.0.smithi178.stdout:Mismatch at snap6: dd14c8acb981f2263238cfdadbf75b16 != 2ca7daa4ea1c73634cf6a66f5f2ce643
./7307120/teuthology.log:2023-06-18T18:35:16.190 INFO:tasks.workunit.client.0.smithi202.stdout:Mismatch at snap5: 4ac9c5ab6355749f2d0e80c0c47a5ab5 != ef521bf00cb8a92c0d4cb4b919f50ef6
./7307115/teuthology.log:2023-06-18T18:32:59.943 INFO:tasks.workunit.client.0.smithi193.stdout:Mismatch at snap4: d67fde6fc32faa14cfe46b8bb247c480 != 775ff36d06dea054cf9c7da9ab3e7a56
./7307107/teuthology.log:2023-06-18T18:28:19.922 INFO:tasks.workunit.client.0.smithi184.stdout:Mismatch at snap4: 91def9a27d39125e4d92ea207914263d != 9d4a497c917714f2f4b031abc8b9e19e
./7307116/teuthology.log:2023-06-18T18:34:13.517 INFO:tasks.workunit.client.0.smithi170.stdout:Mismatch at snap3: cae20059878c32b3fcf325eb0c82319e != 4e108299ed1e9882d4ce8d759f546095
./7307105/teuthology.log:2023-06-18T18:24:31.136 INFO:tasks.workunit.client.0.smithi188.stdout:Mismatch at snap2: 0b90b92b59f87b6536f495d5905b2cd9 != f28bc8774b41872ed5bd952497945bf3
./7307106/teuthology.log:2023-06-18T18:22:32.126 INFO:tasks.workunit.client.0.smithi153.stdout:Mismatch at snap4: 5728052494473922ec28269a42a30284 != 30c055148c677ff33817cda9a5c26bf7
./7307110/teuthology.log:2023-06-18T18:32:13.375 INFO:tasks.workunit.client.0.smithi114.stdout:Mismatch at snap13: bde0e1e04ee57124942e1fe5294893ac != b944485e518ab4502710dfc685ff56e7
./7307118/teuthology.log:2023-06-18T18:34:11.124 INFO:tasks.workunit.client.0.smithi154.stdout:Mismatch at snap4: a54bbb2f4e2dbef8d8803ad418251ff9 != 18fd53bd2ba6490370e3b62e737bc71c
./7307103/teuthology.log:2023-06-18T18:30:33.334 INFO:tasks.workunit.client.0.smithi169.stdout:Mismatch at snap19: 55ec7381931a4d35ea5faaa9a95cf566 != aa22129d068f042dae8f95fea75069fe
./7307119/teuthology.log:2023-06-18T18:35:23.421 INFO:tasks.workunit.client.0.smithi190.stdout:Mismatch at snap11: aa5ed3bd1ca6c2d2ae6e07d26f78f7ef != d7a7edf9d2293604f69b7d17fc2b6921
./7307113/teuthology.log:2023-06-18T18:34:35.986 INFO:tasks.workunit.client.0.smithi155.stdout:Mismatch at snap6: d3afea6790f488ae48918f8d7b2e1cd4 != 7a75d5d653ab950a3f2da6ea69affe4f
./7307114/teuthology.log:2023-06-18T18:34:32.848 INFO:tasks.workunit.client.0.smithi151.stdout:Mismatch at snap4: 03243b67af0d017a45609d33ecd2b4ca != 4473e1841cd52b85f586fc101ef2a39a

After:

https://pulpito.ceph.com/dis-2023-06-18_17:53:28-rbd-wip-61616-distro-default-smithi/

@idryomov
Copy link
Contributor Author

jenkins test api

Copy link
Contributor

@trociny trociny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@idryomov
Copy link
Contributor Author

I added more validation to the reproducer and encountered a side issue which I'm punting on for now, see the comment in the script. Example failure:

2023-06-19T22:38:25.772 INFO:tasks.workunit.client.0.smithi101.stderr:+ rbd object-map check f51d1fde-471c-4cac-8869-e417c8e320d5-dst@snap11
2023-06-19T22:38:26.017 INFO:tasks.workunit.client.0.smithi101.stderr:2023-06-19T22:38:26.016+0000 7f5d25a5f700 -1 librbd::ObjectMapIterateRequest: object map error: object rbd_data.1212314ad85a.000000000000005b marked as 1, but should be 3
ct Map Check: 99% complete...2023-06-19T22:38:26.213+0000 7f5d25a5f700 -1 librbd::object_map::InvalidateRequest: 0x7f5d0401d5f0 invalidating object map in-memory
2023-06-19T22:38:26.215 INFO:tasks.workunit.client.0.smithi101.stderr:2023-06-19T22:38:26.213+0000 7f5d25a5f700 -1 librbd::object_map::InvalidateRequest: 0x7f5d0401d5f0 invalidating object map on-disk
2023-06-19T22:38:26.217 INFO:tasks.workunit.client.0.smithi101.stderr:2023-06-19T22:38:26.216+0000 7f5d26260700 -1 librbd::object_map::InvalidateRequest: 0x7f5d0401d5f0 should_complete: r=0
2023-06-19T22:38:26.222 INFO:tasks.workunit.client.0.smithi101.stderr:+ local flags
2023-06-19T22:38:26.223 INFO:tasks.workunit.client.0.smithi101.stderr:++ rbd info f51d1fde-471c-4cac-8869-e417c8e320d5-dst@snap11
2023-06-19T22:38:26.223 INFO:tasks.workunit.client.0.smithi101.stderr:++ grep 'flags: '
2023-06-19T22:38:26.266 INFO:tasks.workunit.client.0.smithi101.stderr:+ flags=' flags: object map invalid, fast diff invalid'
2023-06-19T22:38:26.266 INFO:tasks.workunit.client.0.smithi101.stderr:+ [[      flags: object map invalid, fast diff invalid =~ object map invalid ]]
2023-06-19T22:38:26.266 INFO:tasks.workunit.client.0.smithi101.stderr:+ echo 'Object map invalid at f51d1fde-471c-4cac-8869-e417c8e320d5-dst@snap11'
2023-06-19T22:38:26.267 INFO:tasks.workunit.client.0.smithi101.stdout:Object map invalid at f51d1fde-471c-4cac-8869-e417c8e320d5-dst@snap11
2023-06-19T22:38:26.268 INFO:tasks.workunit.client.0.smithi101.stderr:+ exit 1

https://pulpito.ceph.com/dis-2023-06-19_21:33:07-rbd-wip-61616-distro-default-smithi/

@idryomov
Copy link
Contributor Author

Reruns on the mergeable version (same set of jobs as above):

RBD_DEVICE_TYPE: "krbd"

https://pulpito.ceph.com/dis-2023-06-20_00:10:40-krbd-main-wip-exclusive-lock-snapc-default-smithi/

RBD_DEVICE_TYPE: "nbd"

https://pulpito.ceph.com/dis-2023-06-19_23:47:15-rbd-wip-61616-distro-default-smithi/

The current version is pretty useless:

- "rbd bench" writes the same byte (0xff) over and over again, so
  almost all checksumming is in vain
- snapshots are taken in a steady state (i.e. not under I/O), so no
  race conditions can get exposed
- even with these caveats, it's not wired up into the suite

Redo this workunit to be a reliable reproducer for the issue fixed
in the previous commit and wire it up for both krbd and rbd-nbd.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
@idryomov
Copy link
Contributor Author

jenkins test make check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants