-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
quincy: librbd: Fix local rbd mirror journals growing forever #50159
Conversation
bdd48e6
to
667df76
Compare
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
This commit fixes commit 7ca1bab by pushing properly aligned discards back to m_image_extents, if corrected. If discards are misaligned (off 0, len 4608, gran=4096), they are corrected properly, but only in object_extents and not in m_image_extents. When journal_append_event is triggered it will only append from m_image_extents and does not now about the alignment fixes. In commit_io_events_extent it will log a message and return without completing the io since the larger misaligned area was sent to the journal. This will in turn break rbd journal mirroring since the local client will wait indefinately on the commit to be completed, which it never does. This does not effect rbd-mirror in any way, which may be confusing and dangerous since it's only rbd-mirror that updates ceph health, and not the local client. Setting `rbd_skip_partial_discard = false` under client will restore the pre 7ca1bab behaviour and thus not trigger the bug with journals growing. This will set `rbd_discard_granularity_bytes = 0` internally. This setting is only changed during startup of a client. Fixes: 7ca1bab Fixes: https://tracker.ceph.com/issues/57396 Signed-off-by: Josef Johansson <josef@oderland.se> (cherry picked from commit 21a26a7) Conflicts: src/librbd/io/ImageRequest.cc [ commit b2c8882 ("librbd: return area from extents_to_file()") not in quincy ] src/test/librbd/io/test_mock_ImageRequest.cc [ commit b9a2384 ("librbd: propagate area down to file_to_extents()") not in quincy ]
Currently nothing triggers the length_modified case in ImageDiscardRequest::prune_object_extents() in isolation. It's only triggered in DiscardGranularityJournalAppendEnabled test together with the prune_required case and a bad refactoring could easily break the length_modified logic again. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> (cherry picked from commit 34e59c4) Conflicts: src/test/librbd/io/test_mock_ImageRequest.cc [ commit b9a2384 ("librbd: propagate area down to file_to_extents()") not in quincy ]
"rbd feature disable" appears to reliably hang if the corresponding remote request is proxied to rbd-nbd (because rbd-nbd happens to own the exclusive lock after a series of blkdiscard calls) [1]. Work around it here by enabling journaling before the image is mapped and disabling it after the image is unmapped. Also, don't assert on the output of "rbd journal inspect --verbose" having a certain number of entries. This is racy: if the script gets delayed after the last blkdiscard call for some reason, there may be fewer entries present in the journal or none at all. [1] https://tracker.ceph.com/issues/58740 Signed-off-by: Ilya Dryomov <idryomov@gmail.com> (cherry picked from commit fcfef0a)
667df76
to
9d08098
Compare
Rebased to resolve a trivial context conflict in |
Rados suite reivew: https://pulpito.ceph.com/?branch=wip-yuri4-testing-2023-02-22-0817-quincy Failures, unrelated: Details: |
backport tracker: https://tracker.ceph.com/issues/58765
backport of #49614
parent tracker: https://tracker.ceph.com/issues/57396