mgr/rbd_support: recover from rados client blocklisting #49742

ajarr · 2023-01-14T00:42:44Z

In certain scenarios the OSDs were slow to process RBD requests.
This lead to the rbd_support module's RBD client not being able to
gracefully handover a RBD exclusive lock to another RBD client.
After the condition persisted for some time, the other RBD client
forcefully acquired the lock by blocklisting the rbd_support module's
RBD client, and consequently blocklisted the module's RADOS client. The
rbd_support module stopped working. To recover the module, the entire
mgr service had to be restarted which reloaded other mgr modules.

Instead of recovering the rbd_support module from client blocklisting
by being disruptive to other mgr modules, recover the module
automatically without restarting the mgr serivce. On client getting
blocklisted, shutdown the module's handlers and blocklisted client,
create a new rados client for the module, and start the new handlers.

Fixes: https://tracker.ceph.com/issues/56724
Signed-off-by: Ramana Raja rraja@redhat.com

Contribution Guidelines

To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

Checklist

Tracker (select at least one)
- References tracker ticket
- Very recent bug; references commit where it was introduced
- New feature (ticket optional)
- Doc update (no ticket needed)
- Code cleanup (no ticket needed)
Component impact
- Affects Dashboard, opened tracker ticket
- Affects Orchestrator, opened tracker ticket
- No impact that needs to be tracked
Documentation (select at least one)
- Updates relevant documentation
- No doc update is appropriate
Tests (select at least one)
- Includes unit test(s)
- Includes integration test(s)
- Includes bug reproducer
- No tests

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows

src/pybind/mgr/rbd_support/mirror_snapshot_schedule.py

github-actions · 2023-01-24T01:34:23Z

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

ajarr · 2023-04-18T15:25:51Z

I see mypy failures here, https://jenkins.ceph.com/job/ceph-pull-requests/113916/testReport/junit/projectroot.src.pybind/mgr/run_tox_mgr/

ajarr · 2023-04-18T17:57:41Z

I see mypy failures here, https://jenkins.ceph.com/job/ceph-pull-requests/113916/testReport/junit/projectroot.src.pybind/mgr/run_tox_mgr/

Fixed them

src/pybind/mgr/rbd_support/mirror_snapshot_schedule.py

ajarr · 2023-04-18T19:47:11Z

jenkins test make check

ajarr · 2023-04-19T15:31:33Z

jenkins test make check arm64

ajarr · 2023-04-20T13:02:42Z

jenkins test make check arm64

src/pybind/mgr/rbd_support/schedule.py

src/pybind/mgr/rbd_support/module.py

src/pybind/mgr/rbd_support/mirror_snapshot_schedule.py

src/pybind/mgr/rbd_support/trash_purge_schedule.py

qa/workunits/rbd/cli_generic.sh

src/pybind/mgr/rbd_support/task.py

idryomov · 2023-05-07T14:51:44Z

Overall, I think there is room for improvement in the tests (probably OK to defer to another PR though):

for TrashPurgeScheduleHandler and MirrorSnapshotScheduleHandler tests, instead of just checking that both pre- and post-blocklisting schedules show up, it would be good to test that both pre- and post-blocklisting scheduled work actually gets done. For trash purge scheduler, I would suggest creating two pools with an image trashed in each, adding a short (1-2m) schedule for one pool before blocklisting and for another after blocklisting and asserting that trash purge runs after enough time passes. And similarly for mirror snapshot scheduler: a single pool with two images would do as per-image schedules can be added there.
for TaskHandler test, queue some (let's say 5) flattens on different images before blocklisting and a sixth flatten after blocklisting and assert that all six flattens complete.

ajarr · 2023-05-07T16:13:28Z

@idryomov accidentally requested for re-review. pls ignore

idryomov · 2023-05-07T16:56:53Z

Current tests appear to be stable with the following fixups:

diff --git a/qa/workunits/rbd/cli_generic.sh b/qa/workunits/rbd/cli_generic.sh
index 46160c8059df..57279d26dcee 100755
--- a/qa/workunits/rbd/cli_generic.sh
+++ b/qa/workunits/rbd/cli_generic.sh
@@ -1503,13 +1503,15 @@ test_perf_image_iostat_recovery() {
     echo "testing recovery of perf handler after module's RADOS client is blocklisted..."
     remove_images
 
-    ceph osd pool create rbd1 8
-    rbd pool init rbd1
-    rbd namespace create rbd1/ns
+    ceph osd pool create rbd3 8
+    rbd pool init rbd3
+    rbd namespace create rbd3/ns
 
-    IMAGE_SPECS=("rbd1/test1" "rbd1/ns/test2")
+    IMAGE_SPECS=("rbd3/test1" "rbd3/ns/test2")
     for spec in "${IMAGE_SPECS[@]}"; do
-        rbd create $RBD_CREATE_ARGS --size 10G $spec
+        # ensure all images are created without a separate data pool
+        # as we filter iostat by specific pool specs below
+        rbd create $RBD_CREATE_ARGS --size 10G --rbd-default-data-pool '' $spec
     done
 
     BENCH_PIDS=()
@@ -1519,7 +1521,7 @@ test_perf_image_iostat_recovery() {
         BENCH_PIDS+=($!)
     done
 
-    test "$(rbd perf image iostat --format json rbd1 |
+    test "$(rbd perf image iostat --format json rbd3 |
         jq -r 'map(.image) | sort | join(" ")')" = 'test1'
 
     # Fetch and blocklist the rbd_support module's RADOS client
@@ -1529,10 +1531,10 @@ test_perf_image_iostat_recovery() {
     ceph osd blocklist add $CLIENT_ADDR
     ceph osd blocklist ls | grep $CLIENT_ADDR
 
-    expect_fail rbd perf image iostat --format json rbd1/ns
+    expect_fail rbd perf image iostat --format json rbd3/ns
     sleep 10
     for i in `seq 24`; do
-        test "$(rbd perf image iostat --format json rbd1/ns |
+        test "$(rbd perf image iostat --format json rbd3/ns |
             jq -r 'map(.image) | sort | join(" ")')" = 'test2' && break
 	sleep 10
     done
@@ -1543,7 +1545,7 @@ test_perf_image_iostat_recovery() {
     wait
 
     remove_images
-    ceph osd pool rm rbd1 rbd1 --yes-i-really-really-mean-it
+    ceph osd pool rm rbd3 rbd3 --yes-i-really-really-mean-it
 }
 
 test_mirror_pool_peer_bootstrap_create() {
@@ -1661,7 +1663,6 @@ test_tasks_recovery() {
     ceph osd blocklist add $CLIENT_ADDR
     ceph osd blocklist ls | grep $CLIENT_ADDR
 
-    test "$(ceph rbd task list)" = "[]"
     expect_fail ceph rbd task add flatten rbd2/clone1
     sleep 10
     for i in `seq 24`; do
@@ -1670,10 +1671,6 @@ test_tasks_recovery() {
     done
     test "$(ceph rbd task list)" != "[]"
 
-    # queue flatten and check that it completes
-    rbd info rbd2/clone1 | grep 'parent: '
-    expect_fail rbd snap unprotect rbd2/img1@snap
-    ceph rbd task add flatten rbd2/clone1
     for i in {1..12}; do
         rbd info rbd2/clone1 | grep 'parent: ' || break
         sleep 10

... requests to be completed. Signed-off-by: Ramana Raja <rraja@redhat.com>

Signed-off-by: Ramana Raja <rraja@redhat.com>

In certain scenarios the OSDs were slow to process RBD requests. This lead to the rbd_support module's RBD client not being able to gracefully handover a RBD exclusive lock to another RBD client. After the condition persisted for some time, the other RBD client forcefully acquired the lock by blocklisting the rbd_support module's RBD client, and consequently blocklisted the module's RADOS client. The rbd_support module stopped working. To recover the module, the entire mgr service had to be restarted which reloaded other mgr modules. Instead of recovering the rbd_support module from client blocklisting by being disruptive to other mgr modules, recover the module automatically without restarting the mgr serivce. On client getting blocklisted, shutdown the module's handlers and blocklisted client, create a new rados client for the module, and start the new handlers. Fixes: https://tracker.ceph.com/issues/56724 Signed-off-by: Ramana Raja <rraja@redhat.com>

... after the module's RADOS client is blocklisted. Signed-off-by: Ramana Raja <rraja@redhat.com>

ajarr · 2023-05-08T21:23:41Z

Overall, I think there is room for improvement in the tests (probably OK to defer to another PR though):

for TrashPurgeScheduleHandler and MirrorSnapshotScheduleHandler tests, instead of just checking that both pre- and post-blocklisting schedules show up, it would be good to test that both pre- and post-blocklisting scheduled work actually gets done. For trash purge scheduler, I would suggest creating two pools with an image trashed in each, adding a short (1-2m) schedule for one pool before blocklisting and for another after blocklisting and asserting that trash purge runs after enough time passes. And similarly for mirror snapshot scheduler: a single pool with two images would do as per-image schedules can be added there.

for TaskHandler test, queue some (let's say 5) flattens on different images before blocklisting and a sixth flatten after blocklisting and assert that all six flattens complete.

Created tracker ticket for now, https://tracker.ceph.com/issues/59681

idryomov · 2023-05-10T09:54:39Z

No related failures:

https://pulpito.ceph.com/dis-2023-05-08_22:48:48-rbd-wip-dis-testing-distro-default-smithi/
https://pulpito.ceph.com/dis-2023-05-09_11:02:54-rbd-wip-dis-testing-distro-default-smithi/
https://pulpito.ceph.com/dis-2023-05-09_20:53:52-rbd-wip-dis-testing-distro-default-smithi/

This is with #49975 excluded in the last rerun -- it's causing "Exiting scrub checking -- not all pgs scrubbed." errors. Per @neha-ojha the plan is to introduce a more aggressive QoS profile for teuthology tests.

ajarr requested a review from a team as a code owner January 14, 2023 00:42

ajarr marked this pull request as draft January 14, 2023 00:42

github-actions bot added pybind rbd labels Jan 14, 2023

ajarr force-pushed the fix-56724 branch 3 times, most recently from a2531e2 to 0ba629b Compare January 17, 2023 03:06

ajarr changed the title ~~mgr/rbd_support: add own RADOS client for MirrorSnapshotScheduleHandler~~ mgr/rbd_support: add own RADOS client for handlers Jan 17, 2023

idryomov reviewed Jan 17, 2023

View reviewed changes

ajarr force-pushed the fix-56724 branch from 0ba629b to 5a14410 Compare January 17, 2023 23:45

ajarr requested a review from idryomov January 17, 2023 23:50

ajarr force-pushed the fix-56724 branch from 5a14410 to 0652fb8 Compare January 19, 2023 02:26

github-actions bot added the mgr label Jan 19, 2023

ajarr force-pushed the fix-56724 branch 2 times, most recently from 14995b5 to 167b87f Compare January 24, 2023 01:33

github-actions bot added documentation rgw needs-rebase labels Jan 24, 2023

ajarr force-pushed the fix-56724 branch from 167b87f to 7b8a7bd Compare January 24, 2023 01:36

github-actions bot removed the needs-rebase label Jan 24, 2023

ajarr force-pushed the fix-56724 branch from 7b8a7bd to 94a76df Compare January 25, 2023 02:27

github-actions bot added core mon labels Jan 31, 2023

ajarr removed rgw documentation core labels Jan 31, 2023

ajarr force-pushed the fix-56724 branch from d90c501 to 03b9b42 Compare January 31, 2023 19:34

github-actions bot added core tests labels Jan 31, 2023

ajarr force-pushed the fix-56724 branch from 090b46d to 134b543 Compare April 18, 2023 01:38

ajarr force-pushed the fix-56724 branch from 134b543 to 9ee6eb0 Compare April 18, 2023 17:57

ajarr commented Apr 18, 2023

View reviewed changes

src/pybind/mgr/rbd_support/mirror_snapshot_schedule.py Show resolved Hide resolved

idryomov reviewed May 7, 2023

View reviewed changes

ajarr requested a review from idryomov May 7, 2023 16:12

ajarr added 2 commits May 8, 2023 13:39

mgr/rbd_support: notify the thread waiting on pending snapshot

38a7e37

... requests to be completed. Signed-off-by: Ramana Raja <rraja@redhat.com>

pybind/rados: add ConnectionShutdown exception class

e452899

Signed-off-by: Ramana Raja <rraja@redhat.com>

ajarr force-pushed the fix-56724 branch from 9ee6eb0 to a2af00c Compare May 8, 2023 19:25

ajarr added 2 commits May 8, 2023 16:45

qa/workunits/rbd: Add tests for rbd_support module recovery

a2f15d4

... after the module's RADOS client is blocklisted. Signed-off-by: Ramana Raja <rraja@redhat.com>

ajarr force-pushed the fix-56724 branch from a2af00c to a2f15d4 Compare May 8, 2023 20:54

idryomov approved these changes May 8, 2023

View reviewed changes

idryomov added the wip-dis-testing label May 8, 2023

idryomov merged commit 6544c04 into ceph:main May 10, 2023
11 checks passed

idryomov added the needs-reef-backport label May 10, 2023

ajarr deleted the fix-56724 branch May 10, 2023 14:12

This was referenced May 11, 2023

quincy: mgr/rbd_support: recover from rados client blocklisting #51455

Merged

pacific: mgr/rbd_support: fixes related to recover from rados client blocklisting #51464

Merged

idryomov mentioned this pull request May 16, 2023

reef: RBD backports (batch 2) #51518

Merged

idryomov removed the needs-reef-backport label May 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mgr/rbd_support: recover from rados client blocklisting #49742

mgr/rbd_support: recover from rados client blocklisting #49742

ajarr commented Jan 14, 2023 •

edited

github-actions bot commented Jan 24, 2023

ajarr commented Apr 18, 2023

ajarr commented Apr 18, 2023

ajarr commented Apr 18, 2023

ajarr commented Apr 19, 2023

ajarr commented Apr 20, 2023

idryomov commented May 7, 2023 •

edited

ajarr commented May 7, 2023

idryomov commented May 7, 2023

ajarr commented May 8, 2023

idryomov commented May 10, 2023

mgr/rbd_support: recover from rados client blocklisting #49742

mgr/rbd_support: recover from rados client blocklisting #49742

Conversation

ajarr commented Jan 14, 2023 • edited

Contribution Guidelines

Checklist

github-actions bot commented Jan 24, 2023

ajarr commented Apr 18, 2023

ajarr commented Apr 18, 2023

ajarr commented Apr 18, 2023

ajarr commented Apr 19, 2023

ajarr commented Apr 20, 2023

idryomov commented May 7, 2023 • edited

ajarr commented May 7, 2023

idryomov commented May 7, 2023

ajarr commented May 8, 2023

idryomov commented May 10, 2023

ajarr commented Jan 14, 2023 •

edited

idryomov commented May 7, 2023 •

edited