Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbd-mirror: clean-up unnecessary non-primary snapshots #34496

Merged
merged 10 commits into from Apr 15, 2020

Conversation

dillaman
Copy link

@dillaman dillaman commented Apr 9, 2020

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard backend
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

Jason Dillaman added 9 commits April 9, 2020 10:00
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
This will allow a remote rbd-mirror process to have a snapshot to use for
delta sync operations during failover.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
…mage

A pending refresh could occur after setting the non-primary feature flag but
before the creation of the demotion snapshot. This would prevent the snapshot
from being created and would leave the image in a half-primary state.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
snapshot-based mirroring needs to be able to potentially delete a
demotion snapshot during the unlink process. Previously, these
snapshots have been left while the read-only error was ignored.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Previously only newly created user snapshots were included in the
non-primary snapshot snap-seq mapping table. However, we need to
retain a full history of the mapping table if we want to be able to
prune non-primary snapshots.

Failovers are a special case since we won't have a valid snap seq mapping
so it will need to be rebuilt. Luckily, both sides should be read-only
in the previous state so we can use the snapshot names to find matches.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Once a non-primary snapshot is no longer required for syncing, delete it
from the image.

Fixes: https://tracker.ceph.com/issues/44105
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
If a previous remote snapshot was synced but the unlink failed,
ensure we retry the unlink so that the remote can cleanup the unused
snapshot.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
@dillaman
Copy link
Author

dillaman commented Apr 9, 2020

Example of failover/failback image prior to changes:

$ rbd --cluster cluster1 snap ls --all mirror/test
SNAPID  NAME                                                                                           SIZE     PROTECTED  TIMESTAMP                 NAMESPACE                                                                       
    15  .mirror.non_primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.186e9c1d-059a-4ce6-b649-9ee72e243d23  128 MiB             Thu Apr  9 09:58:37 2020  mirror (non-primary peer_uuids:[]9e6a2730-5e52-4c53-906d-60c10c4124fd:17 copied)
    17  .mirror.non_primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.348d6300-a508-4191-80ef-9fcf04e3edb8  128 MiB             Thu Apr  9 09:58:38 2020  mirror (demoted peer_uuids:[]9e6a2730-5e52-4c53-906d-60c10c4124fd:18 copied)    
    18  .mirror.non_primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.1f110610-984e-4b62-a70d-b66aa68bad6e  128 MiB             Thu Apr  9 09:58:48 2020  mirror (non-primary peer_uuids:[]9e6a2730-5e52-4c53-906d-60c10c4124fd:19 copied)
    19  .mirror.non_primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.18d1396c-3b23-4994-9c83-dde0242d8fbc  128 MiB             Thu Apr  9 09:58:49 2020  mirror (non-primary peer_uuids:[]9e6a2730-5e52-4c53-906d-60c10c4124fd:20 copied)
    20  .mirror.non_primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.9c86a943-a350-456e-945c-3b2b5677572e  128 MiB             Thu Apr  9 09:58:53 2020  mirror (demoted peer_uuids:[]9e6a2730-5e52-4c53-906d-60c10c4124fd:23 copied)    
    21  .mirror.primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.862e30e0-7118-4c1d-bc3c-7b3491de5549      128 MiB             Thu Apr  9 09:58:59 2020  mirror (primary peer_uuids:[8a773676-1f71-48f1-8c8d-22aeea2e8d67])              
    22  .mirror.primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.a769ee7b-6ecf-41bd-884e-c81bedc49def      128 MiB             Thu Apr  9 09:59:03 2020  mirror (demoted peer_uuids:[8a773676-1f71-48f1-8c8d-22aeea2e8d67])              
    23  .mirror.non_primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.1595c9c3-0e78-4ed8-8da8-4ee3007d7ce3  128 MiB             Thu Apr  9 09:59:13 2020  mirror (non-primary peer_uuids:[]9e6a2730-5e52-4c53-906d-60c10c4124fd:26 copied)
    24  .mirror.non_primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.078b0e71-d06e-45cc-ab50-0ef97590a2a9  128 MiB             Thu Apr  9 09:59:14 2020  mirror (non-primary peer_uuids:[]9e6a2730-5e52-4c53-906d-60c10c4124fd:27 copied)
    25  .mirror.non_primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.091453c0-0fdc-4810-a49b-5ab26ebd3992  128 MiB             Thu Apr  9 09:59:18 2020  mirror (demoted peer_uuids:[]9e6a2730-5e52-4c53-906d-60c10c4124fd:29 copied)    
    27  .mirror.primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.74205fef-c9a9-4f93-896d-d193e4fa3620      128 MiB             Thu Apr  9 09:59:28 2020  mirror (primary peer_uuids:[8a773676-1f71-48f1-8c8d-22aeea2e8d67])              
    29  .mirror.primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.6c9126b2-e330-47f9-8b1e-590ad1ac9c6a      128 MiB             Thu Apr  9 09:59:33 2020  mirror (demoted peer_uuids:[8a773676-1f71-48f1-8c8d-22aeea2e8d67])              
    30  .mirror.non_primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.09806369-7b4c-4a76-8349-0f090d559c1c  128 MiB             Thu Apr  9 09:59:43 2020  mirror (non-primary peer_uuids:[]9e6a2730-5e52-4c53-906d-60c10c4124fd:33 copied)
    31  .mirror.non_primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.d82eb214-7499-4bda-bff0-87c7b873dad0  128 MiB             Thu Apr  9 09:59:44 2020  mirror (non-primary peer_uuids:[]9e6a2730-5e52-4c53-906d-60c10c4124fd:34 copied)
$ rbd --cluster cluster2 snap ls --all mirror/test
SNAPID  NAME                                                                                           SIZE     PROTECTED  TIMESTAMP                 NAMESPACE                                                                       
    17  .mirror.primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.e041d9bd-621b-487d-8492-64e3e3f30098      128 MiB             Thu Apr  9 09:58:34 2020  mirror (primary peer_uuids:[e542e297-19c5-45fb-b8e6-23c129b12db1])              
    20  .mirror.primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.d2e16db9-3cfd-40cb-97c8-c0d1ecff5865      128 MiB             Thu Apr  9 09:58:47 2020  mirror (primary peer_uuids:[e542e297-19c5-45fb-b8e6-23c129b12db1])              
    23  .mirror.primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.5661d442-8502-4e42-ad16-6fe6dc4a5a2e      128 MiB             Thu Apr  9 09:58:52 2020  mirror (demoted peer_uuids:[e542e297-19c5-45fb-b8e6-23c129b12db1])              
    24  .mirror.non_primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.95a8313b-bd45-4e78-b2ac-6d899e6b956a  128 MiB             Thu Apr  9 09:59:02 2020  mirror (non-primary peer_uuids:[]558dbbd6-c0d4-4823-95c1-d744a6d03668:21 copied)
    25  .mirror.non_primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.cfee66d6-6e81-45d4-8454-5b6def9fadd8  128 MiB             Thu Apr  9 09:59:03 2020  mirror (demoted peer_uuids:[]558dbbd6-c0d4-4823-95c1-d744a6d03668:22 copied)    
    27  .mirror.primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.8725f856-2494-4511-add4-9998089aea0f      128 MiB             Thu Apr  9 09:59:13 2020  mirror (primary peer_uuids:[e542e297-19c5-45fb-b8e6-23c129b12db1])              
    29  .mirror.primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.72c02b33-fce6-488a-ac72-40926733c1cc      128 MiB             Thu Apr  9 09:59:18 2020  mirror (demoted peer_uuids:[e542e297-19c5-45fb-b8e6-23c129b12db1])              
    30  .mirror.non_primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.b753ab6d-eea5-4d5f-a8e9-b4974deab645  128 MiB             Thu Apr  9 09:59:27 2020  mirror (non-primary peer_uuids:[]558dbbd6-c0d4-4823-95c1-d744a6d03668:26 copied)
    31  .mirror.non_primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.ffe3f4f7-99d7-4d98-8334-374b278823e7  128 MiB             Thu Apr  9 09:59:28 2020  mirror (non-primary peer_uuids:[]558dbbd6-c0d4-4823-95c1-d744a6d03668:27 copied)
    32  .mirror.non_primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.e13a5465-75fc-4dae-a959-9c03b1943b95  128 MiB             Thu Apr  9 09:59:33 2020  mirror (demoted peer_uuids:[]558dbbd6-c0d4-4823-95c1-d744a6d03668:29 copied)    
    34  .mirror.primary.e1484f32-8ea5-41ce-9696-cb9c4110c8cc.9ecaab57-9d91-4b74-a2cb-52ce7a9ff397      128 MiB             Thu Apr  9 09:59:42 2020  mirror (primary peer_uuids:[e542e297-19c5-45fb-b8e6-23c129b12db1])              

Example of same image after improvements:

$ rbd --cluster cluster1 snap ls --all mirror/test
SNAPID  NAME                                                                                           SIZE     PROTECTED  TIMESTAMP                 NAMESPACE                                                                        
    40  .mirror.primary.31ecfa9a-aa92-4bcd-a14f-34fc61296309.4be93599-88ef-4709-8812-3505fc749e17      128 MiB             Thu Apr  9 08:32:28 2020  mirror (demoted peer_uuids:[68e93f48-dbfa-47bd-8062-6f9027edfbdb])               
    43  .mirror.non_primary.31ecfa9a-aa92-4bcd-a14f-34fc61296309.a745a4c6-d012-4e81-bf21-f57e1d134a3a  128 MiB             Thu Apr  9 08:32:40 2020  mirror (non-primary peer_uuids:[] f0d7ac05-468a-4926-b10d-8d04f4896303:42 copied)
$ rbd --cluster cluster2 snap ls --all mirror/test
SNAPID  NAME                                                                                       SIZE     PROTECTED  TIMESTAMP                 NAMESPACE                                                         
    42  .mirror.primary.31ecfa9a-aa92-4bcd-a14f-34fc61296309.f1f2f45f-1b8d-4e3a-9cf5-5c72c7576e6c  128 MiB             Thu Apr  9 08:32:38 2020  mirror (primary peer_uuids:[a0453180-55a7-4795-ae3e-44117ee125d3])

Note that ".mirror.primary.31ecfa9a-aa92-4bcd-a14f-34fc61296309.4be93599-88ef-4709-8812-3505fc749e17" should also be pruned but that requires a larger refactoring of rbd-mirror since the replayer never restarts after its promoted to primary. If it fails over again, it will get cleaned up.

@dillaman
Copy link
Author

dillaman commented Apr 9, 2020

jenkins test make check

@dillaman
Copy link
Author

dillaman commented Apr 13, 2020

coredumps caused by an unrelated race condition now tracked here [1]. Other failures related to the "multiple mirror peers not supported" race that has since been fixed.

[1] https://tracker.ceph.com/issues/45072

@trociny
Copy link
Contributor

trociny commented Apr 14, 2020

@dillaman I think it is unrelated to this PR but I still see rbd-mirror crashes on InstanceWatcher shut down due to FAILED ceph_assert(m_requests.empty()) both in your run [1] and my run [2].

There were 4 crashes, and 3 times it looked the same as I reported earlier, i.e. to me it looked like image_replayer::snapshot::Replayer got stuck in unlink_peer. The corresponding logs are:

/a/trociny-2020-04-13_17:36:28-rbd-wip-mgolub-testing-distro-basic-smithi/4951849/remote/smithi122/log/cluster1-client.mirror.1.30013.log.gz
/a/trociny-2020-04-07_18:48:34-rbd-wip-mgolub-testing-distro-basic-smithi/4932051/remote/smithi150/log/cluster1-client.mirror.0.33385.log.gz
/a/jdillaman-2020-04-09_09:42:22-rbd-wip-jd-testing-distro-basic-smithi/4938684/remote/smithi148/log/cluster1-client.mirror.1.15757.log.gz

And one time it was for a journal based mirroring. It seemed like ImageSync got stuck in send_prune_sync_points. The log file /a/jdillaman-2020-04-09_09:42:22-rbd-wip-jd-testing-distro-basic-smithi/4938679/remote/smithi081/log/cluster1-client.mirror.2.51285.log.gz. For the problem request I see ImageSync called send_prune_sync_points, and at that time remove_peer_image notification has been received and the sync was canceled but it did not terminate ImageSync request.

2020-04-11T01:06:43.589+0000 7f5c8ca4f700 10 rbd::mirror::ImageSync: 0x5601b4322160 handle_flush_sync_point: r=0
2020-04-11T01:06:43.589+0000 7f5c8ca4f700 10 rbd::mirror::ImageSync: 0x5601b4322160 send_prune_sync_points
2020-04-11T01:06:43.589+0000 7f5c8ca4f700 10 rbd::mirror::ImageReplayer: 0x5601b4a7db80 [3/cda5e6c3-dc44-445a-b2db-6bcb8717a165] set_state_description: r=0, desc=bootstrapping, IMAGE_SYNC/PRUNE_SYNC_POINTS
2020-04-11T01:06:43.589+0000 7f5c8ca4f700 15 rbd::mirror::ImageReplayer: 0x5601b4a7db80 [3/cda5e6c3-dc44-445a-b2db-6bcb8717a165] update_mirror_image_status: force=0, state=--
2020-04-11T01:06:43.589+0000 7f5c9f274700 15 rbd::mirror::ImageReplayer: 0x5601b4a7db80 [3/cda5e6c3-dc44-445a-b2db-6bcb8717a165] set_mirror_image_status_update: force=0, state=--
2020-04-11T01:06:43.589+0000 7f5c9f274700 15 rbd::mirror::ImageReplayer: 0x5601b4a7db80 [3/cda5e6c3-dc44-445a-b2db-6bcb8717a165] set_mirror_image_status_update: status={state=up+syncing, description=bootstrapping, IMAGE_SYNC/PRUNE_SYNC_POINTS, last_update=0.000000]}
2020-04-11T01:06:43.589+0000 7f5c9f274700 15 rbd::mirror::MirrorStatusUpdater 0x5601b1dc2a20 set_mirror_image_status: global_image_id=cda5e6c3-dc44-445a-b2db-6bcb8717a165, mirror_image_site_status={state=up+syncing, description=bootstrapping, IMAGE_SYNC/PRUNE_SYNC_POINTS, last_update=0.000000]}
2020-04-11T01:06:43.589+0000 7f5c9f274700 15 rbd::mirror::MirrorStatusUpdater 0x5601b1dc3b00 set_mirror_image_status: global_image_id=cda5e6c3-dc44-445a-b2db-6bcb8717a165, mirror_image_site_status={state=up+syncing, description=bootstrapping, IMAGE_SYNC/PRUNE_SYNC_POINTS, last_update=0.000000]}
2020-04-11T01:06:43.872+0000 7f5c93a5d700 10 rbd::mirror::InstanceWatcher: 0x5601b1108380 handle_notify: notify_id=734439408304, handle=94565291737856, notifier_id=4314
2020-04-11T01:06:43.872+0000 7f5c93a5d700 10 rbd::mirror::InstanceWatcher: 0x5601b1108380 handle_payload: remove_peer_image: instance_id=4314, request_id=408
2020-04-11T01:06:43.872+0000 7f5c93a5d700 10 rbd::mirror::InstanceWatcher: 0x5601b1108380 prepare_request: instance_id=4314, request_id=408
2020-04-11T01:06:43.872+0000 7f5c93a5d700 10 rbd::mirror::InstanceWatcher: 0x5601b1108380 handle_peer_image_removed: global_image_id=cda5e6c3-dc44-445a-b2db-6bcb8717a165, peer_mirror_uuid=5c6a4b9c-fb17-4d0e-97ff-8b6996184ee9
2020-04-11T01:06:43.872+0000 7f5c93a5d700  5 librbd::Watcher: 0x5601b1108380 notifications_blocked: blocked=0
2020-04-11T01:06:43.872+0000 7f5c93a5d700 10 librbd::Watcher::C_NotifyAck 0x5601b41f1f40 C_NotifyAck: id=734439408304, handle=94565291737856
2020-04-11T01:06:43.872+0000 7f5c9f274700 10 rbd::mirror::InstanceReplayer: 0x5601b046de00 remove_peer_image: global_image_id=cda5e6c3-dc44-445a-b2db-6bcb8717a165, peer_mirror_uuid=5c6a4b9c-fb17-4d0e-97ff-8b6996184ee9
2020-04-11T01:06:43.872+0000 7f5c9f274700 10 rbd::mirror::ImageReplayer: 0x5601b4a7db80 [3/cda5e6c3-dc44-445a-b2db-6bcb8717a165] stop: on_finish=0x5601b46a1060, manual=0, desc=
2020-04-11T01:06:43.872+0000 7f5c9f274700 10 librbd::Watcher::C_NotifyAck 0x5601b41f1f40 finish: r=0
2020-04-11T01:06:43.872+0000 7f5c9f274700 10 rbd::mirror::ImageReplayer: 0x5601b4a7db80 [3/cda5e6c3-dc44-445a-b2db-6bcb8717a165] stop: canceling start
2020-04-11T01:06:43.872+0000 7f5c9f274700 10 rbd::mirror::ImageReplayer: 0x5601b4a7db80 [3/cda5e6c3-dc44-445a-b2db-6bcb8717a165] stop: canceling bootstrap
2020-04-11T01:06:43.872+0000 7f5c9f274700 10 rbd::mirror::image_replayer::BootstrapRequest: 0x5601b4aa16c0 cancel: 
2020-04-11T01:06:43.872+0000 7f5c9f274700 10 rbd::mirror::ImageSync: 0x5601b4322160 cancel

[1] http://pulpito.ceph.com/jdillaman-2020-04-09_09:42:22-rbd-wip-jd-testing-distro-basic-smithi/
[2] http://pulpito.ceph.com/trociny-2020-04-13_17:36:28-rbd-wip-mgolub-testing-distro-basic-smithi/

@dillaman
Copy link
Author

@trociny It's related to that linked tracker ticket above. It's an existing issue that will need to be fixed and backported.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Copy link
Contributor

@trociny trociny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants