rbd-mirror: improved replication statistics #34408

dillaman · 2020-04-03T19:38:51Z

Checklist

References tracker ticket
Updates documentation if necessary
Includes tests for new functionality or reproducer for bug

Show available Jenkins commands

jenkins retest this please
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard backend
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox

The initial version mimics the existing ProgressContext callback interface. Later commits will add additional deep-copy unique methods. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

These simple stats will be utilized by rbd-mirror to compute throughput metrics for snapshot-based mirroring. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

When metrics are incorporated, there might not be a forced status update if no new data is available to replicate. However, we will want the metrics to decrease over time. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

The free-form journal replay status description is now JSON-encoded. The "master"/"mirror" designators have been changed to "primary"/"non_primary" to better align with RBD terminology. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

The mirror image status for replaying journal-based images now includes bytes and entries per second in addition to an estimated number of seconds until the image is fully synced. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

This will make it cleaner and easier to add additional data fields to the existing JSON replaying status. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

The mirror image status for replaying snapshot-based images now includes bytes per second and per snapshot, in addition to an estimated number of seconds until the image is fully synced. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

trociny · 2020-04-06T08:59:44Z

src/tools/rbd_mirror/image_replayer/journal/ReplayStatusFormatter.cc

+
+  if (m_entries_behind_master > 0 && entries_per_second > 0) {
+    auto seconds_until_synced = round_to_two_places(
+      m_entries_behind_master / entries_per_second);


Not sure if we want to improve this, but I think it will be underestimate if the primary image is actively writing at that time. If we had something like new_journal_entries_per_second, for the case new_journal_entries_per_second << entries_per_second the formula could be

seconds_until_synced = entries_behind_master / entries_per_second * (1 + new_journal_entries_per_second / entries_per_second)

For larger new_journal_entries_per_second it would need to include additional terms, and if new_journal_entries_per_second >= entries_per_second it would just mean that the sync never complete.

Alternatively I think we could estimate seconds_until_synced by measuring the rate of entries_behind_master change. If it is growing -- the sync will never complete. If it is decreasing, seconds_until_synced = entries_behind_master / rate.

I thought about that and had put something in initially to track it, but the issue is that it's going to be a highly inaccurate stat since we only discover new master entries every 30 seconds or so. Since we don't get the actual timestamp from the master position commit, we just have to guess that it's X entries per 30 seconds.

trociny · 2020-04-06T09:45:28Z

src/tools/rbd_mirror/image_replayer/snapshot/Replayer.cc

+  auto pending_bytes = bytes_per_snapshot * m_pending_snapshots;
+  if (bytes_per_second > 0 && m_pending_snapshots > 0) {
+    auto seconds_until_synced = round_to_two_places(
+      pending_bytes / bytes_per_second);


the same here: in this estimate we don't include new bytes that are going to be added to the image during sync. If we had something like new_snapshots_per_second, it could be:

seconds_until_synced = bytes_per_snapshot * pending_snapshots / bytes_per_second * (1 + new_snapshots_per_second * bytes_per_snapshot / bytes_per_second)

If we can measure new snapshots in "snapshots per second", I think we have a problem. 😉

Once a new snapshot is added, the stats will update after it rescans the image since we now track how many snapshots we need to sync + the average number of bytes we copy per snapshot.

Sure, "snapshots per second" is expected to be a float less than 1, e.g. if snapshots are created hourly it would be 1 / 3600. Anyway, thinking more about it, I tend to agree there is no need in overcomplicating this.

trociny

LGTM

trociny · 2020-04-08T11:17:27Z

@dillaman Running on teuthology, observing rbd-mirror crashes [1], though I don't see how it could be related here.

For that run I had 3 crashes, with the same backtrace. Here is one case:

/ceph/teuthology-archive/trociny-2020-04-07_18:48:34-rbd-wip-mgolub-testing-distro-basic-smithi/4932067/remote/smithi057/log/cluster1-client.mirror.1.30307.log.gz

2020-04-08T07:31:44.943+0000 7f68941a1700 -1 /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/giganti
c/release/16.0.0-432-g2e5188a/rpm/el8/BUILD/ceph-16.0.0-432-g2e5188a/src/tools/rbd_mirror/InstanceWatcher.cc: In function 'rbd::mirror::InstanceWatcher<ImageCtxT>::~InstanceWatcher() [with ImageCtxT = librbd::ImageCtx]' thread 7f68941a1700 time 2020-04-08T07:31:44.944409+0000
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.0.0-432-g2e5188a/rpm/el8/BUILD/c
eph-16.0.0-432-g2e5188a/src/tools/rbd_mirror/InstanceWatcher.cc: 340: FAILED ceph_assert(m_requests.empty())

 ceph version 16.0.0-432-g2e5188a (2e5188a30ff76399e4f30f0ad2286465fb96e8cb) octopus (rc)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f689a0ed030]
 2: (()+0x27b24a) [0x7f689a0ed24a]
 3: (rbd::mirror::InstanceWatcher<librbd::ImageCtx>::~InstanceWatcher()+0x145) [0x56262a01b455]
 4: (rbd::mirror::InstanceWatcher<librbd::ImageCtx>::~InstanceWatcher()+0xd) [0x56262a01b49d]
 5: (rbd::mirror::NamespaceReplayer<librbd::ImageCtx>::handle_shut_down_instance_watcher(int)+0x85) [0x562629ff3a45]
 6: (ThreadPool::PointerWQ<Context>::_void_process(void*, ThreadPool::TPHandle&)+0x148) [0x562629fb3148]
 7: (ThreadPool::worker(ThreadPool::WorkThread*)+0xe64) [0x7f689a1d9ea4]
 8: (ThreadPool::WorkThread::entry()+0x15) [0x7f689a1da705]
 9: (()+0x82de) [0x7f689952a2de]
 10: (clone()+0x43) [0x7f6897a99133]

In the log I traced this pending request to request_id=6:

2020-04-08T07:19:09.595+0000 7f9348484700 10 rbd::mirror::InstanceWatcher: 0x555cc14f0380 prepare_request: instance_id=4435, request_id=6
2020-04-08T07:19:09.595+0000 7f9348484700 10 rbd::mirror::InstanceWatcher: 0x555cc14f0380 handle_sync_start: instance_id=4435, sync_id=11491b28a49a
2020-04-08T07:19:09.595+0000 7f935549e700 10 rbd::mirror::InstanceWatcher: C_NotifyInstanceRequest: 0x555cc44cbec0 finish: r=0
2020-04-08T07:19:09.595+0000 7f935549e700 10 rbd::mirror::InstanceWatcher: C_SyncRequest: 0x555cc46f8b90 finish: r=0
2020-04-08T07:19:09.595+0000 7f935549e700 10 rbd::mirror::InstanceWatcher: 0x555cc14f0380 handle_notify_sync_request: sync_id=11491b28a49a, r=0
2020-04-08T07:19:09.595+0000 7f935549e700 10 rbd::mirror::image_replayer::snapshot::Replayer: 0x555cc724ec00 handle_request_sync: r=0
2020-04-08T07:19:09.595+0000 7f935549e700 10 rbd::mirror::image_replayer::snapshot::Replayer: 0x555cc724ec00 copy_image: remote_snap_id_start=74, remote_snap_id_end=76, local_snap_id_start=77, last_copied_object_number=0, snap_seqs={75=79,76=18446744073709551614}

And for this replayer I see that it seemed to get stuck in unlink peer request:

2020-04-08T07:19:09.604+0000 7f932f452700 10 rbd::mirror::image_replayer::snapshot::Replayer: 0x555cc724ec00 unlink_peer: remote_snap_id=74

Unfortunately I didn't find logs for the remote context to look deeper there.

I looked at another case: /ceph/teuthology-archive/trociny-2020-04-07_18:48:34-rbd-wip-mgolub-testing-distro-basic-smithi/4932031/remote/smithi079/log/cluster1-client.mirror.1.15789.log.gz, and that time it also got stuck on unlink_peer (pending request_id=13, snapshot::Replayer: 0x55f6ac036f00).

[1] http://pulpito.ceph.com/trociny-2020-04-07_18:48:34-rbd-wip-mgolub-testing-distro-basic-smithi/

dillaman · 2020-04-08T11:54:06Z

That is something I have fixed in my in-progress work. There is a missing lock in Replayer::request_sync that can race with shutdown and cancel sync request. I can pull that one-liner out to a new PR / this PR to unstick testing.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>

dillaman · 2020-04-08T13:46:37Z

@trociny I pushed the sync race fix as the last commit.

caisan · 2020-07-09T08:34:29Z

hi, any plans for backporting to N or M version ?

trociny · 2020-07-09T09:26:15Z

There is plan to backport it to Nautilus. See https://tracker.ceph.com/issues/44727 and related.

caisan · 2020-07-14T11:09:13Z

There is plan to backport it to Nautilus. See https://tracker.ceph.com/issues/44727 and related.

thanks.

Jason Dillaman added 7 commits April 3, 2020 13:14

librbd: pass specialized handler interface to deep-copy

c7d6e76

The initial version mimics the existing ProgressContext callback interface. Later commits will add additional deep-copy unique methods. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

librbd: pass bytes-read statistics from deep-copy through handler

0b6cbde

These simple stats will be utilized by rbd-mirror to compute throughput metrics for snapshot-based mirroring. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

rbd-mirror: periodically poll image replayer status

133cce8

When metrics are incorporated, there might not be a forced status update if no new data is available to replicate. However, we will want the metrics to decrease over time. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

rbd-mirror: switch to json_spirit formatter for snapshot image status

de44581

This will make it cleaner and easier to add additional data fields to the existing JSON replaying status. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

dillaman added feature rbd labels Apr 3, 2020

trociny reviewed Apr 6, 2020

View reviewed changes

trociny approved these changes Apr 6, 2020

View reviewed changes

trociny added needs-qa wip-mgolub-testing labels Apr 6, 2020

rbd-mirror: fixed race condition with snapshot sync and shutdown

8a18a7f

Signed-off-by: Jason Dillaman <dillaman@redhat.com>

trociny merged commit 978a2e3 into ceph:master Apr 9, 2020

dillaman deleted the wip-44727 branch April 9, 2020 11:26

smithfarm mentioned this pull request Apr 28, 2020

octopus: rbd-mirror: improved replication statistics #34810

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rbd-mirror: improved replication statistics #34408

rbd-mirror: improved replication statistics #34408

dillaman commented Apr 3, 2020

trociny Apr 6, 2020

dillaman Apr 6, 2020

trociny Apr 6, 2020

dillaman Apr 6, 2020 •

edited

trociny Apr 6, 2020

trociny left a comment

trociny commented Apr 8, 2020

dillaman commented Apr 8, 2020

dillaman commented Apr 8, 2020

caisan commented Jul 9, 2020

trociny commented Jul 9, 2020

caisan commented Jul 14, 2020

rbd-mirror: improved replication statistics #34408

rbd-mirror: improved replication statistics #34408

Conversation

dillaman commented Apr 3, 2020

Checklist

trociny Apr 6, 2020

Choose a reason for hiding this comment

dillaman Apr 6, 2020

Choose a reason for hiding this comment

trociny Apr 6, 2020

Choose a reason for hiding this comment

dillaman Apr 6, 2020 • edited

Choose a reason for hiding this comment

trociny Apr 6, 2020

Choose a reason for hiding this comment

trociny left a comment

Choose a reason for hiding this comment

trociny commented Apr 8, 2020

dillaman commented Apr 8, 2020

dillaman commented Apr 8, 2020

caisan commented Jul 9, 2020

trociny commented Jul 9, 2020

caisan commented Jul 14, 2020

dillaman Apr 6, 2020 •

edited