mgr/dashboard: added iSCSI IOPS/throughput metrics #18653

dillaman · 2017-10-31T15:07:23Z

No description provided.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>

dillaman · 2017-10-31T15:07:35Z

Signed-off-by: Jason Dillaman <dillaman@redhat.com>

Fixes: http://tracker.ceph.com/issues/21391 Signed-off-by: Jason Dillaman <dillaman@redhat.com>

trociny · 2017-11-01T19:57:58Z

src/pybind/mgr/dashboard/rbd_iscsi.py

+                                image['stats'][s] = self._module.get_rate(
+                                    'tcmu-runner', service_id, perf_key)
+                                image['stats_history'][s] = self._module.get_counter(
+                                    'tcmu-runner', service_id, perf_key)[perf_key]


@dillaman Just a question to understand how ceph-mgr works with perf counters. Does _module.get_counter query ceph-mgr's collected in memory data (then how it is collected?) or does it query the service (tcmu-runner)?

@trociny tcmu-runner will now periodically send perf counters to ceph-mgr if the priority of the counter is high enough. Therefore, this get_counter is reading the latest perf stats that are in-memory within the ceph-mgr

@dillaman Could you please point me to the tcmu-runner code that does this? I'd like to look at it as an example.

And thinking about this. Don't we want to make librbd send this instead (probably if some configuration option is enabled)? I had an impression we were going to do this anyway to have reports like top rbd client?

@trociny It actually is librbd that is sending the stats. See commit 7a9d10a in this PR. The top-most image in a clone hierarchy (i.e. the one you open) will send read/write throughput and op stats at priority level PRIO_USEFUL.

The MgrClient will by default send any perf counters at level PRIO_USEFUL or above to ceph-mgr (

ceph/src/common/options.cc

Line 4135 in 7a23097

.set_default((int64_t)PerfCountersBuilder::PRIO_USEFUL)

) for any service daemons.

The tcmu-runner is registering itself as a service daemon so those stats will automatically be exported (https://github.com/open-iscsi/tcmu-runner/blob/8777084029c7708d8fcdbb79e23f086e730dd2e5/rbd.c#L156)

@dillaman Thank you for the explanation! I saw the commit that set PRIO_USEFUL but this did not help me to understand how the magic actually happened.

So after 7a9d10a is merged we could tweak "rbd_ls" dashboard to add IO metrics similarly to rbd_iscsi dashboard?

@trociny The ceph-mgr perf gathering design won't really scale well to potentially tens of thousands of images sending it perf stats. I think in the future there will be two approaches for generic RBD image metric gathering: (1) an OSD-based statistical approach to provide an "rbd top"-like tool for the whole cluster and (2) a per-image opt-in system where an admin enables perf gathering for one (or a few) select RBD images which can be sent to the ceph-mgr / dashboard.

trociny

lgtm

trociny · 2017-11-02T11:21:05Z

@ceph-jenkins retest this please

Jason Dillaman added 4 commits October 29, 2017 08:13

librbd: track the child of an image in-memory

ce2ae1d

Signed-off-by: Jason Dillaman <dillaman@redhat.com>

librbd: moved performance counter enums to common location

2224e5c

Signed-off-by: Jason Dillaman <dillaman@redhat.com>

librbd: export read and writes performance counters for daemons

7a9d10a

Signed-off-by: Jason Dillaman <dillaman@redhat.com>

librbd: track image open and lock acquire time via perf counter

8583270

Signed-off-by: Jason Dillaman <dillaman@redhat.com>

dillaman added feature mgr rbd labels Oct 31, 2017

Jason Dillaman added 2 commits October 31, 2017 11:11

mgr/dashboard: include A/O start relative timestamp for iSCSI

e62f186

Signed-off-by: Jason Dillaman <dillaman@redhat.com>

mgr/dashboard: added iSCSI IOPS/throughput metrics

e55ef24

Fixes: http://tracker.ceph.com/issues/21391 Signed-off-by: Jason Dillaman <dillaman@redhat.com>

dillaman force-pushed the wip-21391 branch from b7ed49b to e55ef24 Compare October 31, 2017 15:12

trociny reviewed Nov 1, 2017

View reviewed changes

trociny approved these changes Nov 2, 2017

View reviewed changes

trociny merged commit 15a9796 into ceph:master Nov 3, 2017

dillaman deleted the wip-21391 branch November 3, 2017 11:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mgr/dashboard: added iSCSI IOPS/throughput metrics #18653

mgr/dashboard: added iSCSI IOPS/throughput metrics #18653

dillaman commented Oct 31, 2017

dillaman commented Oct 31, 2017

trociny Nov 1, 2017

dillaman Nov 1, 2017

trociny Nov 1, 2017

dillaman Nov 1, 2017

trociny Nov 2, 2017 •

edited

dillaman Nov 2, 2017

trociny left a comment

trociny commented Nov 2, 2017

mgr/dashboard: added iSCSI IOPS/throughput metrics #18653

mgr/dashboard: added iSCSI IOPS/throughput metrics #18653

Conversation

dillaman commented Oct 31, 2017

dillaman commented Oct 31, 2017

trociny Nov 1, 2017

Choose a reason for hiding this comment

dillaman Nov 1, 2017

Choose a reason for hiding this comment

trociny Nov 1, 2017

Choose a reason for hiding this comment

dillaman Nov 1, 2017

Choose a reason for hiding this comment

trociny Nov 2, 2017 • edited

Choose a reason for hiding this comment

dillaman Nov 2, 2017

Choose a reason for hiding this comment

trociny left a comment

Choose a reason for hiding this comment

trociny commented Nov 2, 2017

trociny Nov 2, 2017 •

edited