Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mgr: common interface for TSDB modules #17735

Merged
merged 36 commits into from Oct 2, 2017

Conversation

Projects
None yet
7 participants
@jcsp
Copy link
Contributor

commented Sep 14, 2017

This builds on #16999 and #16990

The idea is to have a common selection of perf counters output by both TSDB plugins, where the "priority" of the perf counter is what's used to decide how much to output. This goes along with #16699, in which the priority is used to determine which counters are transmitted from daemons to ceph-mgr.

To make the priority selection make sense, it was also necessary to go and set some priorities inside the mon and OSD. These should match the counters that were already selected in the prometheus module's default list.

I have removed the influx module's ability to directly specify a list of perf counters -- this doesn't really make sense when perf counters are being selected by priority further down the stack. Adding a named perf counter in the influx module's config wouldn't cause it to be included in the reports from the daemon to mgr, so the explicit naming would only be useful for down-selecting the counters, which shouldn't really be necessary as the counters at the default PRIO_USEFUL level aren't too numerous for a modern TSDB to handle.

I also removed the influx module's "cluster" output mode where it summed the perf counters across all OSDs, on the basis that this kind of reduction is usually the TSDB's job rather than the collector's job, and it's not necessarily simple to do automatically for all counters -- for example would someone really want their rocksdb counters from mons summed with the rocksdb counters from OSDs, or similar?

While I was at it, I also switched the influx module's config to use the normal mgr module config mechanism rather than a separate file, added a self test command to the prometheus module, added an automated test to call the modules' self test commands.

@jcsp

This comment has been minimized.

Copy link
Contributor Author

commented Sep 14, 2017

@jan--f @mhdo2 how do you feel about this?

@jcsp

This comment has been minimized.

Copy link
Contributor Author

commented Sep 14, 2017

@jecluis would you mind reviewing the part where I set priorities on the mon perf counters?

@jcsp

This comment has been minimized.

Copy link
Contributor Author

commented Sep 14, 2017

@jdurgin would you mind reviewing the part where I set priorities on the OSD perf counters?

@tchaikov tchaikov self-requested a review Sep 14, 2017

@jcsp

This comment has been minimized.

Copy link
Contributor Author

commented Sep 14, 2017

(need to fix unittest_perf_counters)

@mhdo2

This comment has been minimized.

Copy link
Contributor

commented Sep 14, 2017

@jcsp Hi, I think the changes look fine but I do have some concerns with the config keys. We want the influx module to be able to run on multiple hosts, but this version gets the config keys for only one host at a time. I was working on changing the module to use the config keys instead of the config file and be able to run on multiple hosts at once. As of right now, there are still some bugs I need to work out to get the module fully working.

@jcsp

This comment has been minimized.

Copy link
Contributor Author

commented Sep 14, 2017

this version gets the config keys for only one host at a time

I'm not sure what you mean -- the module runs on whichever ceph-mgr daemon is active, and it will see the same config keys (they are stored in the mon) from wherever it runs.

@jan--f

This comment has been minimized.

Copy link
Member

commented Sep 15, 2017

@jcsp Looks good!

@mhdo2

This comment has been minimized.

Copy link
Contributor

commented Sep 15, 2017

@jcsp Sorry that was worded kind of confusing. I just meant that your version of the module can't send data to more than one influx host at once.

@jcsp

This comment has been minimized.

Copy link
Contributor Author

commented Sep 17, 2017

Ah, I see -- yes, sending to multiple servers sounds like a very reasonable capability to add

Additional optional configuration settings are:

:interval: Time between reports to InfluxDB. Default 5 seconds.

This comment has been minimized.

Copy link
@amitkumar50

amitkumar50 Sep 18, 2017

Contributor

NIT: Why 2 spaces are left b/w full stop and Default.
Also after database, port defaults fullstop is missing.

This comment has been minimized.

Copy link
@jcsp

jcsp Sep 18, 2017

Author Contributor

Neither of those things is a problem.

@jcsp jcsp force-pushed the jcsp:wip-mgr-perf-interface branch from 91d793d to 9efddc8 Sep 18, 2017

@jcsp

This comment has been minimized.

Copy link
Contributor Author

commented Sep 18, 2017

Fixed unit test and rebased

@jcsp

This comment has been minimized.

Copy link
Contributor Author

commented Sep 18, 2017

@jecluis @jdurgin this should be good to go if the commits setting priorities in mon/osd look sane to you guys (probably not that interesting/controversial!)

@jecluis

This comment has been minimized.

Copy link
Member

commented Sep 20, 2017

@jcsp If I'm not mistaken, you're exposing the perfcounters for rocksdb, but not for leveldb. Any particular reasoning behind this?

@jecluis
Copy link
Member

left a comment

This looks good to me. Withholding approval until we know for sure leaving out LevelDBStore was by design and not an oversight :)

@tchaikov

This comment has been minimized.

Copy link
Contributor

commented Sep 20, 2017

@jecluis just a wild guess, probably because "mon_keyvaluedb" is by default "rocksdb" now?

@jecluis

This comment has been minimized.

Copy link
Member

commented Sep 20, 2017

@tchaikov we still have leveldb interfaces, and I'm sure we still have people out there using it. And LevelDBStore also has perfcounters. It would make sense to have these included, imo.

pcb.add_u64_counter(l_mon_election_call, "election_call", "Elections started");
pcb.add_u64_counter(l_mon_election_win, "election_win", "Elections won");
pcb.add_u64_counter(l_mon_election_lose, "election_lose", "Elections lost");
pcb.add_u64(l_mon_num_sessions, "num_sessions", "Open sessions", "sess",

This comment has been minimized.

Copy link
@jecluis

jecluis Sep 20, 2017

Member

After speaking with Jan and getting the gist of priorities, I'd say this would be a lot more on the USEFUL side than just on the INTERESTING side.

@jcsp

This comment has been minimized.

Copy link
Contributor Author

commented Sep 20, 2017

Boring reason that I didn't do leveldb -- I was just referencing the list of counters that Jan had selected in his prometheus PR :-)

Will add...

@jcsp jcsp force-pushed the jcsp:wip-mgr-perf-interface branch 2 times, most recently from 8deb571 to 55d9741 Sep 25, 2017

@jcsp

This comment has been minimized.

Copy link
Contributor Author

commented Sep 25, 2017

Added a fix for MgrClient to respect the perf counters prio_adjust properly, previously this was just ignoring it and consequently the mon leveldb/rocksdb bits were getting dropped.

@jcsp jcsp force-pushed the jcsp:wip-mgr-perf-interface branch from 55d9741 to 25aeeff Sep 26, 2017

John Spray added some commits Sep 1, 2017

John Spray
common: used fixed size int for perf counter prio
...to avoid any ambiguity in allowed range and
make clear how to encode it down the wire.

Signed-off-by: John Spray <john.spray@redhat.com>
John Spray
mgr/prometheus: use new get_all_perf_counters interface
Signed-off-by: John Spray <john.spray@redhat.com>
John Spray
mgr/prometheus: tag stats by daemon name
Using osd=0 or similar tags was problematic because
daemons of different types have some same-named
counters (e.g. MDS and OSD both have objecter
perf counters).

Signed-off-by: John Spray <john.spray@redhat.com>
John Spray
mon: set some priorities on perf counters
Signed-off-by: John Spray <john.spray@redhat.com>
John Spray
ceph.in: use PRIO_INTERESTING as daemonperf threshold
Using PRIO_USEFUL as the threshold for what goes into
time series databases.  I'm claiming that we have
more "useful" counters than fit on the screen,
so daemonperf's "a screen's worth" threshold
should be at the "interesting" level.

Signed-off-by: John Spray <john.spray@redhat.com>
John Spray
mgr: define perf counter constants in mgr_module
So that modules can consume perf counter data
intelligently without having to hunt around
in C land for these constants and redefine them.

Signed-off-by: John Spray <john.spray@redhat.com>
John Spray
mgr: omit module list in beacon logging
This is useful in itself, but awkward when dealing
with logs generally, because it means that when you
grep on the name of a module, you get mostly beacon
messages rather than the log messages from the
module.

Signed-off-by: John Spray <john.spray@redhat.com>
John Spray
mgr/influx: revise perf counter handling
- Use new get_all_perf_counters path
- Consequently get counters for all daemons, not just OSD
- Tag stats with ceph_daemon rather than osd_id, as some
  stats appear from more than one daemon type
- Remove summing of perf counters, external TSDB and/or queries
  can do this.
- Remove mgr_id tag: this would change depending on which
  mgr was active, which is certainly not desirable.

Signed-off-by: John Spray <john.spray@redhat.com>
John Spray
mgr/influx: enable self-test without dependencies
The idea of self-test commands is that they're self
contained and just exercise the module's calls
to the Ceph-side.

Signed-off-by: John Spray <john.spray@redhat.com>
John Spray
mgr/influx: remove file-based config
...and also trim down the configuration to what's really
needed.  In general users don't need to pick and choose
metrics.  We could add it back if there was a strong
motivation.

Signed-off-by: John Spray <john.spray@redhat.com>
John Spray
mgr/prometheus: add a self-test command
Signed-off-by: John Spray <john.spray@redhat.com>
John Spray
common: PerfCountersBuilder helper for priorities
Let the caller set a priority as the defaul, to enable them
to create a bunch at a given priority.  This is just a
convenience.

Signed-off-by: John Spray <john.spray@redhat.com>
John Spray
osd: upgrade a bunch of perf counters to PRIO_USEFUL
These are broadly the OSD-wide IO stats, which happen
to also be the ones that were named in the
prometheus plugin until I changed it to be
priority-based.

Signed-off-by: John Spray <john.spray@redhat.com>
John Spray
mon: elevate priority of many perf counters
We can be quite liberal here, because mons are
small in number.  However, we don't want to expose
KV database counters at this database from OSDs, so
use the prio_adjust mechanism for that.

Signed-off-by: John Spray <john.spray@redhat.com>
John Spray
mgr/prometheus: remove explicit counter list
These have had their priorities bumped up to
USEFUL, so they'll appear in the default
get_all_counters output.

Signed-off-by: John Spray <john.spray@redhat.com>
John Spray
qa: add mgr module selftest task
The module self test commands give us a chance to
catch any other ceph changes that change something
that a module was relying on reading.

Signed-off-by: John Spray <john.spray@redhat.com>
John Spray
test: update perfcounters test for priority in output
Signed-off-by: John Spray <john.spray@redhat.com>
John Spray
mgr: respect perf counter prio_adjust in MgrClient
This awkwardly involves re-ordering some definitions
in perf_counters.h in order to refer to the prio
names defined in PerfCountersBuilder.

Signed-off-by: John Spray <john.spray@redhat.com>

@jcsp jcsp force-pushed the jcsp:wip-mgr-perf-interface branch from 25aeeff to 8816374 Sep 27, 2017

@jcsp jcsp added the needs-qa label Sep 27, 2017

@tchaikov

This comment has been minimized.

Copy link
Contributor

commented Sep 30, 2017

@jcsp jcsp merged commit 47bfe6c into ceph:master Oct 2, 2017

5 checks passed

Docs: build check OK - docs built
Details
Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
make check make check succeeded
Details
make check (arm64) make check succeeded
Details

@jcsp jcsp deleted the jcsp:wip-mgr-perf-interface branch Oct 2, 2017

jan--f added a commit to jan--f/ceph that referenced this pull request Oct 5, 2017

Merge pull request ceph#17735 from jcsp/wip-mgr-perf-interface
mgr: common interface for TSDB modules

Reviewed-by: My Do <mhdo@umich.edu>
Reviewed-by: Jan Fajerski <jfajerski@suse.com>
Reviewed-by: John Spray <john.spray@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.