Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mgr: prometheus fix metadata labels #21557

Merged
merged 2 commits into from May 4, 2018

Conversation

jan--f
Copy link
Contributor

@jan--f jan--f commented Apr 20, 2018

I changed some label names to make it easier to correlate metrics in prometheus. Most notably this changes the id labels in the metadata metrics to ceph_daemon, as well as the content of these labels to correspond with the values of the daemon metrics (e.g. ceph_osd_metadata{ceph_daemon="osd.1"}). While this adds a bit of redundant information it makes it much easier to correlate metrics in prometheus.

I also renamed the instance label in the disc_occupation metric and updated the docs on how to correlate metrics with (imho) less complex instructions.

This commit makes label names that are supposed to be used for matching
consistent. Prometheus can not match on say osd_metadata{id=1} and
osd_up{ceph_daemon=osd.1}. We want to use the label ceph_daemon=osd.1 on
all metrics.

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
Copy link
Contributor

@b-ranto b-ranto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add the ceph_daemon label instead of replacing 'id' labels with it?
(I am ok with replacing 'instance' label with ceph_daemon in disk_occupation).

@jan--f
Copy link
Contributor Author

jan--f commented Apr 20, 2018

@b-ranto may I ask what the motivation for having both labels is (with somewhat redundant values)?

@b-ranto
Copy link
Contributor

b-ranto commented Apr 20, 2018

It is nice to have a way to show the user the actual id like 0 instead of osd.0 in grafana without doing any label_replaces.

@jan--f
Copy link
Contributor Author

jan--f commented Apr 20, 2018

I'd be more inclined to use either label_replace or a relabel_config. This keeps the exporter slimmer and keeps the rundendancy out of the prometheus DB, at least for the label_replace case.

Also it looks like we might get a grafana feature soon that allows to edit label values. See grafana/grafana#7389 (comment)

@b-ranto
Copy link
Contributor

b-ranto commented Apr 20, 2018

Putting the rules to relabel these to the prometheus config might work fine but the additional disk/memory footprint of the ~redundancy would be really minimal though.

Anyway, I was hoping we could ~stabilize prometheus in luminous before 12.2.5 and that is getting really close to the release. I could add this to my latest back-port if we merged this quickly (ideally today) and I would not mind removing the id labels then. However, there still seems to be some issue with the docs build.

Having the data redundant would keep this mostly backwards compatible with the old version.

I was looking at the state of the grafana PR but it looks like it is ~stale for over a year, now. :-/


Ceph instance label
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the "Ceph instance label" section go away? I don't see how the changes here would affect the way stats end up with an "instance" label by default when scraped, which will change (undesirable) depending on which mgr is active

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this section along with the "honor_labels" section. I'd argue that one actually wants the correct instance labels. The info which mgr is active is otherwise lost.

in queries one can use the =~ label matcher to include all mgr instances. For example:

ceph_osd_up{instance=~".*:9283"}

returns data no matter which mgr is active. In Grafana dashboards can use this matching style in conjunction with dashboard templates.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That query would not be what you wanted in a multi-cluster environment (i.e. multiple clusters sharing one prometheus) -- we need the cluster stats to have some kind of identifier that isn't the mgr ID (those can change too easily) and probably also isn't the FSID (it's long and ugly).

Keeping the mgr hostname out of the labels isn't just about queryability, it's also about avoiding bloating the DB with N different time series for each counter for N mgr daemons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point taken regarding multi-cluster environments. How about I revise this section to document both the instance re-label approach and the additional label approach?

Keeping the mgr hostname out of the labels isn't just about queryability, it's also about avoiding bloating the DB with N different time series for each counter for N mgr daemons.

Do we know this, I'd actually be curious what overhead this incurs. I would argue that this shouldn't be an issue since there is always only one mgr exporting metrics. I would hope that the overhead of say three timeseries of length 1 is negligible compared to one timeseries of length 3 (given otherwise more or less the same data/labels). As far as I understand this is exactly what prometheus does well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked into it in detail (working from vague recollections of presentations about the data format), but I wouldn't assume that the impact of having 3x time series would be negligible -- there will only be data in any one of them at any given time offset, but some indices may be 3x bigger in any range where multiple mgrs were outputting.

As in my other comment at the end of the PR, I'm kind of unclear on what's preferable about using hostname= rather than instance= here?

@jan--f
Copy link
Contributor Author

jan--f commented Apr 20, 2018

@b-ranto I understand the desire for stability. I figured I might have a chance to fix this properly before having backwards compatibility issues.

Regarding the doc build, seems like Jenkins had an issue with cloning the ceph repo. Do you guys know the magic command to reschedule this?

@b-ranto
Copy link
Contributor

b-ranto commented Apr 20, 2018

retest this please

1 similar comment
@tchaikov
Copy link
Contributor

retest this please

@liewegas liewegas added this to the mimic milestone Apr 23, 2018
@jcsp
Copy link
Contributor

jcsp commented Apr 23, 2018

The "ceph_daemon" part makes sense to me, but I think I'm missing the reasoning for using "hostname" (and relabelling node_exporter output to add the label) is easier than using "instance" (and letting mgr use honor_labels to populate it). The resulting queries are basically the same, right?

On motivation for using instance rather than an extra hostname label is in the case where someone is getting the daemon stats directly from daemons (not via mgr), as we expect some people will at scale -- for those people, their instance labels really will correspond to where the daemon is running, and they will be able to correlate their daemon stats with node_exporter stats without doing any relabelling.

Regarding keeping redundant id labels in addition to ceph_daemon, I think it depends what (if anything) is already in the wild depending on the old format (having an extra label in there for display convenience is not a good enough reason IMHO). @b-ranto is your interest in the specific 12.2.5 release because of a downstream scheduling thing? If so, I'd suggest we make this change and carry a patch downstream, rather than doubling up the labels in perpetuity.

@jan--f
Copy link
Contributor Author

jan--f commented Apr 23, 2018

The "ceph_daemon" part makes sense to me, but I think I'm missing the reasoning for using "hostname" (and relabelling node_exporter output to add the label) is easier than using "instance" (and letting mgr use honor_labels to populate it). The resulting queries are basically the same, right?

The rename to "hostname" is certainly arbitrary and I don't have strong feeling regarding that label name. I was more after a more flexible approach...or rather the documentation of one. The current state in master requires a user to set honor_labels and add a labels stanza to their node_exporter scrape target. If one leaves out the former prometheus adds the (original) instance label as "exported_instance", which I found confusing (but can certainly live with).
The latter might not be straight forward to add for all deployments. Say a deployment that manages the node_exporter scrape targets using a the file_sd_config mechanism with a config like:

[
  {
    "targets":
      [ 'node1:9100', 'node2:9100', 'node3:9100', 'node4:9100' ],
    "labels": {}
  }
]

would need to change their approach.

So what I'm really after is to document a prometheus config approach that works for everyone. I'd be happy if we documented both approaches, honor_lables and a label config, and using onyl label_rewrite's.

Add another option to correlate metrics from the node_exporter and Ceph
using label_replace.

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
@jan--f jan--f force-pushed the mgr-prometheus-fix-metadata-labels branch from 8956be0 to 331a826 Compare April 26, 2018 15:23
@jan--f
Copy link
Contributor Author

jan--f commented Apr 26, 2018

How's this @jcsp?

Ceph instance label
-------------------
This allows Ceph to export the proper ``instance`` label without prometheus
overwriting it. Without this setting, Prometheus applies an ``instance`` label
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we leave honor_labels in, I think it also makes sense to leave the "instance: ceph_cluster" part in below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this overwrite the instance label exported by the exporter, breaking the example query?

The label is also never referenced again. An example or explanation what to do witht his label would be nice.

@@ -79,34 +79,46 @@ drive statistics, special series are output like this:

::

ceph_disk_occupation{ceph_daemon="osd.0",device="sdd",instance="myhost",job="ceph"}
ceph_disk_occupation{ceph_daemon="osd.0",device="sdd", exported_instance="myhost"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This says exported_instance but the latest code change is using the original plain "instance"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what a user will see without setting honor_labels though. Maybe a note in the honor_labels section that this setting causes this label transformation?

@@ -145,7 +148,8 @@ Example configuration
---------------------

This example shows a single node configuration running ceph-mgr and
node_exporter on a server called ``senta04``.
node_exporter on a server called ``senta04``. Note that this requires to add the
appropriate instance label to every ``node_exporter`` target individually.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add the relabel config in here for nodes to generate the shortname instance label by chopping off the end of the default instance label, and remove the explicit per-endpoint entry -- that part is orthogonal to the rest and doesn't seem to have any downsides to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. I'm happy to propose somehting, maybe in a separate PR though?

@jcsp
Copy link
Contributor

jcsp commented May 2, 2018

jenkins render docs

@ceph-jenkins
Copy link
Collaborator

Doc render available at http://docs.ceph.com/ceph-prs/21557/

@b-ranto b-ranto dismissed their stale review May 2, 2018 18:45

We can still do changes like this, the prometheus api is not stable enough, yet.

@liewegas liewegas changed the base branch from master to mimic May 3, 2018 18:17
@liewegas
Copy link
Member

liewegas commented May 3, 2018

@jan--f
Copy link
Contributor Author

jan--f commented May 4, 2018

I'd be ok merging this. @jcsp, @b-ranto what do you think?

@jcsp
Copy link
Contributor

jcsp commented May 4, 2018

I'd like to improve the docs a bit further to pick one approach and present it as the standard, but the code changes are definitely ready to go.

@jcsp jcsp merged commit 4f98ff5 into ceph:mimic May 4, 2018
@yangdongsheng
Copy link
Contributor

yangdongsheng commented Jun 5, 2019

@jcsp @jan--f It's SURPRISE that the id disappeared when I upgrade my ceph from 12.2.5 to 12.2.12. And we have an service to use id label in our system, but it is not working now. I am curious, why we make this change so urgent even it would break backward compatibility?

@b-ranto
Copy link
Contributor

b-ranto commented Jun 5, 2019

@yangdongsheng The prometheus exporter code was not really stable back then (in the earlier luminous releases). This should be improved, now. You can use ceph_daemon instead of the id to get the similar data to what you would normally get from the id label. This generally simplifies matching in the queries a lot.

@yangdongsheng
Copy link
Contributor

yangdongsheng commented Jun 5, 2019

@b-ranto Thanx for your information. I will get what we want in another way, something as you suggested. But I really don't think that's a good idea to change something would be used by application in upper layer, which could be a disaster in production scenerio. At least, we should keep the backward compatibility in luminous, until the last version of 12.2.X. Maybe we should be more careful in backporting. :)

@b-ranto
Copy link
Contributor

b-ranto commented Jun 5, 2019

@yangdongsheng Yeah, I raised a concern about it but eventually, we decided to change it anyway hoping that noone would be using the old format, yet. Obviously, we were wrong. :-/

The things should be better, now. We shouldn't be doing any backwards incompatible changes in stable branches anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants