New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mgr: prometheus fix metadata labels #21557
Conversation
This commit makes label names that are supposed to be used for matching consistent. Prometheus can not match on say osd_metadata{id=1} and osd_up{ceph_daemon=osd.1}. We want to use the label ceph_daemon=osd.1 on all metrics. Signed-off-by: Jan Fajerski <jfajerski@suse.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add the ceph_daemon label instead of replacing 'id' labels with it?
(I am ok with replacing 'instance' label with ceph_daemon in disk_occupation).
@b-ranto may I ask what the motivation for having both labels is (with somewhat redundant values)? |
It is nice to have a way to show the user the actual |
I'd be more inclined to use either Also it looks like we might get a grafana feature soon that allows to edit label values. See grafana/grafana#7389 (comment) |
Putting the rules to relabel these to the prometheus config might work fine but the additional disk/memory footprint of the ~redundancy would be really minimal though. Anyway, I was hoping we could ~stabilize prometheus in luminous before 12.2.5 and that is getting really close to the release. I could add this to my latest back-port if we merged this quickly (ideally today) and I would not mind removing the id labels then. However, there still seems to be some issue with the docs build. Having the data redundant would keep this mostly backwards compatible with the old version. I was looking at the state of the grafana PR but it looks like it is ~stale for over a year, now. :-/ |
doc/mgr/prometheus.rst
Outdated
|
||
Ceph instance label |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does the "Ceph instance label" section go away? I don't see how the changes here would affect the way stats end up with an "instance" label by default when scraped, which will change (undesirable) depending on which mgr is active
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this section along with the "honor_labels" section. I'd argue that one actually wants the correct instance labels. The info which mgr is active is otherwise lost.
in queries one can use the =~
label matcher to include all mgr instances. For example:
ceph_osd_up{instance=~".*:9283"}
returns data no matter which mgr is active. In Grafana dashboards can use this matching style in conjunction with dashboard templates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That query would not be what you wanted in a multi-cluster environment (i.e. multiple clusters sharing one prometheus) -- we need the cluster stats to have some kind of identifier that isn't the mgr ID (those can change too easily) and probably also isn't the FSID (it's long and ugly).
Keeping the mgr hostname out of the labels isn't just about queryability, it's also about avoiding bloating the DB with N different time series for each counter for N mgr daemons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Point taken regarding multi-cluster environments. How about I revise this section to document both the instance re-label approach and the additional label approach?
Keeping the mgr hostname out of the labels isn't just about queryability, it's also about avoiding bloating the DB with N different time series for each counter for N mgr daemons.
Do we know this, I'd actually be curious what overhead this incurs. I would argue that this shouldn't be an issue since there is always only one mgr exporting metrics. I would hope that the overhead of say three timeseries of length 1 is negligible compared to one timeseries of length 3 (given otherwise more or less the same data/labels). As far as I understand this is exactly what prometheus does well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked into it in detail (working from vague recollections of presentations about the data format), but I wouldn't assume that the impact of having 3x time series would be negligible -- there will only be data in any one of them at any given time offset, but some indices may be 3x bigger in any range where multiple mgrs were outputting.
As in my other comment at the end of the PR, I'm kind of unclear on what's preferable about using hostname= rather than instance= here?
@b-ranto I understand the desire for stability. I figured I might have a chance to fix this properly before having backwards compatibility issues. Regarding the doc build, seems like Jenkins had an issue with cloning the ceph repo. Do you guys know the magic command to reschedule this? |
retest this please |
1 similar comment
retest this please |
The "ceph_daemon" part makes sense to me, but I think I'm missing the reasoning for using "hostname" (and relabelling node_exporter output to add the label) is easier than using "instance" (and letting mgr use honor_labels to populate it). The resulting queries are basically the same, right? On motivation for using Regarding keeping redundant |
The rename to "hostname" is certainly arbitrary and I don't have strong feeling regarding that label name. I was more after a more flexible approach...or rather the documentation of one. The current state in master requires a user to set
would need to change their approach. So what I'm really after is to document a prometheus config approach that works for everyone. I'd be happy if we documented both approaches, honor_lables and a label config, and using onyl label_rewrite's. |
Add another option to correlate metrics from the node_exporter and Ceph using label_replace. Signed-off-by: Jan Fajerski <jfajerski@suse.com>
8956be0
to
331a826
Compare
How's this @jcsp? |
Ceph instance label | ||
------------------- | ||
This allows Ceph to export the proper ``instance`` label without prometheus | ||
overwriting it. Without this setting, Prometheus applies an ``instance`` label |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we leave honor_labels in, I think it also makes sense to leave the "instance: ceph_cluster" part in below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this overwrite the instance label exported by the exporter, breaking the example query?
The label is also never referenced again. An example or explanation what to do witht his label would be nice.
@@ -79,34 +79,46 @@ drive statistics, special series are output like this: | |||
|
|||
:: | |||
|
|||
ceph_disk_occupation{ceph_daemon="osd.0",device="sdd",instance="myhost",job="ceph"} | |||
ceph_disk_occupation{ceph_daemon="osd.0",device="sdd", exported_instance="myhost"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This says exported_instance but the latest code change is using the original plain "instance"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what a user will see without setting honor_labels though. Maybe a note in the honor_labels section that this setting causes this label transformation?
@@ -145,7 +148,8 @@ Example configuration | |||
--------------------- | |||
|
|||
This example shows a single node configuration running ceph-mgr and | |||
node_exporter on a server called ``senta04``. | |||
node_exporter on a server called ``senta04``. Note that this requires to add the | |||
appropriate instance label to every ``node_exporter`` target individually. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could add the relabel config in here for nodes to generate the shortname instance label by chopping off the end of the default instance label, and remove the explicit per-endpoint entry -- that part is orthogonal to the rest and doesn't seem to have any downsides to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me. I'm happy to propose somehting, maybe in a separate PR though?
jenkins render docs |
Doc render available at http://docs.ceph.com/ceph-prs/21557/ |
We can still do changes like this, the prometheus api is not stable enough, yet.
clean run, fwiw http://pulpito.ceph.com/sage-2018-05-03_16:47:35-rados-wip-sage2-testing-2018-05-03-0859-distro-basic-smithi/ is this ready to merge? |
I'd like to improve the docs a bit further to pick one approach and present it as the standard, but the code changes are definitely ready to go. |
@yangdongsheng The prometheus exporter code was not really stable back then (in the earlier luminous releases). This should be improved, now. You can use |
@b-ranto Thanx for your information. I will get what we want in another way, something as you suggested. But I really don't think that's a good idea to change something would be used by application in upper layer, which could be a disaster in production scenerio. At least, we should keep the backward compatibility in luminous, until the last version of 12.2.X. Maybe we should be more careful in backporting. :) |
@yangdongsheng Yeah, I raised a concern about it but eventually, we decided to change it anyway hoping that noone would be using the old format, yet. Obviously, we were wrong. :-/ The things should be better, now. We shouldn't be doing any backwards incompatible changes in stable branches anymore. |
I changed some label names to make it easier to correlate metrics in prometheus. Most notably this changes the id labels in the metadata metrics to ceph_daemon, as well as the content of these labels to correspond with the values of the daemon metrics (e.g.
ceph_osd_metadata{ceph_daemon="osd.1"}
). While this adds a bit of redundant information it makes it much easier to correlate metrics in prometheus.I also renamed the
instance
label in the disc_occupation metric and updated the docs on how to correlate metrics with (imho) less complex instructions.