New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mgr/prometheus: add pool crush rule and crush rule metadata #42216
Conversation
e0d9c5d
to
44eddd1
Compare
@clwluvw, I think crush_rule missed |
@clwluvw, yes, I mean crush device class, cause:
without |
@k0ste How are you assigning one crush_rule to two different pools with different device classes? 🤔 |
@clwluvw oops, overhead in abstraction. Anyway, it's will be nice to know what |
547604d
to
69d9aeb
Compare
@k0ste I've added it as a |
@epuertat Can you please help to review this? |
@clwluvw probably @tchaikov is your person here! That said, are we sure we really want to report the CRUSH rules every 10-15 seconds? Is this a metric that can be somehow displayed in Grafana? And do you need time-series of that? I think that's more of metadata thing and you want to know the current value not the last 30 days progression, don't you? @p-se @aaSharma14 fyi. |
Yes correct it's just a metadata metric like ceph_osd_metadata or other metadata metrics that can help to have graphs with more metadata. |
@clwluvw could you please 'draft' what you are trying to get here? For pseudo-static information or data whose temporal progression is not relevant, I'd rather move it to the Ceph-Dashboard (landing page or pools page). |
Actually we need this in Prometheus. Not in dashboard. |
@epuertat I'm trying to have crush rule info in prometheus metrics as a info metrics to join them to the pool metrics for example and have some infos on them |
ping @epuertat :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting a Traceback.
Traceback (most recent call last):
File "/home/user/src/ceph/src/pybind/mgr/prometheus/module.py", line 232, in collect
data = self.mod.collect()
File "/home/user/src/ceph/src/pybind/mgr/mgr_util.py", line 753, in wrapper
result = f(*args, **kwargs)
File "/home/user/src/ceph/src/pybind/mgr/prometheus/module.py", line 1238, in collect
self.get_metadata_and_osd_status()
File "/home/user/src/ceph/src/pybind/mgr/mgr_util.py", line 753, in wrapper
result = f(*args, **kwargs)
File "/home/user/src/ceph/src/pybind/mgr/prometheus/module.py", line 888, in get_metadata_and_osd_status
rule['ruleset'],
KeyError: 'ruleset'
It appears that ruleset
equals to rule_id
in newer Ceph versions (and I don't get it returned on the CLI either on the master branch):
ceph/src/crush/CrushWrapper.cc
Lines 3000 to 3015 in ade690b
encode(crush->rules[i]->len, bl); | |
/* | |
* legacy crush_rule_mask was | |
* | |
* struct crush_rule_mask { | |
* __u8 ruleset; | |
* __u8 type; | |
* __u8 min_size; | |
* __u8 max_size; | |
* }; | |
* | |
* encode ruleset=ruleid, and min/max of 1/100 | |
*/ | |
encode((__u8)i, bl); // ruleset == ruleid | |
encode(crush->rules[i]->type, bl); |
ceph/src/crush/CrushWrapper.cc
Lines 3145 to 3149 in ade690b
__u8 ruleset; // ignore + discard | |
decode(ruleset, blp); | |
if (ruleset != i) { | |
throw ::ceph::buffer::malformed_input("crush ruleset_id != rule_id; encoding is too old"); | |
} |
You might want to simply take out ruleset
or set it rule_id
, but as I'm not a C++ developer, someone else might want to add a suggestion here.
Other than that, the PR looks okay to me. It does look like the data for the metadata metric has always been retrieved alongside other data, the PR just puts it into a metric. So performance-wise this shouldn't make a difference.
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
69d9aeb
to
95be87a
Compare
@p-se Thanks for your review. Yes you are right I've tested it with pacific v16.2.4 and this change wasn't backported to the pacific branch. As it doesn't even print in CLI in recent versions I've removed it from the exporter too. |
@clwluvw my main concern about this change is that it exposes no metrics at all (the metric value is constantly set to 1), so the metric is used just to convey almost static metadata:
Yes, I know there are other metadata (label) only metrics (e.g.: |
@epuertat To use pool metrics based on their crush rule this can be helped. for example, you want to know the pool usage or pool IOPS, or pool throughput based on its crush rule (this can be mean as a device class in most scenarios) or you want to create a dashboard for your SSD tier and wants to just see the pools that use the SSD disks (crush rule). |
@clwluvw and can't we simply label the pools with the labels you want to use for the queries? PromQL and most NoSQL/timeseries DBs are extremely efficient when the data is modeled based on the queries. In this very case: What I wouldn't do in any case is converting the Prometheus exporter in a new read-only API gateway for exposing all Ceph internal metadata. That's beyond the purpose of monitoring, and also Ceph provides multiple ways to do that (CLI, Dashboard UI, RESTful API, etc). |
We need device_class, not the rule name 🙂 |
I don't see any reference to the device class on this PR. |
@epuertat Yes we can set the crush rule metadata into pool metadata metric directly but I think make this metadata as a dedicated metric can help a lot because maybe others needed to use it for example with device class ( |
@clwluvw "a dedicated metric can help a lot because maybe others needed to use it" doesn't sound like a real use case to me. Prometheus is not a general purpose SQL database or API where you just expose things just in case someone needs them... For that you have the Dashboard API or the CLI. Prometheus is a time-series database specifically designed for monitoring/alerting purposes. |
@epuertat Yes in time crush rules can change. Also for example you can see in this doc that metadata metrics are not so far metrics (https://prometheus.io/docs/practices/naming/ - |
@clwluvw I've been talking about this change with @p-se offline. I'll try to wrap up my mind for the last time:
|
@epuertat It's a metric that is being used for statistics panels. Why shouldn't we have this? metrics aren't just for alerting and can be used for statistics panels like many metrics ceph and other standard prometheus exporters does! I don't get your point here, sorry :(
I told you the purpose past to you that can (At least I want to) use in monitoring to see pool metrics based on crush rule (that would be device class in many cases because crush rule matches to the device class |
@clwluvw I don't think the PR as it is, it's adding value to the whole Ceph community (maybe just for your private use case): there are no visible outcomes from this, no docs, no examples, no tests, nothing. That said I'm not blocking this: let someone else jump in and give their approval to this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clwluvw I'm struggling to see how this helps with alerts. We'll fire an alert based on POOL_NEARFULL or POOL_FULL - they're the things that get you out of bed right? The workflow beyond that is to look at the pool and determine action.
So how can we use this metric to improve the escalation?
The code is running osd crush dump at every refresh - we ideally we need to keep latency down. I haven't profile'd your code, but just comparing osd crush dump between a 3 osd cluster and against 34 osd's I see a .4s to .8s execution time
Before we could merge this I think we need to understand the impact the call would have on larger clusters.
Make sense?
@pcuzner It's not only for alerting purposes and If you read my previous comments it would be used for monitoring purposes as Prometheus can do it like many other metrics that are used just for monitoring purposes! Did you do your perf test without this change, too? Because I don't think I do any more queries in this change! |
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
unstale please |
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution! |
ceph_crush_rule_metadata
metriccrush_rule
label forceph_pool_metadata
metricFixes: https://tracker.ceph.com/issues/51563
Signed-off-by: Seena Fallah seenafallah@gmail.com