Add new Network UPS Tools exporter to the exporters list#1749
Add new Network UPS Tools exporter to the exporters list#1749brian-brazil merged 1 commit intoprometheus:masterfrom DRuggeri:master
Conversation
Signed-off-by: Daniel Ruggeri <druggeri@primary.net>
|
Could this supersede the apcupsd we currently list, or is it doing something different? Some advice to help improve your exporter: variable should be part of the metric name, not a label. The various last_scrape metrics aren't very useful, I'd suggest removing them. I'd also suggest not reporting an error with a gauge, but instead failing the scrape. You're also not catching when the 2nd RPC fails. The _device metric is an info metric, so should end with _info. It also seems to be missing many of the labels I'd expect it to have given the NUT docs - I'd expect everything bar uptime. Why not collect everything, rather than having the user provide a list? |
Yes, but the use case may depend. That's why I created this exporter (I'm changing UPS vendors). The
I know we discussed this in a separate PR and guess I built this from an old template. I've switched to the Desc/MustNewConst pattern. I also did make sure that all logic from an invocation of Collect is encapsulated to avoid race conditions (i.e. if there are concurrent scrapes, it would result in a session with the UPS daemon for each scrape rather than stepping on a common session).
I suppose this could be done by gathering the requested variables during init and then building metrics dynamically from the list, but each UPS will have an unknown set of variables available depending on level of support, NUT driver, and UPS "openness". In the case of a hard-coded list of variables to export, that's not too hard. However, in the case that the administrator wants all variables by setting the
OK - I've gone ahead and removed them. They are leftovers from the exporter that I built this one from. What is the idiomatic way to fail the scrape, though, and how does one detect the failure from from the alertmanager side of things? The Collect method doesn't return an
I think you're referring to the population of
I've renamed it in 3b8fe22e62d8b43afd40067db37d73cb6edbe273. Can you say more about the labels you expect, but are missing? These are held in the
Oops - this is already possible, I just forgot to mention it in the README (fixed in 58135dbb3dce25790e1091a2772e8f2be2950048). This can be done by setting |
That's making it more complicated than it needs to be, you would usually build up a map as you go. However the Go client will do all of this for you when you're using MustNewConstMetric.
You're making things much more difficult and error prone for users to query for no potential benefit.
https://www.robustperception.io/failing-a-scrape-with-the-prometheus-go-client
Silently missing data is not good, it's simplest to fail the scrape if any part of the data collection fails.
Everything bar uptime from that list, as uptime is both a number and changing over time.
Personally as there's only ~200 of them which is small, I'd keep it simple and always expose everything. You're not avoiding any load on the target, you're not going to have that many UPSes, and if someone really wants to micro-optimise for some reason they can use metric relabelling. You also should decide if this is a single target or multi-target exporter, in particular why is Ups an option? If it should be configurable by the user then it should come from a URL parameter rather than a config file and the target label device not be added in, as service discovery is Prometheus's responsibility.
You should leave it as the empty string if it's missing, generally try to keep the data as raw as you can. |
What am I missing?
|
Up will be 0, as for any failed scrape.
That looks right, I misread the code.
By design, you don't. Instead do what the snmp exporter does, create a collector and registry per request. |
I see. OK - I've done this by instantiating a new collector during every scrape with a custom handler (all the base options set on command line are passed to the new collector, but the name of the ups is passed in the query string). This required a separate handler, so I've kept the UPS metrics at /metrics?ups=foo and left the usual This may be a naive question, but is it an expensive operation to create a registry and handlers for each request or is that just splitting hairs? |
|
Generally /metrics is the metrics of the exporter itself, and some other path would have the ups metrics. The "ups" label should be removed, that is the responsibility of Prometheus service discovery to provide. In general you should never have the same labelpair across all of a metrics page.
It's going to be effectively nothing. |
OK, no worries. I've swapped them as proposed.
Sure - done - and I've added some notes in README as a helpful breadcrumb trail for those with multiple UPS devices configured in a single NUT server. I think we should be good to go at this point - thanks for all the feedback! |
|
Thanks! |
Another COVID project as I work on the network and instrumenting more and more of it.
Feedback always welcome