Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prometheus metrics support #352

Open
appliedprivacy opened this issue Nov 22, 2020 · 8 comments
Open

prometheus metrics support #352

appliedprivacy opened this issue Nov 22, 2020 · 8 comments

Comments

@appliedprivacy
Copy link

Hi,
it would be great to see prometheus metrics support directly in unbound,
this would make 3rd party exporter tools with varying quality unnecessary and the exporte would always be compatible with unbound since it is directly integrated. NLnetLabs appears to agree that prometheus makes sense since other NLnetLabs projects incorporate it already (like routinator).

thanks!

@wcawijngaards
Copy link
Member

There already seems to be a project in https://github.com/svartalf/unbound-telemetry that does this.

@wcawijngaards
Copy link
Member

In the commit there is the file https://github.com/NLnetLabs/unbound/blob/master/contrib/metrics.awk . You could use this file with eg. unbound-control stats | awk -f metrics.awk and that produces Prometheus format output. The graphs are like what contrib/unbound_munin_ produces. I have not tested it in prometheus, but it may be helpful to output the unbound statistics into Prometheus. Likely needs some grafana config too, eg. type stacked for the histogram.

jedisct1 added a commit to jedisct1/unbound that referenced this issue Dec 1, 2020
* nlnet/master: (117 commits)
  - Fix NLnetLabs#358: Squelch udp connect 'no route to host' errors on low   verbosity.
  Changelog entry for rc tags 1.13.0rc3 and rc4.
  - Fix assertion failure on double callback when iterator loses   interest in query at head of line that then has the tcp stream   not kept for reuse.
  - Fix contrib/metrics.awk for FreeBSD awk compatibility.
  - Fix compile warnings in rpz initialization.
  - Fix compile warnings for windows.
  - Fix when use free buffer to initialize rbtree for stream reuse.
  - Fix compile warning for type cast in http2_submit_dns_response.
  - Clear readagain upon decommission of pending tcp structure.
  - Fix that after failed read, the readagain cannot activate.
  - For NLnetLabs#352: contrib/metrics.awk for Prometheus style metrics output.
  - Fix to omit UDP receive errors from log, if verbosity low.   These happen because of udp-connect.
  - tag for the 1.13.0rc2 release.
  - Fix readagain and writeagain callback functions for comm point   cleanup.
  - Attempt fix for libevent state in tcp reuse cases after a packet   is written.
  - Fix memory leak for edns client tag opcode config element.
  - Remove debug commands from reuse tests.
  - Better fix for reuse tree comparison for is-tls sockets.  Where   the tree key identity is preserved after cleanup of the TLS state.
  - Fix udp-connect on FreeBSD, do send calls on connected UDP socket.
  - with udp-connect ignore connection refused with UDP timeouts.
  ...
@appliedprivacy
Copy link
Author

Thanks for your feedback.
The feature request was specifically for unbound itself to avoid having to use third party tools like
https://github.com/kumina/unbound_exporter

The issue with the awk approach is that the data is generated asynchronously, meaning that the data does not represent unbound's state when prometheus came along end fetched it.

@wcawijngaards
Copy link
Member

What is the awk issue? Asynchronous? Do you want cumulative numbers or something? The awk script runs very quick, so it does not delay the measurement in that sense.

@appliedprivacy
Copy link
Author

Maybe I misunderstood, but I assumed the awk command runs at a fixed interval (cronjob) and writes the output into a file served by a webserver (instead of running when prometheus asks for it).

We will test with the noreset version of stats
unbound-control stats_noreset | awk -f metrics.awk

Thanks!

@jsha
Copy link

jsha commented Jan 23, 2023

At Let's Encrypt we deploy Unbound and unbound_exporter. We also semi-recently took over maintenance of the prometheus unbound_exporter tool: https://github.com/letsencrypt/unbound_exporter.

Some reasons we would prefer to see Prometheus metrics exported directly by Unbound:

  • Right now the stats privilege is conflated with general control privilege, but it doesn't have to be. Unbound's default notion of retrieving stats also resets them, but in Prometheus stats are never reset, so allowing a machine to fetch stats is a low-privilege operation that can be separated from the control privilege. In our environment I think we would turn off unbound-control if we could get stats without it.
  • Managing configuration of two components on a host that talk to each other, along with a control channel, is more complex than managing a single binary. This is particularly true when using containers, which typically assume they are responsible for a single process. It's possible to get around that assumption but it makes the deployment story more complex.
  • We have to maintain a build target and deployment code for unbound_exporter in each of our environments, as well as Unbound, and keep both up to date.
  • In general, it would be nice to be able to deprecate unbound_exporter.

Also, to clarify: Prometheus stats are generally fetched over the network, via an HTTP GET. So the awk script only solves a part of the problem. What we would really like is an option for Unbound to serve stats in Prometheus format via HTTP.

@james-stevens
Copy link

And while you're at it, would be nice to have prometheus metrics directly out of nsd as well ;)

@james-stevens
Copy link

Maybe I misunderstood, but I assumed the awk command runs at a fixed interval (cronjob) and writes the output into a file served by a webserver (instead of running when prometheus asks for it).

We will test with the noreset version of stats unbound-control stats_noreset | awk -f metrics.awk

Thanks!

prob the easiest way to use it, to integrate with prometheus, would be to run it from xinetd (or similar)

But, TBH, we just use the LetsEncrypt exporter - it does a fine job.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants