-
-
Notifications
You must be signed in to change notification settings - Fork 341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unbound occasionally reports broken stats #485
Comments
The commit fixes the issue by making sure that the values are not negative. It then turns to 0. It also casts the size_t to long long to stop what may be an integer overflow causing the negative numbers in the division function. This likely stops the wrong statistic report, but I do not know for sure; I guess it was related to a 32bit compile. Thanks for the report! |
Whoa! That's a pretty quick fix @wcawijngaards. Thank you so much 🍻 |
* nlnet/master: - Remove case fallthrough from deprecate-rsa-1024 code. - Add ./configure --with-deprecate-rsa-1024 that turns off RSA 1024. - Fix NLnetLabs#485: Unbound occasionally reports broken stats. - Rerun flex and bison. - Fix to squelch tcp socket bind failures when the interface is gone. - Add more logging for out-of-memory cases. - Fix for NLnetLabs#367: only attempt to get the interface for queries that are no longer on the tcp_waiting_list. Clearer template text since not everyone can reopen GitHub issues. Changelog note for NLnetLabs#478 - Merge NLnetLabs#478: Allow configuration of TCP timeout while waiting for response. Changelog note and improved comment. - Fix NLnetLabs#481: Fix comment in configuration file. doc/example.conf.in: Clarify comment for `auto-trust-anchor-file` - Add that log-servfail prints an IP address and more information about one of the last failures for that query. Allow configuration of TCP timeout while waiting for response Create issue templates - Fix compiler warning for signed/unsigned comparison for max_reuse_tcp_queries. - Fix NLnetLabs#474: always_null and others inside view.
Describe the bug
Some of the time stats reported by unbound through the control socket formats floats incorrectly and this broke unbound_exporter for us. The values are unexpectedly negative and in the wrong format.
Example stats:
Prometheus exporter failed to parse these lines:
The exporter barfs on the first parse error and reports unbound as down.
It's interesting how avg is negative but median is not. That may help us reverse engineer and figure out what happened?
I can try to reproduce and find more information if required. Unfortunately the running binary didn't have debug symbols so I couldn't extract all local variables with gdb.
To reproduce
Expected behavior
System:
I know its an old version, but it doesn't look like the metrics code changed a lot since then. I couldn't find anything related in the changelog either.
Thanks!
The text was updated successfully, but these errors were encountered: