New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Corrupted values being sent from collectd to itself. #2209
Comments
For clarification, it only seems to affect |
Did some debug printfs inside the
|
Ah, yes I am indeed looking in the wrong place. This seems to be related with |
A couple more printf's
So the call to |
I think something like this would prevent collectd sending the derivative value if the call to lookup the stored rates fails for some unknown reason:
@octo - Any thoughts yet? |
Hi, We do not use the "network" plugin, but only "write_graphite". This is our write_graphite config:
|
We're also seeing intermittent unreasonable values for CPU. CPU plugin has no special parameters. write_graphite config:
Package: collectd-5.7.1.13.gf7e2d82-1.el7.centos.x86_64.rpm Installed from a mirror of the CI builds. |
I've started noticing this for all 936c450 I can't see anything else that has changed around |
thank you very much for reporting this and your analysis! I think not checking the return value of Thanks and best regards, |
Sure, that would be the corrective logic I think. Would still be nice to know out why it started returning |
Great, thanks! Yes, that would also be interesting to know. Buffers / caching in the network plugin and the cache entry timeout would be the first thing I'd look into. How many metrics are you sending? Do you fill an entire buffer on each iteration? |
There are two servers receiving metrics, one is receiving 30k metrics a second, the other 8k metrics a second. Both are having the same bug. I don't think it's an issue with sending metrics, as between the two servers, only one of them sends a bad value to graphite.
Very likely, we compile with |
This prevents a wrong value being sent to graphite for DERIVE types. See #2209 Signed-off-by: Florian Forster <octo@collectd.org>
Seeing the same issue with derive values on apache plugin (with apache_requests) and the interface plugin (with if_octets). I'm willing to help. Is there anything we can do? |
The immediate fix should have been released in 5.7.2. So I think this is good to close. |
@magnetik this was an error in the JSON formatting, i.e. in the output plugins and will affect all (derive) metrics, including those from the Apache plugin. If you're still having this issue after 5.7.2, please open a separate issue. |
Indeed no problem in 5.7.2. Thanks a lot to everyone who contributed in finding/fixing it, you guys are awesome. 👍 |
Expected behavior
A collectd instance that is also configured as a server is occasionally sending nonsense values to itself from common plugins, such as cpu, disk or swap. However the secondary server is receiving the values uncorrupted.
Both servers should receive the same uncorrupted values.
This is the (abridged) config for server
10.0.0.2
. The reverse configuration is done for10.0.0.3
Actual behavior
tcpdump between 10.0.0.2 and localhost relay:
tcpdump between secondary server 10.0.0.3 and relay
Steps to reproduce
Happened since upgrade to 5.7.1.
The text was updated successfully, but these errors were encountered: