Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

snmp: possible fix for oid key collision when there is an oid key partially matches with current oid_key #1044

Closed
wants to merge 2 commits into from

Conversation

relip
Copy link
Contributor

@relip relip commented Apr 16, 2016

chronograf_-_chrome_2016-04-17_02-06-13

I found this strange when I first set up Chronograf and Telegraf, it doesn't make sense that the graph falls to negative value. From the graph, I suspected there is some kind of input collision so DERIVATIVE() returns negative value, since I found several same values from both eth0 (instance id: 1) and the other network interface with instance id 11.

To see if my assumption is right, I put the following line to this line:

log.Printf("[snmp input] oid_key: %s, variable.Name: %s", oid_key, variable.Name)

and the log shows:

2016/04/17 01:52:20 [snmp input] oid_key: .1.3.6.1.2.1.31.1.1.1.10.5, variable.Name: .1.3.6.1.2.1.31.1.1.1.10.57

I'm not an expert on SNMP nor Telegraf plugins, not sure if this change won't make any side effects.

@sparrc
Copy link
Contributor

sparrc commented Apr 16, 2016

@relip you appear to have made the unit tests begin failing, I'll try to take a closer look sometime this week unless you can figure it out

@relip
Copy link
Contributor Author

relip commented Apr 17, 2016

@sparrc seems 2016/04/16 17:55:51 Reading SNMPtranslate file error: open bad_oid.txt: no such file or directory this is the error, but I don't see any relevance between the file change and the error. Maybe it was something wrong at the first place?

@sparrc
Copy link
Contributor

sparrc commented Apr 17, 2016

@relip I think you can ignore that message, that's just a log message not a test failure, the relevant failures are:

--- FAIL: TestSNMPBulk1 (0.02s)
    Error Trace:    accumulator.go:141
            snmp_test.go:365
    Error:      unknown measurement ifOutOctets with tags map[unit:octets instance:1 host:localhost]


    Error Trace:    accumulator.go:141
            snmp_test.go:377
    Error:      unknown measurement ifOutOctets with tags map[unit:octets instance:2 host:localhost]


    Error Trace:    accumulator.go:141
            snmp_test.go:389
    Error:      unknown measurement ifOutOctets with tags map[unit:octets instance:3 host:localhost]


    Error Trace:    accumulator.go:141
            snmp_test.go:401
    Error:      unknown measurement ifOutOctets with tags map[unit:octets instance:36 host:localhost]

@titilambert
Copy link
Contributor

@relip I think you made an error. Could you tell me which OID is represented in your graph ?

@relip
Copy link
Contributor Author

relip commented Apr 18, 2016

@titilambert Same as the debug output, .1.3.6.1.2.1.31.1.1.1.10.5.

@relip
Copy link
Contributor Author

relip commented Apr 18, 2016

Just pushed a follow up commit that would fix the testing issue. @sparrc @titilambert

@titilambert
Copy link
Contributor

titilambert commented Apr 18, 2016

@relip you have to know that .1.3.6.1.2.1.31.1.1.1.10.5 OID is a counter64. This means that when the max value of the counter is reached, the counter is reset to 0. This implies negative derivative.
You have to use non-derivative function to get proper values...
So, I think variable.Name == oid_key is not really useful...

@relip
Copy link
Contributor Author

relip commented Apr 18, 2016

@titilambert ah, yeah, I need to change that. That is SNMP's intended behavior, however, I guess this is still a bug, right?

@titilambert
Copy link
Contributor

@relip I don't think there is a bug. But I think this stuff: strings.HasPrefix(variable.Name, oid_key+"." is a good improvement :)

@sparrc
Copy link
Contributor

sparrc commented Apr 18, 2016

@relip There is a NON_NEGATIVE_DERIVATIVE function that you should be using instead

@relip
Copy link
Contributor Author

relip commented Apr 18, 2016

@titilambert Reading the value of .1.3.6.1.2.1.31.1.1.1.10.57 when the plugin is looking for .1.3.6.1.2.1.31.1.1.1.10.5 is intended?

The values of the graph above are measured when there is no ingoing/outgoing traffic on that interface.

@sparrc Thanks

@titilambert
Copy link
Contributor

titilambert commented Apr 18, 2016

@relip You're right ! Could you try just with strings.HasPrefix(variable.Name, oid_key+"." and tell me if it fixes your bug ?

@relip
Copy link
Contributor Author

relip commented Apr 18, 2016

@titilambert Apart from my case, I guess removing == will cause ignoring oids when oid in collect and variable.Name is exactly the same?

@relip
Copy link
Contributor Author

relip commented Apr 18, 2016

Okay changed it and compiled it and now it completely stopped sending snmp data to the database.

@sparrc
Copy link
Contributor

sparrc commented Apr 18, 2016

I would think that you would need both cases, because if the OID exactly matches it won't have a trailing ".", correct? @titilambert what is your holdup on having both cases in the if statement?

@titilambert
Copy link
Contributor

@sparrc Ho yeah you're right !

@relip
Copy link
Contributor Author

relip commented Apr 19, 2016

correct...

@sparrc sparrc closed this in 46543d6 Apr 19, 2016
@sparrc
Copy link
Contributor

sparrc commented Apr 19, 2016

thanks @relip! Do either of you know if this might fix #1043 or #961 ?

@relip
Copy link
Contributor Author

relip commented Apr 21, 2016

@sparrc possibly #961, but I don't think #1043 is related to this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants