Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix wrong ASN discovery on non-BGP Devices #14948

Merged
merged 3 commits into from Dec 19, 2023

Conversation

Bierchermuesli
Copy link
Contributor

@Bierchermuesli Bierchermuesli commented Apr 7, 2023

There is a mighty error in bgp-peer discovery. Instead a snmp GetNext we should try to catch the first entry with a regular Get. If there is none, the device has no BGP enabled.

'/usr/bin/snmpgetnext' '-v2c' '-c' 'COMMUNITY' '-OQUsv' '-m' 'BGP4-MIB' '-M' '/opt/librenms/mibs:/opt/librenms/mibs/cisco' 'udp:HOSTNAME:161' 'bgpLocalAs'

a BGP enabled Host (Cisco NCS)

... GetNextRequest(27)  15.2
... GetResponse(30)  15.2.0=<ourASN>

versus a non-BGP enabled Host (Cisco Nexus)

... GetNextRequest(27)  15.2
... GetResponse(32)  16.9.1.1.1.1=1

or old Brocade Switch

... GetNextRequest(27)  15.2
... GetResponse(32)  16.1.1.1.1.1=1

Whit his fix a non BGP host is detected and bgpLocalAs will be NULLed

./discovery.php -h 269 -m bgp-peers -d
...
#### Load disco module bgp-peers ####
SNMP['/usr/bin/snmpget' '-v2c' '-c' 'COMMUNITY' '-OQUsv' '-m' 'BGP4-MIB' '-M' '/home/sgr/sync/code/librenms/mibs:/home/sgr/sync/code/librenms/mibs/cisco' 'udp:HOSTNAME:161' 'bgpLocalAs.0']
No Such Object available on this agent at this OID  
  
No BGP on hostSQL[UPDATE `devices` set `bgpLocalAs`=NULL WHERE device_id=? [269] 18.72ms] 

tcpdump

.... GetRequest(28)  15.2.0
.... GetResponse(28)  15.2.0=[noSuchObject]

This fixes #14752 or Point 1 in #14598 (comment) so we don't query the peering DB with nonexisting ASNs anymore

DO NOT DELETE THE UNDERLYING TEXT

Please note

Please read this information carefully. You can run ./lnms dev:check to check your code before submitting.

  • Have you followed our code guidelines?
  • If my Pull Request does some changes/fixes/enhancements in the WebUI, I have inserted a screenshot of it.
  • If my Pull Request makes discovery/polling/yaml changes, I have added/updated test data.

Testers

If you would like to test this pull request then please run: ./scripts/github-apply 14948
After you are done testing, you can remove the changes with ./scripts/github-remove. If there are schema changes, you can ask on discord how to revert.

@Bierchermuesli
Copy link
Contributor Author

Bierchermuesli commented Apr 7, 2023

Patch deployed in the field. I will give an update tomorrow how the discovery went. current situation:

select count(*), `bgpLocalAs` FROM `devices` WHERE `disabled` = 0 AND `ignore` = 0 AND bgpLocalAs < 10000 group by bgpLocal
As
    -> ;
+----------+------------+
| count(*) | bgpLocalAs |
+----------+------------+
|       48 |          0 |
|       73 |          1 |
|        3 |          3 |
|        9 |         41 |
|        5 |         80 |
|        2 |        131 |
|        1 |        197 |
|        2 |        508 |
|        2 |        514 |
+----------+------------+

@PipoCanaja PipoCanaja added Bug 🕷️ Discovery Device 🖥️ New or added device support labels Apr 7, 2023
@Bierchermuesli
Copy link
Contributor Author

Bierchermuesli commented Apr 11, 2023

my ~150 wrong ASns are gone... 🥳

regarding the failed CI task i need a second opinion: is there a chance of wrong test data in vrp_ce12804-withvrf ? @PipoCanaja (contributor)

$ lnms dev:simulate
$ snmpwalk -v 2c -c vrp_ce12804-withvrf 127.1.6.1:1161  '-m' 'BGP4-MIB' '-M' './mibs:./mibs/huawei' bgpLocalAs -On
.1.3.6.1.2.1.15.2 = No Such Instance currently exists at this OID

I assume bgpLocalAs: 1 in tests/data/vrp_ce12804-withvrf.json is wrong. as it's also not listed in tests/snmpsim/vrp_ce12804-withvrf.snmprec

Compared to another huawei (maybe also a bit wrong?)

$ snmpwalk -v 2c -c vrp_ne8000 127.1.6.1:1161  '-m' 'BGP4-MIB' '-M' './mibs:./mibs/huawei' bgpLocalAs -On
.1.3.6.1.2.1.15.2.0 = Wrong Type (should be INTEGER): Gauge32: 26479

Compared to a this one (which is pretty sure right...):

$ snmpwalk -v 2c -c iosxr_asr9001 127.1.6.1:1161  '-m' 'BGP4-MIB' '-M' './mibs:./mibs/cisco' bgpLocalAs -On
.1.3.6.1.2.1.15.2.0 = INTEGER: 65056

@PipoCanaja
Copy link
Contributor

PipoCanaja commented Apr 12, 2023

my ~150 wrong ASns are gone... 🥳

regarding the failed CI task i need a second opinion: is there a chance of wrong test data in vrp_ce12804-withvrf ? @PipoCanaja (contributor)

$ lnms dev:simulate
$ snmpwalk -v 2c -c vrp_ce12804-withvrf 127.1.6.1:1161  '-m' 'BGP4-MIB' '-M' './mibs:./mibs/huawei' bgpLocalAs -On
.1.3.6.1.2.1.15.2 = No Such Instance currently exists at this OID

I assume bgpLocalAs: 1 in tests/data/vrp_ce12804-withvrf.json is wrong. as it's also not listed in tests/snmpsim/vrp_ce12804-withvrf.snmprec

Wrong is may be a bit excessive 😄 Let say "incomplete" and "forged" 😄 . Usually because people collecting test data are doing it against a real equipment and they want to remove part of the private data.

But this is not a big issue. You just need to update the JSON files with the current snmprec data, and they will appear here in the Pull Request. This will help the reviewer to evaluate your code, and all tests must pass before the PR can be merged anyway.

@Bierchermuesli
Copy link
Contributor Author

hmm. can you elaborate my(?) ToDo a bit?
.1.3.6.1.2.1.15.2 does not exist in the vrp_ce12804-withvrf.snmprec at all. so i'm confusing how it comes to that json data at the first place?

@murrant
Copy link
Member

murrant commented Apr 14, 2023

I wonder about this check... seems like some devices implement bgp without implementing .1.3.6.1.2.1.15.2

@murrant murrant closed this Nov 5, 2023
@murrant murrant reopened this Nov 5, 2023
@Bierchermuesli
Copy link
Contributor Author

So, how can me move on here?

vrp_ce12804-withvrf.snmprec is abiously wrong. there is no standard .1.3.6.1.2.1.15.2 nor the 1.3.6.1.4.1.2011.5.25.177.1.1.1.1 has any bgpLocalAs value.

The 1.3.6.1.4.1.2011.5.25.177.1.1.2.1.6 entry belongs to The Counter That Records the Times the Remote BGP and not value 1 which is expected

I can implement an exeption but 'null' is in my opinion the correct value for this device.

@PipoCanaja
Copy link
Contributor

PipoCanaja commented Dec 18, 2023

vrp_ce12804-withvrf.snmprec is incomplete.
I'll update the PR with a newly collected file, with your PR applied. We'll see how it goes.

@PipoCanaja
Copy link
Contributor

PipoCanaja commented Dec 18, 2023

Ok. Not sure why it was written like getNext in the 1st place. Could it be that some funky devices are not replying to BGP4-MIB::bgpLocalAs.0 but BGP4-MIB::bgpLocalAs.1 for instance ? or any other index ?
Anyway, your change follows the MIB and indeed the vrp-ce12804 data is crap. Cause it works well in real life on my device.

@PipoCanaja
Copy link
Contributor

File corrected. I checked all the VRP test data on my dev machine. CI will test the others and return the result here. If tests are OK, then this PR will make it tomorrow.

Copy link
Contributor

@PipoCanaja PipoCanaja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go

@PipoCanaja PipoCanaja merged commit 0028311 into librenms:master Dec 19, 2023
8 checks passed
@librenms-bot
Copy link

This pull request has been mentioned on LibreNMS Community. There might be relevant details there:

https://community.librenms.org/t/24-1-0-changelog/23271/1

gunkaaa pushed a commit to gunkaaa/librenms that referenced this pull request Jan 8, 2024
* Fix wrong as discovery. if peer has no bgp enabled, the bgp as was miss-discovered

* Update includes/discovery/bgp-peers.inc.php

Co-authored-by: Tony Murray <murraytony@gmail.com>

* tests

---------

Co-authored-by: Tony Murray <murraytony@gmail.com>
Co-authored-by: PipoCanaja <38363551+PipoCanaja@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

invalid bgpLocalAs
4 participants