Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement of a MAC collector through SSH #1190

Closed
wants to merge 1 commit into from

Conversation

earendilfr
Copy link
Contributor

Some Cisco devices doesn't support the macsuck function because they doesn't support the BridgeMIB to the retrieve MAC addresses connected to switch ports...

(By example, the Cisco C1100 series routers have a switch inside but doesn't support the BridgeMIB...)

So, I have created a simple macsuck function through the SSH collector to retrieve these data.
Like the SNMP collector, the function:

  • retrieve the information concerning the current statu of interface
  • retrieve the MAC table

Currently, I have done only the IOS platform but it could be interesting for other devices...

For devices that doesn't support the walk through SNMP with multiple
VLANs
@earendilfr earendilfr marked this pull request as ready for review March 20, 2024 21:58
@rc9000
Copy link
Member

rc9000 commented Mar 20, 2024

Cool! 👍 For the record, I had similar plans a while ago and already talked @ollyg into adding an API endpoint that can upload a macsuck result (api/v1/object/device/{{hostname}}/nodes?enqueue=false). I thought this would allow using ntc-templates instead of adding a lot of command parsing code in modules again. I did some experiments with that here: https://github.com/rc9000/ntcsuck/, it's bascially macsuck for IOS-XE via ntc-templates and Ansible.

However this approach showed two issues:

  • show mac address table and similar does not seem to be a priority in ntc-templates and is poorly implemented for very common platforms even, so we'd probably need to supply our own TextFSM files anyways
  • macsuck does a lot more than just submitting pure cam table results, a non-snmp macsuck will also have to deal with ifoperstatus etc. ... but I see you got that already covered to some extent. We'd probably need a way to still run the Wirelessnodes, NAC etc. stuff while not doing the cam/operstatus parts.

Then other stuff came inbetween and I parked the idea for the time being, looks like this was already two years ago now, dang. But I'd still like a way to do macsuck without snmp, my main problem is that it is so wasteful and slow on big switches and with community based indexing. And it doesn't even work right with SNMPv3 or is a pain with having to create a context per VLAN.

This just for a bit of context, I'm in favor of both a ntc-templates or an expect-ish SSH variant. If we go with the latter, sorry @ollyg that I insisted on the API and then not really came through with any useful application for it.

@ollyg
Copy link
Member

ollyg commented Mar 21, 2024

@rc9000 don't feel bad! The work to support API actually means that inside Netdisco the gather of data is decoupled from the sanity checking and the storage phases, so it makes this work here easier to implement, and also generally cleaner design internally. I was very happy to do it. Note that we even support CLI input of MAC/ARP data via netdisco-do.

@ollyg
Copy link
Member

ollyg commented Mar 21, 2024

Ah yes, I see @earendilfr has implemented using exactly the right feature as a result of refactor! Makes me happy :-)
https://github.com/netdisco/netdisco/pull/1190/files#diff-2fe03540adb4f0ad088f140864852c8dd0b91b3a3e8990638137bf57010bc4bbR141

@earendilfr
Copy link
Contributor Author

Cool! 👍 For the record, I had similar plans a while ago and already talked @ollyg into adding an API endpoint that can upload a macsuck result (api/v1/object/device/{{hostname}}/nodes?enqueue=false). I thought this would allow using ntc-templates instead of adding a lot of command parsing code in modules again. I did some experiments with that here: https://github.com/rc9000/ntcsuck/, it's bascially macsuck for IOS-XE via ntc-templates and Ansible.

However this approach showed two issues:

* `show mac address table` and similar does not seem to be a priority in ntc-templates and is poorly implemented for very common platforms even, so we'd probably need to supply our own TextFSM files anyways

* macsuck does a lot more than just submitting pure cam table results, a non-snmp macsuck will also have to deal with ifoperstatus etc. ... but I see you got that already covered to some extent. We'd probably need a way to still run the Wirelessnodes, NAC etc. stuff while not doing the cam/operstatus parts.

Then other stuff came inbetween and I parked the idea for the time being, looks like this was already two years ago now, dang. But I'd still like a way to do macsuck without snmp, my main problem is that it is so wasteful and slow on big switches and with community based indexing. And it doesn't even work right with SNMPv3 or is a pain with having to create a context per VLAN.

This just for a bit of context, I'm in favor of both a ntc-templates or an expect-ish SSH variant. If we go with the latter, sorry @ollyg that I insisted on the API and then not really came through with any useful application for it.

You are right, the solution is not perfect (clearly, I prefer the SNMP way because the SSH is very slow).

But if you check the output, you can do the necessary to have in parallel the CLI and the SNMP (it's could be fun if we could run the main macsuck in CLI and the macsuck::PortAccessEntity in SNMP...) :

netdisco-do -D macsuck -d ehc-mor-ert01
[89980] 2024-03-21 20:16:07  info App::Netdisco version 2.072003 loaded.
[89980] 2024-03-21 20:16:07  info macsuck: [10.12.32.61] started at Thu Mar 21 21:16:07 2024
[89980] 2024-03-21 20:16:07 debug macsuck: running with timeout 600s
[89980] 2024-03-21 20:16:07 debug => running workers for phase: check
[89980] 2024-03-21 20:16:07 debug -> run worker check/1000000 "internal::backendfqdn"
[89980] 2024-03-21 20:16:07 debug -> run worker check/1000000 "internal::snmpfastdiscover"
[89980] 2024-03-21 20:16:07 debug running with configured SNMP timeouts
[89980] 2024-03-21 20:16:07 debug -> run worker check/0 "macsuck"
[89980] 2024-03-21 20:16:07 debug Macsuck is able to run.
[89980] 2024-03-21 20:16:07 debug => running workers for phase: early
[89980] 2024-03-21 20:16:07 debug -> run worker early/0 "prepare common data"
[89980] 2024-03-21 20:16:07 debug => running workers for phase: main
[89980] 2024-03-21 20:16:07 debug -> run worker main/1000000 "gather macs from file and set interfaces"
[89980] 2024-03-21 20:16:07 debug skip: fwtable data supplied by other source
[89980] 2024-03-21 20:16:07 debug -> run worker main/200 "gather macs from CLI and set interfaces"
[89980] 2024-03-21 20:16:07 debug cli session cache warm: [10.12.32.61]
[89980] 2024-03-21 20:16:08 debug 10.12.32.61 89980 macsuck()
[89980] 2024-03-21 20:16:09 debug -> run worker main/100 "gather macs from snmp and set interfaces"
[89980] 2024-03-21 20:16:09 debug skip: namespace passed at higher priority
[89980] 2024-03-21 20:16:09 debug -> run worker main/100 "macsuck::nodes::portaccessentity"
[89980] 2024-03-21 20:16:09 debug skip: namespace passed at higher priority
[89980] 2024-03-21 20:16:09 debug -> run worker main/100 "macsuck::wirelessnodes"
[89980] 2024-03-21 20:16:09 debug snmp reader cache warm: [10.12.32.61]
[89980] 2024-03-21 20:16:09 debug [10.12.32.61:161] try_connect with v: 3, t: 0.2, r: 0, class: SNMP::Info::Layer3::CiscoSwitch, comm: <hidden>
[89980] 2024-03-21 20:16:10 debug => running workers for phase: store
[89980] 2024-03-21 20:16:10 debug -> run worker store/0 "save macs to database"
[89980] 2024-03-21 20:16:10 debug  [10.12.32.61] macsuck 24:f2:7f:c7:5b:be - port GigabitEthernet0/1/0 has undiscovered neighbor 10.17.193.16
[89980] 2024-03-21 20:16:10 debug  [10.12.32.61] macsuck - port GigabitEthernet0/1/0 vlan 10 : 1 nodes
[89980] 2024-03-21 20:16:10 debug  [10.12.32.61] macsuck - stored 1 forwarding table entries
[89980] 2024-03-21 20:16:10 debug  [10.12.32.61] macsuck - removed 0 fwd table entries to archive
[89980] 2024-03-21 20:16:10 debug => running workers for phase: late
[89980] 2024-03-21 20:16:10 debug -> run worker late/0 "macsuck::hooks"
[89980] 2024-03-21 20:16:10 debug  [10.12.32.61] hooks - 0 queued
[89980] 2024-03-21 20:16:10  info macsuck: finished at Thu Mar 21 21:16:10 2024
[89980] 2024-03-21 20:16:10  info macsuck: status done: Ended macsuck for 10.12.32.61

One big difficulty (root cause is Cisco devices) is the need to launch all the cmd in one time because ssh session is dead after the usage of capture function and so, the reuse of this function generate an error...

@rc9000
Copy link
Member

rc9000 commented Mar 21, 2024

One big difficulty (root cause is Cisco devices) is the need to launch all the cmd in one time

There was also various issues in IOSXR.pm over the years. The current version uses Expect instead of capture(), this seemed to work fine in the end.

@ollyg
Copy link
Member

ollyg commented Mar 22, 2024

I hesitate strongly to say this ... has Net::Appliance::Session been tried? Unfortunately I have no devices to test on, here.

@rc9000
Copy link
Member

rc9000 commented Mar 22, 2024

I hesitate strongly to say this ... has Net::Appliance::Session been tried? Unfortunately I have no devices to test on, here.

I can confirm that Net::Appliance::Session works well with pretty much everything that Cisco has sold in the last two decades :) IIRC in the first couple SSHCollector modules I didn't use it because the dependencies were a pain in our airgapped DC equipped with quite old RHEL servers, Expect was just there.

Nowadays everybody seems to have Internet-straight-to-critical-infrastructure pipelines with Docker, Artifactory and DevopsThisAndThat, so that should not be an issue anymore.

@ollyg
Copy link
Member

ollyg commented Mar 25, 2024

This patch is fine. The reason I've not merged it yet is I wanted to see if there's a way to let Netdisco do the interfaces status as SNMP and cache that, then let the SSH mac-address table command run. Then combine the two at the end.

@ollyg
Copy link
Member

ollyg commented Mar 26, 2024

Actually @rc9000 I wonder why we implemented as Discover/PortProperties/PortAccessEntity.pm instead of Discover/PortAccessEntity.pm in #937 ... the latter would have allowed it still to run with SNMP I think.

@earendilfr
Copy link
Contributor Author

I hesitate strongly to say this ... has Net::Appliance::Session been tried? Unfortunately I have no devices to test on, here.

I can confirm that Net::Appliance::Session works well with pretty much everything that Cisco has sold in the last two decades :) IIRC in the first couple SSHCollector modules I didn't use it because the dependencies were a pain in our airgapped DC equipped with quite old RHEL servers, Expect was just there.

Nowadays everybody seems to have Internet-straight-to-critical-infrastructure pipelines with Docker, Artifactory and DevopsThisAndThat, so that should not be an issue anymore.

I can try to update the PR to replace the usage of Net::SSH module by Except or Net::Appliance::Session...
If I read correctly the "Net::Appliance::Session", the good point is to remove the need to detect correctly the prompt (I have do some work on Rancid and clearly, the detection of the prompt is a real nightmare 😨)

@ollyg
Copy link
Member

ollyg commented Apr 13, 2024

note to self @ollyg: remove macs on forbidden vlans (sanity_vlans)

@ollyg
Copy link
Member

ollyg commented Apr 15, 2024

replaced with #1202

@ollyg ollyg closed this Apr 15, 2024
@ollyg
Copy link
Member

ollyg commented Apr 15, 2024

Hi @earendilfr thanks for the patch! because there were quite significant changes I wanted to make (remove interface status checks), it was easier to make a new branch and PR which is over here at #1202.

ollyg added a commit that referenced this pull request Apr 22, 2024
* Implement of a MAC collector through SSH

For devices that doesn't support the walk through SNMP with multiple
VLANs

* fix typo in SSH transport macsuck

* update macsuck ssh to remove interface update and add sanity/debug

* update IOS SSH collector to remove interfaces and add safeguarding

* fix typo syntax error

* fall back to provided port abbreviation if not known

* add example output for macsuck and change regexp to allow zero numbers on port name

* fix another typo in the worker

* missing dependency

---------

Co-authored-by: earendilfr <earendil@toleressea.fr>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

3 participants