Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear vxlan.vlans for nodes with no vlans (e.g. EVPN RR) #466

Closed

Conversation

jbemmel
Copy link
Collaborator

@jbemmel jbemmel commented Sep 19, 2022

A node that has the evpn module enabled but not the vlan module ( e.g. EVPN RR ) hits an error due to undefined vlan.id values. This causes confusing errors about evpn.evi values that are based on those ( not int but {})

Fixes #417

… 'vlan' module is not enabled

This leads to undefined vlan.id values, causing confusing errors about evpn.evi values that are based on those
Copy link
Owner

@ipspace ipspace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there's no topology describing the challenge you're facing i'm trying to reverse-engineer what's going on. If I got it right, you have a node with VXLAN-enabled VLANs (+ EVPN) but without the VLAN module, so the VLANs have no vlan.id.

Should that be the case, the error should be reported in the VXLAN module, not in EVPN module, and if VXLAN module is not present, then we have to start discussing what to do with vxlan.vlans attribute which could only happen on a node, because globally that would be detected.

@jbemmel
Copy link
Collaborator Author

jbemmel commented Sep 19, 2022

The error happens with only a global vxlan.vlans defined, on a node with vxlan enabled but none of the listed vlans.
node.vlans = [] but node.vxlan.vlans == < global list >

Then when the evpn module iterates over vxlan.vlans, it errors out

One way to fix this, is for the vxlan module to remove vxlan.vlans when node.vlans is empty

@jbemmel jbemmel changed the title Add a check and error out when a node has VXLAN-enabled vlans without having the 'vlan' module enabled Clear vxlan.vlans for nodes with no vlans Sep 19, 2022
@ipspace
Copy link
Owner

ipspace commented Sep 20, 2022

Please add the topology that generates this. I'm worried what the root cause of this behavior is, and that we're not just papering over the symptoms.

@jbemmel jbemmel changed the title Clear vxlan.vlans for nodes with no vlans Clear vxlan.vlans for nodes with no vlans (e.g. EVPN RR) Sep 20, 2022
@ipspace
Copy link
Owner

ipspace commented Sep 20, 2022

Thanks for an excellent test topology -- it nailed it ;)

I knew there was something fishy about the whole thing. VLAN module is a requirement for VXLAN module, and yet you managed to create a topology where a node has VXLAN module, but not VLAN module. It's either a SNAFU I created with 31bf8e1 (shame on me), or another edge case we haven't identified yet.

In any case, this needs to be fixed -- obviously we need to check module dependencies for every node, as we might have a situation where the global list of modules is consistent, but a particular node combo violates the requirements.

I'll fix it, just give me a few hours.

Finally, as I keep saying: focus on the root cause of the problem, not a quick fix somewhere else that will make the current lab you're working on work. The root cause was the shaky module dependency check, and fixing the VXLAN attributes in EVPN module would just kick the can down the road.

Last but not least, now that I know what the root cause is, this is the minimum topology I'll add to the test cases:

defaults.device: eos

nodes:
  r1:
    module: [ vlan, vxlan ]
  r2:
    module: [ vxlan ]

@jbemmel
Copy link
Collaborator Author

jbemmel commented Sep 20, 2022

I still think it's better to also clear vxlan.vlans when no vlans match - even when the vlan module is correctly applied, and it seems to me that the vxlan module is the right place to do that.

Agree that there is the underlying issue of module dependency checks too

@ipspace
Copy link
Owner

ipspace commented Sep 21, 2022

I still think it's better to also clear vxlan.vlans when no vlans match - even when the vlan module is correctly applied, and it seems to me that the vxlan module is the right place to do that.

Agree. #476 has been fixed.

Agree that there is the underlying issue of module dependency checks too

Could you please check whether #475 and #476 solved what this PR is addressing?

@ipspace
Copy link
Owner

ipspace commented Sep 25, 2022

Based on lack-of-response I'm guessing that this is fixed.

@ipspace ipspace closed this Sep 25, 2022
@jbemmel jbemmel deleted the evpn-check-for-missing-vlan-module branch September 26, 2022 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants