-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged EVPN VxLAN MH HLD from Cisco and BCM #1702
base: master
Are you sure you want to change the base?
Conversation
@adyeung please add me to the reviewers list? |
3. Asymmetric IRB support | ||
4. Support of VRRP over EVPN Multihoming. Static Anycast Gateway should be used instead. | ||
5. EVPN Multihomed interface as router-port or routed sub-interface. EVPN Multihomed interface can only be a switchport. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any Interop requirements between EVPN MH and MCLAG based multi-homing solutions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are mutually exclusive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are mutually exclusive within the same node. But my question is more about the interop between Leaf1 (EVPN-MH) to Leaf2A-Leaf2B (MLAG pair). In this case the MLAG Pair wouldnt participate in any of the EVPN MH procedures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct. MC-LAG and EVPN-MH are using different mechanism to exchange information (ICCP vs BGP).
sudo config interface PortChannel2 system-mac 00:00:00:00:22:22 | ||
sudo config interface PortChannel2 evpn-esi auto-system-mac | ||
``` | ||
The Type-3 ESI value will be derived by combining the configured system-mac address on the PortChannel interface, and the PortChannel number. The system-mac is required to be same on all of the VTEPs multihoming a given LAG. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And also the PortChannel number has to be same on all of the VTEPs multihoming right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are using ESI Type-3, the portchannel Id is used a local discriminator. However, ESI type-0 can be use where the whole 10 bytes are configurable. The portChannel id is no longer relevant.
The BUM (Broadcast, Unknown Unicast, and Multicast) traffic in EVPN Multihoming network requires different handling depending on whether BUM traffic is originated from VTEP participating or not participating in the EVPN Multihoming. | ||
|
||
##### 1.2.1.1.1 Local-bias and split-horizon filtering | ||
If the BUM traffic originates from a device attached to an EVPN Ethernet-Segment, local-bias procedure is applied and BUM traffic is flooded on all locally attached multi-homed and single-homed devices. The BUM traffic is also replicated to the remote VTEPs, and split-horizon filtering is applied such that traffic is not replicated to any of the devices that are attached to the shared EVPN Ethernet-Segment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please mention that, this is applicable to only all-active MH?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Local bias comes with VxLAN due to fact there is no indication of incoming BUM traffic along with a easy way to identify corresponding Ethernet-Segment. In MPLS, it is possible to apply ESI-filtering over BUM traffic. Cannot be done for VxLAN. That says, the method works for other load-balancing modes including single-active and port-active. The initial design do not support them yet.
#### 1.2.1.2 L2 Unicast forwarding in EVPN MH | ||
The control-plane and forwarding-plane handling of MAC and local adjacency (ARP/ND) learnt on the Multihomed Ethernet-Segments significantly differs from the single homed Ethernet-Segments in the EVPN VxLAN network. | ||
|
||
Each VTEP advertises EVPN Type-1 (per ES/EAD) route per local Ethernet-Segment. These VTEPs also advertises EVPN Type-2 (MAC/IP) route per locally learn adjacency (ARP/ND). Remote VTEPs perform EVPN route resolution based on matching ESI value carried as part of these routes. For each Ethernet-Segment, an unique L2 Next-hop group (NHG) is formed that contains the participating VTEPs. All processed EVPN Type-2 routes are pointing to that L2 NHG on matching ESI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a section to capture new BGP EVPN routes(Type-1 and Type-4) being supported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @pbrisset
I represent a team working on the EVPN Multihoming implementation and we found some comments to this PR.
Can you, please, check the comments? We think they are crucial to have the implementation.
Thank you
**EVPN_ETHERNET_SEGMENT** | ||
|
||
``` | ||
;New table | ||
;Specifies EVPN Ethernet Segment interface | ||
; | ||
; Status: stable | ||
key = EVPN_ETHERNET_SEGMENT|"ifname"if_id | ||
; field = value | ||
esi = "AUTO" or es_id | ||
; es_id is 10 byte colon (:) separated string when type = "TYPE_0_OPERATOR_CONFIGURED". | ||
; Otherwise, esi value should be "AUTO" falling back on ESI type 1 or 3. | ||
type = esi_type | ||
; esi_type should be string with one of the below values: | ||
; "TYPE_0_OPERATOR_CONFIGURED" for Type-0 ESI | ||
; "TYPE_1_LACP_BASED" for Type-1 ESI | ||
; "TYPE_3_MAC_BASED" for Type-3 ESI | ||
ifname = "ifname"if_id | ||
; if_id is the interface identifier. Same value as in the key. | ||
; e.g., PortChannel1, where the ifname is PortChannel and the if_id is 1. | ||
df_pref = 1*5DIGIT | ||
; Designated-Forwarder election preference for this router in (1..65535) range. | ||
; Default=32767. | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EVPN_ETHERNET_SEGMENT table does not map on existing FRR commands and is missing required functionality: mac and id for type-3 ESI. Those 2 items are both mandatory in FRR for type-3 ES to work. Also to our knowledge there is no type-1 ESI support in FRR right now or in any PR. So even if everything else is ready to resolve ESI through LACP, FRR is not ready at all.
curl -X PATCH "https://SWICH_IP:9090/restconf/data/openconfig-network-instance:network-instances/network-instance=default/evpn/evpn-mh/config" -H "accept: */*" -H "Content-Type: application/yang-data+json" -d "{\"openconfig-network-instance:config\":{\"startup-delay\":150,\"mac-holdtime\":200,\"neigh-holdtime\":250}}" | ||
``` | ||
<a id="4-Flow-Diagrams"></a> | ||
# 4 Flow Diagrams |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Flow Diagrams section contains multiple distinctly different ways of to naming/implementing feature comparing to the rest of the document. We think these diagrams are outdated artifacts from #1638 requirements and contradict the new requirements.
### 3.3.8 L2nhgorch | ||
- New Orchagent class being introduced. | ||
- Subscription to different tables | ||
- APP DB - L2_NEXTHOP_GROUP, L2_NEXTHOP_GROUP_MEMBER | ||
- CONFIG DB - EVPN_ES_INTERFACE. | ||
- Subscription to L2_NEXTHOP_GROUP | ||
- Creates/deletes the L2_ECMP_GROUP ASIC DB table. | ||
- Interacts with PortsOrch to create a bridge-port object of type L2_ECMP_GROUP. | ||
- Subscription to L2_NEXTHOP_GROUP_MEMBER | ||
- Creates/deletes the L2_ECMP_GROUP_MEMBER ASIC DB tables entries on the L2_NEXTHOP_GROUP APP DB table updates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "DB Changes" and "Impact to existing modules" section conflicts with the L2_NEXTHOP_GROUP schema in APP_DB (the latter mentions a separate table for members). We propose that the one in "DB Changes" section is to be chosen as the correct approach and "L2nhgorch" subsection is to be rewritten accordingly.
**EVPN_SPLIT_HORIZON_TABLE** | ||
|
||
``` | ||
; New table | ||
; Specifies split-horizon filtering source VTEPs for EVPN Multihomed interface | ||
; Producer: fpmsyncd | ||
; Consumer: shlorch | ||
; Status: stable | ||
key = EVPN_SPLIT_HORIZON_TABLE:"Vlan"vlan_id:"ifname"if_id | ||
; vlan_id is 1-4 DIGIT Vlan ID. | ||
; if_id is interface number | ||
; field = value | ||
vteps = vtep_list | ||
; String of Comma(,) separated list of VTEP IP addresses. | ||
``` | ||
|
||
**EVPN_DF_TABLE** | ||
``` | ||
; New table | ||
; Specifies designated-forwarder election for this router. | ||
; Producer: fpmsyncd | ||
; Consumer: evpnmhorch | ||
; Status: stable | ||
key = EVPN_DF_TABLE:"Vlan"vlan_id:"ifname"pif_id | ||
; vlan_id is 1-4 DIGIT VLAN ID. | ||
; if_id is interface number | ||
; field = value | ||
df = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EVPN_SPLIT_HORIZON_TABLE table is converted to an SAI Isolation Group. EVPN_DF_TABLE adds a new attribute to the Bridgeport attributes (of the L2_ECMP_GROUP type): SAI_BRIDGE_PORT_ATTR_NON_DF. Neither in the first nor in the second case, a binding to VLAN is not necessary.
### 3.3.9 EvpnMhOrch | ||
A new EvpnMhOrch class will be introduced to perform following activities, | ||
|
||
1. Subscribe to EVPN_MH_GLOBAL config table and set the SAI switch attribute. | ||
2. Subscribe to EVPN_ES_INTERFACE config table updates and maintain internal cache. | ||
3. Provide set of API to check if the given acces interface is ES associated or not. | ||
4. Subscribe to LAG_TABLE oper status changes and trigger MAC update requests to FdbOrch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We think the new rule is needed:
Subscribe to EVPN_DF_TABLE from APP-DB and set the SAI bridge port attribute SAI_BRIDGE_PORT_ATTR_NON_DF to relevant Bridgeport.
sudo config interface <interface name> sys-mac XX:XX:XX:XX:XX:XX | ||
sudo config interface PortChannel1 evpn-esi {XX:XX:XX:XX:XX:XX:XX:XX:XX:XX | auto-lacp | auto-system-mac} | ||
sudo config interface PortChannel1 evpn-df-pref (1-65535) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- These commands do not follow the "config interface" command syntax (https://github.com/sonic-net/sonic-utilities/blob/0e6a55ef5eac306ef61d6f0241625a6baee42ab8/doc/Command-Reference.md?plain=1#L5227)
- "sudo config interface PortChannel1 evpn-esi {XX:XX:XX:XX:XX:XX:XX:XX:XX:XX | auto-lacp | auto-system-mac}" command does not allow to configure a local discriminator for type-3 ESI. Also, it looks like there is no type-1 (auto-lacp) ESI support in FRR. (NOTE: this comment is connected to the one with EVPN_ETHERNET_SEGMENT table)
Proposed commands:
sudo config interface sys-mac <interface_name> XX:XX:XX:XX:XX:XX
sudo config interface evpn-esi <interface_name> {type-0 XX:XX:XX:XX:XX:XX:XX:XX:XX:XX | type-3 (1-16777215)}
sudo config interface evpn-df-pref <interface_name> (1-65535)
### 2.2.4 Linux Kernel Support | ||
EVPN Multihoming feature requires L2 next-hop group (NHG) support in the Linux kernel that is available in v5.10. | ||
EVPN Multihoming feature will not be supported in SONiC releases running 4.x kernel versions. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to RFC 8365, Section 8.3.3
VXLAN-GPE encapsulation needs to be used between these PEs and the ingress PE needs to set the BUM Traffic Bit (B bit) to indicate that this is an ingress-replicated BUM traffic.
It might need an additional patch to support this in the v5.10??
In the current implementation, can all the functionalities already work on the VS (virtual machine) platform? |
Yes, it was tested with VS platform. And there is a work in progress to align the ESI types logic with the current requirements. |
This is the result of merging Cisco and BCM HLDs.
Cisco HLD
BCM HLD