New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EVPN for VxLAN #619

Closed
wants to merge 49 commits into
base: master
from

Conversation

Projects
None yet
10 participants
@vivek-cumulus
Copy link
Contributor

vivek-cumulus commented May 26, 2017

This series of commits implements EVPN for VxLAN (as per RFC 7432 and draft-ietf-bess-evpn-overlay) to provide the standard EVPN L2 functionality for a VTEP/NVE. It comprises of the following key functions:

  • Interface with the kernel (only supported over netlink) to learn about Layer-2 interfaces (bridges and their members), MAC addresses and IP neighbors (ARP/ND)
  • VNI hash object with MAC and neighbor tables maintained against a VNI
  • Interaction between zebra and bgpd to exchange VNI/VTEP information and local/remote MACs and MAC+IP
  • BGP/EVPN protocol implementation to create, exchange and process EVPN type-2 (MAC+IP) and type-3 (Multicast) routes
  • Support for MAC/VM mobility
  • Support for exchange of sticky (static) MACs
  • Basic (essential) configuration to enable EVPN (i.e., learn local VNI/MAC/IP and advertise them); optional configuration for per-VNI parameters (RD and RTs)
  • Operational commands in bgpd and zebra
  • Essential extensions to debug logs

The features implemented in this patch set are sufficient to realize basic inter-subnet routing also using the asymmetric mode (draft-ietf-bess-evpn-inter-subnet-forwarding), but additional refinements are necessary for specific scenarios (e.g., centralized gateways) as well as for symmetric mode.

vivek-cumulus added some commits May 15, 2017

lib: VLAN definition
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
lib: VxLAN Network Identifier definition
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
lib: Macro for number of entries in hash table
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
lib: Additional APIs in bitfield library
Added APIs to:
a) pre-assign 0th bit in the bitfield
b) free 0th bit in the bitfield
c) free the allocated bitfield data

Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by:   Vivek Venkatraman <vivek@cumulusnetworks.com>
lib: Define generic IP address structure
Define an IP address structure which is a union of an IPv4 and IPv6
address. This is for subsequent use in EVPN.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
lib: Refine EVPN prefix definition
Modify EVPN prefix to use the generic IP address structure. Add support
for EVPN type-2 and type-3 prefix dump. Fix references to modified fields
as needed.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
zebra: New API for filling netlink attribute
Define addattr16().

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
zebra: Set nlmsg_pid in netlink_talk()
While it is not essential to set this, it seems a good thing to do.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
zebra: Format netlink requests correctly
When zebra issues read (GET) requests to the kernel using the netlink
interface, it is incorrect to format all of them in a generic manner
using 'struct ifinfomsg' or 'struct rtgenmsg'. Rather, messages for a
particular entity (e.g., routes) should use the corresponding structure
for encoding (e.g., 'struct rtmsg'). Of course, this has to correlate
with what the kernel expects.

In the absence of this, there is the possibility of sending extraneous
information in the request which the kernel wouldn't like.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   David Ahern <dsa@cumulusnetworks.com>
zebra: Layer-2 interface handling
Define interface types of interest and recognize the types. Store layer-2
information (VLAN Id, VNI etc.) for interfaces, process bridge interfaces
and map bridge members to bridge. Display all the additional information
to user (through "show interface").

Note: Only implemented for the netlink interface.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
zebra: VNI and VTEP definition
Define the base data structures for a VxLAN Network Identifier (VNI) and
VxLAN Tunnel End Point (VTEP). These will be used by the EVPN function.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
zebra: VNI and VTEP handling
Implement fundamental handling for VNIs and VTEPs:
- Handle EVPN enable/disable by client (advertise-all-vni)
- Create/update/delete VNIs based on VxLAN interface events and inform
client
- Handle VTEP add/delete from client and install into kernel
- New debug command for VxLAN/EVPN
- kernel interface (Linux/netlink only)

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
zebra: MAC and Neighbor hash table definition
Define the MAC and Neighbor (ARP/ND) data structures. These are maintained
as hash tables against the VNI. Also, define context structures used for
performing various operations on these two tables.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
zebra: MAC and Neighbor (ARP/ND) handling
Implement handling of MACs and Neighbors (ARP/ND entries) in zebra:
- MAC and Neighbor database handlers
- Read MACs and Neighbors from the kernel, when needed and create
entries in zebra's MAC and Neighbor databases.
- Handle add/update/delete notifications from the kernel for MACs and
Neighbors and update zebra's database appropriately
- Inform locally learnt MACs and Neighbors to client
- Handle MACIP add/delete from client and install appriporiate entries
into the kernel
- Since Neighbor entries will be installed on an SVI, implement the
needed mappings

NOTE: kernel interface is only implemented for Linux/netlink

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
zebra: EVPN/VxLAN UI definitions and handling
Implement various UI (vty) commands for EVPN/VxLAN.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
bgpd: Refine extended community handling
Define helper functions to form different kinds of route targets. Also,
refine functions that encode extended communities as well as generate
a string from an extended community.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
bgpd: Separate out RD handling functions
BGP Route Distinguisher (RD) handling is common for different flavors
of BGP VPNs such as BGP/MPLS IP VPNs (RFC 4364) and BGP EVPNs (RFC 7432).
Separate out the RD handling functions into its own files.

Note: No functional change introduced with this commit.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
bgpd: Display extended communities in debug log
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
bgpd: Function to encode Encapsulation type extended community
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
bgpd: Fix check for martian next hops
Ensure that the check for martian next hop is correct, including for MP
nexthops, if IPv4.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
bgpd: EVPN definitions
Define the EVPN (EVI) hash table and objects for mapping route targets to EVIs.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
bgpd: EVPN initialization and cleanup
Define the EVPN (EVI) hash table and related structures and initialize
and cleanup.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
bgpd: Fixes related to use of L2VPN/EVPN
Add checks related to AFI_L2VPN/SAFI_EVPN that were missing in some parts
of the code. Fix incorrect check skipping EVPN when sending End of RIB.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
bgpd: Fix route detailed show for EVPN
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
bgpd: Fix route handling for 2-level routing tables
For certain (sub) address families such as EVPN or L3VPN, the routing
table is organized as a 2-level tree. Ensure that code walking the
routing table does the correct handling in such cases.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
bgpd: Fix next hop setting for EVPN
The next hop for EVPN routes must be an IPv4 or IPv6 address as per
RFC 7432. Ensure this is correctly handled. Also, ensure there
are correct checks for AFI_L2VPN and nexthop AFI is not AFI_L2VPN.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
bgpd: Implement EVPN enable/disable
Implement the command 'advertise-all-vni' under the EVPN address-family
in order to allow the local system to learn about local VNIs (and MACs
and Neighbors corresponding to those VNIs) and exchange with other EVPN
speakers.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by:   Dinesh Dutt <ddutt@cumulusnetworks.com>
bgpd: Install or remove only relevant routes from zebra
Ensure that the AFI/SAFI is relevant to the FIB before attempting to install
or remove the route from zebra.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
lib: Define handlers for VNI and MACIP
Define client handlers for processing add or delete of local VNIs
and local MACIPs.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
bgpd: EVPN route handling
Core EVPN route handling functionality. This includes support for the
following:
- interface with zebra to learn about local VNIs and MACIPs as well as
to install remote VTEPs (per VNI) and remote MACIPs
- create/update/delete EVPN type-2 and type-3 routes
- attribute creation, route selection and install
- route handling per VNI and for the global routing table
- parsing of received EVPN routes and handling by route type
- encoding attributes for EVPN routes and EVPN prefix creation (for
Updates)

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by:   Daniel Walton <dwalton@cumulusnetworks.com>
@louberger

This comment has been minimized.

Copy link
Member

louberger commented Jun 7, 2017

@vivek-cumulus

This comment has been minimized.

Copy link
Contributor

vivek-cumulus commented Jun 8, 2017

@vincentbernat - you are correct about assumption of IPv4 in the underlay. I agree with you that it should be addressed sometime in the future.

@louberger - my implementation started prior to the " " standardization, but I am making changes to the BGP EVPN commands to be of the form "show bgp l2vpn evpn ...". I shall commit that tomorrow as well as update the evpn.md. The configuration command should already be "address-family l2vpn evpn", but I shall check. We will retain "address-family evpn" as a private change for backwards compatibility for our customers.

I do not want to change "vni" to "vni-policy". To me, a policy is a set of rules or definitions that are independent of any entity (i.e., like a template) and can be applied to many entities. Our route-map is a classic "policy". Other NOS have similar constructs. The configuration under a VNI is just that - parameters specific to this VNI.

When we extend the support to other encapsulation types including MPLS, we can call this an "evi".

@louberger

This comment has been minimized.

Copy link
Member

louberger commented Jun 8, 2017

WRT vni-policy the approach taken with vrf-policy was the result of community discussion - if you want to deviate from this, I think we need to run it by folks - perhaps via the frr list...

@vincentbernat

This comment has been minimized.

Copy link
Contributor

vincentbernat commented Jun 9, 2017

It seems there is some recent kernel requirement to distribute MAC+IP. I think @vivek-cumulus already told me about this but I don't remember exactly. With a 4.4, an IPv6 MAC-IP route will override an existing static FDB-only entry. This makes a local MAC appear to be remote. If the cause is known, it would be nice to document it.

@vivek-cumulus

This comment has been minimized.

Copy link
Contributor

vivek-cumulus commented Jun 12, 2017

@louberger - yes, I shall send a mail out to the FRR list tomorrow on this point.

If this is the only real sticking point, since there is a lot of code and functionality being brought in by this PR, I'd like to request for it to be merged in. If the consensus that emerges is to use the keyword "vni-policy" or something else, I shall of course submit a change to that effect.

@vincentbernat - yes, there is some contention on what the kernel considers as "static". Currently, the Linux kernel allows learning a dynamic MAC on some interface X even if a static entry for the same MAC has been installed into the bridge FDB. This in turn causes some confusion for the control plane as it does not expect to see a dynamic entry for a static MAC. I have asked our Cumulus Linux kernel team - who are very involved with the kernel networking community - to propose and implement a change that keeps a static MAC as really "static" (to be overridden only by another static entry).

@pguibert6WIND

This comment has been minimized.

Copy link
Member

pguibert6WIND commented Jun 13, 2017

I had two comments:

  • a minor when configuring peer-group , an error was encountered
    [AFI_IP6][SAFI_EVPN]
    should be replaced by
    [AFI_L2VPN][SAFI_EVPN]

  • I think some unit test ( make check) should be added to check against EVPN packets ( RT2 for example).
    This can be a separate ticket.

vivek-cumulus added some commits Jun 14, 2017

bgpd, doc: Update EVPN operational commands
Make EVPN operational commands follow the "<afi> <safi>" syntax.
The relevant commands are:
- show bgp l2vpn evpn summary
- show bgp l2vpn evpn vni ...
- show bgp l2vpn evpn route ...
- show bgp l2vpn evpn route vni ...
- show bgp l2vpn evpn import-rt

Also update the configuration document.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
bgpd: Fix incorrect AFI reference
Fixes: "bgpd: Fixes related to use of L2VPN/EVPN"
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
@vivek-cumulus

This comment has been minimized.

Copy link
Contributor

vivek-cumulus commented Jun 14, 2017

Thank you @pguibert6WIND , I have addressed the first issue that you reported.

@NetDEF-CI

This comment has been minimized.

Copy link
Collaborator

NetDEF-CI commented Jun 14, 2017

Continous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-924/

This is a comment from an EXPERIMENTAL automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.


CLANG Static Analyzer Summary

  • Github Pull Request 619, comparing to Git base SHA 3264522

New warnings:

Static Analysis warning summary compared to base:

  • Fixed warnings: 0
  • New warnings: 2

147 Static Analyzer issues remaining.

See details at
https://ci1.netdef.org/browse/FRR-FRRPULLREQ-924/artifact/shared/static_analysis/index.html

@NetDEF-CI

This comment has been minimized.

Copy link
Collaborator

NetDEF-CI commented Jun 14, 2017

Continous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-925/

This is a comment from an EXPERIMENTAL automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.


CLANG Static Analyzer Summary

  • Github Pull Request 619, comparing to Git base SHA 3264522

New warnings:

Static Analysis warning summary compared to base:

  • Fixed warnings: 0
  • New warnings: 2

147 Static Analyzer issues remaining.

See details at
https://ci1.netdef.org/browse/FRR-FRRPULLREQ-925/artifact/shared/static_analysis/index.html

@donaldsharp

This comment has been minimized.

Copy link
Member

donaldsharp commented Jun 20, 2017

Lou -> Interesting case is the bridged Router case and how we intend to handle it appropriately

@donaldsharp

This comment has been minimized.

Copy link
Member

donaldsharp commented Jun 27, 2017

@vivek-cumulus responded to email queries yesterday.

@donaldsharp

This comment has been minimized.

Copy link
Member

donaldsharp commented Jun 27, 2017

@louberger is suggesting that we break up the commit into cli and non-cli patch. to see what we can get in

@vivek-cumulus opinions?

@mwinter-osr

This comment has been minimized.

Copy link
Member

mwinter-osr commented Jul 4, 2017

@vivek-cumulus What is the status here?

eqvinox added a commit that referenced this pull request Jul 10, 2017

Merge branch 'evpn-prep'
First 12-and-a-half commits from PR #619

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
@eqvinox

This comment has been minimized.

Copy link
Contributor

eqvinox commented Jul 10, 2017

Merged the first 12 commits (until "bgpd: Fixes related to use of L2VPN/EVPN"),
+ "lib: Remove typedef from ipaddr"

@louberger

This comment has been minimized.

Copy link
Member

louberger commented Jul 10, 2017

@louberger louberger self-requested a review Jul 10, 2017

@louberger louberger added the iterating label Jul 10, 2017

@donaldsharp

This comment has been minimized.

Copy link
Member

donaldsharp commented Jul 11, 2017

@louberger would like to see a consistent configuration approach for l2 and l3 vpn and we had previously had a agreement and this code was not happy with that. We need to come to a new agreement.
@pguibert6WIND vpn l2 and l3 should be configured likewise(similiarly). We need to find an agreement on that. vrf configuration needs to handle both l2 and l3 from the PE side.

Disable the cli and fix merge conflicts, get it in, File an issue, schedule a special meeting to discuss possibly in the last week of July

@donaldsharp

This comment has been minimized.

Copy link
Member

donaldsharp commented Jul 14, 2017

Committed under #809

@ecbaldwin ecbaldwin referenced this pull request Jul 14, 2017

Merged

Evpn plus struct attr #809

@ecbaldwin

This comment has been minimized.

Copy link

ecbaldwin commented Jul 14, 2017

I was just trying this out yesterday. I was able to get a very simple EVPN working. I had a question.

The routes are not displayed with all of the fields that I expect for type 2/3 routes. Any ideas why this is? (BTW, I tried to switch to master seeing this PR was closed in favor of #809 but that code seems to be broken.)

frr-1> show bgp l2vpn evpn route vni 100
BGP table version is 0, local router ID is 10.224.200.10
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete
EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]

   Network          Next Hop            Metric LocPrf Weight Path
*> [2]:[5a:15:13:e1:9c:8e]/224
                    10.224.200.5                           0 65001 i
*> [2]:[ce:8c:b2:ed:89:69]/224
                    10.224.200.10                      32768 i
*> [3]:[10.224.200.5]/224
                    10.224.200.5                           0 65001 i
*> [3]:[10.224.200.10]/224
                    10.224.200.10                      32768 i
@donaldsharp

This comment has been minimized.

Copy link
Member

donaldsharp commented Jul 14, 2017

@ecbaldwin on master please use --enable-cumulus=yes on your configure line to get the cli working again. I've also asked another developer to answer your question.

@ecbaldwin

This comment has been minimized.

Copy link

ecbaldwin commented Jul 14, 2017

@donaldsharp Thanks, I saw your reply on #809 just after I posted that question above. I missed the configure flag to enable the CLI. Is there any plan to carry the content from evpn.md in this PR somewhere where it will be useful? If you want, I could help validate that document and post a PR with it somewhere.

@mkanjari

This comment has been minimized.

Copy link
Contributor

mkanjari commented Jul 14, 2017

@ecbaldwin : Can you please try "show bgp evpn route vni 100" instead.

Sample output:

tor-11# show bgp evpn route vni 1000101
BGP table version is 0, local router ID is 6.0.0.15
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete
EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]

Network Next Hop Metric LocPrf Weight Path
*> [3]:[0]:[32]:[6.0.0.15]
6.0.0.15 32768 i

Displayed 1 prefixes (1 paths)
tor-11#

@ecbaldwin

This comment has been minimized.

Copy link

ecbaldwin commented Jul 17, 2017

@mkanjari Thanks. I ended up making the switch to master as suggested by @donaldsharp above. The command that you gave me to try seems to work fine and shows me the fields that I was not seeing before. The l2vpn command that I tried above doesn't seem to be recognized any more at all. So, I think this is resolved now.

qlyoung pushed a commit to qlyoung/frr that referenced this pull request Nov 6, 2017

Merge branch 'evpn-prep'
First 12-and-a-half commits from PR FRRouting#619

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment