Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dealing with VRFs #1648

Open
amanshaikh75 opened this issue Apr 18, 2018 · 18 comments
Open

Dealing with VRFs #1648

amanshaikh75 opened this issue Apr 18, 2018 · 18 comments

Comments

@amanshaikh75
Copy link

amanshaikh75 commented Apr 18, 2018

It seems to me that the way GoBGP handles VRFs is not correct. Let me illustrate this by describing two work-flows:

Handling receipt of a BGP update message

  • How GoBGP does it according to my understanding
    - Check the update for correctness.
    - Add path(s) to Adj-RIB-In of the peer from which the update is received.
    - If the peer belongs to a VRF, import path(s) to the Global RIB by converting IPv4/v6 routes to VPNv4/v6 routes respectively.
    - Apply applicable import policies to the path(s).
    + Run best-path selection in the context of the global RIB.
    - If necessary, install updated routes into kernel forwarding table through Zebra. While doing this, check which VRFs the route can be imported to, and install it in all the VRFs where the route can be imported to.
    - Propagate changed routes to all the peers after applying export policies.
  • How this should be done in my opinion:
    - Check the update for correctness.
    - Add path(s) to Adj-RIB-In of the peer from which the update is received.
    - Apply applicable import policies to the path(s).
    - Run best-path selection in the context of the VRF in which the peer belongs to.
    - If necessary, install updated routes into VRF’s kernel forwarding table through Zebra.
    - Propagate installed rotes to other peers in the VRF after applying relevant export policies.
    - If the peer’s VRF is not global VRF, export the route(s) to the Global RIB by converting IPv4/v6 routes to VPNv4/v6 routes respectively.
    - Run best-path selection in the context of the global RIB.
    - However, Do NOT install VPN routes into kernel’s global forwarding table.
    - But send updated VPN routes to peers that belong to the global VRF.
    - Check which other VRFs the newly received paths need to be imported into (based on import route-target of the VRF.
    - For each VRF, the route is imported into:
    - Run the best-path calculation in the context of the VRF.
    - If necessary, install updated routes into VRF’s kernel forwarding table through Zebra.
    - Propagate installed routes to other peers in the VRF after applying relevant export policies.

Addition of a new VRF through configuration

  • How GoBGP handles this:
    - If VRF already exists, ignore the request.
    - Add a VRF structure to the table-manager.
    - For every route-target in the import route-target list of this VRF, create a RouteTargetMemberShipNLRI, send it to appropriate neighbors.
    - No VPN routes are imported into the VRF from the global RIB.
  • How this should be done in my opinion:
    - Carry out all the above steps.
    - Import VPN routes into the VRF.
    - Run best-path selection algorithm in the context of the VRF.
    - Install best routes into VRF’s kernel forwarding table through Zebra.
    - Propagate installed rotes to other peers in the VRF after applying relevant export policies.
@iwaseyusuke
Copy link
Contributor

@amanshaikh75 Hi, please let me clear my head, first.
The key point of your suggestion is that GoBGP should calculate the best path per VRF and then should install the path with the VRF context to Zebra, right? I guess it does not so many differences whether applying the best path selection algorithm with the global table or with the VRF table context, because the RD on the VRF should be unique in the all VRF tables, and <RD>:<Prefix> is unique per VRF, then the best path is selected per VRF.

With the following topology

              +------------------------+
        IPv4  | r3                     |
+----+  Uni   | +------+    +--------+ |
| r1 |----------| VRF1 |--->| Global | |
+----+        | +------+    |        | |  VPNv4  +----+
              |             |        |-----------| r4 |
        IPv4  |             |        | |         +----+
+----+  Uni   | +------+    |        | |
| r2 |----------| VRF2 |--->|        | |
+----+        | +------+    +--------+ |
              |               ZAPI |   |
              |                    V   |
              | +--------------------+ |
              | | Zebra              | |
              | +--------------------+ |
              +------------------------+

When the r1 and r2 advertise the same prefix 192.168.1.0/24

r1> gobgp global rib -a ipv4 add 192.168.1.0/24
r1> gobgp global rib -a ipv4
   Network              Next Hop             AS_PATH              Age        Attrs
*> 192.168.1.0/24       0.0.0.0                                   00:00:00   [{Origin: ?}]

r2> gobgp global rib -a ipv4 add 192.168.1.0/24
r2> gobgp global rib -a ipv4
   Network              Next Hop             AS_PATH              Age        Attrs
*> 192.168.1.0/24       0.0.0.0                                   00:00:00   [{Origin: ?}]

GoBGP imports each path per VRF separately and also imports them with RD to the global table.

r3> gobgp vrf 1 rib -a ipv4
   Network              Next Hop             AS_PATH              Age        Attrs
   192.168.1.0/24       10.0.0.1             65001                00:00:00   [{Origin: ?}]

r3> gobgp vrf 2 rib -a ipv4
   Network              Next Hop             AS_PATH              Age        Attrs
   192.168.1.0/24       10.0.0.2             65002                00:00:00   [{Origin: ?}]

r3> gobgp global rib -a vpnv4
   Network                  Labels     Next Hop             AS_PATH              Age        Attrs
*> 65000:100:192.168.1.0/24 [0]        10.0.0.1             65001                00:00:00   [{Origin: ?} {Extcomms: [65000:100]}]
*> 65000:200:192.168.1.0/24 [0]        10.0.0.2             65002                00:00:00   [{Origin: ?} {Extcomms: [65000:200]}]

Then, GoBGP (zclient.go) will install VPN routes into Zebra by using the paths on the global table which contains VRF IDs.

gobgp/server/zclient.go

Lines 241 to 249 in d31262d

case bgp.RF_IPv6_UC, bgp.RF_IPv6_VPN:
if path.GetRouteFamily() == bgp.RF_IPv6_UC {
prefix = path.GetNlri().(*bgp.IPv6AddrPrefix).IPAddrPrefixDefault.Prefix.To16()
} else {
prefix = path.GetNlri().(*bgp.LabeledVPNIPv6AddrPrefix).IPAddrPrefixDefault.Prefix.To16()
}
for _, p := range paths {
nexthops = append(nexthops, p.GetNexthop().To16())
}

gobgp/server/zclient.go

Lines 507 to 510 in d31262d

for _, i := range path.VrfIds {
if body, isWithdraw := newIPRouteBody(pathList{path}); body != nil {
z.client.SendIPRoute(i, body, isWithdraw)
}

Am I misunderstanding?

@amanshaikh75
Copy link
Author

@iwaseyusuke

Hi,

You're right that in the case above, there is no need to calculate BGP best path for each VRF separately since the two VRFs are using different RD's as they should. However, consider a case where r4 sends the following VPNv4 route to r3:

65000:300:192.168.1.0/24 [0]        10.0.0.1             65001                00:00:00   [{Origin: ?} {Extcomms: [65000:100]}]

When this route arrives, GoBGP at r3 will calculate best paths in global RIB, and will choose all three routes as best since each of the three routes has a distinct RD.

r3> gobgp global rib -a vpnv4
   Network                  Labels     Next Hop             AS_PATH              Age        Attrs
*> 65000:100:192.168.1.0/24 [0]        10.0.0.1             65001                00:00:00   [{Origin: ?} {Extcomms: [65000:100]}]
*> 65000:200:192.168.1.0/24 [0]        10.0.0.2             65002                00:00:00   [{Origin: ?} {Extcomms: [65000:200]}]
*> 65000:300:192.168.1.0/24 [0]        10.0.0.1             65001                00:00:00   [{Origin: ?} {Extcomms: [65000:100]}]

Since the first and the third routes have the same route targets, both of them will be imported into VRF 1, and will be installed as best routes into Zebra. In reality, the first route is better than the third route since it is learned over an eBGP session while the other route is learned over an iBGP session (assuming r3-r4 session is an iBGP one). Do you agree?

Speaking more generally, the following cases will create problems with GoBGP's current way of handing VRF routes:

  • When different PEs use different RDs for the same VPN VRF.
  • When routes are leaked from one VRF to another on the same PE.

@iwaseyusuke
Copy link
Contributor

@amanshaikh75 Hi,

I don't know whether the different values of RD and RT are used in the real (production) MPLS VPN services and I think the same value should be better for the maintainability.

But the different RD and RT are not prohibited, and I tried the following with Cisco routers.

+----+                        +----+
| R1 |---------(iBGP)---------| R2 |
+----+                        +----+
 - Vrf1                        - Vrf3
   RD 65000:100                  RD 65000:300
   RT 65000:100                  RT 65000:100  <--- the same RT with Vrf1
   * 192.168.1.0/24              * 192.168.1.0/24  <--- the same prefix exist on R1

 - Vrf2
   RD 65000:200
   RT 65000:200
   * 192.168.3.0/24

With the above situation, the route from R2 "65000:300:192.168.2.0/24" was imported as the following.

R1#show ip bgp all
For address family: IPv4 Unicast


For address family: VPNv4 Unicast

BGP table version is 4, local router ID is 10.0.0.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
              x best-external, a additional-path, c RIB-compressed,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

     Network          Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 65000:100 (default for vrf Vrf1)
 * i 192.168.1.0      10.0.0.2                 0    100      0 ?
 *>                   0.0.0.0                  0         32768 ?
Route Distinguisher: 65000:200 (default for vrf Vrf2)
 *>  192.168.3.0      0.0.0.0                  0         32768 ?
Route Distinguisher: 65000:300
 *>i 192.168.1.0      10.0.0.2                 0    100      0 ?

For address family: IPv4 Multicast


For address family: MVPNv4 Unicast

It seems that "65000:300:192.168.2.0/24" was copied and translated to "65000:100:192.168.2.0/24" and the both paths were imported into the global table.

Also, as you said, the best path seems to be selected per VRF (a local connected path is selected on "Vrf1"). You mean this behavior?

@amanshaikh75
Copy link
Author

Hi @iwaseyusuke

Yes, this is exactly what I am referring to. I don't know how common this is, but it's not precluded.

Another way this situation arises is when routes are shared between different VPNs (or VRFs).

@iwaseyusuke
Copy link
Contributor

@amanshaikh75 Hi,

Does this PR (#1656) address this issue? This patch clones the VPN path having the different RD and overwrites the RD on the cloned path by the RD of matched VRFs.

@amanshaikh75
Copy link
Author

@iwaseyusuke Hi

Your PR seems to be a step in the right direction. However, here are couple of items to think about:

  • When routes are cloned and imported into the VRF, keeping the RD is not really required since all routes are going to have the same RD. In fact, routes coming from the VRF should be imported as Labeled unicast routes (SAFI-4) in my opinion. https://tools.ietf.org/html/rfc8277#section-5 talks about comparing SAFI-4 with SAFI-1 routes.
  • Does your code perform best path calculation in the context of the VRF? If not, how does it decide which route to send to attached CE?

@iwaseyusuke
Copy link
Contributor

@amanshaikh75 Sorry for the delay and thank you for reviewing.

  • When routes are cloned and imported into the VRF, keeping the RD is not really required since all routes are going to have the same RD. In fact, routes coming from the VRF should be imported as Labeled unicast routes (SAFI-4) in my opinion. https://tools.ietf.org/html/rfc8277#section-5 talks about comparing SAFI-4 with SAFI-1 routes.

I think routes from a VRF should be imported as MPLS-labeled VPN routes (SAFI=128). Sorry I might not understand RFC8277 enough, but section-5 can be said for SAFI=128? And section-5 mentions some implementations but does not clarify which approach is the best.

  • Does your code perform best path calculation in the context of the VRF? If not, how does it decide which route to send to attached CE?

Yes, the cloned path should be selected as the best path on the specific VRF. The behavior (the output of gobgp command) is almost equivalent to the Cisco router on my previous comment.

@amanshaikh75
Copy link
Author

  • I think routes from a VRF should be imported as MPLS-labeled VPN routes (SAFI=128). Sorry I might not understand RFC8277 enough, but section-5 can be said for SAFI=128? And section-5 mentions some implementations but does not clarify which approach is the best.

I agree that section-5 of RFC 8277 does not advocate one particular approach. On the other hand, bringing routes into VRF as SAFI=128 still has the problem in that routes learned from CEs are usually SAFI=1 routes. So, now you face the problem of comparing SAFI=1 and SAFI=128 routes, right?

  • Yes, the cloned path should be selected as the best path on the specific VRF. The behavior (the output of gobgp command) is almost equivalent to the Cisco router on my previous comment.

So, are you now maintaining separate route table for each VRF?

@iwaseyusuke
Copy link
Contributor

iwaseyusuke commented May 8, 2018

I agree that section-5 of RFC 8277 does not advocate one particular approach. On the other hand, bringing routes into VRF as SAFI=128 still has the problem in that routes learned from CEs are usually SAFI=1 routes. So, now you face the problem of comparing SAFI=1 and SAFI=128 routes, right?

You mean routes from PE(or P) routers can not be compared with routes from CE routers because routes from PEs will be SAFI=128 and routes from CEs will be SAFI=1, right?
GoBGP does the best path calculation on only its global table and routes from CEs should be translated to VPN routes (SAFI=128) then imported into the global table. On VRF, routes from CEs are represented as SAFI=1, but SAFI=128 on the global table. So both routes from PEs and from CEs are SAFI=128 on global table, then I guess GoBGP can compare routes from PEs and routes from CEs on the global table.

So, are you now maintaining separate route table for each VRF?

No, GoBGP does not maintain each VRF table and maintains only global table.

For example, with the following topology, if both CE1 and CE2 has the same prefix "192.168.1.0/24" and the RD on GoBGP's VRF (65000:100) and the RD on PE's VRF (65000:200),

                              +-----------------------+
                        IPv4  | GoBGP                 |
               +-----+  Uni   | +-----+    +--------+ |  VPNv4  +----+
192.168.1.0/24 | CE1 |----------| VRF |--->| Global |<----------| PE |... CE2 192.168.1.0/24
               +-----+        | +-----+    +--------+ |         +----+
                              +-----------------------+
                                 RD 65000:100                    RD 65000:200

GoBGP will receive an IPv4 prefix "192.168.1.0/24" from CE and another VPNv4 prefix "65000:200:192.168.1.0/24" from PE.

For CE side, GoBGP will translate the prefix "192.168.1.0/24" from CE to the VPNv4 prefix "65000:100:192.168.1.0/24" and import it to its global table.
On the other hand, with my patch, GoBGP clones the VPNv4 prefix "65000:200:192.168.1.0/24" from PE and make a copy as the VPNv4 prefix "65000:100:192.168.1.0/24" and import both to its global table.
At this time, on global table, there are 3 prefixes like;

- "65000:100:192.168.1.0/24" (from CE) (In this example, this prefix is preferred than from PE)
- "65000:100:192.168.1.0/24" (copy of the path from PE)
- "65000:200:192.168.1.0/24" (from PE)

Then GoBGP applies the best path calculation and the first one and the last one are selected, but the last one will not be installed to Zebra because the last one does not have VRF ID.

@amanshaikh75
Copy link
Author

I see. So essentially, you are creating a per-VRF table within the global table by cloning paths if necessary. All the VRF-specific paths will have VPN prefixes with the VRF's RD which will allow gobgp to run the decision process appropriately. I can't think of any obvious problems with this approach.

@iwaseyusuke
Copy link
Contributor

@amanshaikh75 Thanks for your confirmation.

I think this approach is the similar to the way to "regard SAFI-1 routes and SAFI-4 routes as completely independent".

@amanshaikh75
Copy link
Author

One thing @iwaseyusuke .

I have noticed with the current implementation, when a new VRF and a neighbor within it are added to GoBGP, the deamon does not send existing VPNv4 routes that can be imported into the VRF to the neighbor during RIB synchronization. Is this something your PR fixes?

@iwaseyusuke
Copy link
Contributor

@amanshaikh75 Thanks for your report! But sorry this PR does not address that issue I think.

@amanshaikh75
Copy link
Author

amanshaikh75 commented May 10, 2018

Will you be willing to create a new PR to address this issue?

BTW with ZAPI version 5, I have been able to find a way to install VPN routes with double encapsulation at the ingress PE. See zapi_version_5 in my gobgp repository. With this update, I am now able to use GoBGP for CE-to-CE traffic in L3VPN scenario.

@iwaseyusuke
Copy link
Contributor

Will you be willing to create a new PR to address this issue?

Yes, thanks! It is highly welcomed!

BTW with ZAPI version 5, I have been able to find a way to install VPN routes with double encapsulation at the ingress PE. See zapi_version_5 in my gobgp repository. With this update, I am now able to use GoBGP for CE-to-CE traffic in L3VPN scenario.

That sounds great! But, supporting both v4 and v5 seems to make the codes complex... Hmmm...

irino added a commit to irino/gobgp that referenced this issue Mar 2, 2019
 - This commit aims to solve reported problem on issues osrg#1611, osrg#1648 and osrg#1912
 - Partial changes of this commit duplicate with changes on PR osrg#1587 (not merged) and PR osrg#1766 (not merged and already closed)
 - This commit is tested with only FRRouting version 6.0.2 (which uses Zebra API 6)
 - This commit fixes lack of LABEL_MANAGER_CONNECT_ASYNC for ZAPI6.
   (This bug is introduced on commit 2bdb76f "Supporting Zebra API version 6 which is used in FRRouting version 6")
fujita pushed a commit that referenced this issue Mar 14, 2019
 - This commit aims to solve reported problem on issues #1611, #1648 and #1912
 - Partial changes of this commit duplicate with changes on PR #1587 (not merged) and PR #1766 (not merged and already closed)
 - This commit is tested with only FRRouting version 6.0.2 (which uses Zebra API 6)
 - This commit fixes lack of LABEL_MANAGER_CONNECT_ASYNC for ZAPI6.
   (This bug is introduced on commit 2bdb76f "Supporting Zebra API version 6 which is used in FRRouting version 6")
@adisai123
Copy link

I have a scenario; same routes published 1.1.1.1/32 with red and blue vrf tag , arrived at gobgp; those routes get inserted in different routing table ; now I want to access both routes from same vm , how I will do that. Please guide me.

@adisai123
Copy link

@amanshaikh75 Hi, please let me clear my head, first.
The key point of your suggestion is that GoBGP should calculate the best path per VRF and then should install the path with the VRF context to Zebra, right? I guess it does not so many differences whether applying the best path selection algorithm with the global table or with the VRF table context, because the RD on the VRF should be unique in the all VRF tables, and <RD>:<Prefix> is unique per VRF, then the best path is selected per VRF.

With the following topology

              +------------------------+
        IPv4  | r3                     |
+----+  Uni   | +------+    +--------+ |
| r1 |----------| VRF1 |--->| Global | |
+----+        | +------+    |        | |  VPNv4  +----+
              |             |        |-----------| r4 |
        IPv4  |             |        | |         +----+
+----+  Uni   | +------+    |        | |
| r2 |----------| VRF2 |--->|        | |
+----+        | +------+    +--------+ |
              |               ZAPI |   |
              |                    V   |
              | +--------------------+ |
              | | Zebra              | |
              | +--------------------+ |
              +------------------------+

When the r1 and r2 advertise the same prefix 192.168.1.0/24

r1> gobgp global rib -a ipv4 add 192.168.1.0/24
r1> gobgp global rib -a ipv4
   Network              Next Hop             AS_PATH              Age        Attrs
*> 192.168.1.0/24       0.0.0.0                                   00:00:00   [{Origin: ?}]

r2> gobgp global rib -a ipv4 add 192.168.1.0/24
r2> gobgp global rib -a ipv4
   Network              Next Hop             AS_PATH              Age        Attrs
*> 192.168.1.0/24       0.0.0.0                                   00:00:00   [{Origin: ?}]

GoBGP imports each path per VRF separately and also imports them with RD to the global table.

r3> gobgp vrf 1 rib -a ipv4
   Network              Next Hop             AS_PATH              Age        Attrs
   192.168.1.0/24       10.0.0.1             65001                00:00:00   [{Origin: ?}]

r3> gobgp vrf 2 rib -a ipv4
   Network              Next Hop             AS_PATH              Age        Attrs
   192.168.1.0/24       10.0.0.2             65002                00:00:00   [{Origin: ?}]

r3> gobgp global rib -a vpnv4
   Network                  Labels     Next Hop             AS_PATH              Age        Attrs
*> 65000:100:192.168.1.0/24 [0]        10.0.0.1             65001                00:00:00   [{Origin: ?} {Extcomms: [65000:100]}]
*> 65000:200:192.168.1.0/24 [0]        10.0.0.2             65002                00:00:00   [{Origin: ?} {Extcomms: [65000:200]}]

Then, GoBGP (zclient.go) will install VPN routes into Zebra by using the paths on the global table which contains VRF IDs.

gobgp/server/zclient.go

Lines 241 to 249 in d31262d

case bgp.RF_IPv6_UC, bgp.RF_IPv6_VPN:
if path.GetRouteFamily() == bgp.RF_IPv6_UC {
prefix = path.GetNlri().(*bgp.IPv6AddrPrefix).IPAddrPrefixDefault.Prefix.To16()
} else {
prefix = path.GetNlri().(*bgp.LabeledVPNIPv6AddrPrefix).IPAddrPrefixDefault.Prefix.To16()
}
for _, p := range paths {
nexthops = append(nexthops, p.GetNexthop().To16())
}

gobgp/server/zclient.go

Lines 507 to 510 in d31262d

for _, i := range path.VrfIds {
if body, isWithdraw := newIPRouteBody(pathList{path}); body != nil {
z.client.SendIPRoute(i, body, isWithdraw)
}

Am I misunderstanding?

from router 3 I want to access both 65000:100:192.168.1.0/24 and 65000:200:192.168.1.0/24; how to do that ; please guide me.

@lastorel
Copy link

lastorel commented Feb 3, 2024

Even in the only one VRF and unique RD (example 65000:100), the router must execute path selection algorithm into VRF table when a prefix (example 10.0.0.0/8) received from CE1.
At the receiving moment VRF table can already contain 10.0.0.0/8 (with local pref 90 and not best) from another CE2 and the same from remote CE3 (VPNv4).

At my case new prefix will win because it's received from eBGP and has not decreased local preference. After that it must be injected (cloned) into global table in vpnv4 format. And there global table performs independent new path selection in vpnv4 AFI (and the new route can be not best).

               +------------------------+
         IPv4  | PE                     |
+-----+  Uni   | +------+    +--------+ |
| CE1 |----------| VRF1 |--->| Global | |
+-----+        | |      |    |        | |  VPNv4  +----+    +-----+
               | |      |    |        |-----------| PE |----| CE3 |
         IPv4  | |      |    |        | |         +----+    +-----+
+-----+  Uni   | |      |    |        | |
| CE2 |----------|      |    |        | |
+-----+        | +------+    +--------+ |
               |                        |
               |                        |
               +------------------------+

And after that old local CE2 can receive BGP Update with new path because new path was selected as best into VRF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants