Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib: nexthops compare vrf only if ip type #5368

Closed
wants to merge 1 commit into from

Conversation

tylerlinp
Copy link

@tylerlinp tylerlinp commented Nov 19, 2019

Using frr7.2, I encounterd a problem that zebra crashes when removing vrf(I would create an issue). And found when changing an interface vrf, route ff00::/8 deletes failed because of nexthop vrf changed.

To compare nexthops, if given ifindex, it is enough to compare ifindex, the vrf is get from ifindex, and ifindex is more reliable. For blackhole, I think it is a special interface, vrf may be useless. So only type ip need to compare vrf.

logs:

2019/11/14 09:48:05 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWLINK(16), len=1276, seq=0, pid=0
2019/11/14 09:48:05 ZEBRA: RTM_NEWLINK update for Ethernet8(433) sl_type 1 master 0 flags 0x11043
2019/11/14 09:48:05 ZEBRA: Intf Ethernet8(433) PTM up, notifying clients
2019/11/14 09:48:05 ZEBRA: MESSAGE: ZEBRA_INTERFACE_UP Ethernet8(459)
2019/11/14 09:48:05 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWLINK(16), len=1248, seq=0, pid=0
2019/11/14 09:48:05 ZEBRA: RTM_NEWLINK vrf-change for Ethernet8(433) vrf_id 459 -> 0 flags 0x11043
2019/11/14 09:48:05 ZEBRA: MESSAGE: ZEBRA_INTERFACE_ADDRESS_DELETE fe80::42:acff:fe11:2/64 on Ethernet8(459)
2019/11/14 09:48:05 ZEBRA: rib_delnode: 459:fe80::/64: rn 0x55610789b5c0, re 0x55610788f8a0, removing
2019/11/14 09:48:05 ZEBRA: rib_delnode: 459:fe80::/64 (MRIB): rn 0x55610788ed00, re 0x55610788dc90, removing
2019/11/14 09:48:05 ZEBRA: MESSAGE: ZEBRA_INTERFACE_VRF_UPDATE/DEL Ethernet8 VRF Id 459 -> 0
2019/11/14 09:48:05 ZEBRA: MESSAGE: ZEBRA_INTERFACE_VRF_UPDATE/ADD Ethernet8 VRF Id 459 -> 0
2019/11/14 09:48:05 ZEBRA: MESSAGE: ZEBRA_INTERFACE_ADDRESS_ADD fe80::42:acff:fe11:2/64 on Ethernet8(0)
2019/11/14 09:48:05 ZEBRA: rib_add_multipath: 0:fe80::/64: Inserting route rn 0x5561079b2050, re 0x55610788e7b0 (connected) existing (nil)
2019/11/14 09:48:05 ZEBRA: rib_add_multipath: 0:fe80::/64 (MRIB): Inserting route rn 0x55610789f250, re 0x55610789e040 (connected) existing (nil)
2019/11/14 09:48:05 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELROUTE(25), len=116, seq=0, pid=0
2019/11/14 09:48:05 ZEBRA: RTM_DELROUTE ipv6 unicast proto kernel NS 0
2019/11/14 09:48:05 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELROUTE(25), len=116, seq=0, pid=0
2019/11/14 09:48:05 ZEBRA: RTM_DELROUTE ipv6 unicast proto boot NS 0
2019/11/14 09:48:05 ZEBRA: RTM_DELROUTE ff00::/8 vrf 459(1001) metric: 256 Admin Distance: 0
2019/11/14 09:48:05 ZEBRA: rib_delete: 459:ff00::/8: via :: ifindex 433 type 1 doesn't exist in rib

Copy link

@polychaeta polychaeta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution to FRR!

  • One of your commits is missing a Signed-off-by line; we can't accept your contribution until all of your commits have one
  • One of your commits does not have a blank line between the summary and body; this will break git log --oneline

If you are a new contributor to FRR, please see our contributing guidelines.

@@ -120,6 +114,12 @@ static int _nexthop_cmp_no_labels(const struct nexthop *next1,
switch (next1->type) {
case NEXTHOP_TYPE_IPV4:
case NEXTHOP_TYPE_IPV6:
if (next1->vrf_id < next2->vrf_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if vrf is used not only with IPv4/IPv6 nexthop, but nexthop as interface?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ifindex given, vrf is a redundant term, actually it is get from ifindex, further more interface may change vrf, then RTM_DELROUTE handling failed, netlink message doesn't contain nexthop vrf either.

@LabN-CI
Copy link
Collaborator

LabN-CI commented Nov 19, 2019

💚 Basic BGPD CI results: SUCCESS, 0 tests failed

Results table
_ _
Result SUCCESS git merge/5368 08950b1
Date 11/19/2019
Start 03:54:46
Finish 04:20:42
Run-Time 25:56
Total 1815
Pass 1815
Fail 0
Valgrind-Errors 0
Valgrind-Loss 0
Details vncregress-2019-11-19-03:54:46.txt
Log autoscript-2019-11-19-03:55:42.log.bz2
Memory 433 435 359

For details, please contact louberger

To compare nexthops, if given ifindex, it is enough to compare ifindex,
the vrf is get from ifindex, and ifindex is more reliable. For
blackhole, I think it is a special interface, vrf may be useless. So
only type ip need to compare vrf.

Signed-off-by: Tyler Li <tyler.li@mediatek.com>
@LabN-CI
Copy link
Collaborator

LabN-CI commented Nov 19, 2019

💚 Basic BGPD CI results: SUCCESS, 0 tests failed

Results table
_ _
Result SUCCESS git merge/5368 d94edea
Date 11/19/2019
Start 04:35:25
Finish 05:01:17
Run-Time 25:52
Total 1815
Pass 1815
Fail 0
Valgrind-Errors 0
Valgrind-Loss 0
Details vncregress-2019-11-19-04:35:25.txt
Log autoscript-2019-11-19-04:36:19.log.bz2
Memory 422 422 360

For details, please contact louberger

@NetDEF-CI
Copy link
Collaborator

Continuous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-9768/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.

<TITLE>clang_check</TITLE>

clang_check

@NetDEF-CI
Copy link
Collaborator

Continuous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-9769/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.

<TITLE>clang_check</TITLE>

clang_check

Copy link
Member

@riw777 riw777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems, to me, that moving this code would allow processes to delete routes they shouldn't be deleting... I think this is the wrong place to fix this problem.

@sworleys
Copy link
Member

Okay, I took a couple hours to look at this one.

So this is only an issue with kernel routes.

And it seems this specific case is fixed with current master with all cases but NEXTHOP_TYPE_IFINDEX

It was most likely fixed due to #5184 since we reprocess the kernel routes on the interface event.

VRF vrf-blue:
C>* 2.2.2.0/24 is directly connected, dummyVRFblue, 00:00:06
K>* 8.8.8.8/32 [0/0] via 2.2.2.2, dummyVRFblue, 00:00:02
ubuntu_nh# 
root@ubuntu_nh:/home/sworley/Development/frr# ip ro add 8.8.8.8/32 via 2.2.2.2 dev dummyVRFblue vrf vrf-blue
root@ubuntu_nh:/home/sworley/Development/frr# ip link set dev dummyVRFblue master vrf-red
root@ubuntu_nh:/home/sworley/Development/frr# 
ubuntu_nh# show ip ro vrf vrf-blue
ubuntu_nh# 

However, with nexthops of type NEXTHOP_TYPE_IFINDEX, we don't bother processing them and just assume they are active.

Therefore, the fix might actually be to do this (untested):

sworley@alfred ~/D/c/e/v/u/D/frr> git diff zebra
diff --git a/zebra/zebra_nhg.c b/zebra/zebra_nhg.c
index 05da25b2b..fd381ae8e 100644
--- a/zebra/zebra_nhg.c
+++ b/zebra/zebra_nhg.c
@@ -1492,7 +1492,8 @@ static unsigned nexthop_active_check(struct route_node *rn,
        switch (nexthop->type) {
        case NEXTHOP_TYPE_IFINDEX:
                ifp = if_lookup_by_index(nexthop->ifindex, nexthop->vrf_id);
-               if (ifp && if_is_operative(ifp))
+               if (ifp && if_is_operative(ifp)
+                   && (ifp->vrf_id == nexthop->vrf_id))
                        SET_FLAG(nexthop->flags, NEXTHOP_FLAG_ACTIVE);
                else
                        UNSET_FLAG(nexthop->flags, NEXTHOP_FLAG_ACTIVE);
sworley@alfred ~/D/c/e/v/u/D/frr> 

BUT we do have a second bug actually:

We don't handle deletes from the kernel when the nexthop vrf is specified, regardless of whether it changed.

root@ubuntu_nh:/home/sworley/Development/frr# ip ro add 8.8.8.8/32 dev dummyVRFblue
root@ubuntu_nh:/home/sworley/Development/frr# ip ro del 8.8.8.8/32 dev dummyVRFblue

and its still present in the rib:

ubuntu_nh# show ip ro 8.8.8.8/32
Routing entry for 8.8.8.8/32
  Known via "kernel", distance 0, metric 0, best
  Last update 00:00:14 ago
  * directly connected, dummyVRFblue(vrf vrf-blue)

ubuntu_nh# 

and this seems to be because we are not passing the nexthop's vrf_id into rib_delete() when we get it from the kernel:

#2  0x000000000045df0b in rib_delete (afi=AFI_IP, safi=SAFI_UNICAST, vrf_id=0, type=1, instance=0, flags=0, p=0x7fffffff5b88, src_p=0x7fffffff5b70, 
    nh=0x7fffffff57f0, nhe_id=0, table_id=254, metric=0, distance=0 '\000', fromkernel=true) at zebra/zebra_rib.c:2884
2884                            if (nexthop_same_no_labels(rtnh, nh)) {
(gdb) p rtnh->vrf_id
$4 = 191
(gdb) p nh->vrf_id
$5 = 0

which is not being set in the netlink code.

The RTM_DELROUTE needs to call parse_nexthop_unicast() like RTM_NEWROUTE does, where it looks up the interface by ifindex and then sets the nexthop->vrf_id appropriately.

        if (index) {                                                             
                ifp = if_lookup_by_index_per_ns(zebra_ns_lookup(ns_id), index);  
                if (ifp)                                                         
                        nh_vrf_id = ifp->vrf_id;                                 
        }                                                                        
        nh.vrf_id = nh_vrf_id;  

@sworleys
Copy link
Member

So two commits are probably needed to fix this one.

  1. The diff I shared where you check the vrf_id with the interface looked up in nexthop_active_check()
  2. Netlink route updates of type RTM_DELROUTE need to call parse_nexthop_unicast() like RTM_NEWROUTE does in order to get nh.vrf_id when passed to rib_del()

That should resolve the issue without having to change the nexthop comparison function.

@ryan44guo
Copy link

Okay, I took a couple hours to look at this one.

So this is only an issue with kernel routes.

And it seems this specific case is fixed with current master with all cases but NEXTHOP_TYPE_IFINDEX

It was most likely fixed due to #5184 since we reprocess the kernel routes on the interface event.

VRF vrf-blue:
C>* 2.2.2.0/24 is directly connected, dummyVRFblue, 00:00:06
K>* 8.8.8.8/32 [0/0] via 2.2.2.2, dummyVRFblue, 00:00:02
ubuntu_nh# 
root@ubuntu_nh:/home/sworley/Development/frr# ip ro add 8.8.8.8/32 via 2.2.2.2 dev dummyVRFblue vrf vrf-blue
root@ubuntu_nh:/home/sworley/Development/frr# ip link set dev dummyVRFblue master vrf-red
root@ubuntu_nh:/home/sworley/Development/frr# 
ubuntu_nh# show ip ro vrf vrf-blue
ubuntu_nh# 

However, with nexthops of type NEXTHOP_TYPE_IFINDEX, we don't bother processing them and just assume they are active.

Therefore, the fix might actually be to do this (untested):

sworley@alfred ~/D/c/e/v/u/D/frr> git diff zebra
diff --git a/zebra/zebra_nhg.c b/zebra/zebra_nhg.c
index 05da25b2b..fd381ae8e 100644
--- a/zebra/zebra_nhg.c
+++ b/zebra/zebra_nhg.c
@@ -1492,7 +1492,8 @@ static unsigned nexthop_active_check(struct route_node *rn,
        switch (nexthop->type) {
        case NEXTHOP_TYPE_IFINDEX:
                ifp = if_lookup_by_index(nexthop->ifindex, nexthop->vrf_id);
-               if (ifp && if_is_operative(ifp))
+               if (ifp && if_is_operative(ifp)
+                   && (ifp->vrf_id == nexthop->vrf_id))
                        SET_FLAG(nexthop->flags, NEXTHOP_FLAG_ACTIVE);
                else
                        UNSET_FLAG(nexthop->flags, NEXTHOP_FLAG_ACTIVE);
sworley@alfred ~/D/c/e/v/u/D/frr> 

BUT we do have a second bug actually:

We don't handle deletes from the kernel when the nexthop vrf is specified, regardless of whether it changed.

root@ubuntu_nh:/home/sworley/Development/frr# ip ro add 8.8.8.8/32 dev dummyVRFblue
root@ubuntu_nh:/home/sworley/Development/frr# ip ro del 8.8.8.8/32 dev dummyVRFblue

and its still present in the rib:

ubuntu_nh# show ip ro 8.8.8.8/32
Routing entry for 8.8.8.8/32
  Known via "kernel", distance 0, metric 0, best
  Last update 00:00:14 ago
  * directly connected, dummyVRFblue(vrf vrf-blue)

ubuntu_nh# 

and this seems to be because we are not passing the nexthop's vrf_id into rib_delete() when we get it from the kernel:

#2  0x000000000045df0b in rib_delete (afi=AFI_IP, safi=SAFI_UNICAST, vrf_id=0, type=1, instance=0, flags=0, p=0x7fffffff5b88, src_p=0x7fffffff5b70, 
    nh=0x7fffffff57f0, nhe_id=0, table_id=254, metric=0, distance=0 '\000', fromkernel=true) at zebra/zebra_rib.c:2884
2884                            if (nexthop_same_no_labels(rtnh, nh)) {
(gdb) p rtnh->vrf_id
$4 = 191
(gdb) p nh->vrf_id
$5 = 0

which is not being set in the netlink code.

The RTM_DELROUTE needs to call parse_nexthop_unicast() like RTM_NEWROUTE does, where it looks up the interface by ifindex and then sets the nexthop->vrf_id appropriately.

        if (index) {                                                             
                ifp = if_lookup_by_index_per_ns(zebra_ns_lookup(ns_id), index);  
                if (ifp)                                                         
                        nh_vrf_id = ifp->vrf_id;                                 
        }                                                                        
        nh.vrf_id = nh_vrf_id;  

The root problem is why we need vrf_id if nexthop ifindex given? We know if only nexthop ip given, we need vrf_id for lookup the outer interface, and then send nexthop ip and outer interface in netlink. (We can see the function nexthop_same_firsthop, what we really need is nexthop ip and outer interface)But if ifindex given, it is enough for us, the vrf_id is redundant. Keep a vrf_id in this type only makes us do more works useless or harmfully(if we doing something wrong like this problem). So I think Tyler‘s modifies is better.

@sworleys
Copy link
Member

@ryan44guo I'm not confident enough to answer whether the vrf_id is necessary or not but this patch presented alone will not fix the issue described in either 7.2 or current master with recent linux kernels.

We no longer receive explicit route deletes from the kernel on interface events that cause routes to be removed (they are silently deleted and we are expected to know).

ex)

This is with this patch added to current master on a 4.15 kernel:

root@ubuntu_nh:/home/sworley/Development/frr# ip link set dev dummyVRFblue master vrf-blue
root@ubuntu_nh:/home/sworley/Development/frr# ip ro add 8.8.8.8/32 dev dummyVRFblue vrf vrf-blue
root@ubuntu_nh:/home/sworley/Development/frr# ip link set dev dummyVRFblue master vrf-red
root@ubuntu_nh:/home/sworley/Development/frr# ip ro show vrf vrf-blue
root@ubuntu_nh:/home/sworley/Development/frr# 
ubuntu_nh# show ip ro vrf vrf-blue
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, r - rejected route


VRF vrf-blue:
K>* 8.8.8.8/32 [0/0] is directly connected, dummyVRFblue, 00:00:35
ubuntu_nh# 
root@ubuntu_nh:/home/sworley/Development/frr# uname -a
Linux ubuntu_nh 4.15.0-70-generic #79-Ubuntu SMP Tue Nov 12 10:36:11 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
root@ubuntu_nh:/home/sworley/Development/frr# 

The route with nexthop of type NEXTHOP_TYPE_IFINDEX is not removed from our rib.

@tylerlinp
Copy link
Author

@sworleys Yes, you are right. My patch can not fix the issue in your example. Then kernel doesn't send RTM_DELROUTE.

Copy link
Member

@bisdhdh bisdhdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I too think it is not the correct place to fix this issue. It might be fixing one issue but opening ends for other issues to come.

@ryan44guo
Copy link

ryan44guo commented Nov 21, 2019

@sworleys Yes, you are right. My patch can not fix the issue in your example. Then kernel doesn't send RTM_DELROUTE.

If the kernel doesn't send DELROUTE, this is the kernel's bug, it should send DELROUTE to netlink if it had sent ADDROUTE to. That is 2 separated problem, although this modifier can make this route inactive. But make route inactive is not equal to remove route, so that is not a correct workaround. We can assume if the interface bind back to the origin vrf, in frr logic, the route can active again, but does kernel do the same thing? To the root reason, it is a vrf+route thing, it can not be repaired by nexthop vrf stuff.
So although this patch make these 2 bug looks better, but maybe it is not the correct method to anyone of them.

@ryan44guo
Copy link

It seems, to me, that moving this code would allow processes to delete routes they shouldn't be deleting... I think this is the wrong place to fix this problem.

Can you give an example? We only have an example that processes not delete routes which should be deleted now.

@sworleys
Copy link
Member

If the kernel doesn't send DELROUTE, this is the kernel's bug.

Actually, no it was intentionally done in the kernel to limit route notifications on interface events from what I have been told.

@ryan44guo
Copy link

If the kernel doesn't send DELROUTE, this is the kernel's bug.

Actually, no it was intentionally done in the kernel to limit route notifications on interface events from what I have been told.

If so, we can assume that is their design, although I doubt that is a good design. You know what they did is removing static routes which nexthop point to down interface, so when interface up, it can not been added back. I think that is not what we hope to do. When change upper(vrf), if seems a down and an up, so the static route is lost.
OK, if that is what their design, then what we can do? I doubt whether we can distinguish the auto-generation-route(i.e. ff00::/8) and static route configured by user command from netlink message. For the auto-generation-route, it may send us DELROUTE, so we can do nothing about it(of course, we need use tyler's modification to let the route really deleted). For the static route configured by user command, maybe what we should do will be deleting it(not deactive, cause the kernel have deleted it).

@sworleys
Copy link
Member

@ryan44guo I added in #5184 to handle the lack of explicit RTM_DELROUTE for kernel routes.

I shared my suggestions earlier to add to this patch so that we can handle the vrf changing case. I tested it and it seems to work on my 4.19 kernel.

So two commits are probably needed to fix this one.

  1. The diff I shared where you check the vrf_id with the interface looked up in nexthop_active_check()
  2. Netlink route updates of type RTM_DELROUTE need to call parse_nexthop_unicast() like RTM_NEWROUTE does in order to get nh.vrf_id when passed to rib_del()

If we add (2) I think we will no longer need tyler's change with the vrf_id though.

@sworleys
Copy link
Member

It might be worth discussing this in our slack instead.

If you aren't a member, please join by clicking the slack icon and self-inviting yourself

https://frrouting.org/#participate

We can discuss this further in the #development channel or a private group one with everyone

@ryan44guo
Copy link

It might be worth discussing this in our slack instead.

If you aren't a member, please join by clicking the slack icon and self-inviting yourself

https://frrouting.org/#participate

We can discuss this further in the #development channel or a private group one with everyone
@sworleys
I am so sorry that I can not access slack in my company. It is my company's IT policy.
For this problem, I doubt what you said in 2) can resolve tyler's problem. what we discovered is the route DELROUTE message is later then the interface change vrf event(I guess it is because of the work queue in kernel's ipv6 route), so if you parse the vrf_id from ifp, it may still get the new vrf_id.
What you said in 1) perhaps can resolve a large part of problem for kernel's operation(if you do not want call it a bug), but there is a logical error here, although it is not come from your modification. Our rib_process is a timed task, so if one interface down and up (or leave and rejoin in same vrf)in a short time, we can see the kernel route is kept in frr because of it still active when checked, but the kernel remove it. If we want to workaround the kernel's operation, we need do it immediately after the interface event(at least before the next event).

@qlyoung qlyoung self-requested a review November 26, 2019 16:44
@qlyoung qlyoung removed their request for review December 3, 2019 17:05
@sworleys
Copy link
Member

partially fixed by #5553

@qlyoung qlyoung closed this Feb 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet