Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zebra crashes when removing vrf #5369

Closed
tylerlinp opened this issue Nov 19, 2019 · 7 comments · Fixed by #5510
Closed

zebra crashes when removing vrf #5369

tylerlinp opened this issue Nov 19, 2019 · 7 comments · Fixed by #5510
Assignees

Comments

@tylerlinp
Copy link

After updating frr to 7.2(514f508) in SONiC, zebra crashes when removing vrf. I think there should be two bugs.

  1. In the process of deleting, cleanup all rib dest in this vrf immediately, but some dest may be referenced by fpm work queue that scheduling by timer.
  2. In comparing nexthops, including vrf, but vrf is get from ifindex if it exists, after interface changed master, deleting route would fail as vrf changed.

The implement in SONiC ensure removing routes before vrf, so the first bug can avoid in frr7.1. But with second bug, vrf is not empty at deleting, so the first bug appears.

I created PR#5368 to fix bug 2.

The first bug, we can see the dest have two flags, RIB_DEST_UPDATE_FPM and RIB_DEST_SENT_TO_FPM, in function rib_gc_dest, we can see if these flags exist, the dest can not been deleted. But when vrf remove, they are freed regardless of these flags. The difficult is they want to free the vrf here, if we keep the dest for fpm, the vrf remove should been delayed until all of these dests freed after fpm done.

2019/11/14 09:47:50 ZEBRA: MESSAGE: ZEBRA_INTERFACE_ADDRESS_ADD 20.0.0.8/29 on Ethernet16(0)
2019/11/14 09:47:51 ZEBRA: MESSAGE: ZEBRA_INTERFACE_DOWN Ethernet16(0)
2019/11/14 09:47:51 ZEBRA: MESSAGE: ZEBRA_INTERFACE_ADDRESS_DELETE 20.0.0.8/29 on Ethernet16(0)
2019/11/14 09:47:51 ZEBRA: MESSAGE: ZEBRA_INTERFACE_DOWN Ethernet16(0)
2019/11/14 09:47:51 ZEBRA: MESSAGE: ZEBRA_INTERFACE_ADDRESS_DELETE 20.0.0.8/29 on Ethernet16(0)
2019/11/14 09:47:52 ZEBRA: MESSAGE: ZEBRA_INTERFACE_ADDRESS_DELETE 20.0.0.8/29 on Ethernet16(0)
2019/11/14 09:47:54 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWLINK(16), len=1248, seq=0, pid=0
2019/11/14 09:47:54 ZEBRA: RTM_NEWLINK update for Ethernet8(433) sl_type 0 master 0 flags 0x11043
2019/11/14 09:47:54 ZEBRA: Intf Ethernet8(433) has come UP
2019/11/14 09:47:54 ZEBRA: MESSAGE: ZEBRA_INTERFACE_UP Ethernet8(0)
2019/11/14 09:47:54 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWROUTE(24), len=116, seq=0, pid=0
2019/11/14 09:47:54 ZEBRA: RTM_NEWROUTE ipv6 unicast proto boot NS 0
2019/11/14 09:47:54 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWROUTE(24), len=116, seq=0, pid=0
2019/11/14 09:47:54 ZEBRA: RTM_NEWROUTE ipv6 unicast proto kernel NS 0
2019/11/14 09:47:56 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWADDR(20), len=72, seq=0, pid=0
2019/11/14 09:47:56 ZEBRA: netlink_interface_addr RTM_NEWADDR Ethernet8 flags 0x80:
2019/11/14 09:47:56 ZEBRA:   IFA_ADDRESS   fe80::42:acff:fe11:2/64
2019/11/14 09:47:56 ZEBRA:   IFA_CACHEINFO pref -1, valid -1
2019/11/14 09:47:56 ZEBRA: MESSAGE: ZEBRA_INTERFACE_ADDRESS_ADD fe80::42:acff:fe11:2/64 on Ethernet8(0)
2019/11/14 09:47:56 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWROUTE(24), len=116, seq=0, pid=0
2019/11/14 09:47:56 ZEBRA: RTM_NEWROUTE ipv6 local proto  NS 0
2019/11/14 09:47:56 ZEBRA: Route rtm_type: local(2) intentionally ignoring
2019/11/14 09:47:56 ZEBRA: FPM: fe80::/64 triggering update to FPM - Reason: new route selected
2019/11/14 09:47:56 ZEBRA: FPM: fe80::/64 triggering update to FPM - Reason: new route selected
2019/11/14 09:47:56 ZEBRA: FPM: zfpm_netlink_encode_route : RTM_NEWROUTE fe80::/64, Proto: kernel, Metric: 0
2019/11/14 09:47:56 ZEBRA: FPM:   Intf: 433, Gateway: <No address>, Recursive: no, Type: Directly connected, Encap type: none
2019/11/14 09:47:56 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWLINK(16), len=1260, seq=0, pid=0
2019/11/14 09:47:56 ZEBRA: RTM_NEWLINK for VRF Vrf_0(459) table 1001
2019/11/14 09:47:56 ZEBRA: VRF Vrf_0 created, id 459
2019/11/14 09:47:56 ZEBRA: VRF Vrf_0 id 459 is now active
2019/11/14 09:47:56 ZEBRA: MESSAGE: ZEBRA_VRF_ADD Vrf_0
2019/11/14 09:47:56 ZEBRA: RTM_NEWLINK ADD for Vrf_0(459) vrf_id 459 type 2 sl_type 0 master 0 flags 0x480
2019/11/14 09:47:56 ZEBRA: MESSAGE: ZEBRA_INTERFACE_ADD Vrf_0(459)
2019/11/14 09:47:56 ZEBRA: interface Vrf_0 vrf 459 index 459 becomes active.
2019/11/14 09:47:56 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWLINK(16), len=1260, seq=0, pid=0
2019/11/14 09:47:56 ZEBRA: RTM_NEWLINK for VRF Vrf_0(459) table 1001
2019/11/14 09:47:56 ZEBRA: RTM_NEWLINK update for Vrf_0(459) sl_type 0 master 0 flags 0x104c1
2019/11/14 09:47:56 ZEBRA: Intf Vrf_0(459) has come UP
2019/11/14 09:47:56 ZEBRA: MESSAGE: ZEBRA_INTERFACE_UP Vrf_0(459)
2019/11/14 09:47:57 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWLINK(16), len=1248, seq=0, pid=0
2019/11/14 09:47:57 ZEBRA: RTM_NEWLINK update for Ethernet8(433) sl_type 0 master 0 flags 0x11043
2019/11/14 09:47:57 ZEBRA: Intf Ethernet8(433) PTM up, notifying clients
2019/11/14 09:47:57 ZEBRA: MESSAGE: ZEBRA_INTERFACE_UP Ethernet8(0)
2019/11/14 09:47:57 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWLINK(16), len=1276, seq=0, pid=0
2019/11/14 09:47:57 ZEBRA: RTM_NEWLINK vrf-change for Ethernet8(433) vrf_id 0 -> 459 flags 0x11043
2019/11/14 09:47:57 ZEBRA: MESSAGE: ZEBRA_INTERFACE_ADDRESS_DELETE fe80::42:acff:fe11:2/64 on Ethernet8(0)
2019/11/14 09:47:57 ZEBRA: MESSAGE: ZEBRA_INTERFACE_VRF_UPDATE/DEL Ethernet8 VRF Id 0 -> 459
2019/11/14 09:47:57 ZEBRA: MESSAGE: ZEBRA_INTERFACE_VRF_UPDATE/ADD Ethernet8 VRF Id 0 -> 459
2019/11/14 09:47:57 ZEBRA: MESSAGE: ZEBRA_INTERFACE_ADDRESS_ADD fe80::42:acff:fe11:2/64 on Ethernet8(459)
2019/11/14 09:47:57 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELROUTE(25), len=116, seq=0, pid=0
2019/11/14 09:47:57 ZEBRA: RTM_DELROUTE ipv6 unicast proto kernel NS 0
2019/11/14 09:47:57 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELROUTE(25), len=116, seq=0, pid=0
2019/11/14 09:47:57 ZEBRA: RTM_DELROUTE ipv6 unicast proto boot NS 0
2019/11/14 09:47:57 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELNEIGH(29), len=88, seq=0, pid=0
2019/11/14 09:47:57 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELNEIGH(29), len=88, seq=0, pid=0
2019/11/14 09:47:57 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELNEIGH(29), len=88, seq=0, pid=0
2019/11/14 09:47:57 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELADDR(21), len=72, seq=0, pid=0
2019/11/14 09:47:57 ZEBRA: netlink_interface_addr RTM_DELADDR Ethernet8 flags 0x80:
2019/11/14 09:47:57 ZEBRA:   IFA_ADDRESS   fe80::42:acff:fe11:2/64
2019/11/14 09:47:57 ZEBRA:   IFA_CACHEINFO pref -1, valid -1
2019/11/14 09:47:57 ZEBRA: MESSAGE: ZEBRA_INTERFACE_ADDRESS_DELETE fe80::42:acff:fe11:2/64 on Ethernet8(459)
2019/11/14 09:47:57 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELROUTE(25), len=116, seq=0, pid=0
2019/11/14 09:47:57 ZEBRA: RTM_DELROUTE ipv6 local proto  NS 0
2019/11/14 09:47:57 ZEBRA: Route rtm_type: local(2) intentionally ignoring
2019/11/14 09:47:57 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWLINK(16), len=1276, seq=0, pid=0
2019/11/14 09:47:57 ZEBRA: RTM_NEWLINK update for Ethernet8(433) sl_type 1 master 0 flags 0x1002
2019/11/14 09:47:57 ZEBRA: Intf Ethernet8(433) has gone DOWN
2019/11/14 09:47:57 ZEBRA: MESSAGE: ZEBRA_INTERFACE_DOWN Ethernet8(459)
2019/11/14 09:47:57 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWLINK(16), len=1276, seq=0, pid=0
2019/11/14 09:47:57 ZEBRA: RTM_NEWLINK update for Ethernet8(433) sl_type 1 master 0 flags 0x11043
2019/11/14 09:47:57 ZEBRA: Intf Ethernet8(433) has come UP
2019/11/14 09:47:57 ZEBRA: MESSAGE: ZEBRA_INTERFACE_UP Ethernet8(459)
2019/11/14 09:47:57 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWROUTE(24), len=116, seq=0, pid=0
2019/11/14 09:47:57 ZEBRA: RTM_NEWROUTE ipv6 unicast proto boot NS 0
2019/11/14 09:47:57 ZEBRA: RTM_NEWROUTE ff00::/8 vrf 459(1001) metric: 256 Admin Distance: 0
2019/11/14 09:47:57 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWROUTE(24), len=116, seq=0, pid=0
2019/11/14 09:47:57 ZEBRA: RTM_NEWROUTE ipv6 unicast proto kernel NS 0
2019/11/14 09:47:57 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWLINK(16), len=1276, seq=0, pid=0
2019/11/14 09:47:57 ZEBRA: RTM_NEWLINK update for Ethernet8(433) sl_type 1 master 0 flags 0x11043
2019/11/14 09:47:57 ZEBRA: Intf Ethernet8(433) PTM up, notifying clients
2019/11/14 09:47:57 ZEBRA: MESSAGE: ZEBRA_INTERFACE_UP Ethernet8(459)
2019/11/14 09:47:57 ZEBRA: FPM: fe80::/64 triggering update to FPM - Reason: removing existing route
2019/11/14 09:47:57 ZEBRA: FPM: ff00::/8 triggering update to FPM - Reason: new route selected
2019/11/14 09:47:57 ZEBRA: FPM: zfpm_netlink_encode_route : RTM_DELROUTE fe80::/64, Proto: , Metric: 0
2019/11/14 09:47:57 ZEBRA: FPM: zfpm_netlink_encode_route : RTM_NEWROUTE ff00::/8, Proto: kernel, Metric: 256
2019/11/14 09:47:57 ZEBRA: FPM:   Intf: 433, Gateway: <No address>, Recursive: no, Type: Directly connected, Encap type: none
2019/11/14 09:47:58 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWROUTE(24), len=116, seq=0, pid=0
2019/11/14 09:47:58 ZEBRA: RTM_NEWROUTE ipv6 unicast proto kernel NS 0
2019/11/14 09:47:59 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWADDR(20), len=72, seq=0, pid=0
2019/11/14 09:47:59 ZEBRA: netlink_interface_addr RTM_NEWADDR Ethernet8 flags 0x80:
2019/11/14 09:47:59 ZEBRA:   IFA_ADDRESS   fe80::42:acff:fe11:2/64
2019/11/14 09:47:59 ZEBRA:   IFA_CACHEINFO pref -1, valid -1
2019/11/14 09:47:59 ZEBRA: MESSAGE: ZEBRA_INTERFACE_ADDRESS_ADD fe80::42:acff:fe11:2/64 on Ethernet8(459)
2019/11/14 09:47:59 ZEBRA: rib_add_multipath: 459:fe80::/64: Inserting route rn 0x55610789b5c0, re 0x55610788f8a0 (connected) existing (nil)
2019/11/14 09:47:59 ZEBRA: rib_add_multipath: 459:fe80::/64 (MRIB): Inserting route rn 0x55610788ed00, re 0x55610788dc90 (connected) existing (nil)
2019/11/14 09:47:59 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWROUTE(24), len=116, seq=0, pid=0
2019/11/14 09:47:59 ZEBRA: RTM_NEWROUTE ipv6 local proto  NS 0
2019/11/14 09:47:59 ZEBRA: Route rtm_type: local(2) intentionally ignoring
2019/11/14 09:47:59 ZEBRA: FPM: fe80::/64 triggering update to FPM - Reason: new route selected
2019/11/14 09:47:59 ZEBRA: 459:fe80::/64: Adding route rn 0x55610789b5c0, re 0x55610788f8a0 (connected)
2019/11/14 09:47:59 ZEBRA: FPM: fe80::/64 triggering update to FPM - Reason: new route selected
2019/11/14 09:47:59 ZEBRA: 459:fe80::/64: Adding route rn 0x55610788ed00, re 0x55610788dc90 (connected)
2019/11/14 09:47:59 ZEBRA: FPM: zfpm_netlink_encode_route : RTM_NEWROUTE fe80::/64, Proto: kernel, Metric: 0
2019/11/14 09:47:59 ZEBRA: FPM:   Intf: 433, Gateway: <No address>, Recursive: no, Type: Directly connected, Encap type: none
2019/11/14 09:47:59 ZEBRA: 459:fe80::/64 update_from_ctx(): no fib nhg
2019/11/14 09:47:59 ZEBRA: 459:fe80::/64 update_from_ctx(): rib nhg matched, changed 'true'
2019/11/14 09:47:59 ZEBRA: 459:fe80::/64: Redist update re 0x55610788f8a0 (connected), old 0x0 (None)
2019/11/14 09:47:59 ZEBRA: Redist update filter prefix fe80::/64
2019/11/14 09:48:00 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWADDR(20), len=72, seq=0, pid=0
2019/11/14 09:48:00 ZEBRA: netlink_interface_addr RTM_NEWADDR Ethernet8 flags 0x80:
2019/11/14 09:48:00 ZEBRA:   IFA_ADDRESS   fc00::1/126
2019/11/14 09:48:00 ZEBRA:   IFA_CACHEINFO pref -1, valid -1
2019/11/14 09:48:00 ZEBRA: MESSAGE: ZEBRA_INTERFACE_ADDRESS_ADD fc00::1/126 on Ethernet8(459)
2019/11/14 09:48:00 ZEBRA: rib_add_multipath: 459:fc00::/126: Inserting route rn 0x55610789c060, re 0x55610788e7b0 (connected) existing (nil)
2019/11/14 09:48:00 ZEBRA: rib_add_multipath: 459:fc00::/126 (MRIB): Inserting route rn 0x556107899b30, re 0x55610789c5b0 (connected) existing (nil)
2019/11/14 09:48:00 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWROUTE(24), len=116, seq=0, pid=0
2019/11/14 09:48:00 ZEBRA: RTM_NEWROUTE ipv6 local proto  NS 0
2019/11/14 09:48:00 ZEBRA: Route rtm_type: local(2) intentionally ignoring
2019/11/14 09:48:00 ZEBRA: FPM: fc00::/126 triggering update to FPM - Reason: new route selected
2019/11/14 09:48:00 ZEBRA: 459:fc00::/126: Adding route rn 0x55610789c060, re 0x55610788e7b0 (connected)
2019/11/14 09:48:00 ZEBRA: FPM: fc00::/126 triggering update to FPM - Reason: new route selected
2019/11/14 09:48:00 ZEBRA: 459:fc00::/126: Adding route rn 0x556107899b30, re 0x55610789c5b0 (connected)
2019/11/14 09:48:00 ZEBRA: FPM: zfpm_netlink_encode_route : RTM_NEWROUTE fc00::/126, Proto: kernel, Metric: 0
2019/11/14 09:48:00 ZEBRA: FPM:   Intf: 433, Gateway: <No address>, Recursive: no, Type: Directly connected, Encap type: none
2019/11/14 09:48:00 ZEBRA: 459:fc00::/126 update_from_ctx(): no fib nhg
2019/11/14 09:48:00 ZEBRA: 459:fc00::/126 update_from_ctx(): rib nhg matched, changed 'true'
2019/11/14 09:48:00 ZEBRA: 459:fc00::/126: Redist update re 0x55610788e7b0 (connected), old 0x0 (None)
2019/11/14 09:48:04 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELADDR(21), len=72, seq=0, pid=0
2019/11/14 09:48:04 ZEBRA: netlink_interface_addr RTM_DELADDR Ethernet8 flags 0x80:
2019/11/14 09:48:04 ZEBRA:   IFA_ADDRESS   fc00::1/126
2019/11/14 09:48:04 ZEBRA:   IFA_CACHEINFO pref -1, valid -1
2019/11/14 09:48:04 ZEBRA: MESSAGE: ZEBRA_INTERFACE_ADDRESS_DELETE fc00::1/126 on Ethernet8(459)
2019/11/14 09:48:04 ZEBRA: rib_delnode: 459:fc00::/126: rn 0x55610789c060, re 0x55610788e7b0, removing
2019/11/14 09:48:04 ZEBRA: rib_delnode: 459:fc00::/126 (MRIB): rn 0x556107899b30, re 0x55610789c5b0, removing
2019/11/14 09:48:04 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELROUTE(25), len=116, seq=0, pid=0
2019/11/14 09:48:04 ZEBRA: RTM_DELROUTE ipv6 local proto  NS 0
2019/11/14 09:48:04 ZEBRA: Route rtm_type: local(2) intentionally ignoring
2019/11/14 09:48:04 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELROUTE(25), len=116, seq=0, pid=0
2019/11/14 09:48:04 ZEBRA: RTM_DELROUTE ipv6 unicast proto kernel NS 0
2019/11/14 09:48:04 ZEBRA: FPM: fc00::/126 triggering update to FPM - Reason: removing existing route
2019/11/14 09:48:04 ZEBRA: 459:fc00::/126: Deleting route rn 0x55610789c060, re 0x55610788e7b0 (connected)
2019/11/14 09:48:04 ZEBRA: 459:fc00::/126: Redist del: re 0x55610788e7b0 (connected), new re 0x0 (None)
2019/11/14 09:48:04 ZEBRA: rib_process: 459:fc00::/126: rn 0x55610789c060, removing re 0x55610788e7b0
2019/11/14 09:48:04 ZEBRA: rib_unlink: 459:fc00::/126: rn 0x55610789c060, re 0x55610788e7b0
2019/11/14 09:48:04 ZEBRA: 459:fc00::/126: Redist del: re 0x55610789c5b0 (connected), new re 0x0 (None)
2019/11/14 09:48:04 ZEBRA: rib_process: 459:fc00::/126 (MRIB): rn 0x556107899b30, removing re 0x55610789c5b0
2019/11/14 09:48:04 ZEBRA: rib_unlink: 459:fc00::/126 (MRIB): rn 0x556107899b30, re 0x55610789c5b0
2019/11/14 09:48:04 ZEBRA: rib_gc_dest: 459:fc00::/126 (MRIB): removing dest from table
2019/11/14 09:48:04 ZEBRA: FPM: zfpm_netlink_encode_route : RTM_DELROUTE fc00::/126, Proto: , Metric: 0
2019/11/14 09:48:04 ZEBRA: rib_gc_dest: 459:fc00::/126: removing dest from table
2019/11/14 09:48:05 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWLINK(16), len=1276, seq=0, pid=0
2019/11/14 09:48:05 ZEBRA: RTM_NEWLINK update for Ethernet8(433) sl_type 1 master 0 flags 0x11043
2019/11/14 09:48:05 ZEBRA: Intf Ethernet8(433) PTM up, notifying clients
2019/11/14 09:48:05 ZEBRA: MESSAGE: ZEBRA_INTERFACE_UP Ethernet8(459)
2019/11/14 09:48:05 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWLINK(16), len=1248, seq=0, pid=0
2019/11/14 09:48:05 ZEBRA: RTM_NEWLINK vrf-change for Ethernet8(433) vrf_id 459 -> 0 flags 0x11043
2019/11/14 09:48:05 ZEBRA: MESSAGE: ZEBRA_INTERFACE_ADDRESS_DELETE fe80::42:acff:fe11:2/64 on Ethernet8(459)
2019/11/14 09:48:05 ZEBRA: rib_delnode: 459:fe80::/64: rn 0x55610789b5c0, re 0x55610788f8a0, removing
2019/11/14 09:48:05 ZEBRA: rib_delnode: 459:fe80::/64 (MRIB): rn 0x55610788ed00, re 0x55610788dc90, removing
2019/11/14 09:48:05 ZEBRA: MESSAGE: ZEBRA_INTERFACE_VRF_UPDATE/DEL Ethernet8 VRF Id 459 -> 0
2019/11/14 09:48:05 ZEBRA: MESSAGE: ZEBRA_INTERFACE_VRF_UPDATE/ADD Ethernet8 VRF Id 459 -> 0
2019/11/14 09:48:05 ZEBRA: MESSAGE: ZEBRA_INTERFACE_ADDRESS_ADD fe80::42:acff:fe11:2/64 on Ethernet8(0)
2019/11/14 09:48:05 ZEBRA: rib_add_multipath: 0:fe80::/64: Inserting route rn 0x5561079b2050, re 0x55610788e7b0 (connected) existing (nil)
2019/11/14 09:48:05 ZEBRA: rib_add_multipath: 0:fe80::/64 (MRIB): Inserting route rn 0x55610789f250, re 0x55610789e040 (connected) existing (nil)
2019/11/14 09:48:05 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELROUTE(25), len=116, seq=0, pid=0
2019/11/14 09:48:05 ZEBRA: RTM_DELROUTE ipv6 unicast proto kernel NS 0
2019/11/14 09:48:05 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELROUTE(25), len=116, seq=0, pid=0
2019/11/14 09:48:05 ZEBRA: RTM_DELROUTE ipv6 unicast proto boot NS 0
2019/11/14 09:48:05 ZEBRA: RTM_DELROUTE ff00::/8 vrf 459(1001) metric: 256 Admin Distance: 0
2019/11/14 09:48:05 ZEBRA: rib_delete: 459:ff00::/8: via :: ifindex 433 type 1 doesn't exist in rib
2019/11/14 09:48:05 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELNEIGH(29), len=88, seq=0, pid=0
2019/11/14 09:48:05 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELNEIGH(29), len=88, seq=0, pid=0
2019/11/14 09:48:05 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELNEIGH(29), len=88, seq=0, pid=0
2019/11/14 09:48:05 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELNEIGH(29), len=88, seq=0, pid=0
2019/11/14 09:48:05 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELADDR(21), len=72, seq=0, pid=0
2019/11/14 09:48:05 ZEBRA: netlink_interface_addr RTM_DELADDR Ethernet8 flags 0x80:
2019/11/14 09:48:05 ZEBRA:   IFA_ADDRESS   fe80::42:acff:fe11:2/64
2019/11/14 09:48:05 ZEBRA:   IFA_CACHEINFO pref -1, valid -1
2019/11/14 09:48:05 ZEBRA: MESSAGE: ZEBRA_INTERFACE_ADDRESS_DELETE fe80::42:acff:fe11:2/64 on Ethernet8(0)
2019/11/14 09:48:05 ZEBRA: rib_delnode: 0:fe80::/64: rn 0x5561079b2050, re 0x55610788e7b0, removing
2019/11/14 09:48:05 ZEBRA: rib_delnode: 0:fe80::/64 (MRIB): rn 0x55610789f250, re 0x55610789e040, removing
2019/11/14 09:48:05 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELROUTE(25), len=116, seq=0, pid=0
2019/11/14 09:48:05 ZEBRA: RTM_DELROUTE ipv6 local proto  NS 0
2019/11/14 09:48:05 ZEBRA: Route rtm_type: local(2) intentionally ignoring
2019/11/14 09:48:05 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWLINK(16), len=1248, seq=0, pid=0
2019/11/14 09:48:05 ZEBRA: RTM_NEWLINK update for Ethernet8(433) sl_type 0 master 0 flags 0x1002
2019/11/14 09:48:05 ZEBRA: Intf Ethernet8(433) has gone DOWN
2019/11/14 09:48:05 ZEBRA: MESSAGE: ZEBRA_INTERFACE_DOWN Ethernet8(0)
2019/11/14 09:48:05 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWLINK(16), len=1248, seq=0, pid=0
2019/11/14 09:48:05 ZEBRA: RTM_NEWLINK update for Ethernet8(433) sl_type 0 master 0 flags 0x11043
2019/11/14 09:48:05 ZEBRA: Intf Ethernet8(433) has come UP
2019/11/14 09:48:05 ZEBRA: MESSAGE: ZEBRA_INTERFACE_UP Ethernet8(0)
2019/11/14 09:48:05 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWROUTE(24), len=116, seq=0, pid=0
2019/11/14 09:48:05 ZEBRA: RTM_NEWROUTE ipv6 unicast proto boot NS 0
2019/11/14 09:48:05 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWROUTE(24), len=116, seq=0, pid=0
2019/11/14 09:48:05 ZEBRA: RTM_NEWROUTE ipv6 unicast proto kernel NS 0
2019/11/14 09:48:05 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWLINK(16), len=1248, seq=0, pid=0
2019/11/14 09:48:05 ZEBRA: RTM_NEWLINK update for Ethernet8(433) sl_type 0 master 0 flags 0x11043
2019/11/14 09:48:05 ZEBRA: Intf Ethernet8(433) PTM up, notifying clients
2019/11/14 09:48:05 ZEBRA: MESSAGE: ZEBRA_INTERFACE_UP Ethernet8(0)
2019/11/14 09:48:05 ZEBRA: FPM: fe80::/64 triggering update to FPM - Reason: removing existing route
2019/11/14 09:48:05 ZEBRA: 459:fe80::/64: Deleting route rn 0x55610789b5c0, re 0x55610788f8a0 (connected)
2019/11/14 09:48:05 ZEBRA: 459:fe80::/64: Redist del: re 0x55610788f8a0 (connected), new re 0x0 (None)
2019/11/14 09:48:05 ZEBRA: 459:fe80::/64: Redist del old: skipping invalid prefix
2019/11/14 09:48:05 ZEBRA: rib_process: 459:fe80::/64: rn 0x55610789b5c0, removing re 0x55610788f8a0
2019/11/14 09:48:05 ZEBRA: rib_unlink: 459:fe80::/64: rn 0x55610789b5c0, re 0x55610788f8a0
2019/11/14 09:48:05 ZEBRA: 459:fe80::/64: Redist del: re 0x55610788dc90 (connected), new re 0x0 (None)
2019/11/14 09:48:05 ZEBRA: 459:fe80::/64: Redist del old: skipping invalid prefix
2019/11/14 09:48:05 ZEBRA: rib_process: 459:fe80::/64 (MRIB): rn 0x55610788ed00, removing re 0x55610788dc90
2019/11/14 09:48:05 ZEBRA: rib_unlink: 459:fe80::/64 (MRIB): rn 0x55610788ed00, re 0x55610788dc90
2019/11/14 09:48:05 ZEBRA: rib_gc_dest: 459:fe80::/64 (MRIB): removing dest from table
2019/11/14 09:48:05 ZEBRA: rib_process: 0:fe80::/64: rn 0x5561079b2050, removing re 0x55610788e7b0
2019/11/14 09:48:05 ZEBRA: rib_unlink: 0:fe80::/64: rn 0x5561079b2050, re 0x55610788e7b0
2019/11/14 09:48:05 ZEBRA: rib_gc_dest: 0:fe80::/64: removing dest from table
2019/11/14 09:48:05 ZEBRA: rib_process: 0:fe80::/64 (MRIB): rn 0x55610789f250, removing re 0x55610789e040
2019/11/14 09:48:05 ZEBRA: rib_unlink: 0:fe80::/64 (MRIB): rn 0x55610789f250, re 0x55610789e040
2019/11/14 09:48:05 ZEBRA: rib_gc_dest: 0:fe80::/64 (MRIB): removing dest from table
2019/11/14 09:48:05 ZEBRA: FPM: zfpm_netlink_encode_route : RTM_DELROUTE fe80::/64, Proto: , Metric: 0
2019/11/14 09:48:05 ZEBRA: rib_gc_dest: 459:fe80::/64: removing dest from table
2019/11/14 09:48:07 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWADDR(20), len=72, seq=0, pid=0
2019/11/14 09:48:07 ZEBRA: netlink_interface_addr RTM_NEWADDR Ethernet8 flags 0x80:
2019/11/14 09:48:07 ZEBRA:   IFA_ADDRESS   fe80::42:acff:fe11:2/64
2019/11/14 09:48:07 ZEBRA:   IFA_CACHEINFO pref -1, valid -1
2019/11/14 09:48:07 ZEBRA: MESSAGE: ZEBRA_INTERFACE_ADDRESS_ADD fe80::42:acff:fe11:2/64 on Ethernet8(0)
2019/11/14 09:48:07 ZEBRA: rib_add_multipath: 0:fe80::/64: Inserting route rn 0x5561079b2050, re 0x55610789e040 (connected) existing (nil)
2019/11/14 09:48:07 ZEBRA: rib_add_multipath: 0:fe80::/64 (MRIB): Inserting route rn 0x55610788d740, re 0x55610788ce30 (connected) existing (nil)
2019/11/14 09:48:07 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWROUTE(24), len=116, seq=0, pid=0
2019/11/14 09:48:07 ZEBRA: RTM_NEWROUTE ipv6 local proto  NS 0
2019/11/14 09:48:07 ZEBRA: Route rtm_type: local(2) intentionally ignoring
2019/11/14 09:48:07 ZEBRA: FPM: fe80::/64 triggering update to FPM - Reason: new route selected
2019/11/14 09:48:07 ZEBRA: 0:fe80::/64: Adding route rn 0x5561079b2050, re 0x55610789e040 (connected)
2019/11/14 09:48:07 ZEBRA: FPM: fe80::/64 triggering update to FPM - Reason: new route selected
2019/11/14 09:48:07 ZEBRA: 0:fe80::/64: Adding route rn 0x55610788d740, re 0x55610788ce30 (connected)
2019/11/14 09:48:07 ZEBRA: FPM: zfpm_netlink_encode_route : RTM_NEWROUTE fe80::/64, Proto: kernel, Metric: 0
2019/11/14 09:48:07 ZEBRA: FPM:   Intf: 433, Gateway: <No address>, Recursive: no, Type: Directly connected, Encap type: none
2019/11/14 09:48:07 ZEBRA: 0:fe80::/64 update_from_ctx(): no fib nhg
2019/11/14 09:48:07 ZEBRA: 0:fe80::/64 update_from_ctx(): rib nhg matched, changed 'true'
2019/11/14 09:48:07 ZEBRA: 0:fe80::/64: Redist update re 0x55610789e040 (connected), old 0x0 (None)
2019/11/14 09:48:07 ZEBRA: Redist update filter prefix fe80::/64
2019/11/14 09:48:11 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWLINK(16), len=1260, seq=0, pid=0
2019/11/14 09:48:11 ZEBRA: RTM_NEWLINK for VRF Vrf_0(459) table 1001
2019/11/14 09:48:11 ZEBRA: RTM_NEWLINK update for Vrf_0(459) sl_type 0 master 0 flags 0x480
2019/11/14 09:48:11 ZEBRA: Intf Vrf_0(459) has gone DOWN
2019/11/14 09:48:11 ZEBRA: MESSAGE: ZEBRA_INTERFACE_DOWN Vrf_0(459)
2019/11/14 09:48:11 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_DELLINK(17), len=544, seq=0, pid=0
2019/11/14 09:48:11 ZEBRA: RTM_DELLINK for VRF Vrf_0(459)
2019/11/14 09:48:11 ZEBRA: VRF Vrf_0 id 459 is now inactive
2019/11/14 09:48:11 ZEBRA: MESSAGE: ZEBRA_VRF_DELETE Vrf_0
2019/11/14 09:48:11 ZEBRA: rib_unlink: 459:ff00::/8: rn 0x5561078980a0, re 0x5561078960c0
2019/11/14 09:48:11 ZEBRA: VRF Vrf_0 id 459 deleted
2019/11/14 09:48:11 ZEBRA: RTM_DELLINK for unknown interface Vrf_0(459)
ZEBRA: Received signal 11 at 1573724891 (si_addr 0x313dd6, PC 0x7fa6ed967be6); aborting...
zlog_signal+0x113                  7fa6f0a13083     7ffc21368f40 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fa6f09c4000)
PBKDF2_SHA256+0x5e4                7fa6f0a33384     7ffc21369070 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fa6f09c4000)
__restore_rt+0                     7fa6efd630e0     7ffc213691c0 /lib/x86_64-linux-gnu/libpthread.so.0 (mapped at 0x7fa6efd52000)
    ---- signal ----
zfpm_netlink_encode_route+0x76     7fa6ed967be6     7ffc21369750 /usr/lib/x86_64-linux-gnu/frr/modules/zebra_fpm.so (mapped at 0x7fa6ed963000)
zfpm_route_for_update+0x38d        7fa6ed9678ed     7ffc2136d830 /usr/lib/x86_64-linux-gnu/frr/modules/zebra_fpm.so (mapped at 0x7fa6ed963000)
thread_call+0x60                   7fa6f0a40a50     7ffc2136d8a0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fa6f09c4000)
frr_run+0xd8                       7fa6f0a10fa8     7ffc2136da40 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fa6f09c4000)
main+0x333                         55610596d9b3     7ffc2136dc60 /usr/lib/frr/zebra (mapped at 0x556105952000)
__libc_start_main+0xf1             7fa6ef9d32e1     7ffc2136dd70 /lib/x86_64-linux-gnu/libc.so.6 (mapped at 0x7fa6ef9b3000)
_start+0x2a                        55610596e1ea     7ffc2136de30 /usr/lib/frr/zebra (mapped at 0x556105952000)
?                                             0     7ffc2136de38
in thread zfpm_write_cb scheduled from zebra/zebra_fpm.c:482

@rzalamena
Copy link
Member

FPM part was fixed here: #5361

We probably need to backport it to 7.2.

@tylerlinp
Copy link
Author

@rzalamena Is that all? It seems my problem wasn't fixed at that PR. In my test, there is no zebra shutdown in running, just remove vrf device in kernel when FPM connected. If that vrf has routes, zebra could crash.

@rzalamena
Copy link
Member

@tylerlinp you are right. Sorry I recognized the last lines of the stack and assumed that was happening during zebra shutdown.

I managed to reproduce the problem myself and here is what address sanitizer complains:

    #0 0x7fd3e7ead992 in zfpm_trigger_update zebra/zebra_fpm.c:1458
    #1 0x559da4eecb0e in hook_call_rib_update zebra/zebra_rib.c:69
    #2 0x559da4ef1080 in rib_process_del_fib zebra/zebra_rib.c:867               
    #3 0x559da4ef2e23 in rib_process zebra/zebra_rib.c:1230
    #4 0x559da4ef67a9 in process_subq_route zebra/zebra_rib.c:2031
    #5 0x559da4ef6a4c in process_subq zebra/zebra_rib.c:2070
    #6 0x559da4ef6bde in meta_queue_process zebra/zebra_rib.c:2104
    #7 0x7fd3edaffbe3 in work_queue_run lib/workqueue.c:291
    #8 0x7fd3edae129f in thread_call lib/thread.c:1549
    #9 0x7fd3eda35d08 in frr_run lib/libfrr.c:1064
    #10 0x559da4e75f7f in main zebra/main.c:468
    #11 0x7fd3ed13ab96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
    #12 0x559da4e44cd9 in _start (/usr/lib/frr/zebra+0x6ccd9)

0x6080000058e8 is located 72 bytes inside of 88-byte region [0x6080000058a0,0x6080000058f8)
freed by thread T0 here:
    #0 0x7fd3edf037b8 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xde7b8)
    #1 0x7fd3eda44664 in qfree lib/memory.c:129
    #2 0x559da4f174e5 in zebra_rtable_node_cleanup zebra/zebra_vrf.c:414
    #3 0x7fd3edacbdac in route_node_free lib/table.c:89
    #4 0x7fd3edacbfdf in route_table_free lib/table.c:124
    #5 0x7fd3edacbc58 in route_table_finish lib/table.c:60
    #6 0x559da4effe01 in zebra_router_free_table zebra/zebra_router.c:167
    #7 0x559da4f000dc in zebra_router_release_table zebra/zebra_router.c:190
    #8 0x559da4f1656e in zebra_vrf_disable zebra/zebra_vrf.c:232
    #9 0x7fd3edae8eb7 in vrf_disable lib/vrf.c:319
    #10 0x7fd3edae897f in vrf_delete lib/vrf.c:229
    #11 0x559da4e4cb40 in netlink_vrf_change zebra/if_netlink.c:365
    #12 0x559da4e52bfa in netlink_link_change zebra/if_netlink.c:1228
    #13 0x559da4e6d08d in netlink_information_fetch zebra/kernel_netlink.c:284
    #14 0x559da4e70185 in netlink_parse_info zebra/kernel_netlink.c:932
    #15 0x559da4e6d2fa in kernel_read zebra/kernel_netlink.c:406
    #16 0x7fd3edae129f in thread_call lib/thread.c:1549
    #17 0x7fd3eda35d08 in frr_run lib/libfrr.c:1064
    #18 0x559da4e75f7f in main zebra/main.c:468
    #19 0x7fd3ed13ab96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)

previously allocated by thread T0 here:                                                                                                                                                                                         [7/4861]
    #0 0x7fd3edf03d38 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded38)
    #1 0x7fd3eda4454e in qcalloc lib/memory.c:110
    #2 0x559da4ef7848 in zebra_rib_create_dest zebra/zebra_rib.c:2299
    #3 0x559da4ef79d6 in rib_link zebra/zebra_rib.c:2359
    #4 0x559da4ef7d14 in rib_addnode zebra/zebra_rib.c:2390
    #5 0x559da4efa655 in rib_add_multipath zebra/zebra_rib.c:2773
    #6 0x559da4efbfe5 in rib_add zebra/zebra_rib.c:3034
    #7 0x559da4e84859 in netlink_route_change_read_unicast zebra/rt_netlink.c:734
    #8 0x559da4e8635e in netlink_route_change zebra/rt_netlink.c:968
    #9 0x559da4e6d042 in netlink_information_fetch zebra/kernel_netlink.c:278
    #10 0x559da4e70185 in netlink_parse_info zebra/kernel_netlink.c:932
    #11 0x559da4e6d2fa in kernel_read zebra/kernel_netlink.c:406
    #12 0x7fd3edae129f in thread_call lib/thread.c:1549
    #13 0x7fd3eda35d08 in frr_run lib/libfrr.c:1064
    #14 0x559da4e75f7f in main zebra/main.c:468
    #15 0x7fd3ed13ab96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)

To reproduce I just created a VRF, put an interface on it, installed some routes via BGP and opened a socket with FPM only discarding data. After that I deleted the VRF and the crash happened.

@tylerlinp
Copy link
Author

@rzalamena Yes, you reproduced it. Thank you.

@rzalamena
Copy link
Member

@tylerlinp it seems that this PR fixes the problem more precisely: #5553 .

@tylerlinp
Copy link
Author

@rzalamena That is not enough. The RTM_DELROUTE message for ff00::/8 comes after RTM_NEWLINK vrf-change for Ethernet8. The vrf_id gets from ifindex in RTM_DELROUTE message, so it changes as vrf-change. So it is needed to update all routes nexthops in rib when handling vrf-change message. The worst thing is that handling is asynchronous, maybe nexthop vrf hasn't been updated when to delete the route, then compare gets not same.

@rzalamena
Copy link
Member

@tylerlinp thanks for the feedback. Meanwhile we don't get a definite fix for this, would you be willing to try my new FPM module that uses data plane framework?

This new module doesn't the same race problem as the old one (it uses dplane_ctx that only gets freed after every data plane plugin processed it) and is much simpler (uses the same netlink code as the zebra <-> kernel socket).

PR: #5510 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants