Zebra not able to get interfaces addresses and kernel route #10404

fluboi · 2022-01-22T13:45:53Z

Describe the bug

Zebra is not able to get kernel route and interfaces addresses. sh ip route return nothing. And in sh int brief result, Addresses collum is emtpy.

tc2# sh ip route
tc2#

tc2# sh int brief
Interface       Status  VRF             Addresses
---------       ------  ---             ---------
dummy0          up      default         
eno2            up      default         
lo              up      default         
tap103i0        up      default         
tap103i1        up      default         
tap113i0        up      default         
tap113i1        up      default         
tap113i2        up      default         
tap115i0        up      default         
tap116i0        up      default         
tap117i0        up      default         
tap118i0        up      default         
tap119i0        up      default         
tap120i0        up      default         
vmbr5           up      default         
vmbr5.2         up      default

It does not match the kernel state:

lo               UNKNOWN        127.0.0.1/8 ::1/128 
eno2             UP             
vmbr5            UP             fe80::842f:9bff:fea6:488d/64 
vmbr5.2@vmbr5    UP             10.8.2.202/24 2a00:XX/64 fe80::842f:9bff:fea6:488d/64 
tap115i0         UNKNOWN        
tap119i0         UNKNOWN        
tap120i0         UNKNOWN        
tap103i0         UNKNOWN        
tap103i1         UNKNOWN        
tap116i0         UNKNOWN        
tap117i0         UNKNOWN        
tap113i0         UNKNOWN        
tap113i1         UNKNOWN        
tap113i2         UNKNOWN        
tap118i0         UNKNOWN        
dummy0           UNKNOWN        172.22.55.12/32 fe80::7ccb:a9ff:feca:5d52/64

Manually launching Zebra with --log-level debug shows:

root@tc2:~# /usr/lib/frr/zebra -t -F traditional -A 127.0.0.1 -s 90000000 --log-level debug
2022/01/22 14:24:48 ZEBRA: [KQNKJ-R5QVV][EC 4043309092] netlink-cmd (NS 0) error: data remnant size 32768
2022/01/22 14:24:48 ZEBRA: [KQNKJ-R5QVV][EC 4043309092] netlink-cmd (NS 0) error: data remnant size 32768
2022/01/22 14:24:48 ZEBRA: [WVJCK-PPMGD][EC 4043309093] netlink-cmd (NS 0) error: Device or resource busy, type=RTM_GETADDR(22), seq=4, pid=3468439451
2022/01/22 14:24:48 ZEBRA: [NNACN-54BDA][EC 4043309110] Disabling MPLS support (no kernel support)

[X] Did you check if this is a duplicate issue?
[ ] Did you test it on the latest FRRouting/frr master branch?

To Reproduce
I see the same bug on 2 different nodes, but they are in the same Proxmox cluster, with the same hardware...
I'm not able to reproduce on an other Proxmox host (sames version Kernel/PVE/FRR...)

It might be linked to vmbr5, a vlan_aware bridge and the sub interface vmbr5.2

auto vmbr5
iface vmbr5 inet manual
        bridge_ports eno2
        bridge_stp off
        bridge_fd 0
        bridge_vlan_aware yes
        up echo 1 > /sys/devices/virtual/net/$IFACE/bridge/multicast_querier
        up echo 0 > /sys/devices/virtual/net/$IFACE/bridge/multicast_snooping

auto vmbr5.2
iface vmbr5.2 inet static
        address 10.8.2.202/24
        gateway 10.8.2.1

Versions

OS Version: Proxmox PVE 7.1-10 (based on debian 11.2)

Kernel: 5.13.19-3-pve

FRR Version: 8.1

The text was updated successfully, but these errors were encountered:

tufeigunchu · 2022-01-27T03:07:28Z

When you kill zebra task and let it restart, does the status become normal?

fluboi · 2022-01-27T10:04:48Z

No, it does not change anything to killor kill -9 zebra and let watchfrr restart it.

fluboi · 2022-01-27T10:15:49Z

Indeed it might be linked with #10423
Interestingly, if I launch Zebra, and then, ifup a dummy interface, this dummy interface correctly appears in zebra. All other interfaces still not.

tufeigunchu · 2022-01-27T14:08:55Z

What about changing ip address of tap?

fluboi · 2022-01-27T14:19:58Z

If I add a new ip address on any interface when zebra is already running, it appears in frr/zebra. All the others addresses, that was here before zebra startup, are still missing.

fluboi · 2022-01-27T14:31:38Z

Issue is not present using FRR version 7.4.
It is broken using 8.1, 8.0.1 and 7.5.1

mjstapp · 2022-01-27T15:00:48Z

I'm not reproducing this with a fairly vanilla linux, ubuntu20 for example, so it sounds like there's something special going on in your environment.

fluboi · 2022-01-27T19:10:39Z

Ok I finally manage to understand and reproduce.

Vanilla Ubuntu 20.04.3 LTS
Kernel: 5.4.0-96-generic
FRR: 8.1

To reproduce:

ip link add vmbr0 type bridge vlan_filtering 1
for i in {10..20}; do  
  ip link add dummy$i type dummy ;  
  ip link set dev dummy$i up ;  
  ip link set dummy$i master vmbr0 ;  
  bridge vlan del dev dummy$i vid 1 ;  
  bridge vlan add dev dummy$i vid 2-4094 ; 
done

systemctl restart frr

vtysh -c "sh int brief"
vtysh -c "sh ip route"

The issue occurs when there are too many vlan.
In my real world proxmox setup, I have 2 VM with no VLAN ID defined in the GUI, on their NIC. The goal is to create a trunk with all VLAN available for the VM. (Virtual FW/router VM).

donaldsharp · 2022-01-28T17:36:38Z

I have reproduced this.

Currently when the kernel sends netlink messages to FRR the buffers to receive this data is of fixed length. The kernel, with certain configurations, will send netlink messages that are larger than this fixed length. This leads to situations where, on startup, zebra gets really confused about the state of the kernel. Effectively the current algorithm is this: read up to buffer in size while (data to parse) get netlink message header, look at size parse if you can The problem is that there is a 32k buffer we read. We get the first message that is say 1k in size, subtract that 1k to 31k left to parse. We then get the next header and notice that the length of the message is 33k. Which is obviously larger than what we read in. FRR has no recover mechanism nor is there a way to know, a priori, what the maximum size the kernel will send us. Modify FRR to look at the kernel message and see if the buffer is large enough, if not, make it large enough to read in the message. This code has to be per netlink socket because of the usage of pthreads. So add to `struct nlsock` the buffer and current buffer length. Growing it as necessary. Fixes: FRRouting#10404 Signed-off-by: Donald Sharp <sharpd@nvidia.com>

cosmedd · 2022-03-30T21:27:26Z

Will this backported to frr 8.2.x?

aderumier · 2022-04-24T05:12:44Z

@fluboi

Hi,
I'm the proxmox frr package maintenair, I'll try to see if I can backport it to current proxmox 8.0.1. (Another user have reported same kind of netlink error)

fluboi · 2022-04-24T09:43:50Z

Hi,
Thanks @aderumier !
Currently the fix is in master, but not in 8.2.2 nor 8.3-dev.
@donaldsharp Any idea of when could we expect that fix to be in a stable version ?

aderumier · 2022-04-25T06:02:02Z

@fluboi
I had tried to backport to 8.0.1 but, they are 2 others patches
(#10482)
and I'm not sure about the stability.

I think I'll update to 8.2.2 + patches (seem they apply fine, I have done fast tests, I don't see evpn regression).

If you want to test, here a build with the 3 patches

wget https://mutulin1.odiso.net/frr_8.2.2-1+pve1_amd64.deb
wget https://mutulin1.odiso.net/frr_8.2.2-1+pve1_amd64.deb
dpkg -i frr_8.2.2-1+pve1_amd64.deb frr_8.2.2-1+pve1_amd64.deb
systemctl restart frr

For the record, I have another proxmox user on the forum with same kind of problem

Apr 22 09:06:55 parker zebra[1597466]: [WVJCK-PPMGD][EC 4043309093] netlink-cmd (NS 0) error: Device or resource busy, type=RTM_GETROUTE(26), seq=5, pid=2594392672

Apr 22 11:01:49 parker bgpd[1632074]: [VX6SM-8YE5W][EC 33554460] 10.0.10.4: nexthop_set failed, resetting connection - intf 0x0

fhttps://forum.proxmox.com/threads/implementations-of-sdn-networking.99628/page-2

fluboi · 2022-04-25T08:08:05Z

@aderumier
Just tried your package, it fix the issue and evpn seems to just work as expected. (but it's a small Lab) :)

aderumier · 2022-04-26T14:04:05Z

@fluboi
thanks. It's also fixing forum user bug. I'm currently testing 8.2.2 on a big test cluster for 7-10 days. If it's ok, I'll update the proxmox repo to 8.2.2.

Currently when the kernel sends netlink messages to FRR the buffers to receive this data is of fixed length. The kernel, with certain configurations, will send netlink messages that are larger than this fixed length. This leads to situations where, on startup, zebra gets really confused about the state of the kernel. Effectively the current algorithm is this: read up to buffer in size while (data to parse) get netlink message header, look at size parse if you can The problem is that there is a 32k buffer we read. We get the first message that is say 1k in size, subtract that 1k to 31k left to parse. We then get the next header and notice that the length of the message is 33k. Which is obviously larger than what we read in. FRR has no recover mechanism nor is there a way to know, a priori, what the maximum size the kernel will send us. Modify FRR to look at the kernel message and see if the buffer is large enough, if not, make it large enough to read in the message. This code has to be per netlink socket because of the usage of pthreads. So add to `struct nlsock` the buffer and current buffer length. Growing it as necessary. Fixes: FRRouting#10404 Signed-off-by: Donald Sharp <sharpd@nvidia.com>

fluboi added the triage Needs further investigation label Jan 22, 2022

riw777 closed this as completed in 2cf7651 Feb 9, 2022

This was referenced Oct 12, 2022

[FRR] port patch to address interface IP read issue sonic-net/sonic-buildimage#12366

Closed

[FRR 8.2.2] BGP sessions are unable to establish due to interface IP not recognized sonic-net/sonic-buildimage#12380

Open

SwimGeek mentioned this issue Feb 11, 2023

BGP session remains down, nexthop_set failed, bgp_getsockname() failed #12792

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zebra not able to get interfaces addresses and kernel route #10404

Zebra not able to get interfaces addresses and kernel route #10404

fluboi commented Jan 22, 2022

tufeigunchu commented Jan 27, 2022

fluboi commented Jan 27, 2022

fluboi commented Jan 27, 2022

tufeigunchu commented Jan 27, 2022

fluboi commented Jan 27, 2022

fluboi commented Jan 27, 2022

mjstapp commented Jan 27, 2022

fluboi commented Jan 27, 2022

donaldsharp commented Jan 28, 2022

cosmedd commented Mar 30, 2022

aderumier commented Apr 24, 2022

fluboi commented Apr 24, 2022

aderumier commented Apr 25, 2022

fluboi commented Apr 25, 2022

aderumier commented Apr 26, 2022

Zebra not able to get interfaces addresses and kernel route #10404

Zebra not able to get interfaces addresses and kernel route #10404

Comments

fluboi commented Jan 22, 2022

tufeigunchu commented Jan 27, 2022

fluboi commented Jan 27, 2022

fluboi commented Jan 27, 2022

tufeigunchu commented Jan 27, 2022

fluboi commented Jan 27, 2022

fluboi commented Jan 27, 2022

mjstapp commented Jan 27, 2022

fluboi commented Jan 27, 2022

donaldsharp commented Jan 28, 2022

cosmedd commented Mar 30, 2022

aderumier commented Apr 24, 2022

fluboi commented Apr 24, 2022

aderumier commented Apr 25, 2022

fluboi commented Apr 25, 2022

aderumier commented Apr 26, 2022