-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Neighbor entry is installed on every NIC on a MultiNIC node #28660
Comments
Cc @ysksuzuki could you take a look? On multi-nic we should only install it where the actual route is. |
@borkmann Sure! |
cilium/pkg/datapath/linux/node.go Line 687 in a79241a
EDIT: We might be able to use
|
This PR fixes the issue that neighbor entries can be installed on devices where no route to the destination host exists. `netlink.RouteGetWithOptions` (equivalent to `ip route get`) with oif returns a route even if the destination is unreachable from the specified interface. The neighbor entries can be installed on all devices because of this. `netlink.RouteGetWithOptions` with `FIBMatch` option (equivalent to `fibmatch` flag) returns full fib lookup matched route. It returns `No route to host` if the destination is non-routable from the specified device. This PR adds `FIBMatch` option to avoid installing the unnecessary entries. Note: - With IPv6, it returns `Network is unreachable` if the destination is unreachable with or without fibmatch. - With `FIBMatch` option, `netlink.RouteGetWithOptions` returns `MultiPath` field if there are multiple paths to the dest. It returns `GW` field without `FIBMatch`. See examples below. ``` // FIBMatch: false netlink.Route{ GW: 8.8.8.250, MultiPath: [] } netlink.Route{ GW: 9.9.9.250, MultiPath: [] } // FIBMatch: true netlink.Route{ GW: <nil>, MultiPath: [{Ifindex: 1218 Weight: 1 Gw: 9.9.9.250 Flags: []}, {Ifindex: 1220 Weight: 1 Gw: 8.8.8.250 Flags: []}]} netlink.Route{ GW: <nil>, MultiPath: [{Ifindex: 1218 Weight: 1 Gw: 9.9.9.250 Flags: []}, {Ifindex: 1220 Weight: 1 Gw: 8.8.8.250 Flags: []}]} ``` `ip route get` examples ``` $ ip route 10.0.0.0/24 dev veth1 proto kernel scope link src 10.0.0.1 10.0.1.0/24 dev veth3 proto kernel scope link src 10.0.1.1 $ ping -I veth1 10.0.0.1 PING 10.0.0.1 (10.0.0.1) from 10.0.0.1 veth1: 56(84) bytes of data. 64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.066 ms 64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.056 ms 64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=0.057 ms $ ping -I veth3 10.0.0.1 PING 10.0.0.1 (10.0.0.1) from 10.0.1.1 veth3: 56(84) bytes of data. From 10.0.1.1 icmp_seq=1 Destination Host Unreachable From 10.0.1.1 icmp_seq=2 Destination Host Unreachable From 10.0.1.1 icmp_seq=3 Destination Host Unreachable $ ip route get 10.0.0.1 oif veth1 local 10.0.0.1 dev lo table local src 10.0.0.1 uid 1000 cache <local> // `ip route get` returns a route even if the destination is unreachable // from the specified interface $ ip route get 10.0.0.1 oif veth3 10.0.0.1 dev veth3 src 10.0.1.1 uid 1000 cache // With fibmatch flag, it returns full fib lookup matched route $ ip route get fibmatch 10.0.0.1 oif veth1 local 10.0.0.1 dev veth1 proto kernel scope host src 10.0.0.1 $ ip route get fibmatch 10.0.0.1 oif veth3 RTNETLINK answers: No route to host ``` Fixes: cilium#28660 Signed-off-by: Yusuke Suzuki <ysuzuki4112@gmail.com>
This PR fixes the issue that neighbor entries can be installed on devices where no route to the destination host exists. `netlink.RouteGetWithOptions` (equivalent to `ip route get`) with oif returns a route even if the destination is unreachable from the specified interface. The neighbor entries can be installed on all devices because of this. `netlink.RouteGetWithOptions` with `FIBMatch` option (equivalent to `fibmatch` flag) returns full fib lookup matched route. It returns `No route to host` if the destination is non-routable from the specified device. This PR adds `FIBMatch` option to avoid installing the unnecessary entries. Note: - With IPv6, it returns `Network is unreachable` if the destination is unreachable with or without fibmatch. - With `FIBMatch` option, `netlink.RouteGetWithOptions` returns `MultiPath` field if there are multiple paths to the dest. It returns `GW` field without `FIBMatch`. See examples below. ``` // FIBMatch: false netlink.Route{ GW: 8.8.8.250, MultiPath: [] } netlink.Route{ GW: 9.9.9.250, MultiPath: [] } // FIBMatch: true netlink.Route{ GW: <nil>, MultiPath: [{Ifindex: 1218 Weight: 1 Gw: 9.9.9.250 Flags: []}, {Ifindex: 1220 Weight: 1 Gw: 8.8.8.250 Flags: []}]} ``` `ip route get` examples ``` $ ip route 10.0.0.0/24 dev veth1 proto kernel scope link src 10.0.0.1 10.0.1.0/24 dev veth3 proto kernel scope link src 10.0.1.1 $ ping -I veth1 10.0.0.1 PING 10.0.0.1 (10.0.0.1) from 10.0.0.1 veth1: 56(84) bytes of data. 64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.066 ms 64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.056 ms 64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=0.057 ms $ ping -I veth3 10.0.0.1 PING 10.0.0.1 (10.0.0.1) from 10.0.1.1 veth3: 56(84) bytes of data. From 10.0.1.1 icmp_seq=1 Destination Host Unreachable From 10.0.1.1 icmp_seq=2 Destination Host Unreachable From 10.0.1.1 icmp_seq=3 Destination Host Unreachable $ ip route get 10.0.0.1 oif veth1 local 10.0.0.1 dev lo table local src 10.0.0.1 uid 1000 cache <local> // `ip route get` returns a route even if the destination is unreachable // from the specified interface $ ip route get 10.0.0.1 oif veth3 10.0.0.1 dev veth3 src 10.0.1.1 uid 1000 cache // With fibmatch flag, it returns full fib lookup matched route $ ip route get fibmatch 10.0.0.1 oif veth1 local 10.0.0.1 dev veth1 proto kernel scope host src 10.0.0.1 $ ip route get fibmatch 10.0.0.1 oif veth3 RTNETLINK answers: No route to host ``` Fixes: cilium#28660 Signed-off-by: Yusuke Suzuki <ysuzuki4112@gmail.com>
This PR fixes the issue that neighbor entries can be installed on devices where no route to the destination host exists. `netlink.RouteGetWithOptions` (equivalent to `ip route get`) with oif returns a route even if the destination is unreachable from the specified interface. The neighbor entries can be installed on all devices because of this. `netlink.RouteGetWithOptions` with `FIBMatch` option (equivalent to `fibmatch` flag) returns full fib lookup matched route. It returns `No route to host` if the destination is non-routable from the specified device. This PR adds `FIBMatch` option to avoid installing the unnecessary entries. Note: - With IPv6, it returns `Network is unreachable` if the destination is unreachable with or without fibmatch. - With `FIBMatch` option, `netlink.RouteGetWithOptions` returns `MultiPath` field if there are multiple paths to the dest. It returns `GW` field without `FIBMatch`. See examples below. ``` // FIBMatch: false netlink.Route{ GW: 8.8.8.250, MultiPath: [] } netlink.Route{ GW: 9.9.9.250, MultiPath: [] } // FIBMatch: true netlink.Route{ GW: <nil>, MultiPath: [{Ifindex: 1218 Weight: 1 Gw: 9.9.9.250 Flags: []}, {Ifindex: 1220 Weight: 1 Gw: 8.8.8.250 Flags: []}]} ``` `ip route get` examples ``` $ ip route 10.0.0.0/24 dev veth1 proto kernel scope link src 10.0.0.1 10.0.1.0/24 dev veth3 proto kernel scope link src 10.0.1.1 $ ping -I veth1 10.0.0.1 PING 10.0.0.1 (10.0.0.1) from 10.0.0.1 veth1: 56(84) bytes of data. 64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.066 ms 64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.056 ms 64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=0.057 ms $ ping -I veth3 10.0.0.1 PING 10.0.0.1 (10.0.0.1) from 10.0.1.1 veth3: 56(84) bytes of data. From 10.0.1.1 icmp_seq=1 Destination Host Unreachable From 10.0.1.1 icmp_seq=2 Destination Host Unreachable From 10.0.1.1 icmp_seq=3 Destination Host Unreachable $ ip route get 10.0.0.1 oif veth1 local 10.0.0.1 dev lo table local src 10.0.0.1 uid 1000 cache <local> // `ip route get` returns a route even if the destination is unreachable // from the specified interface $ ip route get 10.0.0.1 oif veth3 10.0.0.1 dev veth3 src 10.0.1.1 uid 1000 cache // With fibmatch flag, it returns full fib lookup matched route $ ip route get fibmatch 10.0.0.1 oif veth1 local 10.0.0.1 dev veth1 proto kernel scope host src 10.0.0.1 $ ip route get fibmatch 10.0.0.1 oif veth3 RTNETLINK answers: No route to host ``` Fixes: #28660 Signed-off-by: Yusuke Suzuki <ysuzuki4112@gmail.com>
This PR fixes the issue that neighbor entries can be installed on devices where no route to the destination host exists. `netlink.RouteGetWithOptions` (equivalent to `ip route get`) with oif returns a route even if the destination is unreachable from the specified interface. The neighbor entries can be installed on all devices because of this. `netlink.RouteGetWithOptions` with `FIBMatch` option (equivalent to `fibmatch` flag) returns full fib lookup matched route. It returns `No route to host` if the destination is non-routable from the specified device. This PR adds `FIBMatch` option to avoid installing the unnecessary entries. Note: - With IPv6, it returns `Network is unreachable` if the destination is unreachable with or without fibmatch. - With `FIBMatch` option, `netlink.RouteGetWithOptions` returns `MultiPath` field if there are multiple paths to the dest. It returns `GW` field without `FIBMatch`. See examples below. ``` // FIBMatch: false netlink.Route{ GW: 8.8.8.250, MultiPath: [] } netlink.Route{ GW: 9.9.9.250, MultiPath: [] } // FIBMatch: true netlink.Route{ GW: <nil>, MultiPath: [{Ifindex: 1218 Weight: 1 Gw: 9.9.9.250 Flags: []}, {Ifindex: 1220 Weight: 1 Gw: 8.8.8.250 Flags: []}]} ``` `ip route get` examples ``` $ ip route 10.0.0.0/24 dev veth1 proto kernel scope link src 10.0.0.1 10.0.1.0/24 dev veth3 proto kernel scope link src 10.0.1.1 $ ping -I veth1 10.0.0.1 PING 10.0.0.1 (10.0.0.1) from 10.0.0.1 veth1: 56(84) bytes of data. 64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.066 ms 64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.056 ms 64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=0.057 ms $ ping -I veth3 10.0.0.1 PING 10.0.0.1 (10.0.0.1) from 10.0.1.1 veth3: 56(84) bytes of data. From 10.0.1.1 icmp_seq=1 Destination Host Unreachable From 10.0.1.1 icmp_seq=2 Destination Host Unreachable From 10.0.1.1 icmp_seq=3 Destination Host Unreachable $ ip route get 10.0.0.1 oif veth1 local 10.0.0.1 dev lo table local src 10.0.0.1 uid 1000 cache <local> // `ip route get` returns a route even if the destination is unreachable // from the specified interface $ ip route get 10.0.0.1 oif veth3 10.0.0.1 dev veth3 src 10.0.1.1 uid 1000 cache // With fibmatch flag, it returns full fib lookup matched route $ ip route get fibmatch 10.0.0.1 oif veth1 local 10.0.0.1 dev veth1 proto kernel scope host src 10.0.0.1 $ ip route get fibmatch 10.0.0.1 oif veth3 RTNETLINK answers: No route to host ``` Fixes: cilium#28660 Signed-off-by: Yusuke Suzuki <ysuzuki4112@gmail.com>
This PR fixes the issue that neighbor entries can be installed on devices where no route to the destination host exists. `netlink.RouteGetWithOptions` (equivalent to `ip route get`) with oif returns a route even if the destination is unreachable from the specified interface. The neighbor entries can be installed on all devices because of this. `netlink.RouteGetWithOptions` with `FIBMatch` option (equivalent to `fibmatch` flag) returns full fib lookup matched route. It returns `No route to host` if the destination is non-routable from the specified device. This PR adds `FIBMatch` option to avoid installing the unnecessary entries. Note: - With IPv6, it returns `Network is unreachable` if the destination is unreachable with or without fibmatch. - With `FIBMatch` option, `netlink.RouteGetWithOptions` returns `MultiPath` field if there are multiple paths to the dest. It returns `GW` field without `FIBMatch`. See examples below. ``` // FIBMatch: false netlink.Route{ GW: 8.8.8.250, MultiPath: [] } netlink.Route{ GW: 9.9.9.250, MultiPath: [] } // FIBMatch: true netlink.Route{ GW: <nil>, MultiPath: [{Ifindex: 1218 Weight: 1 Gw: 9.9.9.250 Flags: []}, {Ifindex: 1220 Weight: 1 Gw: 8.8.8.250 Flags: []}]} ``` `ip route get` examples ``` $ ip route 10.0.0.0/24 dev veth1 proto kernel scope link src 10.0.0.1 10.0.1.0/24 dev veth3 proto kernel scope link src 10.0.1.1 $ ping -I veth1 10.0.0.1 PING 10.0.0.1 (10.0.0.1) from 10.0.0.1 veth1: 56(84) bytes of data. 64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.066 ms 64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.056 ms 64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=0.057 ms $ ping -I veth3 10.0.0.1 PING 10.0.0.1 (10.0.0.1) from 10.0.1.1 veth3: 56(84) bytes of data. From 10.0.1.1 icmp_seq=1 Destination Host Unreachable From 10.0.1.1 icmp_seq=2 Destination Host Unreachable From 10.0.1.1 icmp_seq=3 Destination Host Unreachable $ ip route get 10.0.0.1 oif veth1 local 10.0.0.1 dev lo table local src 10.0.0.1 uid 1000 cache <local> // `ip route get` returns a route even if the destination is unreachable // from the specified interface $ ip route get 10.0.0.1 oif veth3 10.0.0.1 dev veth3 src 10.0.1.1 uid 1000 cache // With fibmatch flag, it returns full fib lookup matched route $ ip route get fibmatch 10.0.0.1 oif veth1 local 10.0.0.1 dev veth1 proto kernel scope host src 10.0.0.1 $ ip route get fibmatch 10.0.0.1 oif veth3 RTNETLINK answers: No route to host ``` Fixes: cilium#28660 Signed-off-by: Yusuke Suzuki <ysuzuki4112@gmail.com>
Is there an existing issue for this?
What happened?
When cilium runs on a MultiNIC node with enable-l2-neigh-discovery, a neighbor entry is installed for every node and every NIC, even if there is no route to the node through that NIC:
From code, we only installed the entry if there is "route" to the node on the NIC. But from what I test on the node, it always returns a route:
so a neighbor entry is added for each node and each NIC. Because the ARP fails, kernel keeps trying to refresh them constantly. I see a retry every 3 second from kernel.
Cilium Version
1.13.6
Kernel Version
6.1.42
Kubernetes Version
v1.28.1
Sysdump
No response
Relevant log output
No response
Anything else?
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: