New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpf: Re-introduce ICMPv6 NS responder on from-netdev #30837
Conversation
/test |
3c8955e
to
56296a5
Compare
/test |
01c4b18
to
0bf3a8c
Compare
/test |
0bf3a8c
to
7828c78
Compare
/test |
ci-e2e keeps failing on north-south-loadbalancing but I couldn't make a local repro: https://github.com/cilium/cilium/actions/runs/7971912812/job/21798175918 Adding pwru gh action to see if I can get something useful. |
/ci-e2e |
This turns out to be a kernel specific bug. c0dfeb0 (cc @julianwiedmann ) added a testcase for 5.4 with KPR=on + routing=vxlan, revealed this hidden bug which was accidentally fixed on 1.14, but now come back again when I intended to simply revert #25329 . The scenario is NS connectivity, we are curling from outside to node_ip:node_port. The key part is when first TCP reply with syn+ack arrives at cilium_vxlan, it is supposed to be rev-NAT-ed to the original tuple, then to do a bpf_fib_lookup followed by bpf_redirect. With or without bpf NS responder makes a difference for v6 neighbor system:
Then it's going to affect bpf_fib_lookup:
#27642 is going to be helpful once we have the necessary kernel patch backport. For now, I'm going to work another solution: bring back bpf NS responder only if the NS is asking for a pod; if NS is asking for the node IP, just hand it over to stack. |
/ci-e2e |
c1d8b0f
to
c432029
Compare
/ci-e2e |
CI was happy: https://github.com/cilium/cilium/actions/runs/8018397688/job/21904227831 Will add more bpf unit test and polish commit messages for review |
[ upstream commit: dc9dfd7 ] [ backporter's notes: in v1.15 icmp6_handle_ns() doesn't have ext_err parameter, so icmp6_host_handle() doesn't need to accept it either. ] This reverts commit 6580714, to fix the breakage of "IPv6 NS responder for pod" introduced by cilium#12086 (bpf: Reply NA when recv ND for local IPv6 endpoints). 6580714 was merged to solve cilium#14509. To not revive cilium#14509, this commit also passes through ICMPv6 NS if the target is native node IP (eth0's addr). By letting stack take care of those NS-for-node-IP packets, we managed to: 1. Solve cilium#14509 again, but in a way keeping NS responder. The cause of cilium#14509 was NS responder always generates ND whose source IP is "router_ip" (cilium_internal_ip) rather than "node_ip". Once we pass those NS-for-node-IP packets to stack, the ND response would naturally have "node_ip" as source. 2. Avoid the fib_lookup failure mentioned at cilium#30837 (comment). icmp6_host_handle() also has a new parameter `handle_ns` to control if we want NS responder to be active. If it is called from `to-netdev` code path, handle_ns is set to false. This is suggested by julianwiedmann. Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
[ upstream commit: dc9dfd7 ] [ backporter's notes: in v1.15 icmp6_handle_ns() doesn't have ext_err parameter, so icmp6_host_handle() doesn't need to accept it either. ] This reverts commit 6580714, to fix the breakage of "IPv6 NS responder for pod" introduced by #12086 (bpf: Reply NA when recv ND for local IPv6 endpoints). 6580714 was merged to solve #14509. To not revive #14509, this commit also passes through ICMPv6 NS if the target is native node IP (eth0's addr). By letting stack take care of those NS-for-node-IP packets, we managed to: 1. Solve #14509 again, but in a way keeping NS responder. The cause of #14509 was NS responder always generates ND whose source IP is "router_ip" (cilium_internal_ip) rather than "node_ip". Once we pass those NS-for-node-IP packets to stack, the ND response would naturally have "node_ip" as source. 2. Avoid the fib_lookup failure mentioned at #30837 (comment). icmp6_host_handle() also has a new parameter `handle_ns` to control if we want NS responder to be active. If it is called from `to-netdev` code path, handle_ns is set to false. This is suggested by julianwiedmann. Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
[ upstream commit: dc9dfd7 ] [ backporter's notes: in v1.14 icmp6_handle_ns() doesn't have ext_err parameter, so icmp6_host_handle() doesn't need to accept it either. ] This reverts commit 6580714, to fix the breakage of "IPv6 NS responder for pod" introduced by cilium#12086 (bpf: Reply NA when recv ND for local IPv6 endpoints). 6580714 was merged to solve cilium#14509. To not revive cilium#14509, this commit also passes through ICMPv6 NS if the target is native node IP (eth0's addr). By letting stack take care of those NS-for-node-IP packets, we managed to: 1. Solve cilium#14509 again, but in a way keeping NS responder. The cause of cilium#14509 was NS responder always generates ND whose source IP is "router_ip" (cilium_internal_ip) rather than "node_ip". Once we pass those NS-for-node-IP packets to stack, the ND response would naturally have "node_ip" as source. 2. Avoid the fib_lookup failure mentioned at cilium#30837 (comment). icmp6_host_handle() also has a new parameter `handle_ns` to control if we want NS responder to be active. If it is called from `to-netdev` code path, handle_ns is set to false. This is suggested by julianwiedmann. Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
[ upstream commit: dc9dfd7 ] [ backporter's notes: in v1.14 icmp6_handle_ns() doesn't have ext_err parameter, so icmp6_host_handle() doesn't need to accept it either. ] This reverts commit 6580714, to fix the breakage of "IPv6 NS responder for pod" introduced by cilium#12086 (bpf: Reply NA when recv ND for local IPv6 endpoints). 6580714 was merged to solve cilium#14509. To not revive cilium#14509, this commit also passes through ICMPv6 NS if the target is native node IP (eth0's addr). By letting stack take care of those NS-for-node-IP packets, we managed to: 1. Solve cilium#14509 again, but in a way keeping NS responder. The cause of cilium#14509 was NS responder always generates ND whose source IP is "router_ip" (cilium_internal_ip) rather than "node_ip". Once we pass those NS-for-node-IP packets to stack, the ND response would naturally have "node_ip" as source. 2. Avoid the fib_lookup failure mentioned at cilium#30837 (comment). icmp6_host_handle() also has a new parameter `handle_ns` to control if we want NS responder to be active. If it is called from `to-netdev` code path, handle_ns is set to false. This is suggested by julianwiedmann. Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
[ upstream commit: dc9dfd7 ] [ backporter's notes: in v1.14 icmp6_handle_ns() doesn't have ext_err parameter, so icmp6_host_handle() doesn't need to accept it either. ] This reverts commit 6580714, to fix the breakage of "IPv6 NS responder for pod" introduced by cilium#12086 (bpf: Reply NA when recv ND for local IPv6 endpoints). 6580714 was merged to solve cilium#14509. To not revive cilium#14509, this commit also passes through ICMPv6 NS if the target is native node IP (eth0's addr). By letting stack take care of those NS-for-node-IP packets, we managed to: 1. Solve cilium#14509 again, but in a way keeping NS responder. The cause of cilium#14509 was NS responder always generates ND whose source IP is "router_ip" (cilium_internal_ip) rather than "node_ip". Once we pass those NS-for-node-IP packets to stack, the ND response would naturally have "node_ip" as source. 2. Avoid the fib_lookup failure mentioned at cilium#30837 (comment). icmp6_host_handle() also has a new parameter `handle_ns` to control if we want NS responder to be active. If it is called from `to-netdev` code path, handle_ns is set to false. This is suggested by julianwiedmann. Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
[ upstream commit: dc9dfd7 ] [ backporter's notes: in v1.14 icmp6_handle_ns() doesn't have ext_err parameter, so icmp6_host_handle() doesn't need to accept it either. ] This reverts commit 6580714, to fix the breakage of "IPv6 NS responder for pod" introduced by cilium#12086 (bpf: Reply NA when recv ND for local IPv6 endpoints). 6580714 was merged to solve cilium#14509. To not revive cilium#14509, this commit also passes through ICMPv6 NS if the target is native node IP (eth0's addr). By letting stack take care of those NS-for-node-IP packets, we managed to: 1. Solve cilium#14509 again, but in a way keeping NS responder. The cause of cilium#14509 was NS responder always generates ND whose source IP is "router_ip" (cilium_internal_ip) rather than "node_ip". Once we pass those NS-for-node-IP packets to stack, the ND response would naturally have "node_ip" as source. 2. Avoid the fib_lookup failure mentioned at cilium#30837 (comment). icmp6_host_handle() also has a new parameter `handle_ns` to control if we want NS responder to be active. If it is called from `to-netdev` code path, handle_ns is set to false. This is suggested by julianwiedmann. Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
[ upstream commit: dc9dfd7 ] [ backporter's notes: in v1.14 icmp6_handle_ns() doesn't have ext_err parameter, so icmp6_host_handle() doesn't need to accept it either. ] This reverts commit 6580714, to fix the breakage of "IPv6 NS responder for pod" introduced by cilium#12086 (bpf: Reply NA when recv ND for local IPv6 endpoints). 6580714 was merged to solve cilium#14509. To not revive cilium#14509, this commit also passes through ICMPv6 NS if the target is native node IP (eth0's addr). By letting stack take care of those NS-for-node-IP packets, we managed to: 1. Solve cilium#14509 again, but in a way keeping NS responder. The cause of cilium#14509 was NS responder always generates ND whose source IP is "router_ip" (cilium_internal_ip) rather than "node_ip". Once we pass those NS-for-node-IP packets to stack, the ND response would naturally have "node_ip" as source. 2. Avoid the fib_lookup failure mentioned at cilium#30837 (comment). icmp6_host_handle() also has a new parameter `handle_ns` to control if we want NS responder to be active. If it is called from `to-netdev` code path, handle_ns is set to false. This is suggested by julianwiedmann. Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
[ upstream commit: dc9dfd7 ] [ backporter's notes: in v1.14 icmp6_handle_ns() doesn't have ext_err parameter, so icmp6_host_handle() doesn't need to accept it either. ] This reverts commit 6580714, to fix the breakage of "IPv6 NS responder for pod" introduced by #12086 (bpf: Reply NA when recv ND for local IPv6 endpoints). 6580714 was merged to solve #14509. To not revive #14509, this commit also passes through ICMPv6 NS if the target is native node IP (eth0's addr). By letting stack take care of those NS-for-node-IP packets, we managed to: 1. Solve #14509 again, but in a way keeping NS responder. The cause of #14509 was NS responder always generates ND whose source IP is "router_ip" (cilium_internal_ip) rather than "node_ip". Once we pass those NS-for-node-IP packets to stack, the ND response would naturally have "node_ip" as source. 2. Avoid the fib_lookup failure mentioned at #30837 (comment). icmp6_host_handle() also has a new parameter `handle_ns` to control if we want NS responder to be active. If it is called from `to-netdev` code path, handle_ns is set to false. This is suggested by julianwiedmann. Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
This reverts commit 6580714, to fix the breakage of "IPv6 NS responder for pod" introduced by #12086 (bpf: Reply NA when recv ND for local IPv6 endpoints).
6580714 was merged to solve #14509. To not revive #14509, this commit also passes through ICMPv6 NS if the target is native node IP (eth0's addr). By letting stack take care of those NS-for-node-IP packets, we managed to:
Solve IPV6 access to node is lost after installing cilium with ipv6 enabled #14509 again, but in a way keeping NS responder. The cause of IPV6 access to node is lost after installing cilium with ipv6 enabled #14509 was NS responder always generates ND whose source IP is "router_ip" (cilium_internal_ip) rather than "node_ip". Once we pass those NS-for-node-IP packets to stack, the ND response would naturally have "node_ip" as source.
Avoid the fib_lookup failure mentioned at bpf: Re-introduce ICMPv6 NS responder on from-netdev #30837 (comment).
This PR also adds bpf unit test to cover IPv6 NS responder feature.
Fixes: #30926