New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
datapath: Do not send ICMP6 NA over cilium_wg0 #23969
Conversation
While running the upcoming ci-datapath IPv6 connectivity tests on a cluster with WG's node-to-node encryption enabled, I've noticed that sometimes the pod to host tests are flaking. For example: ❌ host-entity/pod-to-host/ping-3: cilium-test/client-7b78db77d5-dpl5z (fd00:10:244:2::83cf) -> fc00:f853:ccd:e793::6 (fc00:f853:ccd:e793::6:0) A further inspection traced the following relevant ICMPv6 neighbor resolution: eth0 Out IP6 fc00:f853:ccd:e793::3 > ff02::1:ff00:6: ICMP6, neighbor solicitation, who has fc00:f853:ccd:e793::6, length 32 cilium_wg0 In IP6 fd00:10:244:1::b8ed > fc00:f853:ccd:e793::3: ICMP6, neighbor advertisement, tgt is fc00:f853:ccd:e793::6, length 32 Meanwhile, a successful resolution was the following: eth0 Out IP6 fe80::42:acff:fe0c:103 > ff02::1:ff00:6: ICMP6, neighbor solicitation, who has fc00:f853:ccd:e793::6, length 32 eth0 In IP6 fd00:10:244:1::b8ed > fe80::42:acff:fe0c:103: ICMP6, neighbor advertisement, tgt is fc00:f853:ccd:e793::6, length 32 The second didn't go over the WG tunnel, as neither fe80::42:acff:fe0c:103 nor ff02::1:ff00:6 are in the IPCache, while fc00:f853:ccd:e793::3 is (eth0 v6 addr). With the help of bpftrace: kprobe:__neigh_update { printf("__neigh_update: state:%x, flags:%x, new:%d\n", ((struct neighbour *)arg0)->nud_state, arg3, arg2); } The following params were set for the NA received over the WG tunnel: __neigh_update: state:40, flags:7, new:2 The NUD_NOARP was set which made [1] condition to be hit. Even if we remove IFF_NOARP from the cilium_wg0, the condition would be still hit due to the IFF_POINTOPOINT [2]. To fix the issue, make sure that we don't send ICMPv6 NA packets over the WG tunnel. [1]: https://github.com/torvalds/linux/blob/v6.1/net/core/neighbour.c#L1329 [1]: https://github.com/torvalds/linux/blob/v6.1/net/ipv6/ndisc.c#L357. Signed-off-by: Martynas Pumputis <m@lambda.lt>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/ci-datapath |
/test-1.25-4.19 |
/ci-verifier |
/test |
@sayboras Forgot to mention in a comment, but I intentionally ran only those CI jobs which can be affected by the change. |
No worry, I can guess your intention. However, I ran the test during off peak hour, so it's actually quite fast. |
See commit msg.
Successful test run - https://github.com/cilium/cilium/actions/runs/4251516818/jobs/7393997184 (the upcoming IPv6 connectivity tests).
cc @gandro
Fix #23899