Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datapath: Do not send ICMP6 NA over cilium_wg0 #23969

Merged
merged 1 commit into from Feb 24, 2023
Merged

Conversation

brb
Copy link
Member

@brb brb commented Feb 23, 2023

See commit msg.

Successful test run - https://github.com/cilium/cilium/actions/runs/4251516818/jobs/7393997184 (the upcoming IPv6 connectivity tests).

cc @gandro

Fix #23899

While running the upcoming ci-datapath IPv6 connectivity tests on a
cluster with WG's node-to-node encryption enabled, I've noticed that
sometimes the pod to host tests are flaking. For example:

  ❌ host-entity/pod-to-host/ping-3: cilium-test/client-7b78db77d5-dpl5z
  (fd00:10:244:2::83cf) -> fc00:f853:ccd:e793::6
  (fc00:f853:ccd:e793::6:0)

A further inspection traced the following relevant ICMPv6 neighbor
resolution:

   eth0  Out IP6 fc00:f853:ccd:e793::3 > ff02::1:ff00:6: ICMP6, neighbor
   solicitation, who has fc00:f853:ccd:e793::6, length 32

   cilium_wg0 In IP6 fd00:10:244:1::b8ed > fc00:f853:ccd:e793::3: ICMP6,
   neighbor advertisement, tgt is fc00:f853:ccd:e793::6, length 32

Meanwhile, a successful resolution was the following:

    eth0  Out IP6 fe80::42:acff:fe0c:103 > ff02::1:ff00:6: ICMP6,
    neighbor solicitation, who has fc00:f853:ccd:e793::6, length 32

    eth0  In  IP6 fd00:10:244:1::b8ed > fe80::42:acff:fe0c:103: ICMP6,
    neighbor advertisement, tgt is fc00:f853:ccd:e793::6, length 32

The second didn't go over the WG tunnel, as neither
fe80::42:acff:fe0c:103 nor ff02::1:ff00:6 are in the IPCache, while
fc00:f853:ccd:e793::3 is (eth0 v6 addr).

With the help of bpftrace:

    kprobe:__neigh_update {
        printf("__neigh_update: state:%x, flags:%x, new:%d\n",
            ((struct neighbour *)arg0)->nud_state, arg3, arg2);
    }

The following params were set for the NA received over the WG tunnel:

    __neigh_update: state:40, flags:7, new:2

The NUD_NOARP was set which made [1] condition to be hit. Even if we
remove IFF_NOARP from the cilium_wg0, the condition would be still hit
due to the IFF_POINTOPOINT [2].

To fix the issue, make sure that we don't send ICMPv6 NA packets over
the WG tunnel.

[1]: https://github.com/torvalds/linux/blob/v6.1/net/core/neighbour.c#L1329
[1]: https://github.com/torvalds/linux/blob/v6.1/net/ipv6/ndisc.c#L357.

Signed-off-by: Martynas Pumputis <m@lambda.lt>
@brb brb added kind/bug This is a bug in the Cilium logic. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/bug This PR fixes an issue in a previous release of Cilium. area/encryption Impacts encryption support such as IPSec, WireGuard, or kTLS. labels Feb 23, 2023
@brb brb added this to the 1.14 milestone Feb 23, 2023
@brb brb requested a review from a team as a code owner February 23, 2023 10:46
Copy link
Member

@dylandreimerink dylandreimerink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@brb
Copy link
Member Author

brb commented Feb 23, 2023

/ci-datapath

@brb
Copy link
Member Author

brb commented Feb 23, 2023

/test-1.25-4.19

@brb
Copy link
Member Author

brb commented Feb 23, 2023

/ci-verifier

@brb brb added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Feb 23, 2023
@sayboras
Copy link
Member

/test

@brb
Copy link
Member Author

brb commented Feb 24, 2023

@sayboras Forgot to mention in a comment, but I intentionally ran only those CI jobs which can be affected by the change.

@sayboras
Copy link
Member

Forgot to mention in a comment, but I intentionally ran only those CI jobs which can be affected by the change.

No worry, I can guess your intention. However, I ran the test during off peak hour, so it's actually quite fast.

@sayboras sayboras merged commit 84fb5fd into master Feb 24, 2023
@sayboras sayboras deleted the pr/brb/fix-wg-icmp6-na branch February 24, 2023 05:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/encryption Impacts encryption support such as IPSec, WireGuard, or kTLS. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

wireguard: ICMPv6 NA over cilium_wg0 is dropped
4 participants