Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datapath: pass IPv6 NDP traffic to stack without policy check #24919

Merged
merged 1 commit into from Jun 20, 2023

Conversation

jschwinger233
Copy link
Member

@jschwinger233 jschwinger233 commented Apr 17, 2023

Previously, our policy check for IPv6 NDP traffic caused issues such as #23852 and #23910 because this traffic was identified as WORLD_ID, which would be given a verdict of drop when CiliumNetworkPolicy is applied for per-endpoint routing.

To resolve this issue, we pass all IPv6 NDP traffic to the stack without policy check.

This change aligns with how we handle IPv4 ARP: the cilium bpf never performs policy check for ARP, regardless of whether we enable ENABLE_ARP_PASSTHROUGH or ENABLE_ARP_RESPONDER.

Fixes: #23852
Fixes: #23910

Signed-off-by: Zhichuan Liang gray.liang@isovalent.com

Bypassing policy check for IPv6 NDP to fix broken pod-to-pod connectivity when per-endpoint route is enabled with policy.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Apr 17, 2023
@jschwinger233 jschwinger233 changed the title Gray/ipv6 ndp ICMPv6 NDP Apr 17, 2023
@jschwinger233 jschwinger233 added the release-note/bug This PR fixes an issue in a previous release of Cilium. label Apr 28, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Apr 28, 2023
@jschwinger233 jschwinger233 changed the title ICMPv6 NDP datapath: pass IPv6 NDP traffic to stack withtou policy check Apr 28, 2023
@jschwinger233 jschwinger233 requested a review from brb April 28, 2023 09:29
@jschwinger233 jschwinger233 marked this pull request as ready for review April 28, 2023 09:29
@jschwinger233 jschwinger233 requested a review from a team as a code owner April 28, 2023 09:29
@jschwinger233
Copy link
Member Author

@brb We don't really need to abandon ICMPv6 ND responder to resolve #23852 and #23910, just simply let the ND traffic pass to stack without policy check. Do you think this is an acceptable solution?

@brb
Copy link
Member

brb commented Apr 28, 2023

We don't really need to abandon ICMPv6 ND responder

SGTM! Just for what packets the built-in responder would be still used?

@jschwinger233
Copy link
Member Author

jschwinger233 commented Apr 28, 2023

Just for what packets the built-in responder would be still used?

All the ICMPv6 NS packets from containers are handled by responder, and the NS in the opposite direction, from host to container, are on kernel. The bpf program will lookup the endpoints map by NS target IPv6, then send back the ICMPv6 ND after amending the packet payload in place. Because container's IPv6 routing gateway is set to host router IPv6, almost all NS packets from container are targeting host router IPv6, and responder composes ND with lxc L2 address.

As mentioned in #23852 (comment), simply removing the NS responder doesn't make kernel capable of handling these NS packets. I managed to remove responder in a draft PR https://github.com/cilium/cilium/pull/24778/files by changing IPv6 gateway address inside each container during CNI runtime, it worked, in spite of breaking the Docker CNM plugin, but my main concern is, we must introduce a new option like ENABLE_IPV6_ND_RESPONDER, while I can't answer the question "for what scenario we should enable this option". So I think maybe just keep it simple.

@jschwinger233 jschwinger233 changed the title datapath: pass IPv6 NDP traffic to stack withtou policy check datapath: pass IPv6 NDP traffic to stack without policy check Apr 28, 2023
@jschwinger233
Copy link
Member Author

/test

@jschwinger233
Copy link
Member Author

Just for what packets the built-in responder would be still used?

Hold on, there are more places using ND responder, for example, ingress of eth0 when nodport / wireguard / host firewall is enabled.

That causes #14509 due to our responder setting the NA packet's source ipv6 to router ip, and some users complained about this behavior because their firewall/anti-spoofing filter doesn't recognize router ip and drops these NA packets as a result.

I think I still need to remove ND responder to address that issue.

But this PR could be independent to that. This PR is adequate to address #23910 and #23852, I'll open a new PR to remove responder.

Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@pchaigno pchaigno added sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. feature/ipv6 Relates to IPv6 protocol support labels May 15, 2023
@github-actions
Copy link

This pull request has been automatically marked as stale because it
has not had recent activity. It will be closed if no further activity
occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Jun 15, 2023
Previously, our policy check for IPv6 NDP traffic caused issues such
as cilium#23852 and cilium#23910 because this traffic was identified as WORLD_ID,
which would be given a verdict of drop when CiliumNetworkPolicy is
applied for per-endpoint routing.

To resolve this issue, we pass all IPv6 NDP traffic to the stack without
policy check.

This change aligns with how we handle IPv4 ARP: the cilium bpf never
performs policy check for ARP, regardless of whether we enable
`ENABLE_ARP_PASSTHROUGH` or `ENABLE_ARP_RESPONDER`.

Fixes: cilium#23852
Fixes: cilium#23910

Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
@jschwinger233
Copy link
Member Author

/test

@jschwinger233
Copy link
Member Author

/ci-ginkgo

@jschwinger233
Copy link
Member Author

jschwinger233 commented Jun 20, 2023

😿 can't trigger Conformance ginkgo (ci-ginkgo)
working now

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jun 20, 2023
@borkmann borkmann added the pinned These issues are not marked stale by our issue bot. label Jun 20, 2023
@borkmann borkmann merged commit 5039106 into cilium:main Jun 20, 2023
62 checks passed
@tamilmani1989
Copy link
Contributor

tamilmani1989 commented Dec 28, 2023

can this fix be backported to 1.13?

PS: I tried to cherry-pick this PR and this one #24921 on top on 1.13.10 and its not direct and have to resolve a conflict and couple of minor fixes. wanted to see if we can still backport this to 1.13 with those changes?

@jschwinger233
Copy link
Member Author

@tamilmani1989 I'm not sure, because all those works are based on #23445, which is not likely to be backported to 1.13.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/ipv6 Relates to IPv6 protocol support pinned These issues are not marked stale by our issue bot. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.
Projects
None yet
5 participants