Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement BPF-based masquerading for IPv6 #18694

Closed
wants to merge 7 commits into from

Conversation

pchaigno
Copy link
Member

@pchaigno pchaigno commented Feb 1, 2022

Summary of commits:

  1. Preparatory: use a bit of inline BPF asm to avoid bad optimization from LLVM.
  2. Preparatory: refactor ct_is_reply4 as a macro to facilitate implem. of IPv6 counterpart.
  3. Implement the datapath changes by following the IPv4 implementation.
  4. Preparatory: refactor InitBPFMasqueradeAddrs to facilitate implem. of IPv6 counterpart.
  5. Implement agent changes for the flags and macros.
  6. Update documentation.
  7. Extend the tests to cover IPv6 BPF masquerading.

See commits for details.

Fixes: #14350.

@pchaigno pchaigno added sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/major This PR introduces major new functionality to Cilium. feature/ipv6 Relates to IPv6 protocol support labels Feb 1, 2022
@pchaigno pchaigno force-pushed the ipv6-bpf-masquerading branch 6 times, most recently from 84df485 to 796fdcb Compare February 6, 2022 17:10
@pchaigno pchaigno force-pushed the ipv6-bpf-masquerading branch 3 times, most recently from 92cff28 to 9e8f18d Compare February 10, 2022 16:22
@pchaigno pchaigno marked this pull request as ready for review February 10, 2022 22:10
@pchaigno pchaigno requested a review from a team February 10, 2022 22:10
@pchaigno pchaigno requested a review from a team as a code owner February 10, 2022 22:10
@pchaigno pchaigno requested a review from a team February 10, 2022 22:10
@pchaigno pchaigno requested a review from a team as a code owner February 10, 2022 22:10
Copy link
Member

@christarazi christarazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

Copy link
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, nice work!

Maybe one more thing to update:

$ grep supported Documentation/concepts/networking/masquerading.rst
    eBPF based masquerading is currently not supported for IPv6 traffic.

@pchaigno pchaigno force-pushed the ipv6-bpf-masquerading branch 2 times, most recently from 3b2eef3 to 3cc9787 Compare May 23, 2022 17:09
@github-actions

This comment was marked as resolved.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Jul 7, 2022
@github-actions github-actions bot removed the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Jul 16, 2022
@brb
Copy link
Member

brb commented Jul 29, 2022

The complexity has been improved - #20425 (comment). @qmonnet @pchaigno this PR is no longer blocked.

Since ctx->{data,data_end,data_meta} are 32-bits fields, LLVM sometimes
generates 32-bit assignments for those pointers. This leads to verifier
errors as the verifier is unable to track the packet pointers through
the 32-bit assignments, as show below.

    ; return (void *)(unsigned long)ctx->data;
    2: (61) r9 = *(u32 *)(r7 +76)
    ; R7_w=ctx(id=0,off=0,imm=0) R9_w=pkt(id=0,off=0,r=0,imm=0)
    ; return (void *)(unsigned long)ctx->data;
    3: (bc) w6 = w9
    ; R6_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R9_w=pkt(id=0,off=0,r=0,imm=0)
    ; if (data + tot_len > data_end)
    4: (bf) r2 = r6
    ; R2_w=inv(id=1,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6_w=inv(id=1,umax_value=4294967295,var_off=(0x0; 0xffffffff))
    5: (07) r2 += 54
    ; R2_w=inv(id=0,umin_value=54,umax_value=4294967349,var_off=(0x0; 0x1ffffffff))
    ; if (data + tot_len > data_end)
    6: (2d) if r2 > r1 goto pc+466
    ; R1_w=pkt_end(id=0,off=0,imm=0) R2_w=inv(id=0,umin_value=54,umax_value=4294967349,var_off=(0x0; 0x1ffffffff))
    ; tmp = a->d1 - b->d1;
    7: (71) r2 = *(u8 *)(r6 +22)
    R6 invalid mem access 'inv'

As a workaround, Yonghong Song suggested to read those fields using asm
bytecode. With that trick, LLVM is unaware that the original field is
actually on 32 bits and can't perform the above "optimization".

This workaround is needed only now because a subsequent commit
introduces some BPF C code that causes this bytecode to be generated
(snat_v6_needed).

Related: https://lore.kernel.org/bpf/c5a76b4d-abed-51f6-bf16-040eb0baf290@fb.com/T/#m933626f7404562a6e98af78a8f44c30977681ffa
Suggested-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Paul Chaignon <paul@cilium.io>
This will allow us to easily define the IPv6 equivalent in a subsequent
commit. This commit should include no functional changes.

Signed-off-by: Paul Chaignon <paul@cilium.io>
This commit implements BPF masquerading for IPv6 by following the
implementation for IPv4. The main function changed here is
snat_v6_needed with almost the exact same steps as IPv4.

Support for the ipmasq agent in IPv6 is not implemented here.

[ Quentin: Rebased on master branch, addressed minor conflicts in
  nodeport.h due to commit 0ea4341 ("bpf: Make union v6addr const
  where possible"). ]

Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
This commit refactors InitBPFMasqueradeAddrs to easily extend it for
IPv6 in a subsequent commit. This commit shouldn't have any functional
changes.

Signed-off-by: Paul Chaignon <paul@cilium.io>
This commit allows enabling BPF masquerading for IPv6 and defines all
the proper macros for the datapath implementation.

[ Quentin: Rebased on master branch, addressed minor conflicts on
  daemon.go and loader.go. ]

Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
BPF-based IPv6 masquerading is now supported, except for the ip-masq
agent. This commit therefore updates the documentation accordingly.

Signed-off-by: Paul Chaignon <paul@cilium.io>
Since Cilium supports BPF masquerading for IPv6, we can now test it.

The upgrade/downgrade test needs a small exception because IPv6 BPF
masquerading is still not supported on v1.11, from which we upgrade.
The exception logic comes with an assertion, to ensure we don't forget
to remove the exception once the upgrade is performed from v1.12.

Signed-off-by: Paul Chaignon <paul@cilium.io>
@qmonnet
Copy link
Member

qmonnet commented Jul 29, 2022

@pchaigno I rebased, but I can't push on your repo. I pushed to cilium/pr/pchaigno/ipv6-bpf-masquerading instead, any chance you can swap to that branch for the PR?

I addressed minor conflicts on daemon.go, loader.go, nodeport.h. Nothing that looked significant.

Diff on nodeport.h
diff --git a/bpf/lib/nodeport.h b/bpf/lib/nodeport.h
index fa29b2908c90..22b7a7f8575d 100644
--- a/bpf/lib/nodeport.h
+++ b/bpf/lib/nodeport.h
@@ -179,10 +179,10 @@ static __always_inline bool nodeport_uses_dsr6(const struct ipv6_ct_tuple *tuple
  * then the helper function won't depend the dsr checks.
  */
 static __always_inline bool snat_v6_needed(struct __ctx_buff *ctx,
-					   const union v6addr *addr)
+					   union v6addr *addr)
 {
 	union v6addr masq_addr __maybe_unused;
-	union v6addr dr_addr = IPV6_DIRECT_ROUTING;
+	const union v6addr dr_addr = IPV6_DIRECT_ROUTING;
 	struct remote_endpoint_info *remote_ep __maybe_unused;
 	struct endpoint_info *local_ep __maybe_unused;
 	void *data, *data_end;

@pchaigno
Copy link
Member Author

Still hitting a complexity issue on 4.9, in bpf_host, when the host firewall is enabled.

@brb
Copy link
Member

brb commented Aug 2, 2022

Still hitting a complexity issue on 4.9, in bpf_host, when the host firewall is enabled.

We can either allow IPv6 BPF masq only on kernels with HAVE_LARGE_INSN_LIMIT, or wait until the second part of the SNAT refactoring has been done in #19712.

@pchaigno
Copy link
Member Author

pchaigno commented Aug 2, 2022

The 4.9 complexity issue seems to be affecting function handle_lxc_traffic in bpf_host.c. So that one should be really easy to fix by introducing tail calls, to separate the from_host_to_lxc and to_host_from_lxc code paths. It's ~20min of work if you rely on CI to run the tests.

There seem to be another complexity issue on 4.19 with the host endpoint as well. Hopefully it's the same, but it's impossible to know since logs are missing (I suspect because of #20738).

@github-actions

This comment was marked as resolved.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Sep 2, 2022
@qmonnet qmonnet added pinned These issues are not marked stale by our issue bot. and removed stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. labels Sep 2, 2022
@batistein
Copy link

@pchaigno Will this also work in tunneling mode without an IPv4 address?

@qmonnet
Copy link
Member

qmonnet commented Jun 16, 2023

Completed in #23165.

@qmonnet qmonnet closed this Jun 16, 2023
@networkop
Copy link
Contributor

would it make sense to disable the log.Fatal in this ip6_table module check?

@qmonnet
Copy link
Member

qmonnet commented Jul 24, 2023

Discussed offline with Joe:

Pretty sure even with that we still depend on ip6tables for some functionality, so by fataling we can make it crystal clear to the user that they have configured Cilium in a way that will cause IPv6 connectivity issues.

So we keep the current behaviour for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/ipv6 Relates to IPv6 protocol support pinned These issues are not marked stale by our issue bot. release-note/major This PR introduces major new functionality to Cilium. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement BPF based masquerading for IPv6 traffic
10 participants