-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement BPF-based masquerading for IPv6 #18694
Conversation
84df485
to
796fdcb
Compare
92cff28
to
9e8f18d
Compare
9e8f18d
to
30820cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, nice work!
Maybe one more thing to update:
$ grep supported Documentation/concepts/networking/masquerading.rst
eBPF based masquerading is currently not supported for IPv6 traffic.
30820cf
to
df0cdb8
Compare
3b2eef3
to
3cc9787
Compare
This comment was marked as resolved.
This comment was marked as resolved.
The complexity has been improved - #20425 (comment). @qmonnet @pchaigno this PR is no longer blocked. |
Since ctx->{data,data_end,data_meta} are 32-bits fields, LLVM sometimes generates 32-bit assignments for those pointers. This leads to verifier errors as the verifier is unable to track the packet pointers through the 32-bit assignments, as show below. ; return (void *)(unsigned long)ctx->data; 2: (61) r9 = *(u32 *)(r7 +76) ; R7_w=ctx(id=0,off=0,imm=0) R9_w=pkt(id=0,off=0,r=0,imm=0) ; return (void *)(unsigned long)ctx->data; 3: (bc) w6 = w9 ; R6_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R9_w=pkt(id=0,off=0,r=0,imm=0) ; if (data + tot_len > data_end) 4: (bf) r2 = r6 ; R2_w=inv(id=1,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6_w=inv(id=1,umax_value=4294967295,var_off=(0x0; 0xffffffff)) 5: (07) r2 += 54 ; R2_w=inv(id=0,umin_value=54,umax_value=4294967349,var_off=(0x0; 0x1ffffffff)) ; if (data + tot_len > data_end) 6: (2d) if r2 > r1 goto pc+466 ; R1_w=pkt_end(id=0,off=0,imm=0) R2_w=inv(id=0,umin_value=54,umax_value=4294967349,var_off=(0x0; 0x1ffffffff)) ; tmp = a->d1 - b->d1; 7: (71) r2 = *(u8 *)(r6 +22) R6 invalid mem access 'inv' As a workaround, Yonghong Song suggested to read those fields using asm bytecode. With that trick, LLVM is unaware that the original field is actually on 32 bits and can't perform the above "optimization". This workaround is needed only now because a subsequent commit introduces some BPF C code that causes this bytecode to be generated (snat_v6_needed). Related: https://lore.kernel.org/bpf/c5a76b4d-abed-51f6-bf16-040eb0baf290@fb.com/T/#m933626f7404562a6e98af78a8f44c30977681ffa Suggested-by: Yonghong Song <yhs@fb.com> Signed-off-by: Paul Chaignon <paul@cilium.io>
This will allow us to easily define the IPv6 equivalent in a subsequent commit. This commit should include no functional changes. Signed-off-by: Paul Chaignon <paul@cilium.io>
This commit implements BPF masquerading for IPv6 by following the implementation for IPv4. The main function changed here is snat_v6_needed with almost the exact same steps as IPv4. Support for the ipmasq agent in IPv6 is not implemented here. [ Quentin: Rebased on master branch, addressed minor conflicts in nodeport.h due to commit 0ea4341 ("bpf: Make union v6addr const where possible"). ] Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
This commit refactors InitBPFMasqueradeAddrs to easily extend it for IPv6 in a subsequent commit. This commit shouldn't have any functional changes. Signed-off-by: Paul Chaignon <paul@cilium.io>
This commit allows enabling BPF masquerading for IPv6 and defines all the proper macros for the datapath implementation. [ Quentin: Rebased on master branch, addressed minor conflicts on daemon.go and loader.go. ] Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
BPF-based IPv6 masquerading is now supported, except for the ip-masq agent. This commit therefore updates the documentation accordingly. Signed-off-by: Paul Chaignon <paul@cilium.io>
Since Cilium supports BPF masquerading for IPv6, we can now test it. The upgrade/downgrade test needs a small exception because IPv6 BPF masquerading is still not supported on v1.11, from which we upgrade. The exception logic comes with an assertion, to ensure we don't forget to remove the exception once the upgrade is performed from v1.12. Signed-off-by: Paul Chaignon <paul@cilium.io>
@pchaigno I rebased, but I can't push on your repo. I pushed to cilium/pr/pchaigno/ipv6-bpf-masquerading instead, any chance you can swap to that branch for the PR? I addressed minor conflicts on daemon.go, loader.go, nodeport.h. Nothing that looked significant. Diff on nodeport.hdiff --git a/bpf/lib/nodeport.h b/bpf/lib/nodeport.h
index fa29b2908c90..22b7a7f8575d 100644
--- a/bpf/lib/nodeport.h
+++ b/bpf/lib/nodeport.h
@@ -179,10 +179,10 @@ static __always_inline bool nodeport_uses_dsr6(const struct ipv6_ct_tuple *tuple
* then the helper function won't depend the dsr checks.
*/
static __always_inline bool snat_v6_needed(struct __ctx_buff *ctx,
- const union v6addr *addr)
+ union v6addr *addr)
{
union v6addr masq_addr __maybe_unused;
- union v6addr dr_addr = IPV6_DIRECT_ROUTING;
+ const union v6addr dr_addr = IPV6_DIRECT_ROUTING;
struct remote_endpoint_info *remote_ep __maybe_unused;
struct endpoint_info *local_ep __maybe_unused;
void *data, *data_end; |
3cc9787
to
ae16b5d
Compare
Still hitting a complexity issue on 4.9, in bpf_host, when the host firewall is enabled. |
We can either allow IPv6 BPF masq only on kernels with |
The 4.9 complexity issue seems to be affecting function There seem to be another complexity issue on 4.19 with the host endpoint as well. Hopefully it's the same, but it's impossible to know since logs are missing (I suspect because of #20738). |
This comment was marked as resolved.
This comment was marked as resolved.
@pchaigno Will this also work in tunneling mode without an IPv4 address? |
Completed in #23165. |
would it make sense to disable the log.Fatal in this ip6_table module check? |
Discussed offline with Joe:
So we keep the current behaviour for now. |
Summary of commits:
ct_is_reply4
as a macro to facilitate implem. of IPv6 counterpart.InitBPFMasqueradeAddrs
to facilitate implem. of IPv6 counterpart.See commits for details.
Fixes: #14350.