Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpf: nat: Make snat_v6_needed() get on par with snat_v4_prepare_state() #26236

Merged
merged 5 commits into from Jun 26, 2023

Conversation

qmonnet
Copy link
Member

@qmonnet qmonnet commented Jun 14, 2023

We recently implemented snat_v6_needed() in the datapath to determine whether we need SNAT for IPv6 masquerading; but the function was written long before it was merged and was based on snat_v4_needed(), which has since been consolidated and renamed as snat_v4_prepare_state(), and changed again to prepare for Egress Gateway.

This PR aims at bringing snat_v6_needed() as close to snat_v4_prepare_state() as possible. In the process, we also fix a condition for SNAT-ing packets from local endpoints, and the IP address to use for masquerading packets leaving through a tunnel.

Commits:

  • bpf: nat: Move check on dest being outside cluster in snat_v6_needed()
  • bpf: nat: Pass target directly to snat_v6_needed()
  • bpf: nat: Do not skip IPv6 SNAT for packets from local endpoints
  • bpf: nat: Rename snat_v6_needed() into snat_v6_prepare_state()
  • bpf: nat: Use gateway IP for SNAT in the case of BPF overlay
bpf: Update IPv6 BPF masquerading code to bring it closer to IPv4's, fix SNAT for packets from local endpoints, for overlay

@qmonnet qmonnet added kind/enhancement This would improve or streamline existing functionality. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/misc This PR makes changes that have no direct user impact. feature/ipv6 Relates to IPv6 protocol support release-blocker/1.14 This issue will prevent the release of the next version of Cilium. labels Jun 14, 2023
@qmonnet
Copy link
Member Author

qmonnet commented Jun 14, 2023

/test

@joestringer
Copy link
Member

joestringer commented Jun 16, 2023

Is this really a kind/feature? What user-facing impact is there for this change?

At first I was a bit alarmed that a feature was being proposed two days before feature freeze, and still marked as draft the day before. But looking at the size of the change, the release note descrption, and the fact it's marked as release-note/misc, now I think maybe it's not actually a feature, just a nice-to-have code cleanup?

If you can get this into the tree tomorrow in coordination with the relevant code owners then I don't mind, but the window is starting to close on PR proposals for 1.14. We will be moving to a lower-churn phase of development shortly and at that point we'll be scrutinizing each change a bit more closely to determine whether it really matters to merge this around now in time for 1.14 or in a couple of weeks' time (for 1.15 dev cycle).

Woops, I mixed up my lists! The bit about churn still holds, but for now there's still some room for wiggle room around including enhancements, particularly if you think it'll help the 1.14 maintenance cycle post-release.

@qmonnet
Copy link
Member Author

qmonnet commented Jun 16, 2023

Thanks @joestringer for your comment and explanations! Yes, I'm aware of the feature freeze and lower churn phase. Our current labels don't allow us to make the distinction between “we absolutely need this for the release”, and “this is maybe not essential, but important enough that I would like some careful consideration on whether we should include this, and not let it slip unnoticed”. This PR would fall into the second category, I'd say.

I do believe it's an enhancement worth having, as it would bring IPv4 and IPv6 paths closer and make it easier to maintain in the future.

One additional consideration is that I'm not currently sure of whether the check on local_ep in the PR should be moved outside of the #ifdef ENABLE_MASQUERADE_IPV6, like the PR currently does (and like we do on the IPv4 path); I'm trying to understand if we need this and what the consequences are, and whether this is a bug or not in the current version of the code that we would address by merging this - but then, if there's a bug, we'll backport anyway. This is why this PR is still in draft status, by the way. [EDIT: Cleared it out with Gilberto, there's no bug, and we don't need this code to move outside of the #ifdef guard, so all is good.]

@qmonnet
Copy link
Member Author

qmonnet commented Jun 16, 2023

/test

@qmonnet
Copy link
Member Author

qmonnet commented Jun 16, 2023

/test
Rebase mistakes 🙄

@qmonnet
Copy link
Member Author

qmonnet commented Jun 16, 2023

/test

@qmonnet qmonnet force-pushed the pr/snat_v6_needed branch 2 times, most recently from 5762574 to 9bb7f55 Compare June 16, 2023 10:24
@qmonnet qmonnet marked this pull request as ready for review June 16, 2023 10:32
@qmonnet qmonnet requested a review from a team as a code owner June 16, 2023 10:32
@qmonnet qmonnet requested review from aspsk and julianwiedmann and removed request for aspsk June 16, 2023 10:32
@julianwiedmann
Copy link
Member

👋 thanks! Not sure I will manage to get in a proper review today, but some notes:

  • when called with IS_BPF_OVERLAY, do we need a matching case for the SNAT with IPV4_GATEWAY? Looks like this is ROUTER_IP in the IPv6 world.
  • it's a bit suspicious that we don't need to set target->from_local_endpoint. Did we miss to update snat_v6_nat_can_skip() ?

@qmonnet
Copy link
Member Author

qmonnet commented Jun 23, 2023

* when called with `IS_BPF_OVERLAY`, do we need a matching case for the SNAT with `IPV4_GATEWAY`? Looks like this is `ROUTER_IP` in the IPv6 world.

Yes, I think you're right. Otherwise we won't have the gateway/router IP when leaving through the tunnel. I've added a commit to fix this.

* it's a bit suspicious that we don't need to set `target->from_local_endpoint`. Did we miss to update `snat_v6_nat_can_skip()` ?

It is. And yes you're right again, I think this is because the IPv6 masquerading PR didn't pick up the update in that function. I've also added a commit to fix this.

Thanks a lot for the preliminary comments, I'm looking forward to the review 😄

@qmonnet
Copy link
Member Author

qmonnet commented Jun 23, 2023

Given the fixes mentioned above, I'll change the type for this PR to kind/bug.

@qmonnet qmonnet added kind/bug This is a bug in the Cilium logic. and removed kind/enhancement This would improve or streamline existing functionality. labels Jun 23, 2023
@qmonnet
Copy link
Member Author

qmonnet commented Jun 23, 2023

Looks like I messed up when reordering commits, and I've got one that doesn't compile. I'll try to fix this in the evening.

We recently implemented snat_v6_needed() in the datapath to determine
whether we need SNAT for IPv6 masquerading; but the function was written
long before it was merged and was based on snat_v4_needed(), which has
since been consolidated and renamed as snat_v4_prepare_state(), and
changed again to prepare for Egress Gateway.

The objective of this commit is to bring the two functions closer, to
make the datapath more consistent and to make it simpler to support
Egress Gateway with IPv6 in the future. Here we replicate the change
done for IPv4 in commit cdfc30d ("bpf: egressgw: sync logic to
determine if destination is outside cluster"): in the context of Egress
Gateway, the check on the traffic leaving the cluster, relying on
ipv6_addr_in_net() under IPV6_SNAT_EXCLUSION_DST_CIDR, may incorrectly
exclude some trafic from Egress Gateway SNAT. So we move the test above
the IPV6_SNAT_EXCLUSION_DST_CIDR block. See the IPv4 commit for more
details.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
We recently implemented snat_v6_needed() in the datapath to determine
whether we need SNAT for IPv6 masquerading; but the function was written
long before it was merged and was based on snat_v4_needed(), which has
since been consolidated and renamed as snat_v4_prepare_state(), and
changed again to prepare for Egress Gateway.

The objective here is to bring snat_v6_needed() slightly closer to
snat_v4_prepare_state(), by passing "target" instead of "target.addr"
from nodeport_snat_fwd_ipv6(), as done in IPv4 for 4532996 ("bpf:
nat: always track egress gateway connections"). This should help track
Egress Gateway connections in the future, although this is not supported
at this time.

No functional change.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
We're not setting or using target->from_local_endpoint in the IPv6 BPF
masquerading, so that we may skip masquerading for packets from local
endpoints. Let's fix and align on what we do for IPv4.

Fixes: 87981ac ("bpf: Implement IPv6 BPF masquerading")
Reported-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
We recently implemented snat_v6_needed() in the datapath to determine
whether we need SNAT for IPv6 masquerading; but the function was written
long before it was merged and was based on snat_v4_needed(), which has
since been consolidated and renamed as snat_v4_prepare_state(), and
changed again to prepare for Egress Gateway.

Here we replicate the changes from commit 59ac84f ("bpf: nat:
consolidate snat_v4_needed() state into ipv4_nat_target"), in order to
bring the IPv4 and IPv6 functions closer to one another.

No functional change.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Similarly to the IPv4 path, we need to use the "router IP" for IPv6 SNAT
when processing packets that leave the node via a tunnel.

Fixes: 87981ac ("bpf: Implement IPv6 BPF masquerading")
Reported-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
@joestringer
Copy link
Member

[EDIT: Cleared it out with Gilberto, there's no bug, and we don't need this code to move outside of the #ifdef guard, so all is good.]

Given the fixes mentioned above, I'll change the type for this PR to kind/bug.

I'm a bit confused. This is a refactor to align code right? The release note and release note label don't indicate anything about a bug being fixed.

At this point I'm willing to ship this PR in v1.14 if it lands before I branch (some time in the next few days), but I think we should remove the release-blocker label. It's not obvious to me that it's making sufficient ongoing progress to guarantee it lands in time, and it's about time we consider v1.14 development done and move on. If it turns out that this is still a bugfix and it doesn't land before branching, then we can make a call about whether to prepare+backport a more targeted fix or backport this PR when it's ready.

@qmonnet
Copy link
Member Author

qmonnet commented Jun 26, 2023

I'm a bit confused. This is a refactor to align code right? The release note and release note label don't indicate anything about a bug being fixed.

The confusion probably comes from my poor messaging, apologies. These two quotes refer to different things.

  • This PR started as a refactor to align code.
  • Doing so, I believed at first there might be a bug we might need to address with regards to the conditions under which we needed to run the checks on local_ep. But it fact no, we're good on this side.
  • Then Julian raised that there were still some divergences between the IPv4 and IPv6 paths after the PR regarding conditions for SNAT-ing packets from local endpoints, and for overlay. These are bugs in the current implementation, and this PR now addresses them.

I did forget to update the release note. This is fixed now.

I'm not sure what expectations you have regarding the release note label. My understanding is that release-note/bug is for PRs addressing issues affecting previous releases, which is not the case at the moment (IPv6 masquerading was added recently).

At this point [...]

Sorry to hear about the insufficient progress - You know I've been busy with patch releases, but then I know you're making sure everything is in order for 1.14 so I guess it's fair. But the proposed resolution sounds good, let's target completion before branching, or consider a backport otherwise. Thanks a lot for your feedback!

@qmonnet qmonnet removed the release-blocker/1.14 This issue will prevent the release of the next version of Cilium. label Jun 26, 2023
@qmonnet
Copy link
Member Author

qmonnet commented Jun 26, 2023

/test

Copy link
Member

@borkmann borkmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, both v4 and v6 variant are aligned logic-wise modulo the egress GW which is only for IPv4 at the moment.

The v4 code has this one:

# if defined(ENABLE_CLUSTER_AWARE_ADDRESSING) && defined(ENABLE_INTER_CLUSTER_SNAT)
	if (target->cluster_id != 0 &&
	    target->cluster_id != CLUSTER_ID) {
		target->addr = IPV4_INTER_CLUSTER_SNAT;
		return true;
	}
# endif

Is this again for v4-only? :/

@qmonnet
Copy link
Member Author

qmonnet commented Jun 26, 2023

Is this again for v4-only? :/

I thought so, because I only found IPV4_INTER_CLUSTER_SNAT and no IPv6 equivalent, plus I couldn't find where ENABLE_CLUSTER_AWARE_ADDRESSING and ENABLE_INTER_CLUSTER_SNAT were being set at first. But looking again, the IPv6 version was maybe missing only because IPv6 BPF masquerading was not here.

@YutaroHayakawa Could you please take a look and comment? Do we need the IPv6 equivalent to the IPv4 snippet quoted by Daniel now that we have IPv6 masquerading?

[EDIT: It looks like the changes for inter-cluster NAT are not trivial, so if we need them now, this would likely be for a follow-up PR.]

Copy link
Member

@julianwiedmann julianwiedmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm now, thank you Quentin! All good wrt inter-cluster-SNAT, that whole feature is currently IPv4-only.

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jun 26, 2023
@qmonnet qmonnet merged commit b6235e4 into cilium:main Jun 26, 2023
65 checks passed
@qmonnet qmonnet deleted the pr/snat_v6_needed branch June 26, 2023 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/ipv6 Relates to IPv6 protocol support kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants