Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpf: nat: reduce CT lookup scope #25917

Merged
merged 3 commits into from Jun 7, 2023

Conversation

julianwiedmann
Copy link
Member

@julianwiedmann julianwiedmann commented Jun 6, 2023

Continuing from #25826, this PR applies the concept of "single-scope" CT lookup for the SNAT / RevSNAT paths. Helps to make the code paths simpler & more robust, while reducing complexity.

@julianwiedmann julianwiedmann added sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/misc This PR makes changes that have no direct user impact. kind/complexity-issue Relates to BPF complexity or program size issues labels Jun 6, 2023
@julianwiedmann
Copy link
Member Author

/test

@julianwiedmann julianwiedmann marked this pull request as ready for review June 6, 2023 09:22
@julianwiedmann julianwiedmann requested a review from a team as a code owner June 6, 2023 09:22
bpf/lib/nat.h Outdated Show resolved Hide resolved
/* CT expects a tuple with the source and destination ports reversed,
* while NAT uses normal tuples that match packet headers.
*/
ipv4_ct_tuple_swap_ports(&tuple_revsnat);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice saving on one tuple copy! It was bugging me a bit back when I worked on my revSNAT fix.

Might be also worth dropping ipv4_ct_tuple_swap_ports and just going with assigning sport and dport to the right places, what do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bet we can simplify this quite a bit more (especially in combination with your suggestion to pull the tuple swap out of the CT lookup). But let's do that in a follow-on PR, I still have nightmares from too much CT tuple wrangling 😬 .

The NAT code's CT helper currently doesn't differentiate whether it's
called from SNAT or RevSNAT context. If the CT lookup doesn't return an
entry, it unconditionally creates a new CT entry.

But RevSNAT should only be applied to replies of outbound connections. The
expected behaviour is that when such a connection's first packet is handled
by the SNAT path, we create (1) a NAT entry, (2) a RevNAT entry, and
(3) a CT entry that keeps the NAT entries from being GCed.

Thus we should never encounter a situation where the RevSNAT path finds a
matching RevNAT entry for a packet, but then doesn't find the corresponding
CT entry. And the CT entry we *would* create in this case does not match
what we create from the SNAT path at all (for a start, the type is
CT_INGRESS when it should be CT_EGRESS).

So skip the ct_create*() call from the RevSNAT path. This gives us more
robust behaviour, and reduces code size.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
The behaviour for SNAT and RevSNAT is sufficiently different that it makes
little sense to squeeze everything into a shared helper. Just open-code the
CT lookup in the callers. For the RevSNAT case this even allows us to
remove one of the temporary CT tuples.

Also remove the relevant BPF unit tests - we still have coverage for these
paths through `bpf_nat_tests.c`.

(Note that this doesn't completely remove the unused `ext_err` from the
RevSNAT path. I expect we will be using it pretty soon for other errors, so
let's avoid the churn for now).

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
SNAT should only care about outbound connections, while RevSNAT only wants
the replies for such connections. Apply the corresponding scope to their
CT lookups.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
@julianwiedmann
Copy link
Member Author

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jun 6, 2023
@julianwiedmann julianwiedmann merged commit 7362cbe into cilium:main Jun 7, 2023
62 checks passed
@julianwiedmann julianwiedmann deleted the 1.14-bpf-snat-ct branch June 7, 2023 05:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/complexity-issue Relates to BPF complexity or program size issues ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants