Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpf: nodeport: reduce CT lookup scope #25826

Merged
merged 4 commits into from
Jun 5, 2023

Conversation

julianwiedmann
Copy link
Member

@julianwiedmann julianwiedmann commented Jun 1, 2023

The standard behaviour for a CT lookup is that we first check whether the packet is a reply for some tracked connection. If that doesn't return a match, we do an additional lookup for the packet's actual direction (which might either find a match, or return CT_NEW). The exception is when we look for entries of type CT_SERVICE - here we already avoid the second lookup.

But we have some additional code paths that would benefit from excluding the second lookup

  • nodeport RevDNAT only wants to handle a CT_REPLY lookup result (as it only wants to reverse-translate the replies for nodeport connections). This PR converts those users.
  • the inbound nodeport code only cares about the forward direction, but currently needs to tolerate results from the reverse direction (treating them as stale CT entries that need to be re-created). All of this can go away if we switch to SCOPE_FORWARD.
  • the SNAT and RevSNAT paths should only care about their respective direction (I'll convert those in a follow-on, they need a bit more cleanup before).

Make the needed changes so that we can avoid the second lookup. This reduces code size / complexity, and offers more robust behaviour (as we no longer have to consider "impossible" CT results).

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jun 1, 2023
@julianwiedmann julianwiedmann added sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/misc This PR makes changes that have no direct user impact. labels Jun 1, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jun 1, 2023
@julianwiedmann
Copy link
Member Author

/test

@julianwiedmann julianwiedmann added the kind/complexity-issue Relates to BPF complexity or program size issues label Jun 1, 2023
@julianwiedmann julianwiedmann changed the title 1.14 ct scope bpf: conntrack: reduce lookup scope for RevDNAT / RevSNAT paths Jun 2, 2023
@julianwiedmann julianwiedmann marked this pull request as ready for review June 2, 2023 07:52
@julianwiedmann julianwiedmann requested a review from a team as a code owner June 2, 2023 07:52
@julianwiedmann
Copy link
Member Author

@gentoo-root ha, I was hoping it would hit you 😄 . No particular rush, let's give this a good thought.

But the complexity results look pretty nice....

@julianwiedmann julianwiedmann marked this pull request as draft June 2, 2023 10:51
@julianwiedmann
Copy link
Member Author

(back to draft for a bit, we can do even better ...)

@julianwiedmann julianwiedmann changed the title bpf: conntrack: reduce lookup scope for RevDNAT / RevSNAT paths bpf: nodeport: reduce CT lookup scope Jun 2, 2023
@julianwiedmann
Copy link
Member Author

/test

Copy link
Contributor

@gentoo-root gentoo-root left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice cleanup! Conntrack lookups are becoming easier and easier to understand.

I'll take another look after you move it out from draft.

bpf/lib/conntrack.h Outdated Show resolved Hide resolved
/* Lookup entry in forward direction */
if (dir != CT_SERVICE) {
/* now lookup in forward direction: */
case SCOPE_FORWARD:
ipv6_ct_tuple_reverse(tuple);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A further improvement could be to make the callers pass the right tuple from the beginning, without the need to reverse it here if scope is SCOPE_FORWARD.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, nice idea. Definitely follow-on material - we'd need to check how far back we can push this in the call chain without having subtle changes in behaviour. But would make a lot of sense to me.

@julianwiedmann
Copy link
Member Author

julianwiedmann commented Jun 2, 2023

Really nice cleanup! Conntrack lookups are becoming easier and easier to understand.

I'll take another look after you move it out from draft.

Glad that you feel it becomes easier to understand! That's worth even more than the complexity reductions 😊.

FWIW, I'm in no rush to squeeze this in just before feature freeze. Thinking it might make more sense to merge it early for v1.15, add the SNAT changes, give it some time to mature, and take a thorough pass over all the debug statements etc in the CT code to make sure that everything still fits.

@julianwiedmann
Copy link
Member Author

/ci-l4lb

The function only returns a very narrow scope of values, document this
clearly.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
A CT lookup is usually done in two directions - first the reverse
direction (to match replies), then the actual direction. Currently the only
exception is CT_SERVICE lookups, where we only check the reverse direction.

Control this behaviour independently of the CT_SERVICE type, so that we can
introduce more uni-directional lookups with subsequent patches.

No functional change intended.

Note that users of SCOPE_REVERSE need to tolerate that their passed-in
CT tuple won't be reversed after the CT lookup.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
For RevDNAT we only care about lookups that return CT_REPLY. So avoid the
second lookup in the forward direction.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Both the NAT path in the LB (for DNATed service requests) and the DSR path
in the backend node (for requests on a DSR connection) need to perform
CT tracking.

Currently their CT code needs to handle an unexpected CT_REPLY result
from the CT lookup - which we're treating as a match of a stale entry that
needs to be recreated. But if we instead limit the lookup scope to the
forward direction, we won't even match such entries. Instead we get a
CT_NEW result, and create a fresh entry - implicitly clearing any sort of
stale entry.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
@julianwiedmann
Copy link
Member Author

/test

@julianwiedmann julianwiedmann marked this pull request as ready for review June 5, 2023 13:04
@julianwiedmann
Copy link
Member Author

Really nice cleanup! Conntrack lookups are becoming easier and easier to understand.
I'll take another look after you move it out from draft.

Glad that you feel it becomes easier to understand! That's worth even more than the complexity reductions blush.

FWIW, I'm in no rush to squeeze this in just before feature freeze. Thinking it might make more sense to merge it early for v1.15, add the SNAT changes, give it some time to mature, and take a thorough pass over all the debug statements etc in the CT code to make sure that everything still fits.

I take that back 😄. Looks like we might need it a bit earlier than I expected...

Updated the switch statement per your review, and squeezed some DBG improvements in as well.

bpf/lib/conntrack.h Show resolved Hide resolved
bpf/lib/dbg.h Show resolved Hide resolved
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jun 5, 2023
@julianwiedmann julianwiedmann merged commit 9a1a04f into cilium:main Jun 5, 2023
61 checks passed
@julianwiedmann julianwiedmann deleted the 1.14-ct-scope branch June 5, 2023 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/complexity-issue Relates to BPF complexity or program size issues ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants