Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v1.14] bpf: lxc: defer CT_INGRESS entry creation for loopback connections #27920

Merged
merged 2 commits into from Sep 7, 2023

Conversation

julianwiedmann
Copy link
Member

Manual backport (due to contextual conflicts and missing test infra) of:

Once this PR is merged, you can update the PR labels via:

for pr in 27602; do contrib/backporting/set-labels.py $pr done 1.14; done

or with

make add-labels BRANCH=v1.14 ISSUES=27602

@maintainer-s-little-helper maintainer-s-little-helper bot added backport/1.14 This PR represents a backport for Cilium 1.14.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master. labels Sep 4, 2023
@julianwiedmann
Copy link
Member Author

/test-backport-1.14

@julianwiedmann julianwiedmann marked this pull request as ready for review September 4, 2023 12:42
@julianwiedmann julianwiedmann requested a review from a team as a code owner September 4, 2023 12:42
@julianwiedmann
Copy link
Member Author

(resolve a context conflict with #27381)

@julianwiedmann
Copy link
Member Author

/test-backport-1.14

Copy link
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks - I'll adjust my version for 1.12

[ upstream commit dbf4351 ]

[ backporter's notes: bring back ipv4_ct_tuple_swap_addrs() to avoid churn
  in the tests. ]

Currently the CT_INGRESS entry for a loopback connection is already created
when the first request leaves the client, as from-container creates the
CT_EGRESS entry (see the loopback handling in ct_create4() for details).

This is unusual - we would normally just create the CT_INGRESS entry as
the first packet passes through to-container into the backend. But for
loopback connections it is needed, so that
1.) to-container finds a CT entry with .loopback set, and thus skips
    network policy enforcement even for the first packet, and
2.) the CT entry has its rev_nat_index field populated, and thus can
    RevNAT replies in from-container.

This approach conflicts with the fact that loopback replies skip the
client's to-container path (to avoid network policy enforcement).

Once the loopback connection is closed, the backend's from-container path
observes the FIN / RST, and __ct_lookup4() updates the CT_INGRESS entry's
lifetime to CT_CLOSE_TIMEOUT. But the client's to-container path will not
observe the FIN / RST, and thus the CT_EGRESS entry's lifetime remains as
CT_CONNECTION_LIFETIME_TCP. Therefore the CT_INGRESS entry will expire
earlier, and potentially gets garbage-collected while the CT_EGRESS entry
stays in place.

If the same loopback connection is subsequently re-opened, the client's
from-container path finds the CT_EGRESS entry and thus will *not* call
ct_create4(). Consequently the CT_INGRESS entry is not created, and the
backend will not apply the loopback-specific handling described above.
Inbound requests are potentially dropped (due to network policy), and/or
replies are not RevNATed.

Fix this up by letting the backend path create its CT_INGRESS entry as
usual. It just needs a bit of detection code in its CT_NEW handling to
understand that the first packet belongs to a .loopback connection, and
populate its own CT_INGRESS entry accordingly.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
[ upstream commit f0b44c0 ]

The test currently assumes SVC_PORT == BACKEND_PORT. This limits the test
coverage (we're not testing that L4 NAT works properly), and isn't strictly
required in real world scenarios.

Change the test so that the backend uses a different port.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
@julianwiedmann
Copy link
Member Author

/test-backport-1.14

@julianwiedmann
Copy link
Member Author

(one more round of CI to make the branch protection happy)

@julianwiedmann julianwiedmann merged commit 622781b into cilium:v1.14 Sep 7, 2023
56 checks passed
@julianwiedmann julianwiedmann deleted the v1.14-lxc-ct-ingress branch September 7, 2023 10:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.14 This PR represents a backport for Cilium 1.14.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants