Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.14 Backports 2023-10-30 #28870

Merged
merged 25 commits into from Nov 7, 2023
Merged

Conversation

pippolo84
Copy link
Member

@pippolo84 pippolo84 commented Oct 30, 2023

@pippolo84 pippolo84 added kind/backports This PR provides functionality previously merged into master. backport/1.14 This PR represents a backport for Cilium 1.14.x of a PR that was merged to main. labels Oct 30, 2023
Copy link
Member

@giorio94 giorio94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My commit looks good. Thanks!

@pippolo84 pippolo84 marked this pull request as ready for review October 30, 2023 14:12
@pippolo84 pippolo84 requested review from a team as code owners October 30, 2023 14:12
@pippolo84
Copy link
Member Author

@learnitall @julianwiedmann please review your commits very carefully, I hit quite a number of conflicts during the backporting.

Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My PR looks good. Thanks!

@pippolo84
Copy link
Member Author

/test-backport-1.14

Copy link
Member

@nebril nebril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My changes LGTM

@julianwiedmann
Copy link
Member

Peeked at the related ci-conformance fail

--- FAIL: Test (0.00s)
    --- FAIL: Test/CTMapPrivilegedTestSuite (0.00s)
        --- FAIL: Test/CTMapPrivilegedTestSuite/TestCtGcDsr (0.00s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x48 pc=0xc783f3]

Looks like we need below change for 1.14. Let me know if this is getting too involved and you would prefer a separate backport :).

diff --git a/pkg/maps/ctmap/ctmap_privileged_test.go b/pkg/maps/ctmap/ctmap_privileged_test.go
index f5c3fec337..d8b206b005 100644
--- a/pkg/maps/ctmap/ctmap_privileged_test.go
+++ b/pkg/maps/ctmap/ctmap_privileged_test.go
@@ -213,9 +213,9 @@ func (k *CTMapPrivilegedTestSuite) TestCtGcTcp(c *C) {
        defer natMap.Map.Unpin()
 
        ctMapName := MapNameTCP4Global + "_test"
-       mapInfo[mapTypeIPv4TCPGlobal] = mapAttributes{
-               natMap: natMap, natMapLock: mapInfo[mapTypeIPv4TCPGlobal].natMapLock,
-       }
+       setupMapInfo(mapTypeIPv4TCPGlobal, ctMapName,
+               &CtKey4Global{}, int(unsafe.Sizeof(CtKey4Global{})),
+               100, natMap)
 
        ctMap := newMap(ctMapName, mapTypeIPv4TCPGlobal)
        err = ctMap.OpenOrCreate()
@@ -323,9 +323,9 @@ func (k *CTMapPrivilegedTestSuite) TestCtGcDsr(c *C) {
        defer natMap.Map.Unpin()
 
        ctMapName := MapNameTCP4Global + "_test"
-       mapInfo[mapTypeIPv4TCPGlobal] = mapAttributes{
-               natMap: natMap, natMapLock: mapInfo[mapTypeIPv4TCPGlobal].natMapLock,
-       }
+       setupMapInfo(mapTypeIPv4TCPGlobal, ctMapName,
+               &CtKey4Global{}, int(unsafe.Sizeof(CtKey4Global{})),
+               100, natMap)
 
        ctMap := newMap(ctMapName, mapTypeIPv4TCPGlobal)
        err = ctMap.OpenOrCreate()
@@ -413,9 +413,9 @@ func (k *CTMapPrivilegedTestSuite) TestCtGcLegacyDsr(c *C) {
        defer natMap.Map.Unpin()
 
        ctMapName := MapNameTCP4Global + "_test"
-       mapInfo[mapTypeIPv4TCPGlobal] = mapAttributes{
-               natMap: natMap, natMapLock: mapInfo[mapTypeIPv4TCPGlobal].natMapLock,
-       }
+       setupMapInfo(mapTypeIPv4TCPGlobal, ctMapName,
+               &CtKey4Global{}, int(unsafe.Sizeof(CtKey4Global{})),
+               100, natMap)

@pippolo84
Copy link
Member Author

Looks like we need below change for 1.14. Let me know if this is getting too involved and you would prefer a separate backport :).

Thanks @julianwiedmann . I've applied the changes, let's see how it goes.

@pippolo84
Copy link
Member Author

/test-backport-1.14

@pippolo84 pippolo84 mentioned this pull request Oct 31, 2023
6 tasks
Copy link
Member

@julianwiedmann julianwiedmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, and the current CI fails look unrelated 🤞. Thank you!

Copy link
Contributor

@learnitall learnitall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for my PR, thank you!

@pippolo84
Copy link
Member Author

Should wait for #28940 to be merged first, in order to fix CI.

aanm and others added 17 commits November 3, 2023 16:52
[ upstream commit 07d3a21 ]

We need to refetch the pod labels again because we have just added
the endpoint into the endpoint manager. If we have received any pod
events, more specifically any events that modified the pod labels,
between the time the pod was created and the time it was added
into the endpoint manager, the pod event would not have been processed
since the pod event handler would not find the endpoint for that pod
in the endpoint manager. Thus, we will fetch the labels again
and update the endpoint with these labels.

Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit c1803ba ]

This commit changes the bugtool report to collect the XFRM error
counters (i.e., /proc/net/xfrm_stat) twice instead of only once. We will
collect at the beginning and end of the bugtool collection. In that way,
there will be around 5-6 seconds between the two collections and we may
see if any counter is currently increasing.

    $ diff cilium-bugtool-cilium-7d54p-20231025-115151/cmd/cat*--proc-net-xfrm_stat.md
    5c5
    < XfrmInStateProtoError   	4
    ---
    > XfrmInStateProtoError   	6

In this example, we can easily see that the XfrmInStateProtoError is
increasing. That suggests a key rotation issue is currently ongoing (cf.
IPsec troubleshooting docs).

I tried other approaches to collect over a longer timespan. That may
allow us to see slower increases. They all end up being more complex or
messier (we'd need to collect at beginning and end of the sysdump
instead). In the end, considering this is already a fallback plan for
when customers don't collect Prometheus metrics, I think the current,
simpler approach is good enough.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit b836cb1 ]

We recently introduced AWS-CONNMARK-CHAIN iptables rules deletion, but
didn't add them to an if statement guarding actual deletion.

Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 6ab728d ]

This change causes Cilium DaemonSet postStart hook to always delete AWS
iptable rules unless `cni.chainingMode` is set to `aws-cni`.

This will result in the postStart hook being a noop in all non-AWS
deployments. Unfortunately there is no way for helm chart to know
whether it is running on AWS not in ENI mode.

This approach will make sure that we are deleting AWS-specific iptables
rules that cause issues while not requiring us to introduce new
configuration flags for users.

Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 55517ea ]

The previous version of the implementation was actually computing the
labels starting from broader prefixes to narrower ones (first "/0", then
"/1" and so on).  As soon as we had a cache hit, the recursion stopped
without calculating the remaining labels for the CIDRs up to "ones".
This produced an incorrect (shorter) set of labels for a CIDR.

Also, netip.PrefixFrom(...) does not mask the internally stored address,
thus lowering the cache hit ratio even if two different CIDRs, used as
keys in the LRU cache, are equal in terms of masked address. (e.g:
"1.1.1.1/16" and "1.1.0.0/16").

So, netip.Addr.Prefix(...) is used instead.

After the fix, the performance are roughly equal (but with an increased
chance of having a cache hit). Instead, the maximum heap usage in the
worst case (LRU cache filled up with IPv6 labels) is increased 10x.

BenchmarkCIDRLabelsCacheHeapUsageIPv4
    cidr_test.go:628: Memoization map heap usage: 5483.24 KiB
BenchmarkCIDRLabelsCacheHeapUsageIPv6
    cidr_test.go:670: Memoization map heap usage: 54721.70 KiB

name                             old time/op    new time/op    delta
GetCIDRLabels/0.0.0.0/0-8           256ns ±39%     218ns ±46%     ~     (p=0.393 n=10+10)
GetCIDRLabels/10.16.0.0/16-8       1.35µs ± 3%    1.39µs ± 5%   +2.66%  (p=0.012 n=9+10)
GetCIDRLabels/192.0.2.3/32-8       2.52µs ± 2%    2.58µs ± 2%   +2.58%  (p=0.001 n=10+9)
GetCIDRLabels/192.0.2.3/24-8       2.57µs ± 1%    2.24µs ± 3%  -12.69%  (p=0.000 n=8+10)
GetCIDRLabels/192.0.2.0/24-8       2.27µs ± 4%    2.26µs ± 3%     ~     (p=0.690 n=9+8)
GetCIDRLabels/::/0-8                277ns ± 2%     278ns ± 3%     ~     (p=0.796 n=9+9)
GetCIDRLabels/fdff::ff/128-8       9.42µs ± 1%    9.34µs ± 6%     ~     (p=0.484 n=9+10)
GetCIDRLabels/f00d:42::ff/128-8    9.58µs ± 2%    9.62µs ± 7%     ~     (p=0.905 n=10+9)
GetCIDRLabels/f00d:42::ff/96-8     9.63µs ± 1%    8.45µs ± 3%  -12.27%  (p=0.000 n=8+9)
GetCIDRLabelsConcurrent/1-8         205µs ± 3%     207µs ± 3%     ~     (p=0.356 n=9+10)
GetCIDRLabelsConcurrent/2-8         385µs ± 5%     386µs ± 7%     ~     (p=0.631 n=10+10)
GetCIDRLabelsConcurrent/4-8         784µs ± 5%     780µs ± 1%     ~     (p=0.156 n=10+9)
GetCIDRLabelsConcurrent/16-8       3.24ms ± 1%    3.25ms ± 2%     ~     (p=0.529 n=10+10)
GetCIDRLabelsConcurrent/32-8       6.40ms ± 1%    6.39ms ± 3%     ~     (p=0.497 n=9+10)
GetCIDRLabelsConcurrent/48-8       9.69ms ± 1%   10.09ms ± 6%   +4.09%  (p=0.008 n=8+9)

name                             old alloc/op   new alloc/op   delta
GetCIDRLabels/0.0.0.0/0-8            624B ± 0%      624B ± 0%     ~     (all equal)
GetCIDRLabels/10.16.0.0/16-8       2.40kB ± 0%    2.40kB ± 0%     ~     (all equal)
GetCIDRLabels/192.0.2.3/32-8       5.01kB ± 0%    5.01kB ± 0%     ~     (all equal)
GetCIDRLabels/192.0.2.3/24-8       5.01kB ± 0%    4.93kB ± 0%   -1.64%  (p=0.002 n=8+10)
GetCIDRLabels/192.0.2.0/24-8       4.93kB ± 0%    4.93kB ± 0%     ~     (all equal)
GetCIDRLabels/::/0-8                 624B ± 0%      624B ± 0%     ~     (all equal)
GetCIDRLabels/fdff::ff/128-8       18.5kB ± 0%    18.5kB ± 0%     ~     (all equal)
GetCIDRLabels/f00d:42::ff/128-8    18.5kB ± 0%    18.5kB ± 0%     ~     (p=0.108 n=9+10)
GetCIDRLabels/f00d:42::ff/96-8     18.5kB ± 0%    18.5kB ± 0%   -0.06%  (p=0.000 n=10+10)
GetCIDRLabelsConcurrent/1-8         321kB ± 0%     321kB ± 0%     ~     (p=0.127 n=10+8)
GetCIDRLabelsConcurrent/2-8         641kB ± 0%     641kB ± 0%     ~     (p=0.928 n=10+10)
GetCIDRLabelsConcurrent/4-8        1.28MB ± 0%    1.28MB ± 0%     ~     (p=0.853 n=10+10)
GetCIDRLabelsConcurrent/16-8       5.13MB ± 0%    5.13MB ± 0%     ~     (p=0.739 n=10+10)
GetCIDRLabelsConcurrent/32-8       10.3MB ± 0%    10.3MB ± 0%     ~     (p=0.218 n=10+10)
GetCIDRLabelsConcurrent/48-8       15.4MB ± 0%    15.4MB ± 0%     ~     (p=0.218 n=10+10)

name                             old allocs/op  new allocs/op  delta
GetCIDRLabels/0.0.0.0/0-8            2.00 ± 0%      2.00 ± 0%     ~     (all equal)
GetCIDRLabels/10.16.0.0/16-8         2.00 ± 0%      2.00 ± 0%     ~     (all equal)
GetCIDRLabels/192.0.2.3/32-8         2.00 ± 0%      2.00 ± 0%     ~     (all equal)
GetCIDRLabels/192.0.2.3/24-8         2.00 ± 0%      2.00 ± 0%     ~     (all equal)
GetCIDRLabels/192.0.2.0/24-8         2.00 ± 0%      2.00 ± 0%     ~     (all equal)
GetCIDRLabels/::/0-8                 2.00 ± 0%      2.00 ± 0%     ~     (all equal)
GetCIDRLabels/fdff::ff/128-8         3.00 ± 0%      3.00 ± 0%     ~     (all equal)
GetCIDRLabels/f00d:42::ff/128-8      3.00 ± 0%      3.00 ± 0%     ~     (all equal)
GetCIDRLabels/f00d:42::ff/96-8       3.00 ± 0%      3.00 ± 0%     ~     (all equal)
GetCIDRLabelsConcurrent/1-8           138 ± 0%       138 ± 0%     ~     (all equal)
GetCIDRLabelsConcurrent/2-8           277 ± 0%       277 ± 0%     ~     (all equal)
GetCIDRLabelsConcurrent/4-8           555 ± 0%       555 ± 0%     ~     (p=0.248 n=10+9)
GetCIDRLabelsConcurrent/16-8        2.22k ± 0%     2.22k ± 0%     ~     (p=0.353 n=7+10)
GetCIDRLabelsConcurrent/32-8        4.44k ± 0%     4.44k ± 0%     ~     (p=0.723 n=10+10)
GetCIDRLabelsConcurrent/48-8        6.66k ± 0%     6.66k ± 0%     ~     (p=0.090 n=10+9)

Fixes: e0f6c47 ("labels/cidr: Cache GetCIDRLabels computation")

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 1b9b3fc ]

After the introduction of a LRU cache in GetCIDRLabels, the tests should
verify the labels computation both when the cache is cold but also when
it is hot.

Thus, the tests are refactored to check the returned labels twice.

Also, an additional test is added to verify that the labels stay
consistent when we call GetCIDRLabels with the following sequences of
prefixes:

1) "xxx/32", "xxx/31", ..., "xxx/0", "xxx/1", ..., "xxx/32"

2) "xxx/0", "xxx/1", ..., "xxx/32", "xxx/31", ..., "xxx/0"

Finally, InCluster tests are removed since cluster identity does not
exist anymore.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 9f2034e ]

Migrate remaining tests relying on checker to the testing package from
Go standard library.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 71b7ad5 ]

TestCIDRLabelsCacheHeapUsageIP{v4,v6} are meant to estimate the maximum
heap usage when filling up the CIDR labels LRU cache with labels derived
only from IPv4 and labels derived only from IPv6.

Since they give meaningful results only when running them with
benchtime=1x, thery are refactored to be just tests with a t.Log() to
output the heap usage statistics.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 6f1253e ]

After fixing the GetCIDRLabels implementation to generate all the labels
required for a CIDR, the heap usage of the LRU cache increased 10x in
the worst case (all IPv6 labels).  To reduce heap usage, the cache size
is halved, resulting in ~25 MiB in the IPv6 only case with roughly the
same performance.

=== RUN   TestCIDRLabelsCacheHeapUsageIPv4
    cidr_test.go:527: Memoization map heap usage: 1714.41 KiB
--- PASS: TestCIDRLabelsCacheHeapUsageIPv4 (0.67s)
=== RUN   TestCIDRLabelsCacheHeapUsageIPv6
    cidr_test.go:571: Memoization map heap usage: 26527.13 KiB
--- PASS: TestCIDRLabelsCacheHeapUsageIPv6 (0.71s)

name                             old time/op    new time/op    delta
GetCIDRLabels/0.0.0.0/0-8           198ns ±40%     238ns ±34%    ~     (p=0.325 n=10+10)
GetCIDRLabels/10.16.0.0/16-8       1.32µs ± 8%    1.33µs ± 8%    ~     (p=0.812 n=10+10)
GetCIDRLabels/192.0.2.3/32-8       2.41µs ± 3%    2.39µs ± 5%    ~     (p=0.278 n=10+9)
GetCIDRLabels/192.0.2.3/24-8       2.05µs ± 2%    2.05µs ± 1%    ~     (p=0.948 n=9+9)
GetCIDRLabels/192.0.2.0/24-8       2.05µs ± 2%    2.04µs ± 1%    ~     (p=0.797 n=9+8)
GetCIDRLabels/::/0-8                277ns ±31%     257ns ± 1%    ~     (p=0.349 n=10+8)
GetCIDRLabels/fdff::ff/128-8       9.02µs ± 6%    8.80µs ± 3%    ~     (p=0.077 n=9+9)
GetCIDRLabels/f00d:42::ff/128-8    9.40µs ± 6%    9.01µs ± 5%  -4.12%  (p=0.035 n=10+10)
GetCIDRLabels/f00d:42::ff/96-8     7.78µs ± 4%    7.58µs ± 1%  -2.59%  (p=0.011 n=8+9)
GetCIDRLabelsConcurrent/1-8         189µs ± 8%     173µs ± 3%  -8.85%  (p=0.000 n=10+9)
GetCIDRLabelsConcurrent/2-8         350µs ± 2%     346µs ± 1%  -1.05%  (p=0.001 n=8+8)
GetCIDRLabelsConcurrent/4-8         703µs ± 1%     692µs ± 1%  -1.59%  (p=0.000 n=9+9)
GetCIDRLabelsConcurrent/16-8       2.97ms ± 7%    2.91ms ± 1%    ~     (p=0.122 n=10+8)
GetCIDRLabelsConcurrent/32-8       5.81ms ± 1%    5.77ms ± 1%  -0.57%  (p=0.011 n=8+9)
GetCIDRLabelsConcurrent/48-8       8.87ms ± 6%    8.71ms ± 1%    ~     (p=0.139 n=9+8)

name                             old alloc/op   new alloc/op   delta
GetCIDRLabels/0.0.0.0/0-8            624B ± 0%      624B ± 0%    ~     (all equal)
GetCIDRLabels/10.16.0.0/16-8       2.40kB ± 0%    2.40kB ± 0%    ~     (all equal)
GetCIDRLabels/192.0.2.3/32-8       5.01kB ± 0%    5.01kB ± 0%    ~     (all equal)
GetCIDRLabels/192.0.2.3/24-8       4.93kB ± 0%    4.93kB ± 0%    ~     (all equal)
GetCIDRLabels/192.0.2.0/24-8       4.93kB ± 0%    4.93kB ± 0%    ~     (all equal)
GetCIDRLabels/::/0-8                 624B ± 0%      624B ± 0%    ~     (all equal)
GetCIDRLabels/fdff::ff/128-8       18.5kB ± 0%    18.5kB ± 0%    ~     (all equal)
GetCIDRLabels/f00d:42::ff/128-8    18.5kB ± 0%    18.5kB ± 0%    ~     (all equal)
GetCIDRLabels/f00d:42::ff/96-8     18.5kB ± 0%    18.5kB ± 0%    ~     (all equal)
GetCIDRLabelsConcurrent/1-8         321kB ± 0%     321kB ± 0%    ~     (p=0.645 n=10+10)
GetCIDRLabelsConcurrent/2-8         641kB ± 0%     641kB ± 0%    ~     (p=0.796 n=10+10)
GetCIDRLabelsConcurrent/4-8        1.28MB ± 0%    1.28MB ± 0%    ~     (p=0.353 n=10+10)
GetCIDRLabelsConcurrent/16-8       5.13MB ± 0%    5.13MB ± 0%    ~     (p=0.083 n=10+8)
GetCIDRLabelsConcurrent/32-8       10.3MB ± 0%    10.3MB ± 0%    ~     (p=0.481 n=10+10)
GetCIDRLabelsConcurrent/48-8       15.4MB ± 0%    15.4MB ± 0%    ~     (p=0.796 n=10+10)

name                             old allocs/op  new allocs/op  delta
GetCIDRLabels/0.0.0.0/0-8            2.00 ± 0%      2.00 ± 0%    ~     (all equal)
GetCIDRLabels/10.16.0.0/16-8         2.00 ± 0%      2.00 ± 0%    ~     (all equal)
GetCIDRLabels/192.0.2.3/32-8         2.00 ± 0%      2.00 ± 0%    ~     (all equal)
GetCIDRLabels/192.0.2.3/24-8         2.00 ± 0%      2.00 ± 0%    ~     (all equal)
GetCIDRLabels/192.0.2.0/24-8         2.00 ± 0%      2.00 ± 0%    ~     (all equal)
GetCIDRLabels/::/0-8                 2.00 ± 0%      2.00 ± 0%    ~     (all equal)
GetCIDRLabels/fdff::ff/128-8         3.00 ± 0%      3.00 ± 0%    ~     (all equal)
GetCIDRLabels/f00d:42::ff/128-8      3.00 ± 0%      3.00 ± 0%    ~     (all equal)
GetCIDRLabels/f00d:42::ff/96-8       3.00 ± 0%      3.00 ± 0%    ~     (all equal)
GetCIDRLabelsConcurrent/1-8           138 ± 0%       138 ± 0%    ~     (all equal)
GetCIDRLabelsConcurrent/2-8           277 ± 0%       277 ± 0%    ~     (all equal)
GetCIDRLabelsConcurrent/4-8           555 ± 0%       555 ± 0%    ~     (all equal)
GetCIDRLabelsConcurrent/16-8        2.22k ± 0%     2.22k ± 0%    ~     (p=0.176 n=10+7)
GetCIDRLabelsConcurrent/32-8        4.44k ± 0%     4.44k ± 0%    ~     (p=0.867 n=10+10)
GetCIDRLabelsConcurrent/48-8        6.66k ± 0%     6.66k ± 0%    ~     (p=0.682 n=8+10)

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 7f3888f ]

De-obfuscate some of the flags to improve readability.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit c19e447 ]

Test that when the CT entry for 192.168.61.11:38193 -> 192.168.61.12:80
is removed, then the related IN and OUT NAT entries are purged along with
it.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 5b22423 ]

DSR uses a OUT NAT entry for RevDNAT of backend replies. Prior to the
changes in cilium#22978, this NAT entry was
protected by the CT_INGRESS entry which bpf_lxc creates for the backend
connection.

Test that GC of the NAT entry works when the CT entry is removed.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 6b9e351 ]

Point out that PurgeOrphanNATEntries() is only a fallback, to purge NAT
entries that are unexpectedly no longer backed by any CT entry.

During normal operations NAT entries should get purged as part of the GC
for their specific CT entry.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 74b3f56 ]

CT entries that get created for a DSR connection by the datapath will have
the `dsr` flag set. Reflect this in the CT entries that we use for tests.

The flag currently doesn't make a difference for the GC logic, but let's
still be a bit more accurate.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit e743901 ]

We want to add more advanced handling into the GC logic, which requires
information from the actual CT entry. Let's consolidate all of the
decision making in the purgeCtEntry*() functions, so that the NAT code
doesn't need to understand all the details of how CT and NAT interact.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit c1a2d1f ]

Clarify which CT entries potentially require purging of a DSR-related NAT
entry. This reduces the risk of accidentally purging unrelated NAT entries,
and allows the GC logic to do less work.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 21072cd ]

With cilium#22978 we changed how DSR NAT
entries are managed. Instead of associating the NAT entry's lifetime with
bpf_lxc's CT_INGRESS entry, the nodeport code on the backend now creates
its own CT_EGRESS entry. When such a CT_EGRESS entry is GC'ed, we should
therefore also purge the related DSR NAT entry.

Also add a test for this case.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
@pippolo84
Copy link
Member Author

Force-pushed to remove commit already backported here. Description has been updated as well.

@pippolo84 pippolo84 marked this pull request as ready for review November 3, 2023 15:55
@pippolo84
Copy link
Member Author

/test-backport-1.14

@squeed squeed self-requested a review November 6, 2023 19:12
@squeed squeed self-requested a review November 6, 2023 19:28
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Nov 6, 2023
@jibi jibi merged commit cbf01f8 into cilium:v1.14 Nov 7, 2023
57 of 58 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.14 This PR represents a backport for Cilium 1.14.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet