New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.14 Backports 2023-10-30 #28870
v1.14 Backports 2023-10-30 #28870
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My commit looks good. Thanks!
041b96f
to
f139784
Compare
@learnitall @julianwiedmann please review your commits very carefully, I hit quite a number of conflicts during the backporting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My PR looks good. Thanks!
/test-backport-1.14 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My changes LGTM
Peeked at the related ci-conformance fail
Looks like we need below change for 1.14. Let me know if this is getting too involved and you would prefer a separate backport :).
|
f139784
to
d92711e
Compare
Thanks @julianwiedmann . I've applied the changes, let's see how it goes. |
/test-backport-1.14 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, and the current CI fails look unrelated 🤞. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good for my PR, thank you!
Should wait for #28940 to be merged first, in order to fix CI. |
[ upstream commit 07d3a21 ] We need to refetch the pod labels again because we have just added the endpoint into the endpoint manager. If we have received any pod events, more specifically any events that modified the pod labels, between the time the pod was created and the time it was added into the endpoint manager, the pod event would not have been processed since the pod event handler would not find the endpoint for that pod in the endpoint manager. Thus, we will fetch the labels again and update the endpoint with these labels. Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit c1803ba ] This commit changes the bugtool report to collect the XFRM error counters (i.e., /proc/net/xfrm_stat) twice instead of only once. We will collect at the beginning and end of the bugtool collection. In that way, there will be around 5-6 seconds between the two collections and we may see if any counter is currently increasing. $ diff cilium-bugtool-cilium-7d54p-20231025-115151/cmd/cat*--proc-net-xfrm_stat.md 5c5 < XfrmInStateProtoError 4 --- > XfrmInStateProtoError 6 In this example, we can easily see that the XfrmInStateProtoError is increasing. That suggests a key rotation issue is currently ongoing (cf. IPsec troubleshooting docs). I tried other approaches to collect over a longer timespan. That may allow us to see slower increases. They all end up being more complex or messier (we'd need to collect at beginning and end of the sysdump instead). In the end, considering this is already a fallback plan for when customers don't collect Prometheus metrics, I think the current, simpler approach is good enough. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit b836cb1 ] We recently introduced AWS-CONNMARK-CHAIN iptables rules deletion, but didn't add them to an if statement guarding actual deletion. Signed-off-by: Maciej Kwiek <maciej@isovalent.com> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 6ab728d ] This change causes Cilium DaemonSet postStart hook to always delete AWS iptable rules unless `cni.chainingMode` is set to `aws-cni`. This will result in the postStart hook being a noop in all non-AWS deployments. Unfortunately there is no way for helm chart to know whether it is running on AWS not in ENI mode. This approach will make sure that we are deleting AWS-specific iptables rules that cause issues while not requiring us to introduce new configuration flags for users. Signed-off-by: Maciej Kwiek <maciej@isovalent.com> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 55517ea ] The previous version of the implementation was actually computing the labels starting from broader prefixes to narrower ones (first "/0", then "/1" and so on). As soon as we had a cache hit, the recursion stopped without calculating the remaining labels for the CIDRs up to "ones". This produced an incorrect (shorter) set of labels for a CIDR. Also, netip.PrefixFrom(...) does not mask the internally stored address, thus lowering the cache hit ratio even if two different CIDRs, used as keys in the LRU cache, are equal in terms of masked address. (e.g: "1.1.1.1/16" and "1.1.0.0/16"). So, netip.Addr.Prefix(...) is used instead. After the fix, the performance are roughly equal (but with an increased chance of having a cache hit). Instead, the maximum heap usage in the worst case (LRU cache filled up with IPv6 labels) is increased 10x. BenchmarkCIDRLabelsCacheHeapUsageIPv4 cidr_test.go:628: Memoization map heap usage: 5483.24 KiB BenchmarkCIDRLabelsCacheHeapUsageIPv6 cidr_test.go:670: Memoization map heap usage: 54721.70 KiB name old time/op new time/op delta GetCIDRLabels/0.0.0.0/0-8 256ns ±39% 218ns ±46% ~ (p=0.393 n=10+10) GetCIDRLabels/10.16.0.0/16-8 1.35µs ± 3% 1.39µs ± 5% +2.66% (p=0.012 n=9+10) GetCIDRLabels/192.0.2.3/32-8 2.52µs ± 2% 2.58µs ± 2% +2.58% (p=0.001 n=10+9) GetCIDRLabels/192.0.2.3/24-8 2.57µs ± 1% 2.24µs ± 3% -12.69% (p=0.000 n=8+10) GetCIDRLabels/192.0.2.0/24-8 2.27µs ± 4% 2.26µs ± 3% ~ (p=0.690 n=9+8) GetCIDRLabels/::/0-8 277ns ± 2% 278ns ± 3% ~ (p=0.796 n=9+9) GetCIDRLabels/fdff::ff/128-8 9.42µs ± 1% 9.34µs ± 6% ~ (p=0.484 n=9+10) GetCIDRLabels/f00d:42::ff/128-8 9.58µs ± 2% 9.62µs ± 7% ~ (p=0.905 n=10+9) GetCIDRLabels/f00d:42::ff/96-8 9.63µs ± 1% 8.45µs ± 3% -12.27% (p=0.000 n=8+9) GetCIDRLabelsConcurrent/1-8 205µs ± 3% 207µs ± 3% ~ (p=0.356 n=9+10) GetCIDRLabelsConcurrent/2-8 385µs ± 5% 386µs ± 7% ~ (p=0.631 n=10+10) GetCIDRLabelsConcurrent/4-8 784µs ± 5% 780µs ± 1% ~ (p=0.156 n=10+9) GetCIDRLabelsConcurrent/16-8 3.24ms ± 1% 3.25ms ± 2% ~ (p=0.529 n=10+10) GetCIDRLabelsConcurrent/32-8 6.40ms ± 1% 6.39ms ± 3% ~ (p=0.497 n=9+10) GetCIDRLabelsConcurrent/48-8 9.69ms ± 1% 10.09ms ± 6% +4.09% (p=0.008 n=8+9) name old alloc/op new alloc/op delta GetCIDRLabels/0.0.0.0/0-8 624B ± 0% 624B ± 0% ~ (all equal) GetCIDRLabels/10.16.0.0/16-8 2.40kB ± 0% 2.40kB ± 0% ~ (all equal) GetCIDRLabels/192.0.2.3/32-8 5.01kB ± 0% 5.01kB ± 0% ~ (all equal) GetCIDRLabels/192.0.2.3/24-8 5.01kB ± 0% 4.93kB ± 0% -1.64% (p=0.002 n=8+10) GetCIDRLabels/192.0.2.0/24-8 4.93kB ± 0% 4.93kB ± 0% ~ (all equal) GetCIDRLabels/::/0-8 624B ± 0% 624B ± 0% ~ (all equal) GetCIDRLabels/fdff::ff/128-8 18.5kB ± 0% 18.5kB ± 0% ~ (all equal) GetCIDRLabels/f00d:42::ff/128-8 18.5kB ± 0% 18.5kB ± 0% ~ (p=0.108 n=9+10) GetCIDRLabels/f00d:42::ff/96-8 18.5kB ± 0% 18.5kB ± 0% -0.06% (p=0.000 n=10+10) GetCIDRLabelsConcurrent/1-8 321kB ± 0% 321kB ± 0% ~ (p=0.127 n=10+8) GetCIDRLabelsConcurrent/2-8 641kB ± 0% 641kB ± 0% ~ (p=0.928 n=10+10) GetCIDRLabelsConcurrent/4-8 1.28MB ± 0% 1.28MB ± 0% ~ (p=0.853 n=10+10) GetCIDRLabelsConcurrent/16-8 5.13MB ± 0% 5.13MB ± 0% ~ (p=0.739 n=10+10) GetCIDRLabelsConcurrent/32-8 10.3MB ± 0% 10.3MB ± 0% ~ (p=0.218 n=10+10) GetCIDRLabelsConcurrent/48-8 15.4MB ± 0% 15.4MB ± 0% ~ (p=0.218 n=10+10) name old allocs/op new allocs/op delta GetCIDRLabels/0.0.0.0/0-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/10.16.0.0/16-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/192.0.2.3/32-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/192.0.2.3/24-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/192.0.2.0/24-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/::/0-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/fdff::ff/128-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) GetCIDRLabels/f00d:42::ff/128-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) GetCIDRLabels/f00d:42::ff/96-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) GetCIDRLabelsConcurrent/1-8 138 ± 0% 138 ± 0% ~ (all equal) GetCIDRLabelsConcurrent/2-8 277 ± 0% 277 ± 0% ~ (all equal) GetCIDRLabelsConcurrent/4-8 555 ± 0% 555 ± 0% ~ (p=0.248 n=10+9) GetCIDRLabelsConcurrent/16-8 2.22k ± 0% 2.22k ± 0% ~ (p=0.353 n=7+10) GetCIDRLabelsConcurrent/32-8 4.44k ± 0% 4.44k ± 0% ~ (p=0.723 n=10+10) GetCIDRLabelsConcurrent/48-8 6.66k ± 0% 6.66k ± 0% ~ (p=0.090 n=10+9) Fixes: e0f6c47 ("labels/cidr: Cache GetCIDRLabels computation") Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 1b9b3fc ] After the introduction of a LRU cache in GetCIDRLabels, the tests should verify the labels computation both when the cache is cold but also when it is hot. Thus, the tests are refactored to check the returned labels twice. Also, an additional test is added to verify that the labels stay consistent when we call GetCIDRLabels with the following sequences of prefixes: 1) "xxx/32", "xxx/31", ..., "xxx/0", "xxx/1", ..., "xxx/32" 2) "xxx/0", "xxx/1", ..., "xxx/32", "xxx/31", ..., "xxx/0" Finally, InCluster tests are removed since cluster identity does not exist anymore. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 9f2034e ] Migrate remaining tests relying on checker to the testing package from Go standard library. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 71b7ad5 ] TestCIDRLabelsCacheHeapUsageIP{v4,v6} are meant to estimate the maximum heap usage when filling up the CIDR labels LRU cache with labels derived only from IPv4 and labels derived only from IPv6. Since they give meaningful results only when running them with benchtime=1x, thery are refactored to be just tests with a t.Log() to output the heap usage statistics. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 6f1253e ] After fixing the GetCIDRLabels implementation to generate all the labels required for a CIDR, the heap usage of the LRU cache increased 10x in the worst case (all IPv6 labels). To reduce heap usage, the cache size is halved, resulting in ~25 MiB in the IPv6 only case with roughly the same performance. === RUN TestCIDRLabelsCacheHeapUsageIPv4 cidr_test.go:527: Memoization map heap usage: 1714.41 KiB --- PASS: TestCIDRLabelsCacheHeapUsageIPv4 (0.67s) === RUN TestCIDRLabelsCacheHeapUsageIPv6 cidr_test.go:571: Memoization map heap usage: 26527.13 KiB --- PASS: TestCIDRLabelsCacheHeapUsageIPv6 (0.71s) name old time/op new time/op delta GetCIDRLabels/0.0.0.0/0-8 198ns ±40% 238ns ±34% ~ (p=0.325 n=10+10) GetCIDRLabels/10.16.0.0/16-8 1.32µs ± 8% 1.33µs ± 8% ~ (p=0.812 n=10+10) GetCIDRLabels/192.0.2.3/32-8 2.41µs ± 3% 2.39µs ± 5% ~ (p=0.278 n=10+9) GetCIDRLabels/192.0.2.3/24-8 2.05µs ± 2% 2.05µs ± 1% ~ (p=0.948 n=9+9) GetCIDRLabels/192.0.2.0/24-8 2.05µs ± 2% 2.04µs ± 1% ~ (p=0.797 n=9+8) GetCIDRLabels/::/0-8 277ns ±31% 257ns ± 1% ~ (p=0.349 n=10+8) GetCIDRLabels/fdff::ff/128-8 9.02µs ± 6% 8.80µs ± 3% ~ (p=0.077 n=9+9) GetCIDRLabels/f00d:42::ff/128-8 9.40µs ± 6% 9.01µs ± 5% -4.12% (p=0.035 n=10+10) GetCIDRLabels/f00d:42::ff/96-8 7.78µs ± 4% 7.58µs ± 1% -2.59% (p=0.011 n=8+9) GetCIDRLabelsConcurrent/1-8 189µs ± 8% 173µs ± 3% -8.85% (p=0.000 n=10+9) GetCIDRLabelsConcurrent/2-8 350µs ± 2% 346µs ± 1% -1.05% (p=0.001 n=8+8) GetCIDRLabelsConcurrent/4-8 703µs ± 1% 692µs ± 1% -1.59% (p=0.000 n=9+9) GetCIDRLabelsConcurrent/16-8 2.97ms ± 7% 2.91ms ± 1% ~ (p=0.122 n=10+8) GetCIDRLabelsConcurrent/32-8 5.81ms ± 1% 5.77ms ± 1% -0.57% (p=0.011 n=8+9) GetCIDRLabelsConcurrent/48-8 8.87ms ± 6% 8.71ms ± 1% ~ (p=0.139 n=9+8) name old alloc/op new alloc/op delta GetCIDRLabels/0.0.0.0/0-8 624B ± 0% 624B ± 0% ~ (all equal) GetCIDRLabels/10.16.0.0/16-8 2.40kB ± 0% 2.40kB ± 0% ~ (all equal) GetCIDRLabels/192.0.2.3/32-8 5.01kB ± 0% 5.01kB ± 0% ~ (all equal) GetCIDRLabels/192.0.2.3/24-8 4.93kB ± 0% 4.93kB ± 0% ~ (all equal) GetCIDRLabels/192.0.2.0/24-8 4.93kB ± 0% 4.93kB ± 0% ~ (all equal) GetCIDRLabels/::/0-8 624B ± 0% 624B ± 0% ~ (all equal) GetCIDRLabels/fdff::ff/128-8 18.5kB ± 0% 18.5kB ± 0% ~ (all equal) GetCIDRLabels/f00d:42::ff/128-8 18.5kB ± 0% 18.5kB ± 0% ~ (all equal) GetCIDRLabels/f00d:42::ff/96-8 18.5kB ± 0% 18.5kB ± 0% ~ (all equal) GetCIDRLabelsConcurrent/1-8 321kB ± 0% 321kB ± 0% ~ (p=0.645 n=10+10) GetCIDRLabelsConcurrent/2-8 641kB ± 0% 641kB ± 0% ~ (p=0.796 n=10+10) GetCIDRLabelsConcurrent/4-8 1.28MB ± 0% 1.28MB ± 0% ~ (p=0.353 n=10+10) GetCIDRLabelsConcurrent/16-8 5.13MB ± 0% 5.13MB ± 0% ~ (p=0.083 n=10+8) GetCIDRLabelsConcurrent/32-8 10.3MB ± 0% 10.3MB ± 0% ~ (p=0.481 n=10+10) GetCIDRLabelsConcurrent/48-8 15.4MB ± 0% 15.4MB ± 0% ~ (p=0.796 n=10+10) name old allocs/op new allocs/op delta GetCIDRLabels/0.0.0.0/0-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/10.16.0.0/16-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/192.0.2.3/32-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/192.0.2.3/24-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/192.0.2.0/24-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/::/0-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/fdff::ff/128-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) GetCIDRLabels/f00d:42::ff/128-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) GetCIDRLabels/f00d:42::ff/96-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) GetCIDRLabelsConcurrent/1-8 138 ± 0% 138 ± 0% ~ (all equal) GetCIDRLabelsConcurrent/2-8 277 ± 0% 277 ± 0% ~ (all equal) GetCIDRLabelsConcurrent/4-8 555 ± 0% 555 ± 0% ~ (all equal) GetCIDRLabelsConcurrent/16-8 2.22k ± 0% 2.22k ± 0% ~ (p=0.176 n=10+7) GetCIDRLabelsConcurrent/32-8 4.44k ± 0% 4.44k ± 0% ~ (p=0.867 n=10+10) GetCIDRLabelsConcurrent/48-8 6.66k ± 0% 6.66k ± 0% ~ (p=0.682 n=8+10) Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 7f3888f ] De-obfuscate some of the flags to improve readability. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit c19e447 ] Test that when the CT entry for 192.168.61.11:38193 -> 192.168.61.12:80 is removed, then the related IN and OUT NAT entries are purged along with it. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 5b22423 ] DSR uses a OUT NAT entry for RevDNAT of backend replies. Prior to the changes in cilium#22978, this NAT entry was protected by the CT_INGRESS entry which bpf_lxc creates for the backend connection. Test that GC of the NAT entry works when the CT entry is removed. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 6b9e351 ] Point out that PurgeOrphanNATEntries() is only a fallback, to purge NAT entries that are unexpectedly no longer backed by any CT entry. During normal operations NAT entries should get purged as part of the GC for their specific CT entry. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 74b3f56 ] CT entries that get created for a DSR connection by the datapath will have the `dsr` flag set. Reflect this in the CT entries that we use for tests. The flag currently doesn't make a difference for the GC logic, but let's still be a bit more accurate. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit e743901 ] We want to add more advanced handling into the GC logic, which requires information from the actual CT entry. Let's consolidate all of the decision making in the purgeCtEntry*() functions, so that the NAT code doesn't need to understand all the details of how CT and NAT interact. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit c1a2d1f ] Clarify which CT entries potentially require purging of a DSR-related NAT entry. This reduces the risk of accidentally purging unrelated NAT entries, and allows the GC logic to do less work. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 21072cd ] With cilium#22978 we changed how DSR NAT entries are managed. Instead of associating the NAT entry's lifetime with bpf_lxc's CT_INGRESS entry, the nodeport code on the backend now creates its own CT_EGRESS entry. When such a CT_EGRESS entry is GC'ed, we should therefore also purge the related DSR NAT entry. Also add a test for this case. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
d92711e
to
2a4145c
Compare
Force-pushed to remove commit already backported here. Description has been updated as well. |
/test-backport-1.14 |
cidr.go
moved in main branchresource_ctors.go
moved in main branchcidr.go
moved in main branchnat.IPv4
constant missingOnce this PR is merged, a GitHub action will update the labels of these PRs: