Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.14 Backports 2024-02-20 #30864

Merged
merged 9 commits into from Feb 21, 2024
Merged

v1.14 Backports 2024-02-20 #30864

merged 9 commits into from Feb 21, 2024

Conversation

tklauser
Copy link
Member

@tklauser tklauser commented Feb 20, 2024

PRs skipped due to conflicts:

Once this PR is merged, a GitHub action will update the labels of these PRs:

 27796 30756 30795 29717 30790

bimmlerd and others added 9 commits February 20, 2024 10:34
[ upstream commit aa5f699 ]

This case is exercised heavily in the toFQDN policy incremental policy
computation flow.

Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
[ upstream commit 9b43d42 ]

This is on the hot path when we have a fqdn policy for S3, where a
single hostname maps to many IPs. This occurs due to a combination of
factors:

1. We allocate a CIDR identitiy for each IP for a hostname which matches
   a fqdn policy.
2. For each CIDR identity, we generate labels for all super-CIDRs (i.e.
   CIDRs which contain this CIDR).
3. We opportunistically allocate an identity for all IPs which are
   matched by a fqdn selector. For those we already knew, other parts of
   the code decrement the ref count again.
4. We use the SortedList serialization as the key to lookup for existing
   identities here, which sorts and serializes the labels to a byte
   array, which is reasonably expensive.

Since we can't as easily fix the other factors at play here, at least
avoid doing it twice for each label set during the opportunistic
acquire/release path. We can lookup by identity for the fast path, and
build the string representation for the slower path if need be.

We hit this path for every IP returned by S3 that's still being kept
alive (be that because of DNS TTL or the zombie mechanism), hence we can
easily get to 5k acquire/release pairs per DNS request.

While we're at it, also reduce the critical section by moving the
SortedList call outside in lookupOrCreate.

Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
[ upstream commit c1e21d8 ]

SortedList appears prominently in both CPU and Heap pprofs when in a
scenario with an FQDN policy matching S3. This is because DNS requests
to S3 return short-lived DNS responses from a pool of many IPs, each of
which will receive a CIDR identitiy in cilium.

Since we furthermore call SortedList repeatedly for these CIDR
identities (represented as a set of CIDR labels containing all
super-CIDRs), avoiding ~32 buffer allocations per call is worth it.

Before:
Labels_SortedList-10    	1000000	    3079 ns/op	   504 B/op	    13 allocs/op
Labels_SortedListCIDRIDs-10    	  52702	    21417 ns/op	   3680 B/op	     41 allocs/op

After:
Labels_SortedList-10    	1000000	    2164 ns/op	   360 B/op	     3 allocs/op
Labels_SortedListCIDRIDs-10    	  72180	    15444 ns/op	   1624 B/op	      3 allocs/op

Benchstat:
                     │     old     │                 opt                 │
                     │   sec/op    │   sec/op     vs base                │
Labels_SortedList-10   3.279µ ± 6%   2.209µ ± 6%  -32.65% (p=0.000 n=10)

                     │    B/op    │    B/op     vs base                │
Labels_SortedList-10   504.0 ± 0%   360.0 ± 0%  -28.57% (p=0.000 n=10)
                     │  allocs/op  │ allocs/op   vs base                │
Labels_SortedList-10   13.000 ± 0%   3.000 ± 0%  -76.92% (p=0.000 n=10)

pkg: github.com/cilium/cilium/pkg/labels/cidr
                            │     old     │                 opt                 │
                            │   sec/op    │   sec/op     vs base                │
Labels_SortedListCIDRIDs-10   21.23µ ± 5%   14.89µ ± 9%  -29.87% (p=0.000 n=10)
                            │     B/op     │     B/op      vs base                │
Labels_SortedListCIDRIDs-10   3.594Ki ± 0%   1.586Ki ± 0%  -55.87% (p=0.000 n=10)
                            │  allocs/op  │ allocs/op   vs base                │
Labels_SortedListCIDRIDs-10   41.000 ± 0%   3.000 ± 0%  -92.68% (p=0.000 n=10)

Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
[ upstream commit dc6cf34 ]

While fixing one of the review comments in PR that introduced this test,
I changed datapath mode to be explicitly set from matrix.mode.
Unfortunately, setting `native` makes it actually use `tunneling` mode.
Switching to `gke` mode resolves this issue.

Fixes #30247

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
[ upstream commit d7f5e58 ]

In the AKS release cycle, a gap exists between the introduction of new supported Kubernetes versions
and the removal of older versions, leading to failures in scheduled tests.
This PR introduces the capability to disable older Kubernetes versions, mitigating test failures.

Signed-off-by: Birol Bilgin <birol@cilium.io>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
[ upstream commit 14d68f2 ]

This commit revises the Kubernetes versions tested for compatibility across all supported cloud providers.
Additionally, it adjusts the default Kubernetes version to match the default version provided by each cloud provider

Signed-off-by: Birol Bilgin <birol@cilium.io>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
[ upstream commit bb81c06 ]

The current process delegates the review of ariane-config.yaml changes to the contributing group.
With this commit reviewing responsibilities be transferred to the github-sec and ci-structure groups.

Signed-off-by: Birol Bilgin <birol@cilium.io>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
[ upstream commit 27430d4 ]

This bitwise lpm trie is a non-thread-safe binary
trie that indexes arbitrarily long bit-based keys
with associated prefixes indexed from most
significant bit to least significant bit using
the longest prefix match algorithm.

Documenting the behavior of the datastructure is
localized around the method calls in the trie.go
file.

The tests specifically test boundary cases for the
various methods and fuzzes the RangeLookup method.

Updating CODEOWNERS to put sig-policy and ipcache
in charge of this library.

Fixes: #29519

Co-authored-by: Casey Callendrello <cdc@isovalent.com>
Signed-off-by: Nate Sweet <nathanjsweet@pm.me>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
[ upstream commit b19321e ]

This commit updates the Ariane configuration to include the GitHub organization team 'organization-members' in the list of allowed teams.
Consequently, only members of this specific team will have the authorization to initiate test runs via issue comments.

Signed-off-by: Birol Bilgin <birol@cilium.io>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
@tklauser tklauser added kind/backports This PR provides functionality previously merged into master. backport/1.14 This PR represents a backport for Cilium 1.14.x of a PR that was merged to main. labels Feb 20, 2024
@tklauser tklauser mentioned this pull request Feb 20, 2024
8 tasks
@tklauser tklauser marked this pull request as ready for review February 20, 2024 10:43
@tklauser tklauser requested review from a team as code owners February 20, 2024 10:43
@tklauser
Copy link
Member Author

/test-backport-1.14

Copy link
Contributor

@brlbil brlbil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Member

@bimmlerd bimmlerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my commits look good

@tklauser tklauser merged commit e8770c6 into v1.14 Feb 21, 2024
223 checks passed
@tklauser tklauser deleted the pr/v1.14-backport-2024-02-20 branch February 21, 2024 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.14 This PR represents a backport for Cilium 1.14.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants