New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backport v1.13: FQDN fixes #28401
Merged
jrajahalme
merged 8 commits into
cilium:v1.13
from
joamaki:pr/joamaki/fqdn-v1.13-backports
Oct 6, 2023
Merged
Backport v1.13: FQDN fixes #28401
jrajahalme
merged 8 commits into
cilium:v1.13
from
joamaki:pr/joamaki/fqdn-v1.13-backports
Oct 6, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
maintainer-s-little-helper
bot
added
backport/1.13
This PR represents a backport for Cilium 1.13.x of a PR that was merged to main.
kind/backports
This PR provides functionality previously merged into master.
labels
Oct 4, 2023
/test-backport-1.13 |
[ upstream commit 08903e0 ] [ Backporter's note: ToMapState had changed fair bit, had to reimplement the patch on top of the old code to pass in SelectorCache. ] Endpoint.GetLabelsLocked existed on the premise that selector cache could not be locked while endpoint has been locked, due to selector cache locking ip cache in some code paths. This does not seem to be correct, as ip cache calls in to selector cache, not the other way around. With this the Endpoint.GetLabelsLocked can be removed, and SelectorCache.GetLabelsLocked can be used instead also when calling from the Endpoint locked state. This is also in line with the commend on policy DistillPolicy that states that "PolicyOwner (aka Endpoint) is also locked during this call", and then within takes the Selector Cache read lock. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit d9b9e65 ] getNets is used in DenyPreferredInsert for MapState. It is somewhat costly, and uses a cache to compute the result at most once for each MapStateEntry. Speed up the computation with two strategies: - skip looking for CIDR labels when the identity is not a local identity. This works due to CIDR identities always being locally allocated - skip allocating a slice when not needed, returning a nil map instead if the locally allocated identity has no CIDR labels Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit cb246d7 ] Have selector cache precompute the most specific CIDR for an identity when the identity is added, rathter than computing it when needed for each MapStateEntry. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 96ea098 ] The UpdateController method performs either a creation, or updating of a controller. If an update is performed the controller is immediately triggered. In some cases we want to create the controller once without triggering or modifying it if it already has been created. Add a CreateController method for those use cases. Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 867e41a ] Performing a full dump of the policy map on every policy map synchronization is expensive and is only necessary to catch rare cases where the agent's view of the policy map has diverged from the kernel's view which should only happen either due to kernel or other bugs or some other application modifying the endpoint policy map. To reduce the cost of synchronizing large endpoint policy maps perform the full synchronization only periodically. The full synchronization is defaults to 15 minutes. Configurable with the hidden option "--policy-map-full-reconciliation-interval". Fixes: 9dc1350 ("endpoint: Enhance policy map sync") Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 9b43d42 ] This is on the hot path when we have a fqdn policy for S3, where a single hostname maps to many IPs. This occurs due to a combination of factors: 1. We allocate a CIDR identitiy for each IP for a hostname which matches a fqdn policy. 2. For each CIDR identity, we generate labels for all super-CIDRs (i.e. CIDRs which contain this CIDR). 3. We opportunistically allocate an identity for all IPs which are matched by a fqdn selector. For those we already knew, other parts of the code decrement the ref count again. 4. We use the SortedList serialization as the key to lookup for existing identities here, which sorts and serializes the labels to a byte array, which is reasonably expensive. Since we can't as easily fix the other factors at play here, at least avoid doing it twice for each label set during the opportunistic acquire/release path. We can lookup by identity for the fast path, and build the string representation for the slower path if need be. We hit this path for every IP returned by S3 that's still being kept alive (be that because of DNS TTL or the zombie mechanism), hence we can easily get to 5k acquire/release pairs per DNS request. While we're at it, also reduce the critical section by moving the SortedList call outside in lookupOrCreate. Signed-off-by: David Bimmler <david.bimmler@isovalent.com> Signed-off-by: Jussi Maki <jussi.maki@isovalent.com>
[ upstream commit c1e21d8 ] SortedList appears prominently in both CPU and Heap pprofs when in a scenario with an FQDN policy matching S3. This is because DNS requests to S3 return short-lived DNS responses from a pool of many IPs, each of which will receive a CIDR identitiy in cilium. Since we furthermore call SortedList repeatedly for these CIDR identities (represented as a set of CIDR labels containing all super-CIDRs), avoiding ~32 buffer allocations per call is worth it. Before: Labels_SortedList-10 1000000 3079 ns/op 504 B/op 13 allocs/op Labels_SortedListCIDRIDs-10 52702 21417 ns/op 3680 B/op 41 allocs/op After: Labels_SortedList-10 1000000 2164 ns/op 360 B/op 3 allocs/op Labels_SortedListCIDRIDs-10 72180 15444 ns/op 1624 B/op 3 allocs/op Benchstat: │ old │ opt │ │ sec/op │ sec/op vs base │ Labels_SortedList-10 3.279µ ± 6% 2.209µ ± 6% -32.65% (p=0.000 n=10) │ B/op │ B/op vs base │ Labels_SortedList-10 504.0 ± 0% 360.0 ± 0% -28.57% (p=0.000 n=10) │ allocs/op │ allocs/op vs base │ Labels_SortedList-10 13.000 ± 0% 3.000 ± 0% -76.92% (p=0.000 n=10) pkg: github.com/cilium/cilium/pkg/labels/cidr │ old │ opt │ │ sec/op │ sec/op vs base │ Labels_SortedListCIDRIDs-10 21.23µ ± 5% 14.89µ ± 9% -29.87% (p=0.000 n=10) │ B/op │ B/op vs base │ Labels_SortedListCIDRIDs-10 3.594Ki ± 0% 1.586Ki ± 0% -55.87% (p=0.000 n=10) │ allocs/op │ allocs/op vs base │ Labels_SortedListCIDRIDs-10 41.000 ± 0% 3.000 ± 0% -92.68% (p=0.000 n=10) Signed-off-by: David Bimmler <david.bimmler@isovalent.com> Signed-off-by: Jussi Maki <jussi.maki@isovalent.com>
joamaki
force-pushed
the
pr/joamaki/fqdn-v1.13-backports
branch
from
October 5, 2023 10:25
4f8a464
to
3893c3e
Compare
joamaki
requested review from
nebril,
nathanjsweet,
jrajahalme and
bimmlerd
and removed request for
nathanjsweet and
nebril
October 6, 2023 07:38
/test-backport-1.13 |
ti-mo
approved these changes
Oct 6, 2023
bimmlerd
approved these changes
Oct 6, 2023
jrajahalme
approved these changes
Oct 6, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My PRs look good to me.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
backport/1.13
This PR represents a backport for Cilium 1.13.x of a PR that was merged to main.
kind/backports
This PR provides functionality previously merged into master.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backports FQDN related fixes to v1.13 that were not straightforward to backport.