New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix CIDR labels computation #28788
Merged
nathanjsweet
merged 5 commits into
cilium:main
from
pippolo84:pr/pippolo84/fix-get-cidr-labels
Oct 27, 2023
Merged
Fix CIDR labels computation #28788
nathanjsweet
merged 5 commits into
cilium:main
from
pippolo84:pr/pippolo84/fix-get-cidr-labels
Oct 27, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The previous version of the implementation was actually computing the labels starting from broader prefixes to narrower ones (first "/0", then "/1" and so on). As soon as we had a cache hit, the recursion stopped without calculating the remaining labels for the CIDRs up to "ones". This produced an incorrect (shorter) set of labels for a CIDR. Also, netip.PrefixFrom(...) does not mask the internally stored address, thus lowering the cache hit ratio even if two different CIDRs, used as keys in the LRU cache, are equal in terms of masked address. (e.g: "1.1.1.1/16" and "1.1.0.0/16"). So, netip.Addr.Prefix(...) is used instead. After the fix, the performance are roughly equal (but with an increased chance of having a cache hit). Instead, the maximum heap usage in the worst case (LRU cache filled up with IPv6 labels) is increased 10x. BenchmarkCIDRLabelsCacheHeapUsageIPv4 cidr_test.go:628: Memoization map heap usage: 5483.24 KiB BenchmarkCIDRLabelsCacheHeapUsageIPv6 cidr_test.go:670: Memoization map heap usage: 54721.70 KiB name old time/op new time/op delta GetCIDRLabels/0.0.0.0/0-8 256ns ±39% 218ns ±46% ~ (p=0.393 n=10+10) GetCIDRLabels/10.16.0.0/16-8 1.35µs ± 3% 1.39µs ± 5% +2.66% (p=0.012 n=9+10) GetCIDRLabels/192.0.2.3/32-8 2.52µs ± 2% 2.58µs ± 2% +2.58% (p=0.001 n=10+9) GetCIDRLabels/192.0.2.3/24-8 2.57µs ± 1% 2.24µs ± 3% -12.69% (p=0.000 n=8+10) GetCIDRLabels/192.0.2.0/24-8 2.27µs ± 4% 2.26µs ± 3% ~ (p=0.690 n=9+8) GetCIDRLabels/::/0-8 277ns ± 2% 278ns ± 3% ~ (p=0.796 n=9+9) GetCIDRLabels/fdff::ff/128-8 9.42µs ± 1% 9.34µs ± 6% ~ (p=0.484 n=9+10) GetCIDRLabels/f00d:42::ff/128-8 9.58µs ± 2% 9.62µs ± 7% ~ (p=0.905 n=10+9) GetCIDRLabels/f00d:42::ff/96-8 9.63µs ± 1% 8.45µs ± 3% -12.27% (p=0.000 n=8+9) GetCIDRLabelsConcurrent/1-8 205µs ± 3% 207µs ± 3% ~ (p=0.356 n=9+10) GetCIDRLabelsConcurrent/2-8 385µs ± 5% 386µs ± 7% ~ (p=0.631 n=10+10) GetCIDRLabelsConcurrent/4-8 784µs ± 5% 780µs ± 1% ~ (p=0.156 n=10+9) GetCIDRLabelsConcurrent/16-8 3.24ms ± 1% 3.25ms ± 2% ~ (p=0.529 n=10+10) GetCIDRLabelsConcurrent/32-8 6.40ms ± 1% 6.39ms ± 3% ~ (p=0.497 n=9+10) GetCIDRLabelsConcurrent/48-8 9.69ms ± 1% 10.09ms ± 6% +4.09% (p=0.008 n=8+9) name old alloc/op new alloc/op delta GetCIDRLabels/0.0.0.0/0-8 624B ± 0% 624B ± 0% ~ (all equal) GetCIDRLabels/10.16.0.0/16-8 2.40kB ± 0% 2.40kB ± 0% ~ (all equal) GetCIDRLabels/192.0.2.3/32-8 5.01kB ± 0% 5.01kB ± 0% ~ (all equal) GetCIDRLabels/192.0.2.3/24-8 5.01kB ± 0% 4.93kB ± 0% -1.64% (p=0.002 n=8+10) GetCIDRLabels/192.0.2.0/24-8 4.93kB ± 0% 4.93kB ± 0% ~ (all equal) GetCIDRLabels/::/0-8 624B ± 0% 624B ± 0% ~ (all equal) GetCIDRLabels/fdff::ff/128-8 18.5kB ± 0% 18.5kB ± 0% ~ (all equal) GetCIDRLabels/f00d:42::ff/128-8 18.5kB ± 0% 18.5kB ± 0% ~ (p=0.108 n=9+10) GetCIDRLabels/f00d:42::ff/96-8 18.5kB ± 0% 18.5kB ± 0% -0.06% (p=0.000 n=10+10) GetCIDRLabelsConcurrent/1-8 321kB ± 0% 321kB ± 0% ~ (p=0.127 n=10+8) GetCIDRLabelsConcurrent/2-8 641kB ± 0% 641kB ± 0% ~ (p=0.928 n=10+10) GetCIDRLabelsConcurrent/4-8 1.28MB ± 0% 1.28MB ± 0% ~ (p=0.853 n=10+10) GetCIDRLabelsConcurrent/16-8 5.13MB ± 0% 5.13MB ± 0% ~ (p=0.739 n=10+10) GetCIDRLabelsConcurrent/32-8 10.3MB ± 0% 10.3MB ± 0% ~ (p=0.218 n=10+10) GetCIDRLabelsConcurrent/48-8 15.4MB ± 0% 15.4MB ± 0% ~ (p=0.218 n=10+10) name old allocs/op new allocs/op delta GetCIDRLabels/0.0.0.0/0-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/10.16.0.0/16-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/192.0.2.3/32-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/192.0.2.3/24-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/192.0.2.0/24-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/::/0-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/fdff::ff/128-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) GetCIDRLabels/f00d:42::ff/128-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) GetCIDRLabels/f00d:42::ff/96-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) GetCIDRLabelsConcurrent/1-8 138 ± 0% 138 ± 0% ~ (all equal) GetCIDRLabelsConcurrent/2-8 277 ± 0% 277 ± 0% ~ (all equal) GetCIDRLabelsConcurrent/4-8 555 ± 0% 555 ± 0% ~ (p=0.248 n=10+9) GetCIDRLabelsConcurrent/16-8 2.22k ± 0% 2.22k ± 0% ~ (p=0.353 n=7+10) GetCIDRLabelsConcurrent/32-8 4.44k ± 0% 4.44k ± 0% ~ (p=0.723 n=10+10) GetCIDRLabelsConcurrent/48-8 6.66k ± 0% 6.66k ± 0% ~ (p=0.090 n=10+9) Fixes: e0f6c47 ("labels/cidr: Cache GetCIDRLabels computation") Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
After the introduction of a LRU cache in GetCIDRLabels, the tests should verify the labels computation both when the cache is cold but also when it is hot. Thus, the tests are refactored to check the returned labels twice. Also, an additional test is added to verify that the labels stay consistent when we call GetCIDRLabels with the following sequences of prefixes: 1) "xxx/32", "xxx/31", ..., "xxx/0", "xxx/1", ..., "xxx/32" 2) "xxx/0", "xxx/1", ..., "xxx/32", "xxx/31", ..., "xxx/0" Finally, InCluster tests are removed since cluster identity does not exist anymore. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Migrate remaining tests relying on checker to the testing package from Go standard library. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
TestCIDRLabelsCacheHeapUsageIP{v4,v6} are meant to estimate the maximum heap usage when filling up the CIDR labels LRU cache with labels derived only from IPv4 and labels derived only from IPv6. Since they give meaningful results only when running them with benchtime=1x, thery are refactored to be just tests with a t.Log() to output the heap usage statistics. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
After fixing the GetCIDRLabels implementation to generate all the labels required for a CIDR, the heap usage of the LRU cache increased 10x in the worst case (all IPv6 labels). To reduce heap usage, the cache size is halved, resulting in ~25 MiB in the IPv6 only case with roughly the same performance. === RUN TestCIDRLabelsCacheHeapUsageIPv4 cidr_test.go:527: Memoization map heap usage: 1714.41 KiB --- PASS: TestCIDRLabelsCacheHeapUsageIPv4 (0.67s) === RUN TestCIDRLabelsCacheHeapUsageIPv6 cidr_test.go:571: Memoization map heap usage: 26527.13 KiB --- PASS: TestCIDRLabelsCacheHeapUsageIPv6 (0.71s) name old time/op new time/op delta GetCIDRLabels/0.0.0.0/0-8 198ns ±40% 238ns ±34% ~ (p=0.325 n=10+10) GetCIDRLabels/10.16.0.0/16-8 1.32µs ± 8% 1.33µs ± 8% ~ (p=0.812 n=10+10) GetCIDRLabels/192.0.2.3/32-8 2.41µs ± 3% 2.39µs ± 5% ~ (p=0.278 n=10+9) GetCIDRLabels/192.0.2.3/24-8 2.05µs ± 2% 2.05µs ± 1% ~ (p=0.948 n=9+9) GetCIDRLabels/192.0.2.0/24-8 2.05µs ± 2% 2.04µs ± 1% ~ (p=0.797 n=9+8) GetCIDRLabels/::/0-8 277ns ±31% 257ns ± 1% ~ (p=0.349 n=10+8) GetCIDRLabels/fdff::ff/128-8 9.02µs ± 6% 8.80µs ± 3% ~ (p=0.077 n=9+9) GetCIDRLabels/f00d:42::ff/128-8 9.40µs ± 6% 9.01µs ± 5% -4.12% (p=0.035 n=10+10) GetCIDRLabels/f00d:42::ff/96-8 7.78µs ± 4% 7.58µs ± 1% -2.59% (p=0.011 n=8+9) GetCIDRLabelsConcurrent/1-8 189µs ± 8% 173µs ± 3% -8.85% (p=0.000 n=10+9) GetCIDRLabelsConcurrent/2-8 350µs ± 2% 346µs ± 1% -1.05% (p=0.001 n=8+8) GetCIDRLabelsConcurrent/4-8 703µs ± 1% 692µs ± 1% -1.59% (p=0.000 n=9+9) GetCIDRLabelsConcurrent/16-8 2.97ms ± 7% 2.91ms ± 1% ~ (p=0.122 n=10+8) GetCIDRLabelsConcurrent/32-8 5.81ms ± 1% 5.77ms ± 1% -0.57% (p=0.011 n=8+9) GetCIDRLabelsConcurrent/48-8 8.87ms ± 6% 8.71ms ± 1% ~ (p=0.139 n=9+8) name old alloc/op new alloc/op delta GetCIDRLabels/0.0.0.0/0-8 624B ± 0% 624B ± 0% ~ (all equal) GetCIDRLabels/10.16.0.0/16-8 2.40kB ± 0% 2.40kB ± 0% ~ (all equal) GetCIDRLabels/192.0.2.3/32-8 5.01kB ± 0% 5.01kB ± 0% ~ (all equal) GetCIDRLabels/192.0.2.3/24-8 4.93kB ± 0% 4.93kB ± 0% ~ (all equal) GetCIDRLabels/192.0.2.0/24-8 4.93kB ± 0% 4.93kB ± 0% ~ (all equal) GetCIDRLabels/::/0-8 624B ± 0% 624B ± 0% ~ (all equal) GetCIDRLabels/fdff::ff/128-8 18.5kB ± 0% 18.5kB ± 0% ~ (all equal) GetCIDRLabels/f00d:42::ff/128-8 18.5kB ± 0% 18.5kB ± 0% ~ (all equal) GetCIDRLabels/f00d:42::ff/96-8 18.5kB ± 0% 18.5kB ± 0% ~ (all equal) GetCIDRLabelsConcurrent/1-8 321kB ± 0% 321kB ± 0% ~ (p=0.645 n=10+10) GetCIDRLabelsConcurrent/2-8 641kB ± 0% 641kB ± 0% ~ (p=0.796 n=10+10) GetCIDRLabelsConcurrent/4-8 1.28MB ± 0% 1.28MB ± 0% ~ (p=0.353 n=10+10) GetCIDRLabelsConcurrent/16-8 5.13MB ± 0% 5.13MB ± 0% ~ (p=0.083 n=10+8) GetCIDRLabelsConcurrent/32-8 10.3MB ± 0% 10.3MB ± 0% ~ (p=0.481 n=10+10) GetCIDRLabelsConcurrent/48-8 15.4MB ± 0% 15.4MB ± 0% ~ (p=0.796 n=10+10) name old allocs/op new allocs/op delta GetCIDRLabels/0.0.0.0/0-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/10.16.0.0/16-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/192.0.2.3/32-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/192.0.2.3/24-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/192.0.2.0/24-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/::/0-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GetCIDRLabels/fdff::ff/128-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) GetCIDRLabels/f00d:42::ff/128-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) GetCIDRLabels/f00d:42::ff/96-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) GetCIDRLabelsConcurrent/1-8 138 ± 0% 138 ± 0% ~ (all equal) GetCIDRLabelsConcurrent/2-8 277 ± 0% 277 ± 0% ~ (all equal) GetCIDRLabelsConcurrent/4-8 555 ± 0% 555 ± 0% ~ (all equal) GetCIDRLabelsConcurrent/16-8 2.22k ± 0% 2.22k ± 0% ~ (p=0.176 n=10+7) GetCIDRLabelsConcurrent/32-8 4.44k ± 0% 4.44k ± 0% ~ (p=0.867 n=10+10) GetCIDRLabelsConcurrent/48-8 6.66k ± 0% 6.66k ± 0% ~ (p=0.682 n=8+10) Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
pippolo84
added
area/daemon
Impacts operation of the Cilium daemon.
kind/performance
There is a performance impact of this.
release-note/bug
This PR fixes an issue in a previous release of Cilium.
area/fqdn
Affects the FQDN policies feature
labels
Oct 25, 2023
/test |
joamaki
approved these changes
Oct 25, 2023
Thanks @pippolo84! This passes my little simple test case! 🚀 |
joamaki
added
release-blocker/1.12
This issue will prevent the release of the next version of Cilium.
release-blocker/1.13
This issue will prevent the release of the next version of Cilium.
release-blocker/1.14
This issue will prevent the release of the next version of Cilium.
labels
Oct 27, 2023
nathanjsweet
approved these changes
Oct 27, 2023
maintainer-s-little-helper
bot
added
the
ready-to-merge
This PR has passed all tests and received consensus from code owners to merge.
label
Oct 27, 2023
jrajahalme
added
needs-backport/1.12
needs-backport/1.13
This PR / issue needs backporting to the v1.13 branch
needs-backport/1.14
This PR / issue needs backporting to the v1.14 branch
labels
Oct 30, 2023
pippolo84
added
backport-pending/1.14
The backport for Cilium 1.14.x for this PR is in progress.
and removed
needs-backport/1.14
This PR / issue needs backporting to the v1.14 branch
labels
Oct 30, 2023
pippolo84
added
backport-pending/1.13
The backport for Cilium 1.13.x for this PR is in progress.
and removed
needs-backport/1.13
This PR / issue needs backporting to the v1.13 branch
labels
Oct 30, 2023
maintainer-s-little-helper
bot
moved this from Needs backport from main
to Backport pending to v1.14
in 1.14.4
Oct 30, 2023
maintainer-s-little-helper
bot
moved this from Needs backport from main
to Backport pending to v1.13
in 1.13.9
Oct 30, 2023
maintainer-s-little-helper
bot
moved this from Needs backport from main
to Backport pending to v1.12
in 1.12.16
Oct 31, 2023
pippolo84
added
backport-done/1.12
The backport for Cilium 1.12.x for this PR is done.
backport-done/1.13
The backport for Cilium 1.13.x for this PR is done.
and removed
backport-pending/1.13
The backport for Cilium 1.13.x for this PR is in progress.
labels
Nov 2, 2023
maintainer-s-little-helper
bot
moved this from Backport pending to v1.13
to Backport done to v1.13
in 1.13.9
Nov 2, 2023
jibi
added
backport-done/1.14
The backport for Cilium 1.14.x for this PR is done.
and removed
backport-pending/1.14
The backport for Cilium 1.14.x for this PR is in progress.
labels
Nov 7, 2023
This was referenced Nov 12, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/daemon
Impacts operation of the Cilium daemon.
area/fqdn
Affects the FQDN policies feature
backport-done/1.12
The backport for Cilium 1.12.x for this PR is done.
backport-done/1.13
The backport for Cilium 1.13.x for this PR is done.
backport-done/1.14
The backport for Cilium 1.14.x for this PR is done.
kind/bug
This is a bug in the Cilium logic.
kind/performance
There is a performance impact of this.
ready-to-merge
This PR has passed all tests and received consensus from code owners to merge.
release-blocker/1.12
This issue will prevent the release of the next version of Cilium.
release-blocker/1.13
This issue will prevent the release of the next version of Cilium.
release-blocker/1.14
This issue will prevent the release of the next version of Cilium.
release-note/bug
This PR fixes an issue in a previous release of Cilium.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The previous version of the GetCIDRLabels implementation was computing the set of labels for a CIDR incorrectly in case of a cache hit: #28465 (comment)
Also, netip.PrefixFrom(...) does not mask the internally stored address, thus lowering the cache hit ratio even if two different CIDRs, used as keys in the LRU cache, are equal in terms of "masked address". (e.g: "1.1.1.1/16" and "1.1.0.0/16").
So, netip.Addr.Prefix(...) is used instead.
After the fix, the performance are roughly equal (but with an increased chance of having a cache hit). To keep the maximum heap usage reasonably low, the cache size is now halved.
Also, the tests are refactored to increase the coverage (checking the labels computation both in case of cache miss and in case of a cache hit).
Finally, the only remaining test using the deprecated checker package is migrated to the testing package from the standard Go library.
Read each commit message for further details and performance comparison.
cc @odinuge @sjdot