Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CIDR labels computation #28788

Merged
merged 5 commits into from Oct 27, 2023

Conversation

pippolo84
Copy link
Member

The previous version of the GetCIDRLabels implementation was computing the set of labels for a CIDR incorrectly in case of a cache hit: #28465 (comment)

Also, netip.PrefixFrom(...) does not mask the internally stored address, thus lowering the cache hit ratio even if two different CIDRs, used as keys in the LRU cache, are equal in terms of "masked address". (e.g: "1.1.1.1/16" and "1.1.0.0/16").
So, netip.Addr.Prefix(...) is used instead.

After the fix, the performance are roughly equal (but with an increased chance of having a cache hit). To keep the maximum heap usage reasonably low, the cache size is now halved.

Also, the tests are refactored to increase the coverage (checking the labels computation both in case of cache miss and in case of a cache hit).

Finally, the only remaining test using the deprecated checker package is migrated to the testing package from the standard Go library.

Read each commit message for further details and performance comparison.

cc @odinuge @sjdot

The previous version of the implementation was actually computing the
labels starting from broader prefixes to narrower ones (first "/0", then
"/1" and so on).  As soon as we had a cache hit, the recursion stopped
without calculating the remaining labels for the CIDRs up to "ones".
This produced an incorrect (shorter) set of labels for a CIDR.

Also, netip.PrefixFrom(...) does not mask the internally stored address,
thus lowering the cache hit ratio even if two different CIDRs, used as
keys in the LRU cache, are equal in terms of masked address. (e.g:
"1.1.1.1/16" and "1.1.0.0/16").

So, netip.Addr.Prefix(...) is used instead.

After the fix, the performance are roughly equal (but with an increased
chance of having a cache hit). Instead, the maximum heap usage in the
worst case (LRU cache filled up with IPv6 labels) is increased 10x.

BenchmarkCIDRLabelsCacheHeapUsageIPv4
    cidr_test.go:628: Memoization map heap usage: 5483.24 KiB
BenchmarkCIDRLabelsCacheHeapUsageIPv6
    cidr_test.go:670: Memoization map heap usage: 54721.70 KiB

name                             old time/op    new time/op    delta
GetCIDRLabels/0.0.0.0/0-8           256ns ±39%     218ns ±46%     ~     (p=0.393 n=10+10)
GetCIDRLabels/10.16.0.0/16-8       1.35µs ± 3%    1.39µs ± 5%   +2.66%  (p=0.012 n=9+10)
GetCIDRLabels/192.0.2.3/32-8       2.52µs ± 2%    2.58µs ± 2%   +2.58%  (p=0.001 n=10+9)
GetCIDRLabels/192.0.2.3/24-8       2.57µs ± 1%    2.24µs ± 3%  -12.69%  (p=0.000 n=8+10)
GetCIDRLabels/192.0.2.0/24-8       2.27µs ± 4%    2.26µs ± 3%     ~     (p=0.690 n=9+8)
GetCIDRLabels/::/0-8                277ns ± 2%     278ns ± 3%     ~     (p=0.796 n=9+9)
GetCIDRLabels/fdff::ff/128-8       9.42µs ± 1%    9.34µs ± 6%     ~     (p=0.484 n=9+10)
GetCIDRLabels/f00d:42::ff/128-8    9.58µs ± 2%    9.62µs ± 7%     ~     (p=0.905 n=10+9)
GetCIDRLabels/f00d:42::ff/96-8     9.63µs ± 1%    8.45µs ± 3%  -12.27%  (p=0.000 n=8+9)
GetCIDRLabelsConcurrent/1-8         205µs ± 3%     207µs ± 3%     ~     (p=0.356 n=9+10)
GetCIDRLabelsConcurrent/2-8         385µs ± 5%     386µs ± 7%     ~     (p=0.631 n=10+10)
GetCIDRLabelsConcurrent/4-8         784µs ± 5%     780µs ± 1%     ~     (p=0.156 n=10+9)
GetCIDRLabelsConcurrent/16-8       3.24ms ± 1%    3.25ms ± 2%     ~     (p=0.529 n=10+10)
GetCIDRLabelsConcurrent/32-8       6.40ms ± 1%    6.39ms ± 3%     ~     (p=0.497 n=9+10)
GetCIDRLabelsConcurrent/48-8       9.69ms ± 1%   10.09ms ± 6%   +4.09%  (p=0.008 n=8+9)

name                             old alloc/op   new alloc/op   delta
GetCIDRLabels/0.0.0.0/0-8            624B ± 0%      624B ± 0%     ~     (all equal)
GetCIDRLabels/10.16.0.0/16-8       2.40kB ± 0%    2.40kB ± 0%     ~     (all equal)
GetCIDRLabels/192.0.2.3/32-8       5.01kB ± 0%    5.01kB ± 0%     ~     (all equal)
GetCIDRLabels/192.0.2.3/24-8       5.01kB ± 0%    4.93kB ± 0%   -1.64%  (p=0.002 n=8+10)
GetCIDRLabels/192.0.2.0/24-8       4.93kB ± 0%    4.93kB ± 0%     ~     (all equal)
GetCIDRLabels/::/0-8                 624B ± 0%      624B ± 0%     ~     (all equal)
GetCIDRLabels/fdff::ff/128-8       18.5kB ± 0%    18.5kB ± 0%     ~     (all equal)
GetCIDRLabels/f00d:42::ff/128-8    18.5kB ± 0%    18.5kB ± 0%     ~     (p=0.108 n=9+10)
GetCIDRLabels/f00d:42::ff/96-8     18.5kB ± 0%    18.5kB ± 0%   -0.06%  (p=0.000 n=10+10)
GetCIDRLabelsConcurrent/1-8         321kB ± 0%     321kB ± 0%     ~     (p=0.127 n=10+8)
GetCIDRLabelsConcurrent/2-8         641kB ± 0%     641kB ± 0%     ~     (p=0.928 n=10+10)
GetCIDRLabelsConcurrent/4-8        1.28MB ± 0%    1.28MB ± 0%     ~     (p=0.853 n=10+10)
GetCIDRLabelsConcurrent/16-8       5.13MB ± 0%    5.13MB ± 0%     ~     (p=0.739 n=10+10)
GetCIDRLabelsConcurrent/32-8       10.3MB ± 0%    10.3MB ± 0%     ~     (p=0.218 n=10+10)
GetCIDRLabelsConcurrent/48-8       15.4MB ± 0%    15.4MB ± 0%     ~     (p=0.218 n=10+10)

name                             old allocs/op  new allocs/op  delta
GetCIDRLabels/0.0.0.0/0-8            2.00 ± 0%      2.00 ± 0%     ~     (all equal)
GetCIDRLabels/10.16.0.0/16-8         2.00 ± 0%      2.00 ± 0%     ~     (all equal)
GetCIDRLabels/192.0.2.3/32-8         2.00 ± 0%      2.00 ± 0%     ~     (all equal)
GetCIDRLabels/192.0.2.3/24-8         2.00 ± 0%      2.00 ± 0%     ~     (all equal)
GetCIDRLabels/192.0.2.0/24-8         2.00 ± 0%      2.00 ± 0%     ~     (all equal)
GetCIDRLabels/::/0-8                 2.00 ± 0%      2.00 ± 0%     ~     (all equal)
GetCIDRLabels/fdff::ff/128-8         3.00 ± 0%      3.00 ± 0%     ~     (all equal)
GetCIDRLabels/f00d:42::ff/128-8      3.00 ± 0%      3.00 ± 0%     ~     (all equal)
GetCIDRLabels/f00d:42::ff/96-8       3.00 ± 0%      3.00 ± 0%     ~     (all equal)
GetCIDRLabelsConcurrent/1-8           138 ± 0%       138 ± 0%     ~     (all equal)
GetCIDRLabelsConcurrent/2-8           277 ± 0%       277 ± 0%     ~     (all equal)
GetCIDRLabelsConcurrent/4-8           555 ± 0%       555 ± 0%     ~     (p=0.248 n=10+9)
GetCIDRLabelsConcurrent/16-8        2.22k ± 0%     2.22k ± 0%     ~     (p=0.353 n=7+10)
GetCIDRLabelsConcurrent/32-8        4.44k ± 0%     4.44k ± 0%     ~     (p=0.723 n=10+10)
GetCIDRLabelsConcurrent/48-8        6.66k ± 0%     6.66k ± 0%     ~     (p=0.090 n=10+9)

Fixes: e0f6c47 ("labels/cidr: Cache GetCIDRLabels computation")

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
After the introduction of a LRU cache in GetCIDRLabels, the tests should
verify the labels computation both when the cache is cold but also when
it is hot.

Thus, the tests are refactored to check the returned labels twice.

Also, an additional test is added to verify that the labels stay
consistent when we call GetCIDRLabels with the following sequences of
prefixes:

1) "xxx/32", "xxx/31", ..., "xxx/0", "xxx/1", ..., "xxx/32"

2) "xxx/0", "xxx/1", ..., "xxx/32", "xxx/31", ..., "xxx/0"

Finally, InCluster tests are removed since cluster identity does not
exist anymore.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Migrate remaining tests relying on checker to the testing package from
Go standard library.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
TestCIDRLabelsCacheHeapUsageIP{v4,v6} are meant to estimate the maximum
heap usage when filling up the CIDR labels LRU cache with labels derived
only from IPv4 and labels derived only from IPv6.

Since they give meaningful results only when running them with
benchtime=1x, thery are refactored to be just tests with a t.Log() to
output the heap usage statistics.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
After fixing the GetCIDRLabels implementation to generate all the labels
required for a CIDR, the heap usage of the LRU cache increased 10x in
the worst case (all IPv6 labels).  To reduce heap usage, the cache size
is halved, resulting in ~25 MiB in the IPv6 only case with roughly the
same performance.

=== RUN   TestCIDRLabelsCacheHeapUsageIPv4
    cidr_test.go:527: Memoization map heap usage: 1714.41 KiB
--- PASS: TestCIDRLabelsCacheHeapUsageIPv4 (0.67s)
=== RUN   TestCIDRLabelsCacheHeapUsageIPv6
    cidr_test.go:571: Memoization map heap usage: 26527.13 KiB
--- PASS: TestCIDRLabelsCacheHeapUsageIPv6 (0.71s)

name                             old time/op    new time/op    delta
GetCIDRLabels/0.0.0.0/0-8           198ns ±40%     238ns ±34%    ~     (p=0.325 n=10+10)
GetCIDRLabels/10.16.0.0/16-8       1.32µs ± 8%    1.33µs ± 8%    ~     (p=0.812 n=10+10)
GetCIDRLabels/192.0.2.3/32-8       2.41µs ± 3%    2.39µs ± 5%    ~     (p=0.278 n=10+9)
GetCIDRLabels/192.0.2.3/24-8       2.05µs ± 2%    2.05µs ± 1%    ~     (p=0.948 n=9+9)
GetCIDRLabels/192.0.2.0/24-8       2.05µs ± 2%    2.04µs ± 1%    ~     (p=0.797 n=9+8)
GetCIDRLabels/::/0-8                277ns ±31%     257ns ± 1%    ~     (p=0.349 n=10+8)
GetCIDRLabels/fdff::ff/128-8       9.02µs ± 6%    8.80µs ± 3%    ~     (p=0.077 n=9+9)
GetCIDRLabels/f00d:42::ff/128-8    9.40µs ± 6%    9.01µs ± 5%  -4.12%  (p=0.035 n=10+10)
GetCIDRLabels/f00d:42::ff/96-8     7.78µs ± 4%    7.58µs ± 1%  -2.59%  (p=0.011 n=8+9)
GetCIDRLabelsConcurrent/1-8         189µs ± 8%     173µs ± 3%  -8.85%  (p=0.000 n=10+9)
GetCIDRLabelsConcurrent/2-8         350µs ± 2%     346µs ± 1%  -1.05%  (p=0.001 n=8+8)
GetCIDRLabelsConcurrent/4-8         703µs ± 1%     692µs ± 1%  -1.59%  (p=0.000 n=9+9)
GetCIDRLabelsConcurrent/16-8       2.97ms ± 7%    2.91ms ± 1%    ~     (p=0.122 n=10+8)
GetCIDRLabelsConcurrent/32-8       5.81ms ± 1%    5.77ms ± 1%  -0.57%  (p=0.011 n=8+9)
GetCIDRLabelsConcurrent/48-8       8.87ms ± 6%    8.71ms ± 1%    ~     (p=0.139 n=9+8)

name                             old alloc/op   new alloc/op   delta
GetCIDRLabels/0.0.0.0/0-8            624B ± 0%      624B ± 0%    ~     (all equal)
GetCIDRLabels/10.16.0.0/16-8       2.40kB ± 0%    2.40kB ± 0%    ~     (all equal)
GetCIDRLabels/192.0.2.3/32-8       5.01kB ± 0%    5.01kB ± 0%    ~     (all equal)
GetCIDRLabels/192.0.2.3/24-8       4.93kB ± 0%    4.93kB ± 0%    ~     (all equal)
GetCIDRLabels/192.0.2.0/24-8       4.93kB ± 0%    4.93kB ± 0%    ~     (all equal)
GetCIDRLabels/::/0-8                 624B ± 0%      624B ± 0%    ~     (all equal)
GetCIDRLabels/fdff::ff/128-8       18.5kB ± 0%    18.5kB ± 0%    ~     (all equal)
GetCIDRLabels/f00d:42::ff/128-8    18.5kB ± 0%    18.5kB ± 0%    ~     (all equal)
GetCIDRLabels/f00d:42::ff/96-8     18.5kB ± 0%    18.5kB ± 0%    ~     (all equal)
GetCIDRLabelsConcurrent/1-8         321kB ± 0%     321kB ± 0%    ~     (p=0.645 n=10+10)
GetCIDRLabelsConcurrent/2-8         641kB ± 0%     641kB ± 0%    ~     (p=0.796 n=10+10)
GetCIDRLabelsConcurrent/4-8        1.28MB ± 0%    1.28MB ± 0%    ~     (p=0.353 n=10+10)
GetCIDRLabelsConcurrent/16-8       5.13MB ± 0%    5.13MB ± 0%    ~     (p=0.083 n=10+8)
GetCIDRLabelsConcurrent/32-8       10.3MB ± 0%    10.3MB ± 0%    ~     (p=0.481 n=10+10)
GetCIDRLabelsConcurrent/48-8       15.4MB ± 0%    15.4MB ± 0%    ~     (p=0.796 n=10+10)

name                             old allocs/op  new allocs/op  delta
GetCIDRLabels/0.0.0.0/0-8            2.00 ± 0%      2.00 ± 0%    ~     (all equal)
GetCIDRLabels/10.16.0.0/16-8         2.00 ± 0%      2.00 ± 0%    ~     (all equal)
GetCIDRLabels/192.0.2.3/32-8         2.00 ± 0%      2.00 ± 0%    ~     (all equal)
GetCIDRLabels/192.0.2.3/24-8         2.00 ± 0%      2.00 ± 0%    ~     (all equal)
GetCIDRLabels/192.0.2.0/24-8         2.00 ± 0%      2.00 ± 0%    ~     (all equal)
GetCIDRLabels/::/0-8                 2.00 ± 0%      2.00 ± 0%    ~     (all equal)
GetCIDRLabels/fdff::ff/128-8         3.00 ± 0%      3.00 ± 0%    ~     (all equal)
GetCIDRLabels/f00d:42::ff/128-8      3.00 ± 0%      3.00 ± 0%    ~     (all equal)
GetCIDRLabels/f00d:42::ff/96-8       3.00 ± 0%      3.00 ± 0%    ~     (all equal)
GetCIDRLabelsConcurrent/1-8           138 ± 0%       138 ± 0%    ~     (all equal)
GetCIDRLabelsConcurrent/2-8           277 ± 0%       277 ± 0%    ~     (all equal)
GetCIDRLabelsConcurrent/4-8           555 ± 0%       555 ± 0%    ~     (all equal)
GetCIDRLabelsConcurrent/16-8        2.22k ± 0%     2.22k ± 0%    ~     (p=0.176 n=10+7)
GetCIDRLabelsConcurrent/32-8        4.44k ± 0%     4.44k ± 0%    ~     (p=0.867 n=10+10)
GetCIDRLabelsConcurrent/48-8        6.66k ± 0%     6.66k ± 0%    ~     (p=0.682 n=8+10)

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
@pippolo84 pippolo84 added area/daemon Impacts operation of the Cilium daemon. kind/performance There is a performance impact of this. release-note/bug This PR fixes an issue in a previous release of Cilium. area/fqdn Affects the FQDN policies feature labels Oct 25, 2023
@pippolo84 pippolo84 requested review from a team as code owners October 25, 2023 14:08
@pippolo84 pippolo84 added the kind/bug This is a bug in the Cilium logic. label Oct 25, 2023
@pippolo84
Copy link
Member Author

/test

@odinuge
Copy link
Member

odinuge commented Oct 25, 2023

Thanks @pippolo84! This passes my little simple test case! 🚀

@joamaki joamaki added release-blocker/1.12 This issue will prevent the release of the next version of Cilium. release-blocker/1.13 This issue will prevent the release of the next version of Cilium. release-blocker/1.14 This issue will prevent the release of the next version of Cilium. labels Oct 27, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 27, 2023
@nathanjsweet nathanjsweet merged commit 6f1253e into cilium:main Oct 27, 2023
63 checks passed
@jrajahalme jrajahalme added needs-backport/1.12 needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch labels Oct 30, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.13.9 Oct 30, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.14.4 Oct 30, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.12.16 Oct 30, 2023
@pippolo84 pippolo84 mentioned this pull request Oct 30, 2023
9 tasks
@pippolo84 pippolo84 added backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. and removed needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch labels Oct 30, 2023
@pippolo84 pippolo84 mentioned this pull request Oct 30, 2023
6 tasks
@pippolo84 pippolo84 added backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. and removed needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch labels Oct 30, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport pending to v1.14 in 1.14.4 Oct 30, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport pending to v1.13 in 1.13.9 Oct 30, 2023
@pippolo84 pippolo84 mentioned this pull request Oct 31, 2023
4 tasks
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport pending to v1.12 in 1.12.16 Oct 31, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed this from Backport pending to v1.12 in 1.12.16 Nov 2, 2023
@pippolo84 pippolo84 added backport-done/1.12 The backport for Cilium 1.12.x for this PR is done. backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. and removed backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. labels Nov 2, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Backport done to v1.12 in 1.12.16 Nov 2, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.13 to Backport done to v1.13 in 1.13.9 Nov 2, 2023
@jibi jibi added backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. and removed backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels Nov 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/daemon Impacts operation of the Cilium daemon. area/fqdn Affects the FQDN policies feature backport-done/1.12 The backport for Cilium 1.12.x for this PR is done. backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. kind/bug This is a bug in the Cilium logic. kind/performance There is a performance impact of this. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-blocker/1.12 This issue will prevent the release of the next version of Cilium. release-blocker/1.13 This issue will prevent the release of the next version of Cilium. release-blocker/1.14 This issue will prevent the release of the next version of Cilium. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
No open projects
1.12.16
Backport done to v1.12
1.13.9
Backport done to v1.13
1.14.4
Backport pending to v1.14
Development

Successfully merging this pull request may close these issues.

None yet

6 participants