Don't expand CIDR labels, match smartly in Labels instead #30897

squeed · 2024-02-21T20:46:20Z

CIDR labels are expensive. Very expensive. For an IPv6 prefix, GetCidrLabels() can take over 2 KiB of memory.

This change refactors the label internals so that CIDR labels match when the cidr is larger as well. Then, we no longer have to expand CIDR labels at all; the matching logic handles it correctly.

Note to reviewers: Please review by commit; earlier commits add some test invariants. Another commit lightly refactors the label matching API to try and bring some sanity. Only the final two commits change any logic.

joestringer

In principle this seems like a really neat win. I can't currently think of any issues with encoding this into the selectors to identify labels with prefixes that are more specific than the selector.

I didn't review in deep detail beyond the second patch, primarily just for time and because I think you were interested in high level feedback at this time. Perhaps another @cilium/sig-policy member could do further indepth review or I can reschedule a subsequent follow-up review later on.

pkg/policy/selectorcache_selector.go

pkg/labels/array.go

pkg/labels/labels.go

squeed · 2024-02-22T10:47:42Z

Probably the biggest issue this change "tickles" is the inconsistency around "directionality" in the Label / Labels / LabelArray APIs. This is because label equality, as written, is non-commutative.

a := ParseLabel("any:foo=bar")
b := ParseLabel("k8s:foo=bar")

// What is this, Javascript?
a.Equals(b) // == true
b.Equals(a) // == false
a == b // false

Oh, and Labels.Equals() doesn't use Label.Equals():

aa := labels.FromSlice({a}); bb := labels.FromSlice({b})
aa.Equals(bb) // == false
bb.Equals(aa) // == false

But LabelArray.Equals() does:

aa.LabelArray().Equals(bb.LabelArray()) // true
bb.LabelArray().Equals(aa.LabelArray()) // false

For more fun, LabelArray.Has() and Labels.Has() are also inverted in direction:

aa.LabelArray().Has(b) // false
aa.Has(b) // true

My eventual goal with all of this:

.Equals() is always strict and commutative, for Label, Labels, and LabelArray. In fact, we should get rid of Label.Equals()
["1.1.1.1/32"].Has("1.1.1.0/24") == true for Labels and LabelArray
["k8s:foo=bar"].Has("any:foo=bar") == true for Labels and LabelArray
Contains() and Lacks() are consistent with Has() (I believe they are, but this needs to be tested)
Nothing breaks 😬

So, the first step is a consistent .Matches() function.

squeed · 2024-02-22T10:52:12Z

One thing this change relies on is the current limitation that CIDR labels don't have a value. We might fail the case for

["cidr:1.1.1.0/24=foo", "cidr:1.1.1.1/32=bar"].Get("cidr:1.1.1.1/32")
["cidr:1.1.1.0/24=foo", "cidr:1.1.1.1/32=bar"].Get("cidr:1.1.1.0/31")

pippolo84 · 2024-02-22T11:09:17Z

I second what Joe said regarding this being a neat win. I remember thinking about something similar while working on #28788 but got overwhelmed by the potential impact of such change.

As you said, the CIDR labels are way too expensive, and in the IPv6 case, this also limits the effectiveness of the CIDR labels cache introduced in #28788, as it is not possible to get a meaningful hit ratio without a very large memory footprint.
Besides, this seems a lot easier to reason about, compared with the memoization approach of the CIDR cache.

My only concern is the current inconsistencies with the API you already highlighted, but the plan you depicted here seems sound and might also give us a chance to clean this up, besides improving performance. 👍

squeed · 2024-02-26T16:21:03Z

/test

squeed · 2024-02-27T13:07:44Z

/test

pkg/labels/array.go

nathanjsweet

Very impressive! Thanks! Two questions.

pkg/labels/labels.go

pkg/labels/array.go

pkg/labels/cidr.go

pkg/labels/labels.go

pkg/labels/cidr_test.go

christarazi

The changes LGTM and I can't think of any adverse effects from the PR. I am curious though if we have an understanding of what the actual performance impact is with this change. AFAIU, GetCIDRLabels consumed a lot of CPU and memory to generate CIDR labels constantly when the DNS proxy sees many DNS requests, but how much has this PR really changed? Have we done a pprof comparison, for example? I don't doubt that this PR improved things, but it's also useful to quantify how much has changed and worst case, potentially point out a flaw that we didn't detect previously.

pkg/labels/cidr.go

We actually rely quite heavliy on the LabelSourceAny mechanism. EndpointSelectors in CiliumNetworkPolicies always have LabelSourceAny added. For example, the block ```yaml toEndpoints: - matchLabels: io.kubernetes.pod.namespace: kube-system k8s-app: kube-dns ``` converts to the label selector `{any.io.kubernetes.pod.namespace: kube-system,any.k8s-app: kube-dns,}` So, explicitly mention this in comments and update the SelectorCache tests to capture this behavior. Signed-off-by: Casey Callendrello <cdc@isovalent.com>

Adding some invariants that should not be broken during coming refactors. Signed-off-by: Casey Callendrello <cdc@isovalent.com>

This tests adds a very specific invariant that is needed by the policy engine. Specifically, the expanded set of CIDR labels must always `.Has()` a CIDR label that contains it. This will be relevant when we stop expanding CIDR labels and, instead, logically compute CIDR matching. Signed-off-by: Casey Callendrello <cdc@isovalent.com>

The label matching API is complicated and inconsistent. This change tries to bring some sanity to the API going forward, without changing existing behavior. Label matching is directional / non-communtative. Specifically, `"any:foo=bar".Equals("k8s:foo=bar")` is true, whereas `"k8s:foo=bar".Equals("any:foo=bar")` is false. So, with the eventual goal of removing `Label.Equals()`, this commit adds a new `Label.Has()` and `Label.HasKey()` api, with clear documentation around directionality. The fixed point here is `LabelArray.Has()`, which needs a specific directionality as required by the k8s label selector library. Everything else is based off of that. This also changes `Labels.Has()` to match directionality w.r.t `any`-source selectors. In theory this is a breaking change, in actuality `Labels.Has()` is never passed `any` selectors, so this is moot. Signed-off-by: Casey Callendrello <cdc@isovalent.com>

This changes the Labels API to be CIDR-aware. It then logically "expands" CIDR labels when computing matches, so that selectors can match CIDRs even when not present. It does this by parsing CIDRs on label creation, then checking CIDR overlap in the `MatchesKey()` function. The API contract we expose to the policy engine is unchanged: ``` GetCIDRLabels("10.0.0.0/24").LabelArray().Has("cidr.10.0.0.0/8") == true ``` The goal is to stop manually expanding CIDR labels, which is very inefficient. This will follow in a subsequent commit. Signed-off-by: Casey Callendrello <cdc@isovalent.com>

Previously, we would expand a CIDR in to the full set of possible CIDRs that could select it. For example "1.1.1.1/32" would be expanded in to [0.0.0.0/0, 0.0.0.0/1, ... 1.1.1.0/31, 1.1.1.0/32]. This causes significant memory and CPUusage, especially for circumstances such as ToFQDN policies where many /32 and /128 identities are created. Now that CIDR selectors are prefix-aware, rather than just string matches, we can stop generating the complete list of CIDRs. This is safe because CIDRs labels now select CIDRs that are contained within. Benchmark results: │ ../bench_main.out │ ../bench_cidr.out │ sec/op │ sec/op vs base UpdateGenerateDNS-12 4.972 ± 2% 2.882 ± 3% -42.02% (p=0.000 n=10) │ ../bench_main.out │ ../bench_cidr.out │ B/op │ B/op vs base UpdateGenerateDNS-12 77.26Mi ± 0% 24.52Mi ± 0% -68.26% (p=0.000 n=10) │ ../bench_main.out │ ../bench_cidr.out │ allocs/op │ allocs/op vs base UpdateGenerateDNS-12 508.0k ± 0% 291.7k ± 0% -42.59% (p=0.000 n Signed-off-by: Casey Callendrello <cdc@isovalent.com>

squeed · 2024-04-02T19:55:25Z

/test

maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Feb 21, 2024

joestringer reviewed Feb 21, 2024

View reviewed changes

pkg/policy/selectorcache_selector.go Outdated Show resolved Hide resolved

pkg/policy/selectorcache_selector.go Outdated Show resolved Hide resolved

pkg/labels/array.go Show resolved Hide resolved

pkg/labels/labels.go Outdated Show resolved Hide resolved

squeed force-pushed the cidr-label-cover branch from 1c34ac6 to b67c22f Compare February 26, 2024 15:56

squeed added the release-note/minor This PR changes functionality that users may find relevant to operating Cilium. label Feb 26, 2024

maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Feb 26, 2024

squeed marked this pull request as ready for review February 26, 2024 15:57

squeed requested review from a team as code owners February 26, 2024 15:57

squeed requested review from nathanjsweet and pippolo84 February 26, 2024 15:57

joestringer mentioned this pull request Feb 26, 2024

Rework ipcache to handle metadata from multiple sources via a dedicated worker goroutine #21142

Open

29 tasks

squeed force-pushed the cidr-label-cover branch from b67c22f to 1646672 Compare February 27, 2024 10:53

squeed commented Feb 27, 2024

View reviewed changes

pkg/labels/array.go Outdated Show resolved Hide resolved

squeed force-pushed the cidr-label-cover branch from 1646672 to 8852078 Compare February 27, 2024 13:53

nathanjsweet requested changes Feb 27, 2024

View reviewed changes

pkg/labels/labels.go Outdated Show resolved Hide resolved

pkg/labels/labels.go Show resolved Hide resolved

pkg/labels/labels.go Show resolved Hide resolved

nathanjsweet added affects/v1.13 This issue affects v1.13 branch affects/v1.14 This issue affects v1.14 branch affects/v1.15 This issue affects v1.15 branch labels Feb 27, 2024

christarazi reviewed Feb 28, 2024

View reviewed changes

pkg/labels/array.go Show resolved Hide resolved

pkg/labels/array.go Outdated Show resolved Hide resolved

pkg/labels/cidr.go Outdated Show resolved Hide resolved

pkg/labels/labels.go Show resolved Hide resolved

pkg/labels/cidr_test.go Outdated Show resolved Hide resolved

squeed added this pull request to the merge queue Apr 1, 2024

squeed removed this pull request from the merge queue due to a manual request Apr 1, 2024

christarazi approved these changes Apr 1, 2024

View reviewed changes

pkg/labels/cidr.go Show resolved Hide resolved

maintainer-s-little-helper bot added ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels Apr 1, 2024

christarazi removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Apr 1, 2024

maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Apr 2, 2024

squeed added dont-merge/discussion A discussion is ongoing and should be resolved before merging, regardless of reviews & tests status. and removed ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels Apr 2, 2024

maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Apr 2, 2024

squeed added 6 commits April 2, 2024 21:54

policy: small cidr selector test cases

a2639a9

Adding some invariants that should not be broken during coming refactors. Signed-off-by: Casey Callendrello <cdc@isovalent.com>

squeed force-pushed the cidr-label-cover branch from 6dbe224 to 9fbbc4b Compare April 2, 2024 19:54

squeed removed dont-merge/discussion A discussion is ongoing and should be resolved before merging, regardless of reviews & tests status. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels Apr 2, 2024

maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Apr 3, 2024

squeed added this pull request to the merge queue Apr 3, 2024

Merged via the queue into cilium:main with commit 24dd20d Apr 3, 2024
62 checks passed

squeed deleted the cidr-label-cover branch April 3, 2024 09:39

pippolo84 mentioned this pull request Apr 4, 2024

Improve efficiency of applying policy map changes #30535

Closed

This was referenced Apr 9, 2024

CI: Cilium E2E Upgrade: Timed out waiting for datapath updates of FQDN IP information after upgrade #29846

Open

Consider making policy-cidr-match-mode=nodes the default #31961

Open

squeed removed the affects/v1.13 This issue affects v1.13 branch label Apr 17, 2024

squeed mentioned this pull request Jun 12, 2024

Improve performance of FQDN proxy #31032

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't expand CIDR labels, match smartly in Labels instead #30897

Don't expand CIDR labels, match smartly in Labels instead #30897

squeed commented Feb 21, 2024 •

edited

Loading

joestringer left a comment

squeed commented Feb 22, 2024 •

edited

Loading

squeed commented Feb 22, 2024

pippolo84 commented Feb 22, 2024

squeed commented Feb 26, 2024

squeed commented Feb 27, 2024

nathanjsweet left a comment

christarazi left a comment

squeed commented Apr 2, 2024

Don't expand CIDR labels, match smartly in Labels instead #30897

Don't expand CIDR labels, match smartly in Labels instead #30897

Conversation

squeed commented Feb 21, 2024 • edited Loading

joestringer left a comment

Choose a reason for hiding this comment

squeed commented Feb 22, 2024 • edited Loading

squeed commented Feb 22, 2024

pippolo84 commented Feb 22, 2024

squeed commented Feb 26, 2024

squeed commented Feb 27, 2024

nathanjsweet left a comment

Choose a reason for hiding this comment

christarazi left a comment

Choose a reason for hiding this comment

squeed commented Apr 2, 2024

squeed commented Feb 21, 2024 •

edited

Loading

squeed commented Feb 22, 2024 •

edited

Loading