Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipcache: fix flapping labels in SelectorCache when reserved:host identity has multiple IPs #28332

Merged
merged 1 commit into from Oct 5, 2023

Conversation

squeed
Copy link
Contributor

@squeed squeed commented Sep 29, 2023

The reserved:host identity is special: the numeric identity is fixed and the set of labels is mutable. (The datapath requires this.) So, we need to determine all prefixes that have the reserved:host label and capture their labels. Then, we must aggregate all labels from all IPs and insert them as the reserved:host identity labels.

However, the code as written has a race condition whenever the local node has more than one IP address. This can happen when, for example vxlan or ipv6 is enabled. The basic sequence is this:

  1. Insert IP A as reserved:host in to the ipcache. ID 1 now has labels reserved:host
  2. Insert IP A as reserved:kube-apiserver in to the ipcache. ID 1 is updated with labels labels reserved:host, reserved:kube-apsierver
  3. Insert IP B as reserved:host in to the ipcache. ID 1 is updated with labels reserved:host
    And now policies are broken.

Likewise, we need to always update the SelectorCache; we cannot short-circuit if the ipcache already has that identity. Again, this is needed because the identity is mutable. So this bug can take another form:

  1. Insert IP A as reserved:host in to the ipcache. Because IP A is not known to the ipcache, treat ID 1 as a new identity and update the selector cache
  2. Insert IP A as reserved:kube-apiserver. Mutate the labels of ID 1. But, because IP A already has ID 1, short-circuit the update to the selector cache (if the Source is the same, which it may be).
  3. Now the selector cache has incorrect labels for ID 1.

Without this, when there are multiple IPs with the host label, the identity may flap and the SelectorCache may be missing updates.

Fixes: #28259
Fixes: b099ba6
Fixes: f5d9532

@squeed squeed added kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. labels Sep 29, 2023
@squeed squeed requested a review from a team as a code owner September 29, 2023 13:09
@squeed squeed force-pushed the merge-host-labels branch 2 times, most recently from d5b7717 to 59e7f3b Compare September 29, 2023 13:17
@nathanjsweet
Copy link
Member

Additional CI for "pod-to-node-policycidr" test: #28334.

@squeed
Copy link
Contributor Author

squeed commented Sep 30, 2023

/test

@christarazi
Copy link
Member

Without this, when there are multiple IPs with the host label, the identity may flap and the SelectorCache may be missing updates.

I'm having trouble understanding this and without understanding, it's difficult to review the PR. Could you go into more detail and also provide an example?

@nathanjsweet
Copy link
Member

This test keeps failing, but when I run it locally, it passes. I'm going to rerun it a few more times, because the error is simply one of the test pods not coming up.

@christarazi christarazi added the sig/policy Impacts whether traffic is allowed or denied based on user-defined policies. label Oct 2, 2023
@squeed squeed changed the title ipcache: aggregate labels from all IPs with local host identity ipcache: fix flapping labels in SelectorCache when reserved:host identity has multiple IPs Oct 2, 2023
@squeed squeed added the affects/v1.14 This issue affects v1.14 branch label Oct 2, 2023
pkg/ipcache/metadata.go Outdated Show resolved Hide resolved
pkg/ipcache/metadata.go Outdated Show resolved Hide resolved
pkg/ipcache/metadata.go Outdated Show resolved Hide resolved
Copy link
Member

@joestringer joestringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Core change LGTM from a high level, I walked through the logic along with @christarazi . He may have some other minor feedback.

@squeed squeed added affects/v1.13 This issue affects v1.13 branch needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch labels Oct 4, 2023
@squeed squeed requested a review from schlosna October 4, 2023 09:52
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.14.3 Oct 4, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.13.8 Oct 4, 2023
@squeed
Copy link
Contributor Author

squeed commented Oct 4, 2023

@schlosna updated based on feedback; PTAL.

@squeed squeed requested a review from a team as a code owner October 4, 2023 09:53
@squeed squeed self-assigned this Oct 4, 2023
Copy link
Contributor

@schlosna schlosna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my comments, this looks good to me.

The commit check is unhappy with a duplicated word labels in the commit message and references to commits b099ba6 & f5d9532 that were part of #19765

{[1/1] ipcache: aggregate labels from all IPs with local host identity}
Running on ae661971b3e961b64b9f7fc28a6e162eb72ed58e
Warning: WARNING:REPEATED_WORD: Possible repeated word: 'labels'
#20:
with labels labels reserved:host, reserved:kube-apsierver

Warning: WARNING:UNKNOWN_COMMIT_ID: Unknown commit id 'b099ba614', maybe rebased or not pulled?
#41:
Fixes: b099ba6

Warning: WARNING:UNKNOWN_COMMIT_ID: Unknown commit id 'f5d95325c', maybe rebased or not pulled?
#42:
Fixes: f5d9532

@squeed
Copy link
Contributor Author

squeed commented Oct 5, 2023

FYI @joestringer I was chatting with @christarazi, then I realized a few hours later that I had missed a case (albeit unlikely): if an IP loses reserved:host but keeps, say, reserved:kube-apiserver or cidr:xxxx. In that case, we didn't remove reserved:kube-apiserver from ID 1. That is now fixed.

The `reserved:host` identity is special: the numeric identity is fixed and the
set of labels is mutable. (The datapath requires this.) So, we need to
determine all prefixes that have the `reserved:host` label and capture their
labels. Then, we must aggregate *all* labels from all IPs and insert them as
the `reserved:host` identity labels.

However, the code as written has a race condition whenever the local node has
more than one IP address. This can happen when, for example vxlan or ipv6 is
enabled. The basic sequence is this:

1. Insert IP A as `reserved:host` in to the ipcache. ID 1 now has labels
  `reserved:host`
2. Insert IP A as `reserved:kube-apiserver` in to the ipcache. ID 1 is updated
  with labels `reserved:host, reserved:kube-apsierver`
3. Insert IP B as `reserved:host` in to the ipcache. ID 1 is updated with
  labels `reserved:host`. And now policies that select
  `reserved:kube-apiserver` are broken

Likewise, we need to always update the SelectorCache; we cannot short-circuit
if the ipcache already has that identity. Again, this is needed because the
identity is mutable. So this bug can take another form:

1. Insert IP A as `reserved:host` in to the ipcache. Because IP A is not known
to the ipcache, treat ID 1 as a new identity and update the selector cache
2. Insert IP A as `reserved:kube-apiserver`. Mutate the labels of ID 1. But,
because IP A already has ID 1, short-circuit the update to the selector cache
(if the Source is the same, which it _may_ be).
3. Now the selector cache has incorrect labels for ID 1.

Without this, when there are multiple IPs with the host label, the identity may
flap and the SelectorCache may be missing updates.

Fixes: cilium#28259
Fixes: e0d403a
Fixes: 308c142

Signed-off-by: Casey Callendrello <cdc@isovalent.com>
@squeed
Copy link
Contributor Author

squeed commented Oct 5, 2023

/test

@squeed squeed added the backport/author The backport will be carried out by the author of the PR. label Oct 5, 2023
@squeed squeed merged commit d50a525 into cilium:main Oct 5, 2023
59 of 61 checks passed
@squeed squeed added backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. and removed needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch labels Oct 5, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport pending to v1.13 in 1.13.8 Oct 5, 2023
@squeed squeed added backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. and removed needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch labels Oct 5, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport pending to v1.14 in 1.14.3 Oct 5, 2023
@christarazi christarazi added backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. and removed backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. labels Oct 5, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.13 to Backport done to v1.13 in 1.13.8 Oct 5, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.13 to Backport done to v1.13 in 1.13.8 Oct 5, 2023
@joestringer joestringer added backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. and removed backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels Oct 6, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.14 to Backport done to v1.14 in 1.14.3 Oct 6, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.14 to Backport done to v1.14 in 1.14.3 Oct 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects/v1.13 This issue affects v1.13 branch affects/v1.14 This issue affects v1.14 branch backport/author The backport will be carried out by the author of the PR. backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies.
Projects
No open projects
1.13.8
Backport done to v1.13
1.14.3
Backport done to v1.14
Development

Successfully merging this pull request may close these issues.

"kube-apiserver" policy entity is broken on dual-stack clusters
5 participants