Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v1.14] ipcache: fix flapping labels in SelectorCache when reserved:host identity has multiple IPs #28418

Merged
merged 1 commit into from Oct 6, 2023

Conversation

squeed
Copy link
Contributor

@squeed squeed commented Oct 5, 2023

Once this PR is merged, you can update the PR labels via:

$ for pr in 28332; do contrib/backporting/set-labels.py $pr done 1.14; done

[ upstream commit d50a525 ]

[ Backporter's notes: minor renames, otherwise a clean backport ]

The `reserved:host` identity is special: the numeric identity is fixed and the
set of labels is mutable. (The datapath requires this.) So, we need to
determine all prefixes that have the `reserved:host` label and capture their
labels. Then, we must aggregate *all* labels from all IPs and insert them as
the `reserved:host` identity labels.

However, the code as written has a race condition whenever the local node has
more than one IP address. This can happen when, for example vxlan or ipv6 is
enabled. The basic sequence is this:

1. Insert IP A as `reserved:host` in to the ipcache. ID 1 now has labels
  `reserved:host`
2. Insert IP A as `reserved:kube-apiserver` in to the ipcache. ID 1 is updated
  with labels `reserved:host, reserved:kube-apsierver`
3. Insert IP B as `reserved:host` in to the ipcache. ID 1 is updated with
  labels `reserved:host`. And now policies that select
  `reserved:kube-apiserver` are broken

Likewise, we need to always update the SelectorCache; we cannot short-circuit
if the ipcache already has that identity. Again, this is needed because the
identity is mutable. So this bug can take another form:

1. Insert IP A as `reserved:host` in to the ipcache. Because IP A is not known
to the ipcache, treat ID 1 as a new identity and update the selector cache
2. Insert IP A as `reserved:kube-apiserver`. Mutate the labels of ID 1. But,
because IP A already has ID 1, short-circuit the update to the selector cache
(if the Source is the same, which it _may_ be).
3. Now the selector cache has incorrect labels for ID 1.

Without this, when there are multiple IPs with the host label, the identity may
flap and the SelectorCache may be missing updates.

Fixes: cilium#28259
Fixes: e0d403a
Fixes: 308c142

Signed-off-by: Casey Callendrello <cdc@isovalent.com>
@squeed squeed requested a review from a team as a code owner October 5, 2023 11:58
@squeed squeed added kind/backports This PR provides functionality previously merged into master. backport/1.14 This PR represents a backport for Cilium 1.14.x of a PR that was merged to main. labels Oct 5, 2023
@squeed
Copy link
Contributor Author

squeed commented Oct 5, 2023

/test-backport-1.14

@christarazi
Copy link
Member

E2E test hit #27762, rerunning.

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 5, 2023
@ti-mo ti-mo merged commit 5eb57b9 into cilium:v1.14 Oct 6, 2023
58 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.14 This PR represents a backport for Cilium 1.14.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants