Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v1.13] ipcache: aggregate labels from all IPs with local host identity #28416

Merged
merged 1 commit into from
Oct 5, 2023

Conversation

squeed
Copy link
Contributor

@squeed squeed commented Oct 5, 2023

Once this PR is merged, you can update the PR labels via:

$ for pr in 28332; do contrib/backporting/set-labels.py $pr done 1.13; done

[ upstream commit d50a525 ]

[ backporter's notes:
    - oldID was renamed to id, adapted to that
    - v1.13 already mocked out the selector cache, updated that
]

The `reserved:host` identity is special: the numeric identity is fixed and the
set of labels is mutable. (The datapath requires this.) So, we need to
determine all prefixes that have the `reserved:host` label and capture their
labels. Then, we must aggregate *all* labels from all IPs and insert them as
the `reserved:host` identity labels.

However, the code as written has a race condition whenever the local node has
more than one IP address. This can happen when, for example vxlan or ipv6 is
enabled. The basic sequence is this:

1. Insert IP A as `reserved:host` in to the ipcache. ID 1 now has labels
  `reserved:host`
2. Insert IP A as `reserved:kube-apiserver` in to the ipcache. ID 1 is updated
  with labels `reserved:host, reserved:kube-apsierver`
3. Insert IP B as `reserved:host` in to the ipcache. ID 1 is updated with
  labels `reserved:host`. And now policies that select
  `reserved:kube-apiserver` are broken

Likewise, we need to always update the SelectorCache; we cannot short-circuit
if the ipcache already has that identity. Again, this is needed because the
identity is mutable. So this bug can take another form:

1. Insert IP A as `reserved:host` in to the ipcache. Because IP A is not known
to the ipcache, treat ID 1 as a new identity and update the selector cache
2. Insert IP A as `reserved:kube-apiserver`. Mutate the labels of ID 1. But,
because IP A already has ID 1, short-circuit the update to the selector cache
(if the Source is the same, which it _may_ be).
3. Now the selector cache has incorrect labels for ID 1.

Without this, when there are multiple IPs with the host label, the identity may
flap and the SelectorCache may be missing updates.

Fixes: cilium#28259
Fixes: e0d403a
Fixes: 308c142

Signed-off-by: Casey Callendrello <cdc@isovalent.com>
@squeed squeed requested a review from a team as a code owner October 5, 2023 11:46
@squeed squeed added kind/backports This PR provides functionality previously merged into master. backport/1.13 This PR represents a backport for Cilium 1.13.x of a PR that was merged to main. labels Oct 5, 2023
@squeed
Copy link
Contributor Author

squeed commented Oct 5, 2023

/test-backport-1.13

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 5, 2023
Copy link
Member

@christarazi christarazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the backporting notes, easy to review the diff.

@christarazi christarazi merged commit b3118a2 into cilium:v1.13 Oct 5, 2023
61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.13 This PR represents a backport for Cilium 1.13.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants