Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix labels synchronization issues on Cilium #29248

Merged
merged 5 commits into from
Dec 1, 2023
Merged

Conversation

aanm
Copy link
Member

@aanm aanm commented Nov 17, 2023

Read on a per commit basis.

Fixed label synchronization issues in Cilium, ensuring accurate representation of endpoint labels during restoration and addressing out-of-sync problems caused by label changes while the Cilium agent is down.

@aanm aanm added kind/bug This is a bug in the Cilium logic. sig/k8s Impacts the kubernetes API, or kubernetes -> cilium internals translation layers. release-note/bug This PR fixes an issue in a previous release of Cilium. affects/v1.12 This issue affects v1.12 branch backport/author The backport will be carried out by the author of the PR. affects/v1.13 This issue affects v1.13 branch affects/v1.14 This issue affects v1.14 branch needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch labels Nov 17, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.14.5 Nov 17, 2023
@aanm
Copy link
Member Author

aanm commented Nov 17, 2023

/test

@aanm aanm marked this pull request as ready for review November 17, 2023 23:36
@aanm aanm requested review from a team as code owners November 17, 2023 23:36
@aanm aanm added backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. and removed needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch labels Nov 18, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport pending to v1.14 in 1.14.5 Nov 18, 2023
Copy link
Member

@gandro gandro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks good overall, but I have some questions around the details

pkg/labels/oplabels.go Show resolved Hide resolved
pkg/labels/oplabels.go Show resolved Hide resolved
pkg/endpoint/endpoint.go Outdated Show resolved Hide resolved
pkg/endpoint/endpoint.go Outdated Show resolved Hide resolved
pkg/endpoint/endpoint.go Outdated Show resolved Hide resolved
Copy link
Member

@nebril nebril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM aside from Sebastian's comments.

@aanm aanm requested a review from gandro November 20, 2023 13:03
Copy link
Contributor

@learnitall learnitall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending @gandro's approval

Copy link
Member

@gandro gandro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-wise looks good to me. I have one more question around the controller lifetime, which I don't think it can be hit with the current callers to RunMetadataResolver (as they don't provide baseLabels), but might be in the future.

pkg/endpoint/endpoint.go Outdated Show resolved Hide resolved
pkg/endpoint/endpoint.go Outdated Show resolved Hide resolved
pkg/endpoint/endpoint.go Outdated Show resolved Hide resolved
pkg/endpoint/endpoint.go Show resolved Hide resolved
pkg/endpoint/endpoint.go Outdated Show resolved Hide resolved
pkg/endpoint/endpoint.go Outdated Show resolved Hide resolved
Copy link
Member

@tklauser tklauser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once Joe's unresolved review comments are addressed.

@aanm
Copy link
Member Author

aanm commented Nov 29, 2023

/test

@aanm
Copy link
Member Author

aanm commented Nov 30, 2023

/test

@aanm aanm enabled auto-merge November 30, 2023 11:45
When replacing the endpoint labels we want to keep all labels that are
part of the source for which we are replacing the labels. For example,
labels added through the API should not be replaced when a K8s label
update is received.

Signed-off-by: André Martins <andre@cilium.io>
Cilium shouldn't replace labels that come from a different source even
if they have the same key. In order for a label to be replaced, the new
label should have the same source as the old label.

Signed-off-by: André Martins <andre@cilium.io>
If an endpoint does not contain a pod nor a namespace then don't
resolve its metadata.

Signed-off-by: André Martins <andre@cilium.io>
Fix two Cilium bugs related to label handling:

1. Addressed an issue during endpoint restoration where Cilium would incorrectly
   replace labels not sourced from Kubernetes. Previously, labels set on an
   endpoint outside of Kubernetes were wiped out upon restoration, as all labels
   were overwritten with those fetched from Kubernetes.

2. Resolved a bug that occurred when a user added or removed a label from a pod
   or namespace while the Cilium agent was inactive. Upon Cilium restart, the
   affected endpoint failed to reflect these changes, leading to synchronization
   issues in label management.

Signed-off-by: André Martins <andre@cilium.io>
If we execute the 'RunMetadataResolver' more than one time we could face
the situation of deleting this new controller from an older run since
they both shared the same name. Since controllers are never executed if
the their 'RunInterval' is set to 0 then don't need to remove from the
list of controllers.

Signed-off-by: André Martins <andre@cilium.io>
@aanm
Copy link
Member Author

aanm commented Dec 1, 2023

/test

@aanm aanm added this pull request to the merge queue Dec 1, 2023
Merged via the queue into cilium:main with commit d618a70 Dec 1, 2023
60 of 61 checks passed
@aanm aanm deleted the pr/fix-cni-add branch December 1, 2023 22:25
@github-actions github-actions bot added backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. and removed backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels Dec 6, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed this from Backport pending to v1.14 in 1.14.5 Dec 6, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.14 to Backport done to v1.14 in 1.14.5 Dec 6, 2023
@julianwiedmann julianwiedmann removed the affects/v1.14 This issue affects v1.14 branch label Feb 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects/v1.12 This issue affects v1.12 branch affects/v1.13 This issue affects v1.13 branch backport/author The backport will be carried out by the author of the PR. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/k8s Impacts the kubernetes API, or kubernetes -> cilium internals translation layers.
Projects
No open projects
Status: Released
Development

Successfully merging this pull request may close these issues.

None yet

7 participants