Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a bug where cilium host IP is not read from k8s node annotations #27590

Merged
merged 1 commit into from Aug 23, 2023

Conversation

hemanthmalla
Copy link
Member

@hemanthmalla hemanthmalla commented Aug 18, 2023

After 0696874 refactored the logic to read annotations and change the default behavior, addrs array was never assigned to newNode.IPAddresses after it was populated.

Fixes: #27439 partly, there seem to be a race between when n.localNode.IPAddresses is populated and when the first ciliumnode(CN) update is made after an agent restart. In our fork we temporarily added a behavior to retain host IP from ciliumnode when localNode hasn't yet been populated to avoid the double writes. This isn't perfect because in the rare event where cilium_host's router IP has been removed, it would not be updated on the CN.

Open Questions :

  • The daemon initialization process already seems to have strict requirements in what order restoration should happen. Can the order be improved so that we don't update CN before localNode.IPAddresses are populated ?

Thanks to @jaredledvina for identifying the duplicate writes

@hemanthmalla hemanthmalla requested a review from a team as a code owner August 18, 2023 13:58
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Aug 18, 2023
@hemanthmalla
Copy link
Member Author

cc @christarazi / @aanm

@joestringer joestringer added release-note/bug This PR fixes an issue in a previous release of Cilium. affects/v1.12 This issue affects v1.12 branch affects/v1.13 This issue affects v1.13 branch affects/v1.14 This issue affects v1.14 branch needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch labels Aug 18, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Aug 18, 2023
@christarazi
Copy link
Member

The daemon initialization process already seems to have strict requirements in what order restoration should happen. Can the order be improved so that we don't update CN before localNode.IPAddresses are populated ?

I think this sound reasonable to me. I am a bit surprised that's not the case today actually.

Copy link
Member

@christarazi christarazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find!

@aanm aanm self-requested a review August 18, 2023 21:52
@aanm aanm added needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch kind/regression This functionality worked fine before, but was broken in a newer release of Cilium. labels Aug 21, 2023
Copy link
Member

@aanm aanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! Thank you

@aanm
Copy link
Member

aanm commented Aug 21, 2023

/test

After
cilium@0696874
refactored the logic to read annotations and change the default behavior,
addrs array was never assigned to newNode.IPAddresses after it was
populated.

Signed-off-by: Hemanth Malla <hemanth.malla@datadoghq.com>
@hemanthmalla
Copy link
Member Author

/test

@joestringer joestringer merged commit c0a5fbe into cilium:main Aug 23, 2023
60 checks passed
@tklauser tklauser mentioned this pull request Aug 23, 2023
8 tasks
@tklauser tklauser added backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. and removed needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch labels Aug 23, 2023
@tklauser tklauser mentioned this pull request Aug 24, 2023
9 tasks
@tklauser tklauser added backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. and removed needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch labels Aug 24, 2023
@joestringer joestringer added backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. and removed backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels Aug 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects/v1.12 This issue affects v1.12 branch affects/v1.13 This issue affects v1.13 branch affects/v1.14 This issue affects v1.14 branch backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. kind/regression This functionality worked fine before, but was broken in a newer release of Cilium. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CiliumInternalIP removed & re-added on cilium-agent restart
5 participants