Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nodeipam: align eTP=Cluster to kubernetes cloud-providers service lb #31406

Merged
merged 2 commits into from Apr 2, 2024

Conversation

MrFreezeex
Copy link
Member

@MrFreezeex MrFreezeex commented Mar 15, 2024

Please ensure your pull request adheres to the following guidelines:

  • For first time contributors, read Submitting a pull request
  • All code is covered by unit and/or runtime tests where feasible.
  • All commits contain a well written commit description including a title,
    description and a Fixes: #XXX line if the commit addresses a particular
    GitHub issue.
  • If your commit description contains a Fixes: <commit-id> tag, then
    please add the commit author[s] as reviewer[s] to this issue.
  • All commits are signed off. See the section Developer’s Certificate of Origin
  • Provide a title or release-note blurb suitable for the release notes.
  • Are you a user of Cilium? Please add yourself to the Users doc
  • Thanks for contributing!

This aligns Node IPAM eTP=Cluster with how kubernetes CCM handle service lb node selection meaning that it will consider all nodes instead of only the node where your selected pods are scheduled. As a result of this change nodeipam will be compatible to Cilium Ingress/GatewayAPI by default which use a dummy endpoint which would fail with the previous behavior. See the first commit description for more details about this.

It also add nodeipam.cilium.io/match-node-labels annotation now that by default select all nodes and that nodeipam might be less usable if this wasn't added on anything that is not a really small cluster.

Fixes: #31356

Change Node IPAM to select all nodes if externalTrafficPolicy=Cluster and add `nodeipam.cilium.io/match-node-labels` annotation

@MrFreezeex MrFreezeex requested review from a team as code owners March 15, 2024 00:50
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Mar 15, 2024
@MrFreezeex MrFreezeex requested a review from qmonnet March 15, 2024 00:50
Copy link
Member

@gandro gandro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Helm-wise this looks good to me, thanks! I'll defer the other bits to the corresponding CODEOWNERs.

@gandro gandro added release-note/minor This PR changes functionality that users may find relevant to operating Cilium. feature/lb-ipam labels Mar 18, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Mar 18, 2024
Copy link
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super familiar with node IPAM so I'll leave the review of the behaviour change to others. Doc change looks good overall, although I have minor improvement suggestions on grammar and style (please see inline below).

Thanks!

Documentation/network/node-ipam.rst Outdated Show resolved Hide resolved
Documentation/network/node-ipam.rst Outdated Show resolved Hide resolved
Documentation/network/node-ipam.rst Outdated Show resolved Hide resolved
Documentation/network/node-ipam.rst Outdated Show resolved Hide resolved
operator/pkg/nodeipam/node_predicates.go Outdated Show resolved Hide resolved
operator/pkg/nodeipam/node_predicates.go Outdated Show resolved Hide resolved
operator/pkg/nodeipam/nodesvclb.go Outdated Show resolved Hide resolved
operator/pkg/nodeipam/nodesvclb.go Outdated Show resolved Hide resolved
Documentation/network/node-ipam.rst Outdated Show resolved Hide resolved
@MrFreezeex MrFreezeex force-pushed the nodeipam-align-ccm branch 4 times, most recently from 0adfa10 to ef44b60 Compare March 18, 2024 19:14
@MrFreezeex MrFreezeex requested a review from qmonnet March 18, 2024 19:15
Copy link
Member

@pippolo84 pippolo84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! 💯

Copy link
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs look better, thanks! I still have a few suggestions on the new paragraph, but it's just trivial changes.

Documentation/network/node-ipam.rst Outdated Show resolved Hide resolved
Documentation/network/node-ipam.rst Outdated Show resolved Hide resolved
Copy link
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@qmonnet
Copy link
Member

qmonnet commented Mar 19, 2024

/test

@MrFreezeex
Copy link
Member Author

I rebased to fix the conflict on the helm values and related generated files ~

@MrFreezeex
Copy link
Member Author

/test

MrFreezeex and others added 2 commits April 2, 2024 13:28
nodeipam was always looking at the related EndpointSlices of the Service
LoadBalancer to decide which nodes should be "advertised". This is a
problem when Service LoadBalancers are created with dummy endpoints which
is the case for Cilium Ingress/GatewayAPI for instance.

This commit attempts to replicate a bit more the behavior that a CCM
would do to select nodes when eTP=Cluster. In that case we select all
Nodes we consider all nodes as potential candidate instead of
checking where the pods are scheduled via their EndpointSlices.

In the case of eTP=Local, we fallback to the previous behavior of
checking the EndpointSlice to know which Nodes are backing your
corresponding Service. This is not the behavior done in classic CCM
as eTP local seems to be typically implemented by Cloud providers via an
health check mechanism that we currently don't have in nodeipam. And at
this very moment is not planned to be implemented because of the extra
complexity. If this gets implemented at some point nodeipam could also
align with CCM on the eTP=Local case though.

Also in both eTP=Cluster/Local we will respect KEP-3458 that is becoming
stable in Kubernetes 1.30 and dictate how CCM does their first node
filtering. The Predicates were extracted as is from
kubernetes/cloud-provider repo where this is normally implemented for
CCM.

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
Now that nodeipam consider all nodes as potential candidate in
the eTP=Cluster case, a way for user to filter nodes become way more
critical and thus this commit is implementing this.

Co-authored-by: Brendan Dalpe <bdalpe@gmail.com>
Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
auto-merge was automatically disabled April 2, 2024 11:28

Head branch was pushed to by a user without write access

@MrFreezeex
Copy link
Member Author

Rebased again because the ipsec upgrade job was failing because of this: https://cilium.slack.com/archives/C7PE7V806/p1711615975498649?thread_ts=1711592330.743979&cid=C7PE7V806

@MrFreezeex
Copy link
Member Author

/test

@aanm aanm enabled auto-merge April 2, 2024 11:44
@joestringer joestringer added this pull request to the merge queue Apr 2, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Apr 2, 2024
Merged via the queue into cilium:main with commit 11e7dbe Apr 2, 2024
62 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/lb-ipam ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CFP: align Cilium nodeipam node selection with kubernetes cloud-providers
6 participants