Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clustermesh: Add EndpointSlice support for API server #20697

Merged

Conversation

YutaroHayakawa
Copy link
Member

Please see the commit message for details. I think we need to backport this to 1.10 and 1.11 because it meets the backport criteria "Major bugfixes relevant to the correct operation of Cilium" because, without this change, global service only contains local endpoints for secondary IP family when users configure dual-stack service and their k8s support EndpointSlice which is an incorrect behavior I think.

Add EndpointSlice support for clustermesh-apiserver

Currently, clustermesh-apiserver doesn't support synchronizing
EndpointSlice to kvstore. This happens because clustermesh-apiserver
always set DaemonConfig.K8sEnableEndpointSlice option to false while
calling DaemonConfig.Populate(). As a result, clustermesh-apiserver
always fallbacks to the Endpoints.

The problem of this is we cannot support dual-stack global service since
Endpoints only contains the backends for primary IP address family
(Service.spec.ipFamilies[0]). Thus, clustermesh-apiserver only
synchronizes backends with single IP address family.

Also, when cilium-agents are running with EndpoitSlice and
clustermesh-apiservers are running with Endpoints, it will make a state
that global services only contain local endpoints for secondary IP
address family which is unexpected from users' perspective.

To fix that, we expose a new command line
configuration knob --enable-k8s-endpointslice=<bool> for
clustermesh-apiserver and populate its value from cilium-config
ConfigMap. So that cilium-agent and clustermesh-apiserver always use the
same endpoint type.

Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
@YutaroHayakawa YutaroHayakawa added kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. needs-backport/1.10 labels Jul 28, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.11.8 Jul 28, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.10.14 Jul 28, 2022
@YutaroHayakawa YutaroHayakawa marked this pull request as ready for review July 28, 2022 15:20
@YutaroHayakawa YutaroHayakawa requested review from a team as code owners July 28, 2022 15:20
Copy link
Member

@sayboras sayboras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 💯

@YutaroHayakawa
Copy link
Member Author

YutaroHayakawa commented Jul 29, 2022

/test

Job 'Cilium-PR-K8s-1.23-kernel-4.19' failed:

Click to show.

Test Name

K8sVerifier Runs the kernel verifier against Cilium's BPF datapath

Failure Output

FAIL: terminating containers are not deleted after timeout

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.23-kernel-4.19 so I can create one.

Job 'Cilium-PR-K8s-1.16-kernel-4.9' failed:

Click to show.

Test Name

K8sVerifier Runs the kernel verifier against Cilium's BPF datapath

Failure Output

FAIL: Failed to load BPF program bpf_host with datapath configuration:

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.16-kernel-4.9 so I can create one.

@aanm aanm merged commit 5b2b93f into cilium:master Jul 29, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.10 in 1.10.14 Aug 9, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.10 in 1.11.8 Aug 9, 2022
@tklauser tklauser added the backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. label Aug 11, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.10 to Backport done to v1.11 in 1.11.8 Aug 11, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.10 to Backport done to v1.10 in 1.10.14 Aug 11, 2022
@tklauser tklauser added backport-done/1.12 The backport for Cilium 1.12.x for this PR is done. and removed backport-pending/1.12 labels Aug 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. backport-done/1.12 The backport for Cilium 1.12.x for this PR is done. kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
No open projects
1.10.14
Backport done to v1.10
1.11.8
Backport done to v1.11
Development

Successfully merging this pull request may close these issues.

None yet

5 participants