New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix endpoint slice filtering to ensure we get all the necessary objects #25351
Conversation
pkg/k8s/watchers/watcher.go
Outdated
@@ -567,7 +572,7 @@ func (k *K8sWatcher) enableK8sWatchers(ctx context.Context, resourceNames []stri | |||
case resources.K8sAPIGroupEndpointSliceV1Discovery: | |||
// no-op; handled in resources.K8sAPIGroupEndpointV1Core. | |||
case resources.K8sAPIGroupEndpointV1Core: | |||
k.initEndpointsOrSlices(k.clientset.Slim(), serviceOptModifier) | |||
k.initEndpointsOrSlices(k.clientset.Slim(), serviceAndEndpointOptModifier, endpointSliceOptModifier) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we should plumb it in here or just just run GetEndpointSliceListOptionsModifier
where we need it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine here. The PR #23977 removes this from here anyway. I'll rebase and incorporate these changes once this PR lands.
pkg/k8s/watchers/watcher.go
Outdated
@@ -567,7 +572,7 @@ func (k *K8sWatcher) enableK8sWatchers(ctx context.Context, resourceNames []stri | |||
case resources.K8sAPIGroupEndpointSliceV1Discovery: | |||
// no-op; handled in resources.K8sAPIGroupEndpointV1Core. | |||
case resources.K8sAPIGroupEndpointV1Core: | |||
k.initEndpointsOrSlices(k.clientset.Slim(), serviceOptModifier) | |||
k.initEndpointsOrSlices(k.clientset.Slim(), serviceAndEndpointOptModifier, endpointSliceOptModifier) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine here. The PR #23977 removes this from here anyway. I'll rebase and incorporate these changes once this PR lands.
/test |
8a6396e
to
124a014
Compare
@odinuge thank you for your contribution! I think this would be useful to land in 1.14, but we can hold off until after the release has been cut and treat this as a backport. The tree is currently considered frozen, so now would be a good time to rebase. 😄 |
/test |
@odinuge, this PR is suffering from two tests that are stuck. This can be fixed by rebasing this PR again |
@odinuge do you have time to rebase this? I realize all the box-checking is annoying, and we really appreciate the contribution. I'm sorry it takes so much clicking to get stuff in 😞. |
Yes, I'll rebase when I'm back at my computer at work later today. I've been OOTO for a few weeks (it was also in my gh status until yesterday), so I have not been able to rebase during that time. |
124a014
to
e6c96a3
Compare
Rebased now, tho. some earlier changes changed fully how this works, so this won't be possible to backport now 😞 It does however looks like #23977 incorrectly changed how this works, adding this filtering with the wrong arguments... I'll debug some more, but that feels like a big release blocker if its the case.. |
e6c96a3
to
385f75c
Compare
Yes, so for endpointslices, labels from service to endpointSlice will only be done for k8s v1.20+ (kubernetes/kubernetes#94443). However, for endpoints, it has been done for significantly longer (https://github.com/kubernetes/kubernetes/blame/v1.27.0/pkg/controller/endpoint/endpoints_controller.go#L461-L479). And as I understand it, cilium still wants to support k8s versions older than v1.20, right? (given I read the current code right) And to be clear; this PR should mitigate this, and k8s pre v1.20 should start working again. Then, we can concider adding the proxy-name filtering to endpointslices when its safe. Should probably add some more docs and a TODO about that. |
On that note, i'd be curious about the expected behavior of cilium with |
/test |
Likewise, it seems like the PR description is no longer accurate; by my reading of the code, we also dropped headless endpointslices before. The bug fix now is to make service name work on <v1.20. Right? |
I pushed a new commit yesterday with
Yee. I could also update the commit messages (I did add an For the changelog I think its fine/useful having stating the new filtering, given the previous PR that changed the filtering was only a refactor that I assume wasn't intentionally changing the filtering. |
Oooh, or did |
/test |
bingo. but this should be an easy "backport". |
Please do update the commit messages (and maybe squash it to one?). I realize this PR has been a bit of a saga, but the git way is to forget the journey :-). We'll get this in soon. I appreciate the fix. |
3c85df3
to
6726db0
Compare
Thanks! Squashed it and updated the commit msg now 😄 Need some more time to wrap my head around things, but I think I got it right now. Mind sanity checking the last statement to make sure I/we all see this the same way?
That is true, right? Since if the service is missing, this will ensure we only add it to that internal cache, and nothing else will use it? Or? https://github.com/cilium/cilium/blob/v1.14.0-snapshot.0/pkg/k8s/service_cache.go#L284-L286 The code base has change sooo much since I went OOTO a few weeks ago... 😅 |
This fixes the filtering of endpoint slices to ensure that we support all the k8s versions we intend to. This ensures that we always filter out endpoint slices with the well-known "headless" label, and _do not_ filter out any endpoint slices based on the service proxy label. In pre Kubernetes v1.20, the labels on a service were not mirrored into the labels of the endpoint slice. The headless label was not applied. See PR 94443 in kubernetes/kubernetes for more info. When no longer supporting Kubernetes v1.20, we can remove this custom logic - and use the same label filter for endpoints, services and endpoint slices. Historically, we had no filters on the endpoint slice objects, but with the two referred commits, the same filter we had for endpoints and services was introduced to endpoint slices as part of the refactor. The reason we don't revert the behavior directly, is that we _do want_ to filter out endpoint slices for headless services, like we do with normal endpoints. For completeness; the end user behavior will now be equal for both endpoints and endpoint slices; since we will always filter the services in the same way, and when we get an endpoint slice without a corresponding service in state, we effectively ignore that endpoint slice. Fixes: ca3a4df ("k8s: Add Resource[*Endpoints] to shared resources") Fixes: 82a728a ("agent, operator, clustermesh-apiserver: use Resource[*Endpoints]") Signed-off-by: Odin Ugedal <ougedal@palantir.com> Signed-off-by: Odin Ugedal <odin@uged.al>
6726db0
to
2c5d742
Compare
I would mention that this fixes the specific use case of using services with |
/test |
Sounds like when Cilium v1.16 is out then we can drop the changes in this PR? That's when K8s 1.20 support would be dropped. |
Missed this before the approvals. Still want me to add this to changelog and/or commit messages? Feels like things are pretty overloaded with info already tbh., but happy to add if you still want |
Please ensure your pull request adheres to the following guidelines:
description and a
Fixes: #XXX
line if the commit addresses a particularGitHub issue.
Fixes: <commit-id>
tag, thenplease add the commit author[s] as reviewer[s] to this issue.
Fixes: #issue-number