Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

service cache should handle duplicate endpoints addresses #24202

Merged
merged 1 commit into from Mar 7, 2023

Conversation

aojea
Copy link
Contributor

@aojea aojea commented Mar 6, 2023

In some corner cases, a group of endpointslices may have the same address duplicate in different slices.

Since the Endpoints are cached by address, when aggregating the endpoints, we should merge the content instead of overwriting.

Please ensure your pull request adheres to the following guidelines:

  • For first time contributors, read Submitting a pull request
  • All code is covered by unit and/or runtime tests where feasible.
  • All commits contain a well written commit description including a title,
    description and a Fixes: #XXX line if the commit addresses a particular
    GitHub issue.
  • If your commit description contains a Fixes: <commit-id> tag, then
    please add the commit author[s] as reviewer[s] to this issue.
  • All commits are signed off. See the section Developer’s Certificate of Origin
  • Provide a title or release-note blurb suitable for the release notes.
  • Thanks for contributing!

Kubernetes e2e test now passes

e2e.test -kubeconfig /tmp/kindkind -ginkgo.v --ginkgo.focus="should.mirror.a.custom.Endpoint.with.multiple.subsets.and.same.IP.address"

Fixes: #22446

Solved an issue failing to forward traffic to Services if the Endpoint Slices had the same Address on different Slices

@aojea aojea requested a review from a team as a code owner March 6, 2023 22:07
@aojea aojea requested a review from christarazi March 6, 2023 22:07
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Mar 6, 2023
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label Mar 6, 2023
@aojea
Copy link
Contributor Author

aojea commented Mar 6, 2023

/assign @aanm @squeed

This is much easier to review, and we'll need for 1.11 if possible 🙏

Thanks

@christarazi christarazi added kind/bug This is a bug in the Cilium logic. sig/k8s Impacts the kubernetes API, or kubernetes -> cilium internals translation layers. sig/agent Cilium agent related. affects/v1.11 This issue affects v1.11 branch affects/v1.12 This issue affects v1.12 branch affects/v1.13 This issue affects v1.13 branch release-note/bug This PR fixes an issue in a previous release of Cilium. labels Mar 6, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Mar 6, 2023
Copy link

@varunmar varunmar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely contained change. Thanks!

@borkmann borkmann requested a review from aditighag March 6, 2023 22:22
@borkmann
Copy link
Member

borkmann commented Mar 6, 2023

/test

Job 'Cilium-PR-K8s-1.25-kernel-4.19' failed:

Click to show.

Test Name

K8sDatapathVerifier Runs the kernel verifier against Cilium's BPF datapath

Failure Output

FAIL: terminating containers are not deleted after timeout

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.25-kernel-4.19 so I can create one.

Copy link
Member

@aditighag aditighag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes mostly look good! Thanks!

Kubernetes e2e test now passes

Can we add this test to the cilium k8s conformance test suite?

nit: Can you fix the release note?

Fix a bug where Cilium didn't handle correctly

*didn't correctly handle

Also, release notes are user facing, so they should highlight the issue from the user's perspective. I think your original issue title might be better as a release note "fail to forward traffic to custom Endpoint Slices with the same Address on different Slices".

pkg/k8s/endpoints.go Show resolved Hide resolved
pkg/k8s/endpoints.go Show resolved Hide resolved
@aojea
Copy link
Contributor Author

aojea commented Mar 7, 2023

Can we add this test to the cilium k8s conformance test suite?

I specifically added an e2e test in Kubernetes to verify this behavior, as this bug was reported internally to me and was only present in Cilium and I wanted to check nobody else has it. kubernetes/kubernetes#114155

I apologize but I don't have more time to add new conformance tests, however, I added a commit in #24174 to run this e2e kubernetes conformance tests, so it will be tested once that merges

I spend a lot of time in Kubernetes creating, maintaining and ensuring the kubernetes SIG-network tests covers and test all the known issues and features so the ecosystem is consistent with the behaviors we define in Kubernetes, cilium should run these e2e tests, it should have caught issues like this one or #24174

In some corner cases, a group of endpointslices may have the same
address duplicate in different slices.

Since the Endpoints are cached by address, when aggregating the
endpoints, we should merge the content instead of overwriting.

Signed-off-by: Antonio Ojea <aojea@google.com>
@squeed
Copy link
Contributor

squeed commented Mar 7, 2023

/test

Job 'Cilium-PR-K8s-1.26-kernel-net-next' failed:

Click to show.

Test Name

K8sDatapathConfig Host firewall With VXLAN and endpoint routes

Failure Output

FAIL: Failed to reach 10.0.1.244:80 from testclient-host-nmsvw

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.26-kernel-net-next so I can create one.

Hit #15455

@squeed squeed added area/kube-proxy Issues related to kube-proxy (not the kube-proxy-free mode). and removed area/kube-proxy Issues related to kube-proxy (not the kube-proxy-free mode). labels Mar 7, 2023
@aanm aanm added the affects/v1.10 This issue affects v1.10 branch label Mar 7, 2023
@squeed
Copy link
Contributor

squeed commented Mar 7, 2023

/test-1.26-net-next

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Mar 7, 2023
@YutaroHayakawa YutaroHayakawa merged commit 305af9e into cilium:master Mar 7, 2023
@aojea aojea mentioned this pull request Mar 23, 2023
29 tasks
aojea added a commit to aojea/community-1 that referenced this pull request May 17, 2023
Per the contributor ladder document https://github.com/cilium/community/blob/main/CONTRIBUTOR-LADDER.md#organization-member I'd like to request membership in the org for myself, since I plan to contribute regularly, mainly in the areas related to Kubernetes where I'm sig-network and sig-testing member and Kind maintainer.

I've already contributed adding more test coverage for the Kubernetes integration to the project
- cilium/cilium#25258
- cilium/cilium#24209
fixing some bugs:
- cilium/cilium#24202
- cilium/cilium#24174
and helping to triage issues

Signed-off-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
pchaigno pushed a commit to cilium/community that referenced this pull request May 20, 2023
Per the contributor ladder document https://github.com/cilium/community/blob/main/CONTRIBUTOR-LADDER.md#organization-member I'd like to request membership in the org for myself, since I plan to contribute regularly, mainly in the areas related to Kubernetes where I'm sig-network and sig-testing member and Kind maintainer.

I've already contributed adding more test coverage for the Kubernetes integration to the project
- cilium/cilium#25258
- cilium/cilium#24209
fixing some bugs:
- cilium/cilium#24202
- cilium/cilium#24174
and helping to triage issues

Signed-off-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects/v1.10 This issue affects v1.10 branch affects/v1.11 This issue affects v1.11 branch affects/v1.12 This issue affects v1.12 branch affects/v1.13 This issue affects v1.13 branch kind/bug This is a bug in the Cilium logic. kind/community-contribution This was a contribution made by a community member. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/agent Cilium agent related. sig/k8s Impacts the kubernetes API, or kubernetes -> cilium internals translation layers.
Projects
None yet
8 participants