-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update apiserver endpoint reconcilers to support dual-stack #117704
Update apiserver endpoint reconcilers to support dual-stack #117704
Conversation
Skipping CI for Draft Pull Request. |
b96bc83
to
9996147
Compare
/triage accepted |
to produce better outputs on errors
The unit tests were testing the behavior of the code when the name or namespace of the kubernetes service changed, but (a) this has never happened, (b) this probably won't ever happen, and (c) if it was going to happen, the behavior currently being asserted by the unit tests is clearly not the behavior we would actually want (it leaves a stale set of endpoints behind forever, rather than either deleting the old object or continuing to update both objects) so we would have to completely rewrite those tests at that point anyway. So simplify the unit tests by not testing that; just use default/kubernetes in all the tests.
Sort the test case data so the "setup" fields appear before the "expected result" fields, to make it easier to understand each case. Add descriptions to the tests that didn't have any. Swap the names of two test cases that did the opposite of what they claimed. Improve the names of the Endpoints and EndpointSlice objects used by the tests to make it clearer what some of them are, and so that the two basic endpoint sets (endpoints1/epSlice1 and endpoints2/epSlice2) are the same in both the Create test and the Update test.
They are both compile-time constants; previously the namespace was hardcoded in multiple places and the name was passed from the controller to the reconciler on every call. Simplify by just passing them to the adapter at construct time instead.
The unit tests were checking that TargetRef got copied from the Endpoints to the EndpointSlice, but neither of the reconcilers creates Endpoints with TargetRefs set (and they couldn't even if they wanted to, since the reconciler currently has no information about how/where the apiserver is running.)
They were only used by the unit tests, but even there they are now redundant with expectCreate/expectUpdate. Also remove an incorrect usage of RetryOnConflict in one of the reconcilers; we'll fix this generically later.
Make Endpoints updates work the same way as EndpointSlice updates do, where we just check if the Endpoints provided by the reconciler is (relevantly) different from the one on the server, and Update it if so (or Create it if it doesn't already exist). Merge the Create and Update unit tests together, and update for the fact that Sync will do nothing on a no-op update, and won't try to Create something that already exists or Update something that doesn't.
Rather than having EndpointsAdapter.Get return a v1.Endpoints object to the reconciler, and having the reconciler modify that and pass it back, and duplicating a bunch of code between the two reconcilers, have the adapter just return the list of apiserver IPs instead, and have the reconcilers work with just that, and move all the shared Endpoints comparing/fixing/creating/updating code into the adapter.
When reconciling endpoints, correctly handle the cases where we race with another apiserver on a Create, Update, or Delete.
Pass the service IP to the EndpointsAdapter at construct time so it can reliably filter the endpoint IPs and set the slice AddressType correctly.
Rather than generating the Endpoints from scratch and then generating the EndpointSlice from the Endpoints, generate both of them from scratch (preparing for the possibility of supporting EndpointSlice-only features, such as dual-stack).
In particular, if the reconciler finds itself in a cluster that has a dual-stack pair of EndpointSlices for default/kubernetes, then it should delete the extra slice, since it is probably stale data left behind by a rollback.
(This does not add user-facing dual-stack apiserver support; it just updates the backend so that the rest of the feature can be added later.)
9996147
to
02cd140
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: danwinship The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@danwinship: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
/kind feature
/kind bug
What this PR does / why we need it:
This makes the apiserver endpoint reconcilers support dual-stack. It does not add dual-stack support to anything outside of
pkg/controlplane/reconcilers
; this is just step 1 to making the apiservers dual-stack.In fact, most of this PR is just rewriting the reconcilers in preparation for supporting dual-stack. In current git, the way that the reconcilers work is:
v1.Endpoints
object with the correct IPs and portsThat won't work for dual-stack, because the
v1.Endpoints
object can't represent dual-stack endpoints. So the PR changes the adapter/reconciler interface so that now the reconcilers only work with IPs and ports, and the adapter takes the IPs/ports and creates/updates an appropriate set of Endpoints and one or more EndpointSlices (simplifying the code and fixing a handful of bugs in the process).Which issue(s) this PR fixes:
Fixes #101070
(and does more on top of that)
Special notes for your reviewer:
The first 4 commits (up to "
Misc endpointsadapter_test.go improvements
") are just improvements/cleanups to the unit tests.Then the next 4 (up to "
Remove unused EndpointsAdapter Create/Update return values
") are minor API improvements and code cleanups.The next 2 ("
Merge Create/Update/EnsureEndpointSliceFromEndpoints together
" and "Refactor duplicated EndpointReconciler code into EndpointsAdapter
") are the core of the PR, changing the adapter/reconciler interface as described above so that most of the work is done in the adapter and the reconcilers only work with IPs/ports. (With the following commit, "Use retry.OnError in reconciler Sync
" as a fixup to the new API.)Then the next 2 ("
Make the endpoint reconciler explicitly aware of the primary IP family
" and "Generate both Endpoints and EndpointSlices from scratch
") prepare it to deal with multiple EndpointSlices rather than only one, and "Make the reconciler correctly handle unexpected apiserver EndpointSlices
" actually implements that, fixing #101070 and in particular fixing the "downgrading from dual-stack post-KEP-2438 apiservers to pre-KEP-2438 apiservers" problem.Then finally the last commit actually adds dual-stack support. FWIW, we could omit that commit for now and just merge everything else, since it doesn't do anything externally-visible yet, and maybe the way I changed the API (in terms of separate primary/secondary IP args vs a single array arg) will turn out to not work great with the eventual apiserver changes.
However, it definitely makes sense to commit everything up to "
Make the reconciler correctly handle unexpected apiserver EndpointSlices
", since that is something we want before having proper dual-stack support, for rollback support.Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
/sig network
/sig api-machinery
/assign @aojea @robscott