Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating new EndpointSliceProxying feature gate for kube-proxy, enabling EndpointSlice feature gate by default #86137

Merged
merged 2 commits into from Jan 18, 2020

Conversation

robscott
Copy link
Member

@robscott robscott commented Dec 10, 2019

What type of PR is this?
/kind feature

What this PR does / why we need it:
This creates a new EndpointSliceProxying feature gate to cover EndpointSlice consumption (kube-proxy) and allow the existing EndpointSlice feature gate to focus on EndpointSlice production only. Along with that addition, this enables the EndpointSlice feature gate by default, now only affecting the controller.

The rationale here is that it's really difficult to guarantee all EndpointSlices are created in a cluster upgrade process before kube-proxy attempts to consume them. Although masters are generally upgraded before nodes, and in most cases, the controller would have enough time to create EndpointSlices before a new node with kube-proxy spun up, there are plenty of edge cases where that might not be the case. The primary limitation on EndpointSlice creation is the API rate limit of 20QPS. In clusters with a lot of endpoints and/or with a lot of other API requests, it could be difficult to create all the EndpointSlices before a new node with kube-proxy targeting EndpointSlices spun up.

Separating this into 2 feature gates allows for a more gradual rollout with the EndpointSlice controller being enabled by default in 1.18, and EndpointSlices for kube-proxy being enabled by default in the next release.

Special notes for your reviewer:
I'm not sure if the new EndpointSliceProxying feature gate should be alpha or beta at this point. It will not be enabled by default, but it is being split out of a feature gate that is already beta.

Does this PR introduce a user-facing change?:

EndpointSlices will now be enabled by default. A new `EndpointSliceProxying` feature gate determines if kube-proxy will use EndpointSlices, this is disabled by default.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

/sig network
/priority important-longterm
/cc @freehan @liggitt @thockin

@k8s-ci-robot k8s-ci-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Dec 10, 2019
@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. sig/network Categorizes an issue or PR as relevant to SIG Network. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/ipvs sig/auth Categorizes an issue or PR as relevant to SIG Auth. labels Dec 10, 2019
@robscott
Copy link
Member Author

robscott commented Jan 8, 2020

The EndpointSlice controller has had all known bugs fixed, I think it's ready to be enabled by default now. This PR allows us to gradually enable EndpointSlices, starting with only the controller. Let me know if there's any reason not to enable this by default now.

/assign @thockin @liggitt

@aojea
Copy link
Member

aojea commented Jan 8, 2020

/test pull-kubernetes-e2e-kind-ipv6

@liggitt
Copy link
Member

liggitt commented Jan 8, 2020

The EndpointSlice controller has had all known bugs fixed, I think it's ready to be enabled by default now. This PR allows us to gradually enable EndpointSlices, starting with only the controller. Let me know if there's any reason not to enable this by default now.

/assign @thockin @liggitt

I haven't had a chance to review this specifically, but that matches my expectations

@robscott
Copy link
Member Author

/retest

2 similar comments
@robscott
Copy link
Member Author

/retest

@robscott
Copy link
Member Author

/retest

@@ -1051,6 +1051,13 @@ items:
- create
- patch
- update
- apiGroups:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these changes in the same commit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, I've split them into two separate commits now, thanks!

This creates a new EndpointSliceProxying feature gate to cover EndpointSlice
consumption (kube-proxy) and allow the existing EndpointSlice feature gate to
focus on EndpointSlice production only. Along with that addition, this enables
the EndpointSlice feature gate by default, now only affecting the controller.

The rationale here is that it's really difficult to guarantee all EndpointSlices
are created in a cluster upgrade process before kube-proxy attempts to consume
them. Although masters are generally upgraded before nodes, and in most cases,
the controller would have enough time to create EndpointSlices before a new node
with kube-proxy spun up, there are plenty of edge cases where that might not be
the case. The primary limitation on EndpointSlice creation is the API rate limit
of 20QPS. In clusters with a lot of endpoints and/or with a lot of other API
requests, it could be difficult to create all the EndpointSlices before a new
node with kube-proxy targeting EndpointSlices spun up.

Separating this into 2 feature gates allows for a more gradual rollout with the
EndpointSlice controller being enabled by default in 1.18, and EndpointSlices
for kube-proxy being enabled by default in the next release.
This enables the EndpointSlice controller by default, but does not make
kube-proxy a consumer of the EndpointSlice API.
Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 18, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: robscott, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 18, 2020
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit 967b40f into kubernetes:master Jan 18, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.18 milestone Jan 18, 2020
@robscott robscott deleted the endpointslice-proxy-feature branch January 24, 2020 18:07
@liggitt liggitt removed the sig/auth Categorizes an issue or PR as relevant to SIG Auth. label Feb 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/ipvs cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/network Categorizes an issue or PR as relevant to SIG Network. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants