-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating EndpointSlice KEP to include EndpointsMirroring and plans for 1.19 and beyond #1713
Updating EndpointSlice KEP to include EndpointsMirroring and plans for 1.19 and beyond #1713
Conversation
* Kubernetes 1.19: If the number of endpoints in one Endpoints object exceeds | ||
1000, generated a warning event on the Endpoints resource. | ||
* Kubernetes 1.20: Limit the number of endpoints in one Endpoints object to | ||
1000 and generate warning events on Endpoints resource if additional endpoints |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how was this limit selected? Does this map to a specific problematic quantity or threshold? How frequently would events be created to indicate this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great question, @freehan probably has more background on this. I'm not tied to this in any way, it's something from the original KEP that I've clarified here, but I'm happy to push the timeline on this and/or change the parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scalability team has a recommended threshold based on testing. For endpoint, the threshold is 250. Above 250 endpoints with many services and nodes, API server starts to get stressed.
In reality, if a cluster has one or few service with 1000+ endpoints that rarely changes, it does not hurt that much. Hence, picking the threshold and warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
putting an Event creation in the Endpoint controller sync path could be impactful depending on how often this was written. was an annotation on the Endpoints object considered instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had originally thought that an event would be more visible, especially if it was a warning. I'd imagine that watching a cluster for warning events is reasonably common. With that said, I'd already thought we'd have to have some kind of rate-limiting mechanism here to only publish an event like this intermittently for each Endpoints resource.
As I've thought about it more, I like your annotation idea. It seems like a warning event might actually be too noisy for something that users can't actually do anything about (other than disabling the Endpoints controller or having smaller Services, neither of which we want).
To go one step further, would a label be appropriate here? I think it could be useful to easily select all Endpoints in a cluster that are currently over-capacity to understand the impact of upgrading to the release that will truncate those Endpoints resources. Maybe a endpoints.kubernetes.io/over-capacity: "true"
label? This could persist after the release where Endpoints start to get truncated as a helpful indication of the effect of this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed an update with this approach. Added a slight tweak to it that may not make sense. In 1.20 endpoints.kubernetes.io/over-capacity
will be set to warning
and in 1.21 it will be set to truncated
.
c330cc4
to
062d0a1
Compare
Also, btw: thanks for adding the Windows plans to the KEP! |
062d0a1
to
a6f88a5
Compare
a6f88a5
to
ebd5114
Compare
### EndpointSliceMirroring Controller | ||
|
||
In some cases, custom Endpoints resources are created by applications. To ensure | ||
that these applications will not need to immediately update their |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
immediately
This makes it sound like they will eventually have to switch their implementations. I would expect this controller will be effective as long as the v1 Endpoints API is supported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, thanks! Updated to:
To ensure that these applications will not need to concurrently write to both Endpoints and EndpointSlice resources, a new EndpointSliceMirroring controller will...
EndpointSliceMirroring controller will be used to mirror custom Endpoints | ||
resources to corresponding EndpointSlices. | ||
|
||
* A new `endpointslicemirroring.kubernetes.io/skip` label will be introduced. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arbitrary label names! bikeshed time!
I have a slight preference for the prefix being a resource, like endpointslice.kubernetes.io/...
, and making the part after the slash describe what the label is opting out of. Other label/annotation keys that follow a similar pattern:
- statefulset.kubernetes.io/pod-name
- cronjob.kubernetes.io/instantiate
- endpointslice.kubernetes.io/managed-by
To opt out, are you wanting a true value, a false value, or any value (I have a slight preference for any value, which lets us set to an empty string)?
-
true value means we need a negative sounding label name:
endpointslice.kubernetes.io/skip-mirror=true
endpointslice.kubernetes.io/no-mirror=true
-
false value means we need a positive sounding label name:
endpointslice.kubernetes.io/mirror=false
-
any value means we need a negative sounding label name but don't have the mental overhead of a literal true value (mere presence with any value means we skip)
endpointslice.kubernetes.io/skip-mirror
endpointslice.kubernetes.io/no-mirror
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my preference would be for the label value to be meaningful. I think it's confusing for endpointslice.kubernetes.io/skip-mirror=false
to result in a "true" interpretation. Admittedly we'd have a similar problem with endpointslice.kubernetes.io/skip-mirror=yes
, but that seems at least slightly less likely. I'd originally thought of this as a controller specific annotation, but because it's something that will be part of the EndpointSlice API, I agree with the endpointslice
prefix here. So maybe endpointslice.kubernetes.io/skip-mirror=true
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated PR to include this approach.
…s for 1.19 and beyond The most notable addition here is a new concept of EndpointSliceMirroring. This will ensure that custom Endpoints are mirrored to EndpointSlices automatically. Beyond this addition, this clarifies the roll out plan and updates anything that has become outdated.
ebd5114
to
1aa15cb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM - Will approve and we can iterate nits
/approve
/lgtm
|
||
* Kubernetes 1.20: If the number of endpoints in one Endpoints object exceeds | ||
1000, set `endpoints.kubernetes.io/over-capacity` label to "warning". | ||
* Kubernetes 1.21: Limit the number of endpoints in one Endpoints object to 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in favor of doing this, but that seems relatively short. Any reason not to drag this out a bit? E.g. 22 or 23?
and EndpointSlice resources, a new EndpointSliceMirroring controller will be | ||
used to mirror custom Endpoints resources to corresponding EndpointSlices. | ||
|
||
* A new `endpointslice.kubernetes.io/skip-mirror` label will be introduced. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we do this without a label?
If not, it's a bit weird to have an endpointslice.kubernetes.io
label on an Endpoints. Maybe endpoints.kubernetes.io/disable-endpoint-slice-mirror
? Minor point
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: robscott, thockin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The most notable addition here is a new concept of EndpointsMirroring. This will ensure that custom Endpoints are mirrored to EndpointSlices automatically. Beyond this addition, this clarifies the roll out plan and updates anything that has become outdated.
/sig network
/cc @liggitt @freehan @thockin