-
Notifications
You must be signed in to change notification settings - Fork 1.6k
KEP 2369: Kubelet Sizing Providers #2370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| ## Alternatives | ||
|
|
||
| 1. add a built-in sizing provider in-tree. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not add an HTTP endpoint with a standardized schema that one can POST the desired values to, with the appropriate authn and authz?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make sure you add a consideration and discussion of this to the Alternatives section. Right now it has one bullet about an in-tree implementation, but doesn't consider pros and cons. Pros and cons of all alternative solutions should be outlined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ehashman
|
|
||
| ## Summary | ||
|
|
||
| Kubelet should have a sizing provider mechanism, which could give kubelet an ability to dynamically fetch sizing values for memory and cpu reservations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The title of the KEP is a little bit misleading. When I saw it initially, I thought it is for dynamically changing node capacity, instead of the node allocatable. Both were requested in the past by me. ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dchen1107 Thanks for your comment, I will be happy to modify the title if it helps avoid the confusion. Please let me know what would be your suggestion.
|
|
||
| ### Non-Goals | ||
|
|
||
| * For now the plugin mechanism is proposed here is only for fetching values of `system reserved` and `kube reserved`. Similar approach can be taken to dynamically fetch the values of other parameters of the kubelet (e.g. `evictionHard`) but they are out of scope of this KEP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be an interesting follow-on exercise to enumerate the kubelet configuration values that reasonably vary by instance type. implicit in this proposal is the thesis I agree with that merging the existing kubelet file based configuration with instance type specific overrides is cumbersome in many practical environments versus exec-ing out to some other local intelligence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @derekwaynecarr for your comment.
c04d99f to
aa26351
Compare
|
/assign @deads2k @derekwaynecarr |
aa26351 to
6467340
Compare
|
I'm aware this KEP is in-review and hasn't been merged yet. But I just wanted to highlight that the KEP number (in the title of this PR, the |
6467340 to
bb7fb25
Compare
|
Thanks @JamesLaverack for pointing that out. I have updated all the references to |
bb7fb25 to
ab1414e
Compare
|
Not sure why |
|
I think your table of contents is out of date. The PRR section in the ToC has a few extra headings that are not in the text. |
ab1414e to
9919dcb
Compare
Yeah, that was the issue. @ehashman thanks for the hint to run |
|
|
||
| TBD for beta. | ||
|
|
||
| * **What specific metrics should inform a rollback?** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't strictly required for alpha, but as I recall, if these values are too low it can lead to node readiness/responsiveness problems. It may be a good idea to indicate how to tell if the current size is well suited to the current usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @deads2k for your comment.
|
approving prr for alpha. /approve |
|
/assign @dchen1107 |
| #### Alpha -> Beta Graduation | ||
|
|
||
| * integration or e2e tests. | ||
| * at least one working plugin implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two metrics worth tracking for exec plugins were defined as part of kubelet cred provider plugin.
similar metrics would apply here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a slight difference between cred provider plugin and this node sizing provider plugin.
The cred provider plugin gets executed throughout the lifetime of the kubelet whenever there is a request to provision a pod with matching image name. So It makes sense for them to have kubelet_credential_provider_plugin_errors which will track the number errors that occurred from invoking an exec plugin and kubelet_credential_provider_plugin_duration which will track the duration of execution by plugins.
On the other hand, node sizing provider plugin gets executed only during the start up of the kubelet. Unlike cred provider plugin, there is no repeated execution of node sizing provider plugin by the kubelet spread over the time (as long as kubelet is running). If a node sizing provider plugin is enabled and it fails to fetch the sizing values, the kubelet will abort the startup and log the error in kubelet logs. Since the user has explicitly requested the sizing values to be fetched using the exec plugin, kubelet should not default to any other values. Defaulting to any other values and bringing up the kubelet may result in an unpredictable behavior, such as node lock ups, as the default sizing values may be not optimal.
The node sizing provider plugin is either going to succeed or fail during the kubelet startup, since it's not going to generate any time series data like cred provider plugin I have deliberately dropped the Metrics section from this KEP.
But in case you still feel that there is a scope for adding any other metrics, please let me know.
|
I am supportive of the idea to delegate computation of these fields dynamically to a plugin that can apply their own local formula. I suspect as this moves out of alpha->beta phase, we may want to clarify if the pods-per-core heuristic is provided as an argument to this plugin OR derived from this plugin as the desired pod density is something that would skew reservations. I think that can be evolved in a pre-beta evaluation of the feature. defer for @dchen1107 to provide further comment. |
9919dcb to
f432371
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: deads2k, harche The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
d45d31a to
edaf034
Compare
Signed-off-by: Harshal Patil <harpatil@redhat.com>
edaf034 to
43c0826
Compare
|
Another potential idea was to allow kubelet |
| ### Test Plan | ||
|
|
||
| Alpha: | ||
| * Unit tests for the exec plugin provider | ||
| * Unit tests for API validation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any more detail here?
ravisantoshgudimetla
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this LGTM. Thank you for working on this @harche.
I think it will be beneficial from Windows node stand point too, considering we provide static recommendations.
|
|
||
| ### Metrics | ||
|
|
||
| N/A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think one metric is startup time even though it is not time series, it will be good to know how long kubelet took to startup with a plugin.
| ## Alternatives | ||
|
|
||
| ### Add a built-in sizing provider in-tree. | ||
| #### Pros |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another alternative is what @derekwaynecarr suggested. We implemented a plugin based mechanism in scheduler using both intree and http. Using metadata server is more inline with cloudproviders like AWS do. We can perhaps add this as supported implementation when we graduate to beta?
|
@ravisantoshgudimetla just updating from conversation in sig-node, but @dchen1107 was against a sparse type of plugin model but appeared more open to supporting --config as a URL per my comment above on slack. |
+1 that this would be beneficial for Windows. |
|
We decided not to pursue this I believe. /close |
|
@ehashman: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
Do we have alternative for this feature? |
|
This can be calculated by code responsible for provisioning a cluster/setting up a node with kubelet rather than delegating it to the kubelet itself per this proposal. |
Dynamic sizing providers for kubelet
Enhancement Issue: #2369
/sig node
/cc @derekwaynecarr @dchen1107 @mrunalp @ehashman @rphillips
Signed-off-by: Harshal Patil harpatil@redhat.com