Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discover Instance Type Capacity Memory Overhead Instead of vmMemoryOverheadPercent #5161

Open
jonathan-innis opened this issue Nov 25, 2023 · 2 comments
Labels
feature New feature or request

Comments

@jonathan-innis
Copy link
Contributor

jonathan-innis commented Nov 25, 2023

Description

Tell us about your request

We could consider a few options to discover the expected capacity overhead for a given instance type:

  1. We could store the instance type capacity in memory once a version of that type has been launched and use that as the capacity value after the initial launch rather than basing our calculations off of some heuristic
  2. We could launch instance types, check their capacity and diff the reported capacity from the actual capacity in a generated file that can be shipped with Karpenter on release so we always have accurate measurements on the instance type overhead.

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

Calculating the difference between the EC2-reported memory capacity and the actual capacity of the instance as reported by kubelet.

Are you currently working around this issue?

Using a heuristic vmMemoryOverheadPercent value right now that is tunable by users and passed through karpenter-global-settings

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@jonathan-innis jonathan-innis added the feature New feature or request label Nov 25, 2023
@jonathan-innis
Copy link
Contributor Author

This is a "transferred" version of the issue in kubernetes-sigs/karpenter: kubernetes-sigs/karpenter#716

@sosheskaz
Copy link
Contributor

Could a simple mitigation be to add a static setting? e.g. vmMemoryOverheadOffset, which would take a static amount of memory?

When running compute which is highly heterogeneous, but relatively large, this would simplify the practice of managing this setting, because a sufficiently large static overhead would be able to outweigh the percentage, and allow for less waste.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants