-
Notifications
You must be signed in to change notification settings - Fork 832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for more kubelet and container runtime configuration #1346
Conversation
✔️ Deploy Preview for karpenter-docs-prod ready! 🔨 Explore the source changes: 61856fb 🔍 Inspect the deploy log: https://app.netlify.com/sites/karpenter-docs-prod/deploys/621358737b017b0007cbc491 😎 Browse the preview: https://deploy-preview-1346--karpenter-docs-prod.netlify.app |
) | ||
|
||
// ContainerRuntimeConfiguration defines args to be used when configuring the Container Runtime. | ||
// Note, not all Container Runtime implementations will support all of these settings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this struct from? Is this a generic interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not found any existing API (struct) for those parameters. However, most Container Runtimes (CRI implementations) accept some configuration (usually via config files). This was inspired by bottlerocket which will then create the corresponding config files:
- For the Docker Container Runtime it will add it to daemon-json
- For containerd it will be written to the containerd-config
As Container Runtime configuration is not cloud provider specific, I thought that it would make sense to place it alongside KubeletConfiguration
in the Constraints
struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not comfortable defining this abstraction in Karpenter without deep understanding of where this is going. I worry that this will become an unmaintainable bag of parameters. It's a different story for kubelet configuration, since it's an OSS standard we're taking a subset of.
Are you blocked on this? What's the use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I am blocked on this. We need to be able to define the registry mirrors, to make sure all container images are pulled through our own (pull through caching & security policy enforcing) registries.
We can also move it into AWS provider specific part (the AWS
struct). However, as written before, this kind of settings are quite common among Container Runtimes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is registry mirrors all you needed? There's a lot that's being added here and I want to make sure we only add fields as they're necessary and prevent a bloated spec.
Keep in mind that eventually we do intend on inlining UserData in the Provisioner as well which could be used to support any kind of customization (OS level or kubelet / container-runtime level).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep in mind that eventually we do intend on inlining UserData in the Provisioner as well which could be used to support any kind of customization (OS level or kubelet / container-runtime level).
@suket22 With this PR I just followed the path already paved (in the current code base), which does not support setting a verbatim UserData field and instead uses input from KubeletConfiguration
and AWS
structs to assemble the UserData. If you intend to replace all that logic with a simple userData
field (e.g. in Constraints
or AWS
structs) that's fine and I am with you (and I will gladly refactor things with this PR). Just let me know which directions you guys want to go.
Is registry mirrors all you needed? There's a lot that's being added here and I want to make sure we only add fields as they're necessary and prevent a bloated spec.
While I understand your concerns, you should keep in mind that if you abstract over a configurable system (like for example AML2, bottlerocket, the kubelet, containerd, ...), there is a reason why those system have so many configuration parameters and that your abstractions must have means to pass on such configuration parameters to the underlying systems. Actually, this is a strong hint that Approach 3 - Provide a more generic way to specify launch template contents in the Provider might be the right way to avoid having to change karpenter with each user (specific) need.
In our specific case we need to fine tune the kubelet and containerd to make sure we are compliant with several different high availability and security compliance requirements. For example we must be able to set eventRecordQPS
(to unlimited) for security compliance reasons (see CIS Kubernetes Benchmarks control 4.2.9).
pkg/apis/provisioning/v1alpha5/container_runtime_configuration.go
Outdated
Show resolved
Hide resolved
pkg/apis/provisioning/v1alpha5/container_runtime_configuration.go
Outdated
Show resolved
Hide resolved
pkg/apis/provisioning/v1alpha5/container_runtime_configuration.go
Outdated
Show resolved
Hide resolved
We should also update the documentation for this. If you could list out all the kubeletConfiguration fields here https://karpenter.sh/v0.6.3/provisioner/#speckubeletconfiguration and add containerRuntimeConfiguration that would be great! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly lgtm.
-
Could we add tests for this code? The UserData logic should definitely be added to IMO. @spring1843 recently added support for coveralls reports btw, but not going to block this PR on those results.
-
Did you get a chance to try your change out as well and verify if that worked as expected? The existing CI and unit tests aren't exhaustive so a manual test / smoke test would be great since the blast radius is quite high - if we mess up UserData generation, nobody's nodes join the cluster.
-
We can probably do this outside of this PR but we should add validation to the AWS provider that fails if a launch template is specified in addition to either kubeletConfiguration / containerRuntimeConfig since we can't honor both.
eventRecordQPS: 5 | ||
eventBurst: 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the spacing is incorrect for these - it should be on the same level as clusterDNS
// KubeletConfiguration defines args to be used when configuring kubelet on provisioned nodes. | ||
// They are a subset of the upstream types, recognizing not all options may be supported. | ||
// Wherever possible, the types and names should reflect the upstream kubelet types. | ||
// Wherever possible, the types and names should reflect the upstream kubelet types from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wherever possible,
This concerns me. This spec should be an explicit, word for word copy paste of snippits from that struct.
EvictionHard map[string]string `json:"evictionHard,omitempty"` | ||
} | ||
|
||
const notNegative = "cannot be negative" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used once, cleaner to inline it where it's used.
const notNegative = "cannot be negative" | ||
|
||
func (k *KubeletConfiguration) validate() (errs *apis.FieldError) { | ||
errs = addErrIfNegative(errs, k.EventRecordQPS, "eventRecordQPS") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errs.Also will collapse nil errors. This would be a bit more idiomatic, I think:
return errs.Also(
validateGreaterThan(k.EventBurst, 0, "eventRecordQPS"),
validateGreaterThan(k. RegistryPullQPS, 0, "requistryPullQPS")
)
// Note that not all providers may use all addresses. | ||
//+optional | ||
ClusterDNS []string `json:"clusterDNS,omitempty"` | ||
// EventRecordQPS is the maximum event creations per second. If 0, | ||
// there is no limit enforced. | ||
EventRecordQPS *int32 `json:"eventRecordQPS,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need all of these parameters for your use case? Can you provide a sample of how you'd use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please see #1346 (comment) (above)
//+optional | ||
KubeletConfiguration KubeletConfiguration `json:"kubeletConfiguration,omitempty"` | ||
// ContainerRuntimeConfiguration are options passed to the container runtime when provisioning nodes | ||
//+optional | ||
ContainerRuntimeConfiguration ContainerRuntimeConfiguration `json:"containerRuntimeConfiguration,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not comfortable including this custom spec into our API surface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please see #1346 (comment) (above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hear you on this request. Still -- this is API surface and we need to get it right.
Can we at least remove from this PR so that we can move forward with the kubeletconfiguration changes which are less contentious? We can address the remaining gap in a followon.
@suket22 I guess a provided launch template does not conflict with kubeletConfiguration / containerRuntimeConfig as long as its
|
@suket22 No, I have no reasonable (local) test infrastructure / setup for this yet. Would be great if we could add this to CI (instead of each part-time contributor wasting time to figure out how to test this locally). |
We're working on e2e CI that should help here. FWIW, I think it's pretty critical that you test something manually when it's developed. Otherwise it can be easy to miss edge cases, performance problems, etc. |
Hey @alex-berger are you still working on this? Anything I can do to unblock? |
Hi @ellistarn, the discussions and comments on this PR got me thinking and I started to dig a bit deeper to figure out how we could achieve:
Currently, I am preparing a (WIP) PR, which I would like to share with you (soon) to gather feedback and to see whether Karpenter would benefit from such changes. I would like to stall this PR here (for now) and wait for the feedback on the new PR. If the new PR has any chance to be accepted it might obsolete this one (otherwise not). As soon as the above mentioned new PR is ready, I will share a link here. |
We're starting to add more of these options in this PR: #1720 Thanks for brining up these issues! |
Add support for more kubelet and container runtime configuration parameters
1. Issue, if available:
None
2. Description of changes:
This PR adds support for commonly used kubelet configuration parameters and also introduces the first
container runtime configuration parameter (registry mirrors).
3. How was this change tested?
Only by running the unit tests via
make dev
.4. Does this change impact docs?
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.