feat: make vm memory overhead configurable #1953

bwagner5 · 2022-06-17T16:49:41Z

Fixes #

Description

Custom AMIs require different VM memory overhead due to other software or agents running on the machine. This PR makes the VM memory overhead parameter configurable via a CLI arg or env var.
I've also added a script to generate a table with the env var / cli flag and usage description for configuring the karpenter controller

How was this change tested?

make docgen and observed the overhead calculation is the same as before.
Varied the vm overhead parameter between 0, 1, and 0.075, observing correct docs gen of memory.

Does this change impact docs?

Yes, PR includes docs updates
Yes, issue opened: #
No

Release Note

VM Memory Overhead is now configurable via an environment variable to the Karpenter controller (see the options here: https://karpenter.sh/preview/tasks/configuration/)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

netlify · 2022-06-17T16:49:47Z

✅ Deploy Preview for karpenter-docs-prod canceled.

Name	Link
🔨 Latest commit	`a743baa`
🔍 Latest deploy log	https://app.netlify.com/sites/karpenter-docs-prod/deploys/62accad7204da000076436f1

ellistarn · 2022-06-17T17:26:34Z

hack/docs/instancetypes_gen_docs.go

@@ -31,7 +33,9 @@ func main() {

 	os.Setenv("AWS_SDK_LOAD_CONFIG", "true")
 	os.Setenv("AWS_REGION", "us-east-1")
-	ctx := context.Background()
+	os.Setenv("CLUSTER_NAME", "docs-gen")
+	os.Setenv("CLUSTER_ENDPOINT", "https://docs-gen.aws")


:O What's this?

just dummy params to allow for flag parsing to take place. This allowed me to plug in other params that could change the generated instance types resources like max-pods and vm overhead easily. Figured I'd leave in how I did it so that if we need to or want to change some of those parameters, it's a simple os.Setenv(...

ellistarn · 2022-06-17T17:27:51Z

pkg/utils/env/env.go

@@ -33,6 +33,20 @@ func WithDefaultInt(key string, def int) int {
 	return i
 }

+// WithDefaultFloat64 returns the float64 value of the supplied environment variable or, if not present,
+// the supplied default value. If the float64 conversion fails, returns the default
+func WithDefaultFloat64(key string, def float64) float64 {


Is there a generics way to do this?

I don't think so. The float64 func uses strconv.ParseFloat, the Bool one uses strconv.ParseBool and the Int one uses strconv.Atoi. I think the only way you could do it is to use a generic return type and accept a default interface{} as the parameter so that you could do a type switch:

func WithDefault[V float64 | int | string | bool](key string, def interface{}) V { var err error var p V val, ok := os.LookupEnv(key) if !ok { return def } switch def.(type) { case float64: p, err = strconv.ParseFloat(val, 64) case int: p, err = strconv.Atoi(val) case bool: p, err = strconv.ParseBool(val) default: panic("not supported") } if err != nil { return def.(V) } return p }

I think they're better as separate funcs :)

pkg/utils/options/options.go

suket22 · 2022-06-17T18:32:35Z

pkg/utils/options/options.go

+	f.IntVar(&opts.HealthProbePort, "health-probe-port", env.WithDefaultInt("HEALTH_PROBE_PORT", 8081), "The port the health probe endpoint binds to for reporting controller health")
+	f.IntVar(&opts.KubeClientQPS, "kube-client-qps", env.WithDefaultInt("KUBE_CLIENT_QPS", 200), "The smoothed rate of qps to kube-apiserver")
+	f.IntVar(&opts.KubeClientBurst, "kube-client-burst", env.WithDefaultInt("KUBE_CLIENT_BURST", 300), "The maximum allowed burst of queries to the kube-apiserver")
+	f.Float64Var(&opts.VMMemoryOverhead, "vm-memory-overhead", env.WithDefaultFloat64("VM_MEMORY_OVERHEAD", 0.075), "The VM memory overhead as a percent that will be subtracted from the total memory for all instance types")


I remember you doing some testing on this - did we want to tune 0.075 any further or is that still the best approximation we've got by default?

Have we also given up trying to do this per instance type?

Testing found 0.075 to be very close to the best param for AL2, but this will at least give users control if they want to tune it themselves which could make sense if the instance types are constrained too.

I gave up on per instance type, there's too much variability and we could be wrong as new ones are released, the percentage is a conservative, safe value.

As a customer, how do I know what value to be setting here?

I'd probably see that 0.075 is too high and therefore my nodes aren't as densely packed as they are with just nodegroups + k8s scheduler, but how do I tune this to say 0.072 etc? Is it just guesswork?

If I choose a really bad number, the worst that will happen is that Karpenter may launch an incorrect number of nodes to satisfy my pending pods but it should eventually converge. Is my understanding correct?

bwagner5 added 5 commits June 17, 2022 10:24

feat: add configurable vm overhead

be45f35

make vm memory overhead configurable

29a40e8

add comment book endings for gen

74c15d1

add config docs gen table

5c153df

move webhook flags out of controller options

95db488

bwagner5 requested a review from a team as a code owner June 17, 2022 16:49

bwagner5 requested a review from suket22 June 17, 2022 16:49

ellistarn reviewed Jun 17, 2022

View reviewed changes

pkg/utils/options/options.go Outdated Show resolved Hide resolved

fix linting

a5f5db1

suket22 reviewed Jun 17, 2022

View reviewed changes

subclass flagset

a743baa

ellistarn approved these changes Jun 17, 2022

View reviewed changes

ellistarn merged commit 6f2cb4d into aws:main Jun 17, 2022

This was referenced Jul 13, 2022

Fix failing lint checks #2128

Closed

feat: Adding Custom AMIFamily and validations #2154

Merged

suket22 mentioned this pull request Jul 21, 2022

docs: Add docs for customAMIs, tests for UserData #2169

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: make vm memory overhead configurable #1953

feat: make vm memory overhead configurable #1953

bwagner5 commented Jun 17, 2022

netlify bot commented Jun 17, 2022 •

edited

ellistarn Jun 17, 2022

bwagner5 Jun 17, 2022

ellistarn Jun 17, 2022

bwagner5 Jun 17, 2022 •

edited

suket22 Jun 17, 2022

bwagner5 Jun 17, 2022

suket22 Jun 17, 2022

feat: make vm memory overhead configurable #1953

feat: make vm memory overhead configurable #1953

Conversation

bwagner5 commented Jun 17, 2022

netlify bot commented Jun 17, 2022 • edited

✅ Deploy Preview for karpenter-docs-prod canceled.

ellistarn Jun 17, 2022

Choose a reason for hiding this comment

bwagner5 Jun 17, 2022

Choose a reason for hiding this comment

ellistarn Jun 17, 2022

Choose a reason for hiding this comment

bwagner5 Jun 17, 2022 • edited

Choose a reason for hiding this comment

suket22 Jun 17, 2022

Choose a reason for hiding this comment

bwagner5 Jun 17, 2022

Choose a reason for hiding this comment

suket22 Jun 17, 2022

Choose a reason for hiding this comment

netlify bot commented Jun 17, 2022 •

edited

bwagner5 Jun 17, 2022 •

edited