docs: [RFC] Add v1beta1 API RFC #4274

jonathan-innis · 2023-07-17T02:13:42Z

Fixes #N/A

Description

This PR proposes the v1beta1 APIs and full Change List into the design directory

How was this change tested?

Does this change impact docs?

[] Yes, PR includes docs updates
[] Yes, issue opened: #
No

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

netlify · 2023-07-17T02:13:48Z

✅ Deploy Preview for karpenter-docs-prod canceled.

Name	Link
🔨 Latest commit	`8184ef1`
🔍 Latest deploy log	https://app.netlify.com/sites/karpenter-docs-prod/deploys/65040439bc3ebd00082ced2f

ellistarn

Awesome work on this!!!

designs/v1beta1-api.md

designs/v1beta1-full-changelist.md

designs/v1beta1-api.md

designs/v1beta1-full-changelist.md

Co-authored-by: Tim Bannister <tim@scalefactory.com>

designs/v1beta1-full-changelist.md

designs/v1beta1-api.md

designs/v1beta1-full-changelist.md

designs/v1beta1-api.md

garvinp-stripe · 2023-08-22T17:55:19Z

Great work! I had some time to think, this isn't necessarily a strong opinion but I think there is some value in considering. I don't know what the timeline looks like and maybe what I am thinking doesn't make sense at v1beta1 however I think there is an opportunity to drive some usability improvement since there is going to be significant work in changing the custom resources.

I wanted to see if we can take some inspiration from https://gateway-api.sigs.k8s.io/ which is somewhat new and have been through quite a bit of changes in its overall design. Specifically the way it carved out custom resources that is able to abstract discrete configuration depending on personas. I think this way of thinking can yield quite a bit of insight into how to structure custom resources so that cluster operators can easily provide application developers the appropriate level of control.

Starting with NodePool. I think it has a lot of abstraction leaks between App dev vs Cluster Operator. Ideally app devs can specify their pods -> node, maybe some taints and label for their nodes (pretty much all the parameters in spec). However app dev shouldn't be able to specify kubelet configurations since this is usually managed by cluster operator. I think if possible this can be carved out to something like NodeProviders which are associated to NodePool. Similar to HTTPRoutes to Gateway in GatewayAPI. I think this would significantly improve the flexibility of how cluster operators manage Provisioners/ NodePool since often times there are much more app dev teams to cluster operator teams and most cluster operators are currently creating provisioners per app team.

I think something nice would be "version" similar to LaunchTemplate version. This allows Cluster Operators to rollout and rollback changes to nodes safely and have the necessary controls when managing many NodeClasses. Ideally each NodeClasses version should be immutable. Right now it is difficult to expand on NodeClass for nodes outside of AWS. It may be valuable to split NodeClasses, I don't have a good name, such that we have something that is at the cloud provider level and one that just defines nodes specification specific to a cloud provider. (Similar to gatewayclass and gateway in Gateway API). The Cloud provider layer resource defines the controller for creating nodes for that cloud provider and the nodeclass themselves work in the context of each cloud provider.

Thanks

jonathan-innis · 2023-08-23T05:41:33Z

I think something nice would be "version" similar to LaunchTemplate version. This allows Cluster Operators to rollout and rollback changes to nodes safely and have the necessary controls when managing many NodeClasses. Ideally each NodeClasses version should be immutable. Right now it is difficult to expand on NodeClass for nodes outside of AWS. It may be valuable to split NodeClasses, I don't have a good name, such that we have something that is at the cloud provider level and one that just defines nodes specification specific to a cloud provider. (Similar to gatewayclass and gateway in Gateway API). The Cloud provider layer resource defines the controller for creating nodes for that cloud provider and the nodeclass themselves work in the context of each cloud provider.

I'm still thinking on the other comment that you made around the Gateway API, but, for this one, the idea was that the NodeClass is the cloudprovider extensibility mechanism; it's effectively the interface to define how Karpenter can actually bring up a Node in a given Cloudprovider. If another Cloudprovider came along and wanted to support Karpenter, I would expect that they would create their own NodeClass and this NodeClass would define its own interface with how to launch Nodes in that Cloudprovider.

Generally, we wanted to avoid creating too many disparate concepts that have to be individually reasoned about (i.e. more API kinds). We effectively have taken the stance that if something is common enough that it should be useful to all cloudproviders, then it should be hoisted up into the NodePool; otherwise, it belongs within the cloudprovider-specific NodeClass.

Is there a specific use-case that you have in mind where you think splitting out the details of the NodeClass could be useful beyond hoisting common cross-cloudprovider values into the NodePool?

designs/v1beta1-api.md

tallaxes · 2023-09-18T01:05:37Z

designs/v1beta1-full-changelist.md


-1. API Group Renames
-    1. `karpenter.k8s.aws` → `compute.k8s.aws`


Same question here, wondering what prompted the reversal ...

tallaxes · 2023-09-18T01:09:19Z

designs/v1beta1-api.md

-#### `compute.k8s.aws`
-
-1. `compute.k8s.aws/instance-hypervisor`
-2. `compute.k8s.aws/instance-encryption-in-transit-supported`
-3. `compute.k8s.aws/instance-category`
-4. `compute.k8s.aws/instance-family`
-5. `compute.k8s.aws/instance-generation`
-6. `compute.k8s.aws/instance-local-nvme`
-7. `compute.k8s.aws/instance-size`
-8. `compute.k8s.aws/instance-cpu`
-9. `compute.k8s.aws/instance-memory`
-10. `compute.k8s.aws/instance-network-bandwidth`
-11. `compute.k8s.aws/instance-pods`
-12. `compute.k8s.aws/instance-gpu-name`
-13. `compute.k8s.aws/instance-gpu-manufacturer`
-14. `compute.k8s.aws/instance-gpu-count`
-15. `compute.k8s.aws/instance-gpu-memory`
-16. `compute.k8s.aws/instance-accelerator-name`
-17. `compute.k8s.aws/instance-accelerator-manufacturer`
-18. `compute.k8s.aws/instance-accelerator-count`
+#### `karpenter.k8s.aws`
+
+1. `karpenter.k8s.aws/instance-hypervisor`
+2. `karpenter.k8s.aws/instance-encryption-in-transit-supported`
+3. `karpenter.k8s.aws/instance-category`
+4. `karpenter.k8s.aws/instance-family`
+5. `karpenter.k8s.aws/instance-generation`
+6. `karpenter.k8s.aws/instance-local-nvme`
+7. `karpenter.k8s.aws/instance-size`
+8. `karpenter.k8s.aws/instance-cpu`
+9. `karpenter.k8s.aws/instance-memory`
+10. `karpenter.k8s.aws/instance-network-bandwidth`
+11. `karpenter.k8s.aws/instance-gpu-name`
+12. `karpenter.k8s.aws/instance-gpu-manufacturer`
+13. `karpenter.k8s.aws/instance-gpu-count`
+14. `karpenter.k8s.aws/instance-gpu-memory`
+15. `karpenter.k8s.aws/instance-accelerator-name`
+16. `karpenter.k8s.aws/instance-accelerator-manufacturer`
+17. `karpenter.k8s.aws/instance-accelerator-count`


@jonathan-innis, I am curios what prompted the reversal of the original plan?

jonathan-innis requested a review from a team as a code owner July 17, 2023 02:13

jonathan-innis requested a review from tzneal July 17, 2023 02:13

jonathan-innis force-pushed the rfc-v1beta1-design branch 3 times, most recently from 238e4ee to 810bdfc Compare July 17, 2023 02:20

jonathan-innis changed the title ~~docs: [RFC] Add v1beta1 RFC to designs directory~~ docs: [RFC] Add v1beta1 API RFC Jul 17, 2023

jonathan-innis force-pushed the rfc-v1beta1-design branch 8 times, most recently from b6b3d21 to 4c9f48d Compare July 19, 2023 05:05

jonathan-innis force-pushed the rfc-v1beta1-design branch 2 times, most recently from a647e9a to fa8ccb7 Compare July 24, 2023 23:26

jonathan-innis added needs-design Design required needs-review PRs that are still going through the review process and removed needs-design Design required labels Jul 25, 2023

jonathan-innis force-pushed the rfc-v1beta1-design branch 2 times, most recently from ca8bfc2 to 06eb4fa Compare July 25, 2023 18:45

Add v1beta1 RFC to designs directory

9b80758

jonathan-innis force-pushed the rfc-v1beta1-design branch 3 times, most recently from ad4b2a2 to 2068978 Compare July 25, 2023 21:45

ellistarn reviewed Jul 25, 2023

View reviewed changes

sftim reviewed Jul 25, 2023

View reviewed changes

Update NodeTemplate to NodeClass

affeabf

jonathan-innis force-pushed the rfc-v1beta1-design branch from 2068978 to affeabf Compare July 28, 2023 19:11

sftim reviewed Jul 30, 2023

View reviewed changes

designs/v1beta1-full-changelist.md Outdated Show resolved Hide resolved

jonathan-innis and others added 2 commits July 30, 2023 22:31

Update designs/v1beta1-full-changelist.md

a42d5e9

Co-authored-by: Tim Bannister <tim@scalefactory.com>

Fix security group typo

79fe642

jonathan-innis force-pushed the rfc-v1beta1-design branch from 618f3d3 to 79fe642 Compare July 31, 2023 18:26

jonathan-innis mentioned this pull request Jul 31, 2023

chore: Add v1beta1 APIs kubernetes-sigs/karpenter#426

Merged

garvinp-stripe reviewed Aug 3, 2023

View reviewed changes

designs/v1beta1-full-changelist.md Show resolved Hide resolved

jonathan-innis mentioned this pull request Aug 9, 2023

chore: Create v1beta1 APIs #4405

Merged

3 tasks

sftim mentioned this pull request Aug 11, 2023

option to enable / disable consolidation by replace and delete kubernetes-sigs/karpenter#684

Closed

rothgar reviewed Aug 17, 2023

View reviewed changes

designs/v1beta1-api.md Show resolved Hide resolved

tallaxes reviewed Aug 20, 2023

View reviewed changes

designs/v1beta1-api.md Show resolved Hide resolved

tallaxes reviewed Aug 20, 2023

View reviewed changes

designs/v1beta1-api.md Outdated Show resolved Hide resolved

tallaxes reviewed Aug 20, 2023

View reviewed changes

designs/v1beta1-api.md Show resolved Hide resolved

sftim reviewed Aug 20, 2023

View reviewed changes

designs/v1beta1-full-changelist.md Outdated Show resolved Hide resolved

designs/v1beta1-full-changelist.md Outdated Show resolved Hide resolved

designs/v1beta1-full-changelist.md Outdated Show resolved Hide resolved

designs/v1beta1-api.md Show resolved Hide resolved

Update docs from review comments

d813dbe

garvinp-stripe reviewed Aug 23, 2023

View reviewed changes

designs/v1beta1-api.md Outdated Show resolved Hide resolved

garvinp-stripe reviewed Aug 24, 2023

View reviewed changes

designs/v1beta1-api.md Show resolved Hide resolved

Update field naming

163b70c

sftim reviewed Aug 31, 2023

View reviewed changes

designs/v1beta1-api.md Show resolved Hide resolved

jonathan-innis force-pushed the rfc-v1beta1-design branch 2 times, most recently from 20c3bdf to 4b8edd2 Compare September 14, 2023 22:42

Change name to EC2NodeClass and remove some fast-follow features

5d5ec0d

jonathan-innis force-pushed the rfc-v1beta1-design branch from 4b8edd2 to 5d5ec0d Compare September 14, 2023 22:43

Tidy up some details and naming

8184ef1

jonathan-innis enabled auto-merge (squash) September 15, 2023 07:16

ellistarn approved these changes Sep 15, 2023

View reviewed changes

jonathan-innis merged commit 6ea6132 into aws:main Sep 15, 2023
15 checks passed

tallaxes reviewed Sep 18, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: [RFC] Add v1beta1 API RFC #4274

docs: [RFC] Add v1beta1 API RFC #4274

jonathan-innis commented Jul 17, 2023 •

edited

netlify bot commented Jul 17, 2023 •

edited

ellistarn left a comment

garvinp-stripe commented Aug 22, 2023 •

edited

jonathan-innis commented Aug 23, 2023

tallaxes Sep 18, 2023 •

edited

tallaxes Sep 18, 2023


		1. API Group Renames
		1. `karpenter.k8s.aws` → `compute.k8s.aws`

docs: [RFC] Add v1beta1 API RFC #4274

docs: [RFC] Add v1beta1 API RFC #4274

Conversation

jonathan-innis commented Jul 17, 2023 • edited

netlify bot commented Jul 17, 2023 • edited

✅ Deploy Preview for karpenter-docs-prod canceled.

ellistarn left a comment

Choose a reason for hiding this comment

garvinp-stripe commented Aug 22, 2023 • edited

jonathan-innis commented Aug 23, 2023

tallaxes Sep 18, 2023 • edited

Choose a reason for hiding this comment

tallaxes Sep 18, 2023

Choose a reason for hiding this comment

jonathan-innis commented Jul 17, 2023 •

edited

netlify bot commented Jul 17, 2023 •

edited

garvinp-stripe commented Aug 22, 2023 •

edited

tallaxes Sep 18, 2023 •

edited