Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1alpha4 API Proposal #676

Merged
merged 1 commit into from
Sep 16, 2021
Merged

v1alpha4 API Proposal #676

merged 1 commit into from
Sep 16, 2021

Conversation

ellistarn
Copy link
Contributor

@ellistarn ellistarn commented Sep 14, 2021

1. Issue, if available:
#661

2. Description of changes:

3. Does this change impact docs?

  • Yes, PR includes docs updates
  • Yes, issue opened: link to issue
  • No

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@netlify
Copy link

netlify bot commented Sep 14, 2021

✔️ Deploy Preview for karpenter-docs-prod canceled.

🔨 Explore the source changes: e308989

🔍 Inspect the deploy log: https://app.netlify.com/sites/karpenter-docs-prod/deploys/6143783b71f8f2000771f6db

launchTemplateName: "foo"
```

The Cloud Provider *may* continue to support corresponding well known labels at the pod level (e.g. `node.k8s.aws/subnet-name`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear whether this will hold true for the alternatives listed in later sections.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my mind, we'd be resetting most of these and just adding them at the provisioner level (for now).

*Note: Not all `spec.provider` fields must have a corresponding well known label.*

### Alternative: Inline vs Nested Parameters
It's possible to leverage inline json capabilities to collapse these parameters to the top level. For example
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear what 'inline json capabilities' means - capabilities of what/where?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


We introduce a new `provider` field that enables strongly typed vendor specific parameters without violating vendor neutrality principles in the Karpenter codebase. We leverage Kubernetes `runtime.RawExtensions` to encapsulate these fields as raw bytes, which are then unmarshaled in vendor specific code. Vendors may implement arbitrary validation, defaulting, and provisioning behavior over the entire structure of these extensions. These structures are versioned separately from the Provisioner GVK to enable Cloud Providers to make backwards incompatible changes to provider specific configuration without requiring a version bump to the Provisioner CRD. If versioning is not specified by the user, it will be inferred and defaulted.

```bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are these bash rather than yaml?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call

limits: {} # Vendor neutral, recognized by Karpenter generic controllers
labels: {} # Vendor neutral, recognized by Karpenter generic controllers
taints: [] # Vendor netural, recognized by Karpenter generic controllers
subnets: [] # Vendor specific, only recognized by AWS Cloud Provider code
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't you back to the name collision (potential) problem here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Structural vs nominal types. Discuss. 😛

Unless we want to generalise over abstract providers for some reason, I recommend avoiding this and sticking to composition of nominal types (the RawExtension recommendation above).

2. (+) Vendors may change versions decoupled from each other.
3. (+) Parameters that are natural siblings (e.g. `zones`, `securityGroups`) will be no longer be separated by `provider`.
4. (-) Cloud Provider must do more than simply implementing an interface (they must define an CR, etc).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a con here that customers can get confused about whether a field is vendor-specific or not. And also that when "porting" someone else's yaml code from one vendor to another, they may not even realize certain fields need to change (e.g., the subnets example above).

3. (+) Parameters that are natural siblings (e.g. `zones`, `securityGroups`) will be no longer be separated by `provider`.
4. (-) Cloud Provider must do more than simply implementing an interface (they must define an CR, etc).

This approach is quite attractive, but requires significant effort and changes to the project.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably re-word this a bit, as "quite attractive" is somewhat difficult to quantify. I'd suggest more focusing on how much effort it would be (perhaps quantify that a bit) as a reason not to do it. (I'm also not 100% sure all of the potential downsides were listed?)

These changes recommend:
1. Cloud Provider specific extensions under `spec.provider`.
2. Removal of `spec.cluster`.
3. Pluralization of `spec.architecture` and `spec.operatingSystem`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was pretty confusing, I thought you meant you were going to change the names to (eg) operatingSystems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do!

designs/v1alpha4-api.md Outdated Show resolved Hide resolved
@ellistarn ellistarn changed the title [WIP] v1alpha4 API Proposal v1alpha4 API Proposal Sep 15, 2021
apiVersion: karpenter.sh/v1alpha4
kind: Provisioner
spec:
provider:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ provider :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, this looks like a great improvement. from our discussion on Sep 16, would this be an appropriate place for a provider to inject a secret location for kubeconfigs that it might need?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct!

- `spec.labels['node.k8s.aws/subnet-name']`

One benefit of this approach is that pods may use corresponding `spec.nodeSelector[...]` to request additional constraints on provisioned nodes. This information must be communicated through Kubernetes label values, which is awkward for the following use cases:
1. Parameters are a list (e.g. `subnets: ["subneta", "subnetb", "subnetc"]`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd change to "may be a list" unless all parameters are a list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in reference to the previous "in the following use cases" -> "parameters are a list", "parameters are a struct"

Comment on lines +40 to +41
node.k8s.aws/subnet-name: "123" # Vendor Specific Field
kubernetes.io/instance-type: "m5.large" # Vendor Neutral Field
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh so it's vendor neutral based on the key and not the value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct -- the value is just a string

```yaml
spec:
limits:
ready: 500 # Flat or Percentage. Karpenter will not launch additional capacity if ready nodes exceeds this value

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "ready=90%" mean? (percentage of what total?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. It slipped my mind when I was thinking "IntOrString all the things" and "unready should have parity, so let's make ready". I suppose it could mean, this provisioner can deploy 50% of the nodes in this cluster. Though I'd much prefer that the provisioners are their own universes and are not aware of other provisioners.

I keep going back and forth on a node count as a limit (as opposed to resource limits below). I'm leaning simply leaving this out entirely until I have a real use case for it.

limits: {} # Vendor neutral, recognized by Karpenter generic controllers
labels: {} # Vendor neutral, recognized by Karpenter generic controllers
taints: [] # Vendor netural, recognized by Karpenter generic controllers
subnets: [] # Vendor specific, only recognized by AWS Cloud Provider code

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Structural vs nominal types. Discuss. 😛

Unless we want to generalise over abstract providers for some reason, I recommend avoiding this and sticking to composition of nominal types (the RawExtension recommendation above).

bwagner5
bwagner5 previously approved these changes Sep 16, 2021
Copy link
Contributor

@bwagner5 bwagner5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice doc!

We expand the constraints of a Provisioner from a scalar to a vector in all cases. This will apply to both `operatingSystem` and `architecture` enabling the operator to specify greater flexibility. This supports use cases such as heterogenous architectures within a single provisioner, selected dynamically at runtime by the cloud provider. More importantly, this change creates a consistent and predictable semantic for all vendor neutral constraints. Similar to other constraints, the semantic of this change allows cloud providers to choose any value in the constraint slice, e.g. prioritizing arm64 for cost reasons.
### Example
```yaml
spec:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would these be expanded? We'd need to inspect the container manifest's support for the architecture and operating system before we could expand the constraints which is possible.

Architecture could be considered optimistic where Karpenter assumes that a container for both amd64 and arm64 exist. But that same strategy couldn't be used for OS, I don't think (at least from a practical use-case).

Copy link
Contributor

@bwagner5 bwagner5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, nice doc

@ellistarn ellistarn merged commit a8a5ab4 into aws:main Sep 16, 2021
@ellistarn ellistarn deleted the api branch September 16, 2021 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants