KEP-5000: API Linting and CRD Schema Change Validation Tooling #5102

everettraven · 2025-01-29T19:29:05Z

One-line PR description: Adding new KEP for API Linting and CRD Schema Change Validation Tooling

Issue link: API linting and CRD schema change validation tooling #5000

Other comments:

Signed-off-by: Bryce Palmer <bpalmer@redhat.com>

k8s-ci-robot · 2025-01-29T19:29:16Z

Welcome @everettraven!

It looks like this is your first PR to kubernetes/enhancements 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/enhancements has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

k8s-ci-robot · 2025-01-29T19:29:17Z

Hi @everettraven. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/README.md

JoelSpeed · 2025-02-07T15:14:41Z

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/README.md

+- [Infrastructure Needed (Optional)](#infrastructure-needed-optional)
+<!-- /toc -->
+
+## Release Signoff Checklist


Is there any guidance that shows what we should do for when there is no release related to the KEP?

Leaving this as an open question for folks more familiar with the KEP process as I wasn't able to find any guidance on this.

JoelSpeed · 2025-02-07T15:18:12Z

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/README.md

+[documentation style guide]: https://github.com/kubernetes/community/blob/master/contributors/guide/style-guide.md
+-->
+
+This proposal introduces two new sig-api-machinery subprojects to improve the experience of developing Kubernetes APIs (including CRDs).`kube-api-linter` (aka `kal`) is a golangci-lint plugin for evaluating Go types that are used to generate CRDs to ensure they follow the Kubernetes API conventions and best practices. `crdify` is a CLI that compares CRD YAML to identify changes in CRD schemas, ensuring that any changes are compatible.


Is there someone who needs to sign off on names of projects?

I wonder if we want to use a name for the schema upgrade checker that says more about what it's doing? crd-schema-upgrade-checker?

I figured naming could be discussed as part of this KEP. I'm open to suggestions on better naming

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/README.md

Flesh out details around KAL

Signed-off-by: Bryce Palmer <bpalmer@redhat.com>

JoelSpeed · 2025-02-11T19:35:45Z

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/README.md

+
+| Name | Scope | Description | Implemented | Carry |
+| ---- | ----- | ----------- | ----------- | ----- |
+| enum | Property | Validates compatibility of changes to enum constraints on a property. Net new enum constraints, adding and removing enum values are flagged. | Yes | Yes |


Is there an option to be able to tailor this? If we had a use case where we were confident that adding a new enum value would be sufficiently handled such that this was safe, how would we make an exception?

There is configuration options implemented today for most of the validations here. See: https://everettraven.github.io/crd-diff/#/validations?id=enum

I didn't include configuration information in this table though, because I'm not quite happy with the existing configuration method and called this out explicitly as needing some further considerations: https://github.com/kubernetes/enhancements/pull/5102/files#diff-41e4d1a3fd8006a9e40e6415cede33325066400913bdaa49808839037eb2fecaR271

If it makes sense to add at least the configuration options that we anticipate having in some form, I can run back through and add them.

Instead of exploding the validations table size with all the possible configuration options, I added a link to the bottom of this section to point to the documentation site for crd-diff in e69b3ed

JoelSpeed · 2025-02-11T19:39:52Z

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/README.md

+| minitems | Property | Validates compatibility of changes to minitems constraints on a property. Net new minitems constraints or an increase in minitems value are flagged | Yes | Yes |
+| minlength | Property | Validates compatibility of changes to minlength constraints on a property. Net new minlength constraints or an increase in minlength value are flagged | Yes | Yes |
+| minproperties | Property | Validates compatibility of changes to minproperties constraints on a property. Net new minproperties constraints or an increase in minproperties value are flagged | Yes | Yes |
+| required | Property | Validates compatibility of changes to required constraints on a property. Adding new required properties are flagged | Yes | Yes | 


Do current implementations handle the case where a new required property is added, as a member of a new optional property?

I think so, but I would have to double check.

JoelSpeed · 2025-02-11T19:40:43Z

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/README.md

+| type | Property | Validates compatibility of changes to type constraints on a property. Changes to property types are flagged | Yes | Yes |
+| scope | CRD | Validates that scope doesn't change for the CRD | Yes | Yes |
+| existingfieldremoval | CRD | Validates that existing fields are not removed from the CRD | Yes | Yes |
+| storedversionremoval | CRD | Validates that stored versions are not removed from the CRD. Only valid for CRDs that are previously present on a cluster | Yes | Yes |


This is checking that a version of the schema, that is also listed as a stored version, is not removed from the schema right?

JoelSpeed · 2025-02-11T19:41:27Z

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/README.md

+| exclusiveMaximum | Property | Validates compatibility of changes to the exclusiveMaximum of a property. Net new exclusiveMaximum constraints or a decrease in exclusiveMaximum are flagged | No | Yes |
+| exclusiveMinimum | Property | Validates compatibility of changes to the exclusiveMinimum of a property. Net new exclusiveMinimum constraints or an increase in exclusiveMinimum are flagged | No | Yes |
+| uniqueItems | Property | Validates compatibility of changes to the uniqueItems of a property. Net new uniqueItems constraints are flagged | No | Yes | 
+| pattern | Property | Validates compatibility of changes to the pattern constraint of a property. Net new pattern constraints and more restrictive validations are flagged | No | Yes |


How would you determine if a more restrictive pattern is introduced?

Not something I've put much thought into, but I do understand that this could be complex and might not be possible. I figure more nuanced discussion on the exact validation logic would take place during implementation of the validation.

Happy to remove the "more restrictive validations" piece if needed.

I suspect, if we were to review the API guidelines and took them strictly to heart, we would lead to the conclusion that any change, restrictive or more permissive would be a breaking change. Maybe wait for others to weigh in on this, but I suspect when we come to implement this, the default probably wants to be "any change to the pattern" with configuration for other options

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/README.md

JoelSpeed · 2025-02-11T19:45:56Z

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/README.md

+| ListsMustHaveSSATags | Property | Validates that lists have the `x-kubernetes-list-type` tag for server side apply | Yes | No, if desired this is better suited for a linter in `kal` |
+| ConditionsMustHaveSSATags | Property | Validates that status conditions fields have the appropriate tags for server side apply | Yes | No, if desired this is better suited for a linter in `kal` |
+| NoNewRequiredFields | CRD | Validates that no new required fields are added | Yes | Yes |
+| MustNotExceedCostBudget | Property | Validates that `XValidations` don't exceed the CEL cost budget | Yes | No, if desired this is better suited for a linter in `kal` |


I haven't currently got this on the KAL wishlist, but it probably should be there.

With the current implementation it also gives output in the form or warnings and info level about the complexity of calculations, warning of unbounded cardinality, it would be a shame to lose that, though I don't know how I'd fit that into a linter

…s, add information on configuration Signed-off-by: Bryce Palmer <bpalmer@redhat.com>

Signed-off-by: Bryce Palmer <bpalmer@redhat.com>

JoelSpeed · 2025-02-12T12:13:02Z

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/README.md

+| exclusiveMaximum | Property | Validates compatibility of changes to the exclusiveMaximum of a property. Net new exclusiveMaximum constraints or a decrease in exclusiveMaximum are flagged | No | Yes |
+| exclusiveMinimum | Property | Validates compatibility of changes to the exclusiveMinimum of a property. Net new exclusiveMinimum constraints or an increase in exclusiveMinimum are flagged | No | Yes |
+| uniqueItems | Property | Validates compatibility of changes to the uniqueItems of a property. Net new uniqueItems constraints are flagged | No | Yes | 
+| pattern | Property | Validates compatibility of changes to the pattern constraint of a property. Net new pattern constraints and more restrictive validations are flagged | No | Yes |


I suspect, if we were to review the API guidelines and took them strictly to heart, we would lead to the conclusion that any change, restrictive or more permissive would be a breaking change. Maybe wait for others to weigh in on this, but I suspect when we come to implement this, the default probably wants to be "any change to the pattern" with configuration for other options

JoelSpeed · 2025-02-14T12:16:46Z

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/README.md

+    apiVersion: crdify.sigs.k8s.io/v1alpha1
+    kind: <validationName>Config
+    customField: value


What is the benefit to embedded yaml in yaml like this? In terms of implementing validation for these sub structures, does it make it easier?

How strict do we plan to be for the API versioning of the configuration of our tooling? Do we expect to make significant/breaking changes to the configuration over time?

What is the benefit to embedded yaml in yaml like this? In terms of implementing validation for these sub structures, does it make it easier?

I took inspiration from the existing admission plugin configuration approach where each admission plugin has its own configuration that can be specified in an embedded YAML fashion like this. I'm not sure the original motivations for this approach with the admission plugins, but I can imagine it is beneficial to be able to version the configuration options for an individual admission plugin separately from the top-level admission plugin configuration type. I don't imagine us having much churn on the configuration options for individual checks, but having the additional flexibility to modify and version the configuration options respected by the individual CRD schema change validations may be helpful to adjust as unknown use cases present themselves.

I would definitely be curious to hear from those with more experience in maintaining the admission plugin configuration approach if there have been any major pain points or gotchas of using this approach that may need some additional consideration.

How strict do we plan to be for the API versioning of the configuration of our tooling? Do we expect to make significant/breaking changes to the configuration over time?

I would expect that we follow the same conventions of versioning as a standard Kubernetes API. I don't know that we have any intentions of making significant/breaking changes to the configuration over time, but I think the flexibility to do so if absolutely needed will be good to have. We have a varying chunk of use cases that we are attempting to account for, but I highly doubt we have captured all that there will be in the future.

non-trivial yaml inside of an opaque string inside of yaml is going to have poor tooling support ...

I also don't think validation tools should be making breaking changes in general, it will discourage usage, but if we have to, we have the top level version.

golangci-lint for example tends to make new linters opt-in to encourage upgrading to the latest version

Also worth pointing out: The component config approach hasn't been super great for stability so far, not a single component has GA config and we as a result hold ourself to alpha/beta rules and ... ship breaking config changes, that we probably wouldn't have been emboldened to do if we hadn't added a GVK to them.

IMHO: linters should have additive opt-in/out config (add new options over time, do not break old options) and avoid breaking changes more genearlly, otherwise the tradeoff between using them leans towards maintaining usage of the linter being more effort than hand-reviewing.

There's a middle ground where you use map[string]interface{} right? The yaml can be nested inline without making the inner yaml stringly-typed. Support for completions, formatting-error checks etc. would still be worse than with a fully typed struct but perhaps marginally better than stringly.

Thanks for sharing your thoughts and insights! I do agree that we should avoid breaking changes as much as possible.

map[string]interface{} could work. If we want to get rid of the inner yaml all together we could make explicit typed fields for each validation, maybe something like:

apiVersion: crdify.sigs.k8s.io/v1alpha1 kind: ValidationConfiguration validations: foo: enforcement: Error || Warn || None someConfigField: someConfigValue

That gets rid of the inner YAML and maintains fully typed structs.

I'm also open to discussing further:

Removing the GVK for the config file. This would remove the ability to version the configuration itself, but maybe that is for the best? Any breaking configuration changes would warrant a new major release of the tooling so versioning is still possible if absolutely necessary.

Keeping the GVK and the v1alpha1 versioning at the start

Keeping the GVK and starting the versioning of the config file at v1 immediately to discourage breaking config changes

What degree of consistency with configuration approaches are generally expected for sig sponsored sub-projects? Would it be a better UX to stay consistent with other component configurations and add the GVK?

Removing the GVK for the config file. This would remove the ability to version the configuration itself, but maybe that is for the best? Any breaking configuration changes would warrant a new major release of the tooling so versioning is still possible if absolutely necessary.

I wouldn't suggest that, I would have a version for the top-level config object for sure, and GVK is a reasonable way to do that. It leaves the option to do a major revision with conversion later if we really have to

I think nested versions / objects sounds confusing and verbose for users though.

I think it is reasonable to keep the GVK for the top-level config object.

From the discussion so far, I am leaning towards two options (open to further suggestions as well):

Option 1:

apiVersion: crdify.sigs.k8s.io/v1alpha1 # Could also be v1 if preferred kind: ValidationConfiguration validations: foo: enforcement: Error || Warn || None someConfigField: someConfigValue

Option 2:

apiVersion: crdify.sigs.k8s.io/v1alpha1 # Could also be v1 if preferred kind: ValidationConfiguration validations: - name: foo enforcement: Error || Warn || None config: someConfigField: someConfigValue

Are there any strong preferences for one approach over the other?

I am leaning towards option 2 with the idea that a more loose structure here might make it easier to implement new validations in the future because they don't have to modify the basic configuration file structure. Might also force us to be careful with configuration changes because each change to the configuration format impacts every validation rather than a single one.

If you go down the option 2 route, that means that config is basically a raw blob of yaml that can have any structure? I suspect that will make validation of the configuration harder, and not easier.

Option 1 gives you a concrete type and makes unmarshalling the data a 1 shot. Any unrecognised fields would either be dropped, or an error reported. I don't see how that would be possible in option 2 without lots of manual validation code to first check the name and then unmarshal the config from the raw data?

For option 2, my thinking is to still take an approach similar to that of the admission plugins in kubernetes, without the yaml-in-yaml approach. This gets into the implementation details, but essentially each validation would register a function to initialize the validation with a provided configuration. It would be up to each validation to have logic to validate the provided configuration. If initializing a validation fails it returns an error and the process will exit, returning that error to the user.

^ is definitely not a 1 shot evaluation, but I anticipate the user experience would be the same.

aojea · 2025-02-14T15:37:25Z

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/kep.yaml

+reviewers:
+  - TBD
+approvers:
+  - TBD


you need sponsor from the sig leads
/assign @deads2k @jpbetz

Thanks!

Went ahead and updated this to include them both in the reviewers and approvers list.

Signed-off-by: Bryce Palmer <bpalmer@redhat.com>

k8s-ci-robot · 2025-02-14T18:52:24Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: everettraven
Once this PR has been reviewed and has the lgtm label, please ask for approval from deads2k. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

keps/sig-api-machinery/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

BenTheElder

This feels like two KEPs for the price of one? :-)
Is there a reason both tools are one KEP?

(also, the repo creation process does not require a KEP FWIW, thought it makes sense to capture design discussion somewhere https://github.com/kubernetes/community/blob/master/github-management/kubernetes-repositories.md#sig-repositories)

everettraven · 2025-02-14T19:42:38Z

This feels like two KEPs for the price of one? :-) Is there a reason both tools are one KEP?

It was requested during the sig-api-machinery meeting when these tools were proposed that a small KEP be created. There didn't seem to be a preference in how, but IIRC it was mentioned that both tools could be included in a single KEP.

I'm not strongly opinionated one way or another and am happy to make whatever updates folks think make sense, including separating each tool into it's own KEP.

jpbetz

I love that this is being worked on! Added some feedback. I'm a big supporter of this.

jpbetz · 2025-03-05T18:13:36Z

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/README.md

+
+This proposal introduces two new sig-api-machinery subprojects to improve the experience of developing Kubernetes APIs (including CRDs).
+
+A linter, `kube-api-linter` (aka `kal`) as a golangci-lint plugin for evaluating Go types that are used to generate both built-in types and CRDs, to ensure they follow the Kubernetes API conventions and best practices.


I believe "kal" has a vulgar meaning in at least one language. Consider avoiding the non-capitalized acronym, or avoiding the acronym entirely.

(Someone taught me to check for this a few years ago and it's surprising how often it comes up).

Ahhhhhhhh, yes, I've just found an example.

Will make sure to name the repo/go module fully as kube-api-linter when we make the transfer of the code from where it is now. I don't really know what else we would call this (unless we go for a nautically themed name that doesn't tell me it's a linter), so I'm not sure we can avoid KAL altogether

While we can make an effort to not abbreviate it, others will I suspect

Good catch! I think as long as we make an explicit effort to ensure we only ever call it kube-api-linter it should be fine?

I'll update this shortly to remove the acronym.

jpbetz · 2025-03-05T20:06:03Z

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/README.md

+
+### Goals
+
+- Create tooling to help developers writing Kubernetes APIs in Go to follow best practices and Kubernetes API conventions


This is excellent.

Having a "delegate" from @kubernetes/api-reviewers might valuable here. (Any volunteers to do a pass and provide guidance feedback @liggitt @msau42 @deads2k @thockin)?

cc @yongruilin, @aaron-prindle who has been working on in-tree validation linting for Declarative Validation

I'd be keen to catch up with @yongruilin and @aaron-prindle about their progress. So far I've been focussing the linters as much as possible on those that don't need markers for validation, or where kube markers are also supported for CRDs.

There are some examples now though (maxlength) where it checks for a maximal string/list length, that only now supports CRDs. I'd hope in the future to be able to extend that linter to be able to cover built-in types too. Understanding the goals of the in-tree validation would help me understand how to drive KAL towards being useful for both built-ins and CRDs

+1 on the delegate from the kubernetes/api-reviewers group. I've got some text in the "Risks and Mitigation" section below that calls out the need for a robust review process with "stakeholders". I'll add an entry to that list.

jpbetz · 2025-03-05T20:11:46Z

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/README.md

+
+## Proposal
+
+This KEP introduces two new sig-api-machinery subprojects.


I support creating these projects.

jpbetz · 2025-03-05T20:13:36Z

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/README.md

+
+### Go-based Kubernetes API Linting
+
+A linter, built as a `golangci-lint` plugin, that evaluates Go type defintions (typically `_types.go`), and flags deviations from Kubernetes API conventions and best practices.


Are there any plans for setting up communication channels or regular meetings?

I think it would be valuable for @yongruilin to attend (if he agrees).

I hadn't considered a need before beyond the sig-apimachinery calls, but, now you mention it, I'm totally happy.

I'm not sure what the best practices are for other similar projects, something along the lines of:

Slack channel

Mailing List

Bi-weekly meeting

Would that be about right?

I'm in the same boat as Joel, having not thought of needing more than the existing sig-api-machinery channels, but I can see value in this having separate communication channels/meetings.

Maybe a slack channel and a bi-weekly meeting to start? Maybe we could take advantage of the existing sig-api-machinery mailing list to announce larger milestones until a need arises for it to have a separate mailing list?

Slack channel

Mailing List

Bi-weekly meeting

Yes, this is totally sufficient.

The community can provision additional channels, so just ask (myself or contributor experience or whomever).

The main SIG meetings are busy so having your own meeting is best for collaboration. Frequency is totally up to the participants, but bi-weekly works well for many groups.

Mailing list- Maybe not as essential, but totally fine to have.

If, in the future, this group develops a sufficiently large charter or contributor group, please also feel free to propose the formation of a formal working group or subproject. (In the short term, getting some repos and getting actual coding done is probably the priority, but something to remember as the project progresses)

I can look into a slack channel, something like #sig-apimachinery-dev-tools?

jpbetz · 2025-03-05T20:16:22Z

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/README.md

+
+### CRD Schema Change Validation
+
+Create a CLI tool that evaluates the differences between an old and new CRD in YAML to identify changes that may break users.


nit: JSON too? Or maybe just drop the serialization choice here in KEP and make sure the implementation supports everything needed.

jpbetz · 2025-03-05T20:18:18Z

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/README.md

+- Comparing CRD across git revisions (could be used in CI systems) - `crdify git://main?path=mycrd.yaml git://HEAD?path=mycrd.yaml`
+- Comparing CRD from git revision to a local file - `crdify git://main?path=mycrd.yaml file://mycrd.yaml`
+
+The schema change validation will focus on changes that may break users, and changes that can be detected by comparing the CRD schemas in YAML.


So this tool focused on backward compatibility and the linter focuses on conventions/best practices?

Yeah that's correct. crdify will have a view of what's changed over time so is better suited to picking up things like "you shouldn't be increasing the minimum value".

We did actually start with our version of this tool doing both, but the feedback loop for developers was long. They would typically not run the schema diff checked until they created a PR, and of course, have to regenerate the CRDs to be able to even run this.

The linter however is much more lightweight, they don't have to generate the CRDs and it can even be integrated into IDEs to give users very quick feedback on the changes they've made.

I had suggested in the initial pass, we try and make them focused on the two separate "phases" of development, where each is best suited to check specific things

jpbetz · 2025-03-05T20:19:32Z

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/README.md

+
+### Why do we need both?
+
+When authoring types for Kubernetes, we typically use Go types, and add "markers" (e.g. `+optional`) to provide information about how the fields should behave.


Just a heads up that https://github.com/kubernetes/enhancements/blob/master/keps/sig-api-machinery/5073-declarative-validation-with-validation-gen/README.md is being implemented and aims to significantly increase the usage of markers for validation of built-in types.

I would expect the ideas being proposed in this KEP to have broader applications as more declarative validation tags are added to built-in types, since the amount of machine accessible information about built-in types will increase.

I hadn't seen the proposal from that KEP before, I'll give that a read as this is exactly the kind of thing I think KAL will want to be supporting in the future.

jpbetz · 2025-03-05T20:35:06Z

keps/sig-api-machinery/5000-api-linting-crd-schema-tooling/README.md

+
+To implement some of the desired checks, for example ensuring that a required field is not a pointer, or that all fields are marked either `+optional` or `+required`, we need to evaluate the Go types themselves.
+
+On the other hand, typical linter implementations do not provide the ability to compare two versions of a file, and identify changes that may break users.


So the capability grid is:

linter backward compatibility checker

Go types kube-api-linter unsupported

OpenAPIv3 unsupported crdify

I'll admit I can imagine both the unsupported cells in the grid being useful. Being able to line OpenAPIv3 of CRDs in particular seems super handy.

WRT to linting the openapi schema, what kind of use case did you have in mind? If we lint conventions are the go level, then the authors of the API can be certain that their types meet conventions (or the configured subset of them).

If we then also say, lets lint conventions at the openapi level, well for CRD authors, the Go level is easier so they'd probably just use that. So is this level then for cluster admins to lint the CRDs they're installing? Can they effect a change to the CRD if they find a violation? I'd expect probably not?

For the change validation inspecting Go types, that is something we could possibly look into. It's not uncommon for linters to look at git history and determine "whats new", so in theory yes, we could look into adding "breaking change validation" to the linter as well

Do you think it's a significant imposition to people to suggest using two tools for the different stages as proposed here?

For linting the openapi schema, I could see the value to those that create kubernetes API extensions outside of the Go programming language. How often this is done is unclear to me as I haven't taken the time to survey the non-Go based operators and aggregated apiservers (not sure if there is non-Go libraries to simplify this). Having a linter that can catch common things via the openapi schema could be helpful in a broader set of use cases.

I, personally, wouldn't mind adding that to the crdify tool via a subcommand like crdify lint and the compatibility checks could be under something like crdify compare.

That being said, for those developing in Go I would generally expect the recommendation to be:

use kube-api-linter for conventions/best practices of your Go types

use crdify for comparing compatibility of changes to your CRDs

Someone could certainly try to use kube-api-linter and crdify lint together, but we would probably want to make sure we have a tight coupling of best practice evaluations in both tools. If kube-api-linter results in a user making a change, then crdify lint should not contradict that change (and vice versa).

then crdify lint should not contradict that change (and vice versa).

We can/should set up testing to ensure that

Is this additional linting behavior for crdify something we want to include in the scope of this enhancement?

for CRD authors, the Go level is easier so they'd probably just use that. So is this level then for cluster admins to lint the CRDs they're installing? Can they effect a change to the CRD if they find a violation? I'd expect probably not?

I do agree that IF each tool need to pick an single input format, that we've picked the best options.

But can't help but notice that this project is going to need to support both input formats, which makes me wonder if it would be worthwhile to somehow decouple the input processing from the rest of each tool such that both tools can accept both input formats.

Re: CRD authors that write OpenAPI instead of Go types. I'd love to see data. I suspect there is a community here, although I might be wrong.

Re: Schema backward comaptibility checking for Go types- Could be useful in-tree?

Oh also, please treat my feedback here as non-blocking. I'm just sharing thoughts.

The way these two tools are written today, I don't think decoupling the input formats is possible. To support both we would need to rewrite validations in a lot of cases. (One is looking at Go AST using the go analysis constructs, the other walks the openapi schema)

I have a feeling that trying to decouple (by which I'm interpreting as convert into some intermediate syntax) so that analysis logic can be shared, is likely to be quite hard, and make two projects that are otherwise fairly simple, quite complex.

There are also many examples where they wouldn't make sense (or even be possible) if you crossed the input format.

Specifically, there are some linters where we are looking specifically at the existence of markers, or json tags, or an optional field being a pointer with omitempty. None of those would make any sense/be possible for the schema. format.

Right now, I'd rather focus on what's documented here and keep this idea on the back burner as a nice to have in the future

Signed-off-by: Bryce Palmer <bpalmer@redhat.com>

…s, config format updates Signed-off-by: Bryce Palmer <bpalmer@redhat.com>

KEP-5000: API Linting and CRD Schema Change Validation Tooling

2b14e49

Signed-off-by: Bryce Palmer <bpalmer@redhat.com>

k8s-ci-robot added cncf-cla: yes kind/kep labels Jan 29, 2025

k8s-ci-robot requested review from apelisse and fedebongio January 29, 2025 19:29

k8s-ci-robot added the sig/api-machinery label Jan 29, 2025

k8s-ci-robot added the needs-ok-to-test label Jan 29, 2025

k8s-ci-robot added the size/XL label Jan 29, 2025

JoelSpeed reviewed Feb 7, 2025

View reviewed changes

JoelSpeed and others added 2 commits February 7, 2025 17:45

Flesh out details around KAL

98cd745

Merge pull request #1 from JoelSpeed/api-tooling

1fc58e6

Flesh out details around KAL

k8s-ci-robot added size/XXL and removed size/XL labels Feb 10, 2025

everettraven added 2 commits February 10, 2025 11:46

fixup! remove comments

b0f3ecd

Signed-off-by: Bryce Palmer <bpalmer@redhat.com>

fixup! add more information on crd schema change validations

ebfd4d1

Signed-off-by: Bryce Palmer <bpalmer@redhat.com>

k8s-ci-robot added size/XL and removed size/XXL labels Feb 10, 2025

JoelSpeed reviewed Feb 11, 2025

View reviewed changes

everettraven added 2 commits February 11, 2025 16:05

fixup! update reasons for not including crd-schema-checker validation…

e69b3ed

…s, add information on configuration Signed-off-by: Bryce Palmer <bpalmer@redhat.com>

add validation configuration

3ce47fe

Signed-off-by: Bryce Palmer <bpalmer@redhat.com>

JoelSpeed reviewed Feb 14, 2025

View reviewed changes

aojea reviewed Feb 14, 2025

View reviewed changes

k8s-ci-robot assigned deads2k and jpbetz Feb 14, 2025

fixup! add sig leads as approvers/reviewers

ccc23bc

Signed-off-by: Bryce Palmer <bpalmer@redhat.com>

BenTheElder reviewed Feb 14, 2025

View reviewed changes

This was referenced Feb 27, 2025

feat: add validator for the description of a property everettraven/crd-diff#7

Open

Shift to align more with what is proposed in https://github.com/kubernetes/enhancements/pull/5102 everettraven/crd-diff#9

Open

jpbetz reviewed Mar 5, 2025

View reviewed changes

everettraven added 2 commits March 6, 2025 16:31

fixup! remove kal acronym except for links to existing repository

b1281db

Signed-off-by: Bryce Palmer <bpalmer@redhat.com>

fixup! put admission plugin in scope, remove serialization constraint…

e062056

…s, config format updates Signed-off-by: Bryce Palmer <bpalmer@redhat.com>


		This proposal introduces two new sig-api-machinery subprojects to improve the experience of developing Kubernetes APIs (including CRDs).

		A linter, `kube-api-linter` (aka `kal`) as a golangci-lint plugin for evaluating Go types that are used to generate both built-in types and CRDs, to ensure they follow the Kubernetes API conventions and best practices.


		### Goals

		- Create tooling to help developers writing Kubernetes APIs in Go to follow best practices and Kubernetes API conventions


		## Proposal

		This KEP introduces two new sig-api-machinery subprojects.


		### Go-based Kubernetes API Linting

		A linter, built as a `golangci-lint` plugin, that evaluates Go type defintions (typically `_types.go`), and flags deviations from Kubernetes API conventions and best practices.


		### CRD Schema Change Validation

		Create a CLI tool that evaluates the differences between an old and new CRD in YAML to identify changes that may break users.


		### Why do we need both?

		When authoring types for Kubernetes, we typically use Go types, and add "markers" (e.g. `+optional`) to provide information about how the fields should behave.


		To implement some of the desired checks, for example ensuring that a required field is not a pointer, or that all fields are marked either `+optional` or `+required`, we need to evaluate the Go types themselves.

		On the other hand, typical linter implementations do not provide the ability to compare two versions of a file, and identify changes that may break users.

	linter	backward compatibility checker
Go types	kube-api-linter	unsupported
OpenAPIv3	unsupported	crdify

KEP-5000: API Linting and CRD Schema Change Validation Tooling #5102

Are you sure you want to change the base?

KEP-5000: API Linting and CRD Schema Change Validation Tooling #5102

Conversation

everettraven commented Jan 29, 2025

k8s-ci-robot commented Jan 29, 2025

k8s-ci-robot commented Jan 29, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

everettraven Feb 11, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k8s-ci-robot commented Feb 14, 2025

BenTheElder left a comment

Choose a reason for hiding this comment

everettraven commented Feb 14, 2025

jpbetz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpbetz Mar 7, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

everettraven Mar 6, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpbetz Mar 6, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

everettraven Feb 11, 2025 •

edited

Loading

jpbetz Mar 7, 2025 •

edited

Loading

everettraven Mar 6, 2025 •

edited

Loading

jpbetz Mar 6, 2025 •

edited

Loading