Feature Request: Allow openapi references in CRD validation schema #54579

mtaufen · 2017-10-25T17:48:27Z

Offline chat with @mbohlool: CRD validation schema doesn't currently support references. You can get around this by specifying all of the types inline, but it's extraordinarily verbose (and prone to typos).

nikhita · 2017-10-26T09:51:38Z

/area custom-resources
/cc @sttts @deads2k @enisoc @colemickens

sttts · 2017-10-26T10:08:44Z

We were chatting about this idea before. Basically, we need a reference namespace for our API groups, something like apps.k8s.io/v1.StateFulSetSpec (whatever the actual syntax is).

One note about this: the apiextensions apiserver is not coupled to kube-apiserver right now and shouldn't be in the future. How does apiextensionsapiserver know about the specs of the types of other groups? The aggregrator is the only one who knows how to resolve such a reference correctly or at least it know which downstream server to ask.

In other words, getting the information flow right is not so trivial.

mtaufen · 2017-10-26T13:31:43Z

It should at least be possible for CRDs that are registered with the kube-apiserver, right? Or am I misunderstanding how CRD works?

…

On Thu, Oct 26, 2017 at 3:09 AM, Dr. Stefan Schimanski < ***@***.***> wrote: We were chatting about this idea before. Basically, we need a reference namespace for our API groups, something like apps.k8s.io/v1.StateFulSetSpec (whatever the actual syntax is). One note about this: the apiextensions apiserver is not coupled to kube-apiserver right now and shouldn't be in the future. How does apiextensionsapiserver know about the specs of the types of other groups? The aggregrator is the only one who knows how to resolve such a reference correctly or at least it know which downstream server to ask. In other words, getting the information flow right is not so trivial. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#54579 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA3JwTYEnOBHjCR7QLFQqGm7hK4z-zMLks5swFpZgaJpZM4QGZfO> .

-- Michael Taufen Google SWE

enisoc · 2017-10-26T18:33:17Z

Could we have apiextensions-apiserver download OpenAPI from kube-apiserver, like kubectl does?

mbohlool · 2017-10-26T19:58:08Z

/cc @cheftako @mml @lavalamp

sttts · 2017-10-26T20:03:26Z

Could we have apiextensions-apiserver download OpenAPI from kube-apiserver, like kubectl does?

Yes, something like that which loosely couples those two.

lavalamp · 2017-10-27T00:15:11Z

I am concerned about giving ourselves really hard problems when we do version updates that drop a previously deprecated field.

sttts · 2017-10-27T08:05:30Z

I am concerned about giving ourselves really hard problems when we do version updates that drop a previously deprecated field.

We cannot express in our CRD JSON Schema that certain fields do not exist (following the kube API conventions). So a CR cannot become invalid because a deprecated field was dropped. But as we store raw JSON for CRs, those removed fields would continue to exist in old CRs.

lavalamp · 2017-10-27T19:06:08Z

CRD embeds v1.Pod. (time passes) we deprecate v1.Pod in favor of v2.Pod (...much later...) We remove v1.Pod. Suddenly the entire system loses the ability to interpret the CRD that embedds the v1.Pod object.

sttts · 2017-10-27T20:18:27Z

At least it won't validate it anymore. Deletion is not a problem.

sttts · 2017-10-27T20:20:02Z

Which also means that we need a migration early enough, or more precisely the CRD author has to take care of this. Also with native types: if you miss the point in time to migrate, the objects are lost.

nikhita · 2017-10-29T19:16:36Z

/cc @sdminonne

sdminonne · 2017-10-30T09:54:52Z

@lavalamp, personally I agree with @sttts, my understanding is that a coupling problem does exist but it's totally on the CRD creation/handling side. A CR/CRD is explicitly created with a reference to a native object. If the object does not exist anymore, it's a user error.

@nikhita thanks
@clamoriniere1A FYI

lavalamp · 2017-11-07T00:25:15Z

It's not about new objects, I agree those will be fine. It is about stored objects.

sdminonne · 2017-11-08T10:04:46Z

OK, I didn't get it. Thanks for the clarification.

nikhita · 2018-01-17T21:28:25Z

/kind feature

erictune · 2018-03-01T23:32:43Z

We should not support this.

There should not be references to schemas that cross two separately released components (the core and the software that has the CRD). This will result in unexpected addition or deletion (think rollback of server version) of fields in the CRD schema. It also makes it hard for the controller to determine if objects it previously created are what it intended to create, since they change unexpectedly.

Most CRDs I've seen do not typically embed a pod spec. They only have select fields in them.
When they do embed a pod spec, it is usually optional and not the common case.
I think it is better to give up validation in uncommon cases than to couple unrelated systems.

markmandel · 2018-03-01T23:45:26Z

Most CRDs I've seen do not typically embed a pod spec.

Apologies, but I'm working on a project right now that embeds a PodSpecTemplate. From my experience this is not that uncommon an occurrence. Sounds like something we might need more data on.

Not saying that refs are the absolute best answer, but from my perspective there are definitely projects that will need this type of functionality.

enisoc · 2018-03-03T00:36:23Z

Just to add another aspect to consider:

So far the only proposal for supporting server-side apply (and/or strategic merge patch) on resources served through CRD is to put the necessary schema info in OpenAPI. Without the ability to directly pull in the schema for a PodTemplateSpec, you would have to recursively copy in all those schemas into your own, if you want server-side apply to work correctly for your resource.

Maybe we could consider some tooling to expand references inline, which would then be checked in with the CRD manifest. That wouldn't be my ideal user experience, although it would get a little better if we at least had the ability to break down a schema into definitions that can be referenced from within the same schema.

ant31 · 2018-03-05T10:17:24Z

Maybe we could consider some tooling to expand references inline

In case it helps

I'm a maintainer of prometheus-operator, its CRD uses many of the Kubernetes API specifications.
To add CRD validation, we developed a small OpenAPI generator to inline all references.
output example: https://github.com/coreos/prometheus-operator/blob/master/example/prometheus-operator-crd/prometheus.crd.yaml

To use it:

statically generate openapi_spec (with references) using: https://github.com/kubernetes/kube-openapi:
Call https://github.com/ant31/crd-validation/blob/master/pkg/crdvalidation.go#L39:

crd.Spec.validation: GetCustomResourceValidation(config.SpecDefinitionName, config.GetOpenAPIDefinitions)

That wouldn't be my ideal user experience

It's not a terrible one

enisoc · 2018-03-05T21:27:48Z

@ant31 That's awesome!

+cc @pwittrock Have you considered this approach for kube-builder?

fejta-bot · 2018-12-22T08:10:24Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2019-01-21T08:56:41Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2019-02-20T09:15:44Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2019-02-20T09:15:52Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

nick4fake · 2019-12-17T04:43:29Z

I am sorry for sending notification to everyone involved in this thread but I simply wanted to note another aspect: same-file referencing.
What do you think about allowing referencing at least definitions in the same file?

rjeberhard · 2020-02-27T09:08:01Z

While migrating from CRD v1beta1 to v1, this issue comes back up... We used the same technique as described above by the maintainer of the Prometheus operator to generate and inline the schema for the built-in types that we reference; however, this makes for very large schema sections. This was acceptable when all versions had to share a single, backward compatible schema, but since CRD v1 wants the schema for each version, even if they are the same, this quickly leads to a CRD that is too big to create.

Has there been any momentum on this discussion? Or, have I misunderstood that CRD v1 can still allow for a single definition of a shared schema?

Note: x-kubernetes-embedded-resource isn't useful because few of the types we need to reference are standalone resources, e.g. we want to allow customers to specify additional volumes for the pods our operator will create.

erictune · 2020-02-28T01:16:32Z

I acknowledge this is an issue and will look into it.

@rjeberhard
Can you point at an example of your types.go files that ends up being too large?

rjeberhard · 2020-03-05T12:14:02Z

Sorry for the delay. Our operator isn't written in Go, but instead in Java using the standard client. Also, I discovered that Kubernetes will accept the large CRD if I use kubectl create or replace rather than apply. Therefore, I'm not blocked. Finally, all of the standard types we reference are for the use case where a customer needs to specify additional content for the pods our operator will create, e.g. resources or additional volumes. While it would be simpler for me to be able to reference standard types, I'm going to explore if we could change to having the customer specify a pod template.

lavalamp · 2020-03-13T22:48:37Z

Serverside apply doesn't use an annotation and wouldn't hit this limit.

rajasaur · 2020-06-07T18:16:10Z

How would this work for recursive references? We have a case where a property references a type that was previously defined in the same document.

lavalamp · 2020-06-08T15:19:17Z

References to things included in the type definition already seem safe. @sttts or @jpbetz might have thoughts?

It would be nice for a user to know that an EDS resource they're trying to create is invalid. There are already several validations in the CRDs' definition, but it's still missing some, like in .spec.template.metadata. When a resource with invalid metadata (like annotations or labels where the value is an int instead of a string, which is common) is created, it will break the informer since it's no longer able to de-serialize it into the Go structs, throwing the following error: ``` E0520 14:25:07.400582 1 reflector.go:123] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:204: Failed to list *v1alpha1.ExtendedDaemonSet: v1alpha1.ExtendedDaemonSetList.Items: []v1alpha1.ExtendedDaemonSet: v1alpha1.ExtendedDaemonSet.Spec: v1alpha1.ExtendedDaemonSetSpec.Template: v1.PodTemplateSpec.ObjectMeta: v1.ObjectMeta.CreationTimestamp: Annotations: ReadString: expects " or n, but found 1, error found in #10 byte of ...|rollout":1},"creatio|..., bigger context ...|e-eu1.staging.dog","wheelofmisery/force-rollout":1},"creationTimestamp":null,"labels":{"app":"datado|... ``` To avoid that, this commit expands the validation to also check the metadata more thoroughly. The actual validation inlined here comes from K8s OpenAPI specs for a PodSpec. Currently there is no way to reference that, so unfortunately we have to inline it (see kubernetes/kubernetes#54579). Now, when a user tries to create an invalid resource, they'll be stopped with an error message: ``` $ kubectl apply -f invalid-foo-eds.yml The ExtendedDaemonSet "foo" is invalid: spec.template.metadata.annotations.rollout: Invalid value: "integer": spec.template.metadata.annotations.rollout in body must be of type string: "integer" ```

drpmma · 2021-05-12T02:51:23Z

References to things included in the type definition already seem safe. @sttts or @jpbetz might have thoughts?

Is there any update? same issue here.

james-mchugh · 2024-02-05T22:04:14Z

What's the recommended solution for this in 2024? I have a CRD that uses a podTemplateSpec. I attempted to pull the OpenAPI spec for the core API, inline all references, and then use its podTemplateSpec definition in my CRD, but that did not work. I got various validation errors when trying to apply the CRD. I've been trying to resolve the errors one-by-one, but I'm hoping there is a better way to do this.

rjeberhard · 2024-02-05T22:56:55Z

@james-mchugh, just responding to let you know that I've not found a better solution. We're using the Java client, so we generate schemas that pull in standard types by building a large CRD with schema that matches the standard types by walking the types and then accounting for special cases like IntOrString.

It's a pain...

AlexanderYastrebov · 2024-03-22T11:32:52Z

We used the same technique as described above by the maintainer of the Prometheus operator to generate and inline the schema for the built-in types that we reference; however, this makes for very large schema sections.

We trimmed descriptions to stay below annotation size limits, see zalando-incubator/stackset-controller#498

mtaufen added area/api Indicates an issue on api area. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels Oct 25, 2017

mtaufen assigned mbohlool Oct 25, 2017

nikhita mentioned this issue Oct 26, 2017

Getting CRD Validation to Beta #53829

Closed

k8s-ci-robot added the area/custom-resources label Oct 26, 2017

k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Jan 17, 2018

nikhita mentioned this issue Jan 23, 2018

Umbrella issue for CRDs moving to GA #58682

Closed

54 tasks

nikhita mentioned this issue Feb 16, 2018

CRD validation support references for core types kubernetes/apiextensions-apiserver#27

Closed

markmandel mentioned this issue Feb 22, 2018

GameServer definition validation googleforgames/agones#10

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 22, 2018

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 21, 2019

k8s-ci-robot closed this as completed Feb 20, 2019

This was referenced Apr 17, 2020

How to generate swagger from without definitions RicoSuter/NJsonSchema#642

Open

Autogenerated OpenAPI schema classes don't work as expected kubernetes-client/csharp#417

Closed

ukclivecox mentioned this issue May 22, 2020

OpenAPI Validation for PredictiveUnits limited to 5 levels SeldonIO/seldon-core#1864

Closed

juliogreff mentioned this issue Jun 11, 2020

Validate CRs .spec.template.metadata DataDog/extendeddaemonset#21

Merged

bfjelds mentioned this issue Dec 19, 2020

Consider implementing a validator for Configurations project-akri/akri#180

Closed

yuzisun mentioned this issue Jan 12, 2021

Adversarial robustness explainer kserve/kserve#1244

Merged

DmitriGekhtman mentioned this issue Jan 25, 2021

[autoscaler][kubernetes][operator] Validation for embedded K8s objects in CRD ray-project/ray#13674

Closed

cici37 mentioned this issue Aug 10, 2021

Update KEP jpbetz/enhancements#3

Merged

Feature Request: Allow openapi references in CRD validation schema #54579

Feature Request: Allow openapi references in CRD validation schema #54579

Comments

mtaufen commented Oct 25, 2017 • edited Loading

nikhita commented Oct 26, 2017

sttts commented Oct 26, 2017

mtaufen commented Oct 26, 2017 via email

enisoc commented Oct 26, 2017

mbohlool commented Oct 26, 2017 • edited Loading

sttts commented Oct 26, 2017

lavalamp commented Oct 27, 2017

sttts commented Oct 27, 2017

lavalamp commented Oct 27, 2017

sttts commented Oct 27, 2017

sttts commented Oct 27, 2017

nikhita commented Oct 29, 2017

sdminonne commented Oct 30, 2017 • edited by sttts Loading

lavalamp commented Nov 7, 2017

sdminonne commented Nov 8, 2017

nikhita commented Jan 17, 2018

erictune commented Mar 1, 2018

markmandel commented Mar 1, 2018

enisoc commented Mar 3, 2018

ant31 commented Mar 5, 2018 • edited Loading

enisoc commented Mar 5, 2018

fejta-bot commented Dec 22, 2018

fejta-bot commented Jan 21, 2019

fejta-bot commented Feb 20, 2019

k8s-ci-robot commented Feb 20, 2019

nick4fake commented Dec 17, 2019

rjeberhard commented Feb 27, 2020

erictune commented Feb 28, 2020

rjeberhard commented Mar 5, 2020

lavalamp commented Mar 13, 2020

rajasaur commented Jun 7, 2020

lavalamp commented Jun 8, 2020

drpmma commented May 12, 2021

james-mchugh commented Feb 5, 2024

rjeberhard commented Feb 5, 2024

AlexanderYastrebov commented Mar 22, 2024

mtaufen commented Oct 25, 2017 •

edited

Loading

mbohlool commented Oct 26, 2017 •

edited

Loading

sdminonne commented Oct 30, 2017 •

edited by sttts

Loading

ant31 commented Mar 5, 2018 •

edited

Loading