New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Allow openapi references in CRD validation schema #54579

Closed
mtaufen opened this Issue Oct 25, 2017 · 38 comments

Comments

@mtaufen
Copy link
Contributor

mtaufen commented Oct 25, 2017

Offline chat with @mbohlool: CRD validation schema doesn't currently support references. You can get around this by specifying all of the types inline, but it's extraordinarily verbose (and prone to typos).

@nikhita

This comment has been minimized.

Copy link
Member

nikhita commented Oct 26, 2017

/area custom-resources
/cc @sttts @deads2k @enisoc @colemickens

@sttts

This comment has been minimized.

Copy link
Contributor

sttts commented Oct 26, 2017

We were chatting about this idea before. Basically, we need a reference namespace for our API groups, something like apps.k8s.io/v1.StateFulSetSpec (whatever the actual syntax is).

One note about this: the apiextensions apiserver is not coupled to kube-apiserver right now and shouldn't be in the future. How does apiextensionsapiserver know about the specs of the types of other groups? The aggregrator is the only one who knows how to resolve such a reference correctly or at least it know which downstream server to ask.

In other words, getting the information flow right is not so trivial.

@mtaufen

This comment has been minimized.

Copy link
Contributor Author

mtaufen commented Oct 26, 2017

@enisoc

This comment has been minimized.

Copy link
Member

enisoc commented Oct 26, 2017

Could we have apiextensions-apiserver download OpenAPI from kube-apiserver, like kubectl does?

@mbohlool

This comment has been minimized.

Copy link
Member

mbohlool commented Oct 26, 2017

@sttts

This comment has been minimized.

Copy link
Contributor

sttts commented Oct 26, 2017

Could we have apiextensions-apiserver download OpenAPI from kube-apiserver, like kubectl does?

Yes, something like that which loosely couples those two.

@lavalamp

This comment has been minimized.

Copy link
Member

lavalamp commented Oct 27, 2017

I am concerned about giving ourselves really hard problems when we do version updates that drop a previously deprecated field.

@sttts

This comment has been minimized.

Copy link
Contributor

sttts commented Oct 27, 2017

I am concerned about giving ourselves really hard problems when we do version updates that drop a previously deprecated field.

We cannot express in our CRD JSON Schema that certain fields do not exist (following the kube API conventions). So a CR cannot become invalid because a deprecated field was dropped. But as we store raw JSON for CRs, those removed fields would continue to exist in old CRs.

@lavalamp

This comment has been minimized.

Copy link
Member

lavalamp commented Oct 27, 2017

CRD embeds v1.Pod. (time passes) we deprecate v1.Pod in favor of v2.Pod (...much later...) We remove v1.Pod. Suddenly the entire system loses the ability to interpret the CRD that embedds the v1.Pod object.

@sttts

This comment has been minimized.

Copy link
Contributor

sttts commented Oct 27, 2017

At least it won't validate it anymore. Deletion is not a problem.

@sttts

This comment has been minimized.

Copy link
Contributor

sttts commented Oct 27, 2017

Which also means that we need a migration early enough, or more precisely the CRD author has to take care of this. Also with native types: if you miss the point in time to migrate, the objects are lost.

@nikhita

This comment has been minimized.

Copy link
Member

nikhita commented Oct 29, 2017

@sdminonne

This comment has been minimized.

Copy link
Contributor

sdminonne commented Oct 30, 2017

@lavalamp, personally I agree with @sttts, my understanding is that a coupling problem does exist but it's totally on the CRD creation/handling side. A CR/CRD is explicitly created with a reference to a native object. If the object does not exist anymore, it's a user error.

@nikhita thanks
@clamoriniere1A FYI

@lavalamp

This comment has been minimized.

Copy link
Member

lavalamp commented Nov 7, 2017

It's not about new objects, I agree those will be fine. It is about stored objects.

@sdminonne

This comment has been minimized.

Copy link
Contributor

sdminonne commented Nov 8, 2017

OK, I didn't get it. Thanks for the clarification.

@nikhita

This comment has been minimized.

Copy link
Member

nikhita commented Jan 17, 2018

/kind feature

@erictune

This comment has been minimized.

Copy link
Member

erictune commented Mar 1, 2018

We should not support this.

There should not be references to schemas that cross two separately released components (the core and the software that has the CRD). This will result in unexpected addition or deletion (think rollback of server version) of fields in the CRD schema. It also makes it hard for the controller to determine if objects it previously created are what it intended to create, since they change unexpectedly.

Most CRDs I've seen do not typically embed a pod spec. They only have select fields in them.
When they do embed a pod spec, it is usually optional and not the common case.
I think it is better to give up validation in uncommon cases than to couple unrelated systems.

@markmandel

This comment has been minimized.

Copy link
Contributor

markmandel commented Mar 1, 2018

Most CRDs I've seen do not typically embed a pod spec.

Apologies, but I'm working on a project right now that embeds a PodSpecTemplate. From my experience this is not that uncommon an occurrence. Sounds like something we might need more data on.

Not saying that refs are the absolute best answer, but from my perspective there are definitely projects that will need this type of functionality.

@enisoc

This comment has been minimized.

Copy link
Member

enisoc commented Mar 3, 2018

Just to add another aspect to consider:

So far the only proposal for supporting server-side apply (and/or strategic merge patch) on resources served through CRD is to put the necessary schema info in OpenAPI. Without the ability to directly pull in the schema for a PodTemplateSpec, you would have to recursively copy in all those schemas into your own, if you want server-side apply to work correctly for your resource.

Maybe we could consider some tooling to expand references inline, which would then be checked in with the CRD manifest. That wouldn't be my ideal user experience, although it would get a little better if we at least had the ability to break down a schema into definitions that can be referenced from within the same schema.

@ant31

This comment has been minimized.

Copy link
Member

ant31 commented Mar 5, 2018

Maybe we could consider some tooling to expand references inline

In case it helps

I'm a maintainer of prometheus-operator, its CRD uses many of the Kubernetes API specifications.
To add CRD validation, we developed a small OpenAPI generator to inline all references.
output example: https://github.com/coreos/prometheus-operator/blob/master/example/prometheus-operator-crd/prometheus.crd.yaml

To use it:

crd.Spec.validation: GetCustomResourceValidation(config.SpecDefinitionName, config.GetOpenAPIDefinitions)

That wouldn't be my ideal user experience

It's not a terrible one

@enisoc

This comment has been minimized.

Copy link
Member

enisoc commented Mar 5, 2018

@ant31 That's awesome!

+cc @pwittrock Have you considered this approach for kube-builder?

@G-Harmon

This comment has been minimized.

Copy link

G-Harmon commented Mar 14, 2018

cc me. I saw this as a GSoC potential project on https://github.com/cncf/soc.

@Bubblemelon

This comment has been minimized.

Copy link

Bubblemelon commented Mar 20, 2018

Hi ! Wondering how I could start helping with this issue? Any tips on getting started or advice on this would be much appreciated! Thanks in advance 😄

@liggitt

This comment has been minimized.

Copy link
Member

liggitt commented Mar 21, 2018

I agree with Eric that we should not do this. Another wrinkle in allowing references is the security aspect (reference expansion is a significant source of confused deputy issues where a reference gets tried as a URL fetch or file load by a component with access to different networks or file systems than you)

@ant31

This comment has been minimized.

Copy link
Member

ant31 commented Mar 21, 2018

Most CRDs I've seen do not typically embed a pod spec. They only have select fields in them.

I tend to disagree, most operators I've seen are using some part of the API. Depending on the operator some configuration is exposed to the user and most of the time it directly uses a kubernetes API spec.

@liggitt what about inlines all references in the CRD spec as proposed by @enisoc.
We can provide tooling to help users to statically generate such spec (without references) similarly to https://github.com/kubernetes/kube-openapi

@sttts

This comment has been minimized.

Copy link
Contributor

sttts commented Mar 21, 2018

We can provide tooling to help users to statically generate such spec

If we don't add dynamic reference support, this is the way out. There are couple of problems though:

  • those references will be resolved at development time. The schema might not fit to the cluster of the user.
  • the schemata in /swagger.json are incomplete. They are purely about the structure of the go types, not about syntactical restrictions of fields or even cross-field restrictions (implementated in validations today in the apiserver).
@lavalamp

This comment has been minimized.

Copy link
Member

lavalamp commented Mar 23, 2018

I don't think we can take this up until we have a universal stored-object-upgrade solution (#52185). If we had that, then CRD objects with references could automatically be upgraded.

It might still be a bad idea for other reasons (per @liggitt's comment) but I think this is really not a good idea if changes to built in resources are going to have unintuitive implications for CRD authors.

@lavalamp

This comment has been minimized.

Copy link
Member

lavalamp commented Mar 23, 2018

To be extra clear, all of @erictune, @liggitt, and myself are against working on this right now. Please, no one work on this right now--I don't want to reject your PRs! :)

Instead, work on the blocker I mentioned ;)

@quinton-hoole

This comment has been minimized.

Copy link
Member

quinton-hoole commented Mar 26, 2018

@kubernetes/sig-multicluster-misc
Just to chime in that Cluster Federation v2 has a strong need to validate templates based on Kubernetes resource types. All of our federated types include embedded templates of Kubernetes types, and we would benefit enormously from being able to reliably validate these at creation/update time. In the absence of such early validation, we force our users to deal with asynchronous validation failures when the kubernetes resources are propagated to kubernetes clusters, which is significantly harder. Happy to follow the consensus view here, and/or provide real use case examples/requirements if that's helpful.

@lavalamp

This comment has been minimized.

Copy link
Member

lavalamp commented Mar 26, 2018

@fejta-bot

This comment has been minimized.

Copy link

fejta-bot commented Jun 24, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@nikhita

This comment has been minimized.

Copy link
Member

nikhita commented Jun 24, 2018

/remove-lifecycle stale

@fejta-bot

This comment has been minimized.

Copy link

fejta-bot commented Sep 22, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@nikhita

This comment has been minimized.

Copy link
Member

nikhita commented Sep 23, 2018

/remove-lifecycle stale

@fejta-bot

This comment has been minimized.

Copy link

fejta-bot commented Dec 22, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@fejta-bot

This comment has been minimized.

Copy link

fejta-bot commented Jan 21, 2019

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@fejta-bot

This comment has been minimized.

Copy link

fejta-bot commented Feb 20, 2019

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Feb 20, 2019

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment