Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Directly convert CRD structuralSchema to smdSchema #87872

Closed
wants to merge 1 commit into from

Conversation

jennybuckley
Copy link

What type of PR is this?
/kind feature
/sig api-machinery
/wg apply
/priority important-longterm
/cc @apelisse @jpbetz @sttts

What this PR does / why we need it:
Currently we convert the validation field in CRDs into v2 openapi, then into 'proto.Models', then into an smd schema. This causes issues because CRDs are allowed to put v3 openapi in their validation fields, some of which an smd schema could express, but isn't compatible with 'proto.Models', so that information is lost.

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 6, 2020
@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/feature Categorizes issue or PR as related to a new feature. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. wg/apply priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/apiserver labels Feb 6, 2020
@jennybuckley jennybuckley force-pushed the crd-smd-direct branch 2 times, most recently from b57f3e8 to 0c8bbbc Compare February 6, 2020 21:27
@k8s-ci-robot k8s-ci-robot added the area/dependency Issues or PRs related to dependency changes label Feb 6, 2020
@apelisse
Copy link
Member

apelisse commented Feb 7, 2020

the verify failure seem legitimate :-)

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: jennybuckley
To complete the pull request process, please assign apelisse, sttts
You can assign the PR to them by writing /assign @apelisse @sttts in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Feb 7, 2020
@jennybuckley jennybuckley force-pushed the crd-smd-direct branch 6 times, most recently from 4023fe4 to 6caad85 Compare February 10, 2020 23:34
@jennybuckley jennybuckley changed the title [WIP] Directly convert CRD structuralSchema to smdSchema Directly convert CRD structuralSchema to smdSchema Feb 11, 2020
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 11, 2020
@jennybuckley
Copy link
Author

This is mostly done but I still need to figure out a plan for testing it

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 12, 2020
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 14, 2020
@jennybuckley
Copy link
Author

rebased

@jennybuckley
Copy link
Author

/retest

Copy link
Member

@apelisse apelisse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've mostly started looking at the wiring. Haven't looked at the actual conversion. Let's talk about it in person though.

// and uses a default deduced type converter if none is provided.
func NewCRDStructuredMergeManager(typeConverter TypeConverter, objectConverter runtime.ObjectConvertor, objectDefaulter runtime.ObjectDefaulter, gv schema.GroupVersion, hub schema.GroupVersion) (_ Manager, err error) {
if typeConverter == nil {
typeConverter = internal.DeducedTypeConverter{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize that this is how we used to do it, but wouldn't it be better to have this somewhere else actually? It feels a little odd here now.

utilruntime.HandleError(fmt.Errorf("error building openapi models for %s: %v", crd.Name, err))
openAPIModels = nil
utilruntime.HandleError(fmt.Errorf("failed to building typeConverter for apply: %s: %v", crd.Name, err))
return nil, fmt.Errorf("the server could not properly serve the CR schema") // validation should avoid this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even understanding the context, I'm not sure I understand what this error means?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the error we return in many other cases in this file, and then we log a more specific reason in the apiserver logs.

Not that we couldn't also return the specific error to the client

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like something people can take an action on? "It failed because of XYZ", at least they can figure out what to do, no? Agreed that if it's a bug that requires an update/bug-fix then it's not as useful.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the validation of the crd's structural schema is done correctly, then this shouldn't ever be hit, so if it does get hit, then this would be due to a bug that requires an update/bug-fix

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't meant to catch structural schema errors, it's meant to catch errors where, the structural schema is valid and it still fails to convert

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s := &structuralschema.Structural{}
if structuralSchema, ok := structuralSchemas[v]; ok {
if len(structuralschema.ValidateStructural(nil, structuralSchema)) == 0 {
s = structuralSchema
}
}
err = builder.AddStructural(v, s)

Copy link
Member

@liggitt liggitt Mar 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than block all get/put/post/delete requests to the custom resource on this, I'd rather see us attempt this same operation in the CRD controller and add an error condition to the CRD if it fails, to surface the error to the CRD author (similar to what we do for structural errors)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not against doing that for safety, but we don't expect this error to be able to be possibly triggered by users, since we ensure that only valid structural schemas are passed into the builder.AddStructural, which should only return an error if an invalid structural schema was passed in.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a user sends the apiserver a CRD with a schema that fails structural validation, then we won't even try to pass it into the builder.AddStructural function (and we pass in an empty structural schema instead).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the surface area is large enough and the conversion complex enough that I think we should gain confidence with a non-blocking condition-adding approach before making it blocking

// and merges it with the models defined in the static OpenAPI spec.
// Returns nil models if the ServerSideApply feature is disabled, or the static spec is nil, or an error is encountered.
func buildOpenAPIModelsForApply(staticOpenAPISpec *spec.Swagger, crd *apiextensionsv1.CustomResourceDefinition) (proto.Models, error) {
// Returns nil if the ServerSideApply feature is disabled, or the static spec is nil, or an error is encountered.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not something you just changed, but I think it's a little odd to read the code when you can return nil even if there is no error. That's not ideal. Especially since both these checks (feature enabled, staticOpenAPISpec == nil) could be done before calling the method. Would you mind moving these out?

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 4, 2020
@k8s-ci-robot
Copy link
Contributor

@jennybuckley: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added wg/api-expression Categorizes an issue or PR as relevant to WG API Expression. and removed wg/apply labels Apr 17, 2020
@apelisse
Copy link
Member

apelisse commented May 9, 2020

Quick question: is this only going to change the schemas that are known as structural, or all of them?

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 7, 2020
@apelisse
Copy link
Member

apelisse commented Aug 7, 2020

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 7, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 5, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 5, 2020
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/apiserver area/dependency Issues or PRs related to dependency changes area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. wg/api-expression Categorizes an issue or PR as relevant to WG API Expression.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants