Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crd versioning with nop Conversion #63830

Merged
merged 5 commits into from May 23, 2018

Conversation

mbohlool
Copy link
Contributor

@mbohlool mbohlool commented May 14, 2018

Implements Custom Resource Definition versioning according to design doc.

Note: I recreated this PR instead of #63518. Huge number of comments there broke github.

Follow-ups: #64136

@sttts @nikhita @deads2k @liggitt @lavalamp

Add CRD Versioning with NOP converter

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 14, 2018
@kubernetes kubernetes deleted a comment from k8s-ci-robot May 15, 2018
@mbohlool mbohlool force-pushed the crd_versioning_nop branch 2 times, most recently from aee96d2 to 928e86c Compare May 15, 2018 04:52
@kubernetes kubernetes deleted a comment from k8s-ci-robot May 15, 2018
@@ -25,6 +25,9 @@ type CustomResourceDefinitionSpec struct {
// Group is the group this resource belongs in
Group string
// Version is the version this resource belongs in
// Should be always first item in Versions field if provided.
// Optional, but at least one of Version or Versions must be set.
// Deprecated: Please use `Versions`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented before: why do we keep this field in the internal types? Let's use conversion to recreate it it from the Versions slice

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed offline: this is needed because validation is done on internal types. Without Version here we would have to validate in conversion that it matches the first item in Versions.

@sttts
Copy link
Contributor

sttts commented May 15, 2018

Second commit has a misleading commit msg.


unstructOut.SetUnstructuredContent(unstructIn.UnstructuredContent())

_, err := c.ConvertToVersion(unstructOut, unstructOut.GroupVersionKind().GroupVersion())
Copy link
Contributor

@sttts sttts May 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks wrong: we cannot expect that the out object has a group version set, can we?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just set the content of the out from in object. if In object has gv set, the out would have it too, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a noop then if input and output GV matches?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a noop then if input and output GV matches?

yes, that's a bug. any info we expect to get from unstructOut must be retrieved before we stomp it with unstructOut.SetUnstructuredContent(unstructIn.UnstructuredContent())

if we can't determine the target group version, we have to return an error

at least in some cases, the context arg is used to pass version info, though I'm not sure how normative that is:

if err := c.convertor.Convert(obj, into, c.decodeVersion); err != nil {
return nil, gvk, err
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That line is the only one calling ObjectConverter.Convert (outside of parameter codecs). So we know pretty well what the context is there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a noop then if input and output GV matches?

I think the assumption (at least from UnstructuredConverter implementation) is they are the same gvk. I am calling ConvertToVersion to take care of the list case where GVKs are the same but list items may not.

func (c *nopConverter) ConvertToVersion(in runtime.Object, target runtime.GroupVersioner) (runtime.Object, error) {
var err error
// Run the converter on the list items instead of list itself
if meta.IsListType(in) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this returns true as soon as there is a items part of a CR. Use a cast to UnstructuredList here instead as @lavalamp proposed before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I should use the cast to unstructuredList here too.

}

func (c *nopConverter) convertToVersion(in runtime.Object, target runtime.GroupVersioner) error {
if kind := in.GetObjectKind().GroupVersionKind(); !kind.Empty() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it correct to do nothing in case of an empty kind?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compare mbohlool#5.

At https://github.com/sttts/kubernetes/blob/2c1a689952ec34e3f9ecb7bcd1772c3fa35c9597/staging/src/k8s.io/apiserver/pkg/endpoints/handlers/patch.go#L80 the patch endpoint code call the NewFunc of the CR store which is an empty Unstructure in this case. This ends up here in the converter. To not fall over, we have this if clause.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ref: #63880

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#63880 has merged. The empty kind exception here can go.

@@ -30,6 +30,7 @@ import (
"github.com/go-openapi/validate"
"github.com/golang/glog"

"k8s.io/apiextensions-apiserver/pkg/apiserver/conversion"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import order is wrong

@@ -640,3 +683,18 @@ func (in crdStorageMap) clone() crdStorageMap {
}
return out
}

type CrdConversionRESTOptionGetter struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be private. Also it deserves a go doc.

cast := obj.(*apiextensions.CustomResourceDefinition)
c.enqueueCRD(cast)
UpdateFunc: func(oldObj, newObj interface{}) {
c.enqueueCRD(oldObj.(*apiextensions.CustomResourceDefinition))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably worth a comment why we're enqueuing both (to make sure we process removed versions correctly)

Group: groupVersion.Group,
Version: groupVersion.Version,
GroupPriorityMinimum: 1000, // CRDs should have relatively low priority
VersionPriority: int32(100 + len(crd.Spec.Versions) - index), // CRDs should have relatively low priority
Copy link
Member

@liggitt liggitt May 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we want this to result in 100 for CRDs with exactly one version to avoid dueling old API servers. This produces 101.

also, this comment was outstanding from the previous PR:

what happens if two CRDs define the same group versions, but in opposite order?

For example:

  • CRD A defines group=foo, versions=v1,v2
  • CRD B defines group=foo, versions=v2,v1

What should the versionpriority for foo/v1 and foo/v2 be? How do we ensure the version priority for the API service is deterministic no matter which CRD happens to come first when searching?

One possibility is to keep track of the version priority calculated for a particular CRD but keep searching through all the CRDs in case any others define it with a higher priority. Highest priority is used. That keeps it deterministic, and if all CRDs for a group define the same versions in the same order, that becomes the order in discovery

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replied there for another suggestion. What if we add Priority field to CustomResourceDefinitionVersion type? If the user does not provide Priority then we can fill it for them (if that is an accepted pattern in our APIs). By having Priority field, there is a way for our users to keep order in any way they want.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we add Priority field to CustomResourceDefinitionVersion type?

whether it is computed or explicitly set, there is still the possibility of the two versions having flipped priorities in two CRDs. we need to make the outcome of that situation deterministic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's step back a bit here:

Why can different CRD resources have different version orders?

I want to argue that we should not support this. The same way that names must not overlap between different CRD objects, neither should versions have a mismatch (= being in different order; of course it is fine that different CRDs have different version subsets, but the order should match).

Any algorithm to create a stable order is wrong for the CRD author in certain cases: the order will potentially not match his intent.

With names we postponed the verification of the not-overlapping invariant into the NamingController which sets the NamesAccepted condition. Clearly, it can only use a temporary snapshots of the CRDs it knows (fromt the lister), so we don't have consistency there as we don't have transactions.

I want to argue further that version order is far less critical than names. Hence:

Why don't we add a VersionsAccepted condition from a VersionsController and enforce matching version orders?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the controller as long as it's HA safe...

It is as safe as the existing naming controller (i.e. there is a race in HA). The assumption is that CRDs are statically created by developers. Even if in an HA environment it can go wrong (= a conflict is not detected and the new CRD versions are served) in 1 out of 100 cases, it will be noticed in 99 cases, and the developer will fix the order.

On the other hand, an algorithm will silently compute some order (which does not match the intend) and the developer of the API group (assuming it is one owner of the group) will not notice. I am not sure I want to trade HA consistency with having a behaviour that the user expects and understands.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stefan's makes a good point about a stable order algorithm leading to action at a distance. If you modify your CRD with another version and create a conflict, the stable ordering algorithm may make what you're looking at succeed while breaking something else.

By having a proper condition for it, the action happens for your change. You don't see what you expect, you check status and you see why.

It'll be fairly hard to take back the choice and fix behavior later (see TPR to CRD). I think we should learn from the naming problems and take the proper status an reconciliation approach.

@@ -56,7 +64,7 @@ func TestHandleVersionUpdate(t *testing.T) {
Group: "group.com",
Version: "v1",
GroupPriorityMinimum: 1000,
VersionPriority: 100,
VersionPriority: 101,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is demonstrating a change that is going to fight old API servers

// TODO: should this be a typed error?
return fmt.Errorf("%v is unstructured and is not suitable for converting to %q", kind, target)
}
if !apiextensions.HasCRDVersion(c.crd, gvk.Version) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a general pattern I expected to see was to do "expensive" work when setting up the convertor/handler in response to CRD changes, rather than in the actual Convert/Handle methods.

for example:

// NewCRDConverter returns a new CRD converter based on the conversion settings in crd object.
func NewCRDConverter(crd *apiextensions.CustomResourceDefinition) runtime.ObjectConvertor {
  validVersions := map[schema.GroupVersion]bool{}
  for _, version := range crd.Spec.Versions {
    validVersions[schema.GroupVersion{Group:crd.Spec.Group, Version: version.Name}] = true
  }

  return &nopConverter{
    clusterScoped: crd.Spec.Scope == apiextensions.ClusterScoped,
    validVersions: validVersions,
  }
}

and in the actual noop conversion:

inGV := in.GroupVersionKind().GroupVersion()
if _, ok := c.validVersions[inGV]; !ok {
  ... return unknown version error ...
}

outGV := out.GroupVersionKind().GroupVersion()
if _, ok := c.validVersions[outGV]; !ok {
  ... return unknown version error ...
}

... set content and stomp version ...
return nil

it looks like the handler setup was close to that already (set up maps from versions to storages, scopes, etc), which is good.

Copy link
Contributor Author

@mbohlool mbohlool May 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the setup is already like that. This function call apiextensions.HasCRDVersion(c.crd, gvk.Version) is not expensive and also replaced another check existed before. We need to do that because the handler is for all CRDs with all versions. When we pass this, in the getCrdInfo function, we have the map and that map is pre-populated with all valid versions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

structuring the convertor so that all in/out transformations are determined at construction time makes for O(1) lookups in Convert(), and as convertors become more complex (declarative transformation, webhook, etc), we'll only want to do that complex setup (building the transforming functions, the webhook client, etc) once at construction as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. I assumed this line is in handler but it is in nop_converter. My bad. It make sense now.

Copy link
Contributor Author

@mbohlool mbohlool May 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not check outGV anywhere in nop_converter

we should. the convertor's job is to convert from inGV to outGV... making sure outGV is in the list of valid versions is part of that.

if !apiextensions.HasCRDVersion(c.crd, gvk.Version) {
return fmt.Errorf("request to convert CRD to an invalid version: %s", gvk.String())
}
in.GetObjectKind().SetGroupVersionKind(gvk)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we changing the GVK of the incoming object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ObjectConverter.ConvertToVersion comment says:

// This method does not guarantee that the in object is not mutated.

leaving the door open to mutate in object and return it. Do you think it is better to duplicate it? We do not have ObjectCreator interface in converter but we can pass it here if we need to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the no-op conversion. Its only task is to set the GVK of an Unstructured. Both Convert and ConvetToVersion allow mutation of the incoming object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I misread the diff and thought this was part of the Convert(in, out runtime.Object) method and was confused why we were messing with in, not out. this is fine

@@ -96,7 +96,21 @@ func (c *DiscoveryController) sync(version schema.GroupVersion) error {
Version: crd.Spec.Version,
})

if crd.Spec.Version != version.Version {
foundThisVersion := false
for _, v := range crd.Spec.Versions {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're iterating over and adding all served versions, we can drop the append on line 94 of spec.Version, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

}
}

if !foundThisVersion {
continue
}
foundVersion = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with multiple versions supported, this statement is no longer true:

Versions: apiVersionsForDiscovery,
// the preferred versions for a group is arbitrary since there cannot be duplicate resources
PreferredVersion: apiVersionsForDiscovery[0],

the ordering of apiVersionsForDiscovery needs to reflect the ordering of the versions in the CRDs for the group (and the open question about conflicting order needs resolving).

c.queue.Add(schema.GroupVersion{Group: obj.Spec.Group, Version: obj.Spec.Version})
for _, v := range obj.Spec.Versions {
c.queue.Add(schema.GroupVersion{Group: obj.Spec.Group, Version: v.Name})
}
}

func (c *DiscoveryController) addCustomResourceDefinition(obj interface{}) {
Copy link
Member

@liggitt liggitt May 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same change needed to enqueue both old and new in updateCustomResourceDefinition?

@liggitt
Copy link
Member

liggitt commented May 22, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 22, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: liggitt, mbohlool, sttts

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 22, 2018
@mbohlool
Copy link
Contributor Author

/retest

@mbohlool mbohlool added this to the v1.11 milestone May 22, 2018
@mbohlool mbohlool added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label May 22, 2018
@lavalamp
Copy link
Member

lavalamp commented May 22, 2018 via email

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels May 22, 2018
@lavalamp
Copy link
Member

lavalamp commented May 22, 2018 via email

@k8s-github-robot
Copy link

[MILESTONENOTIFIER] Milestone Pull Request: Up-to-date for process

@liggitt @mbohlool

Pull Request Labels
  • sig/api-machinery: Pull Request will be escalated to these SIGs if needed.
  • priority/important-soon: Escalate to the pull request owners and SIG owner; move out of milestone after several unsuccessful escalation attempts.
  • kind/feature: New functionality.
Help

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants