Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump CRD versions to v1beta1 #1782

Merged
merged 45 commits into from Oct 1, 2019
Merged

Conversation

thbkrkr
Copy link
Contributor

@thbkrkr thbkrkr commented Sep 24, 2019

This commit bumps the version of the CRDS from version v1alpha1 to version v1betav1:

  • Copy pkg/apis/*/v1alpha1 in pkg/apis/*/v1beta1. Only differences exists for the ElasticsearchStatus (some fields have been removed since the 0.9)
  • all occurrences of v1alpha1 are replaced by v1betav1 in pkg/, config/, test/ and docs/*.asciidoc
  • Add JSON struct tag fields to make the controller-gen happy
  • Add kubebuilder annotations
  • Restore the controller-gen target to generate the CRDs (with trivialVersions=true for now, CRD multi-versions support with schema changes #1823)
  • Add a patch-crd go script to remove the podTemplate schema (CRD podTemplate schema forces to declare an empty spec.containers #1822)
  • Add the apm shortname for the APM Server

Resolves #1140.

@thbkrkr thbkrkr added >enhancement Enhancement of existing functionality v1.0.0-beta1 >breaking labels Sep 24, 2019
@charith-elastic
Copy link
Contributor

If we don't retain the v1alpha1 definitions, would it be possible to deploy the new version on top of an existing cluster without breaking anything? I can't seem to be able to apply the new CRDs without deleting the old ones first (The CustomResourceDefinition "kibanas.kibana.k8s.elastic.co" is invalid: spec.version: Invalid value: "v1beta1": must match the first version in spec.versions). Unless I have made a silly mistake, I assume that this means that people can't upgrade without some down time?

Also, should we consider implementing conversion to support a smoother upgrade flow?

@pebrc
Copy link
Collaborator

pebrc commented Sep 24, 2019

I think conversion would require a webhook server which is currently removed in master due to #1723 But it is definitely something we would like to do once it is restored.

As to having multiple versions of the API:
I think we are still using the deprecated version attribute and we should be using versions instead which lists the different versions and indicates via served whether users can submit new objects using a particular version.

Can we try that and see if keeping v1alpha1 in the versions (but served: false) smoothens the upgrade process?

@thbkrkr
Copy link
Contributor Author

thbkrkr commented Sep 24, 2019

If we don't retain the v1alpha1 definitions, would it be possible to deploy the new version on top of an existing cluster without breaking anything?

No.

I assume that this means that people can't upgrade without some down time?

Yes.

In the current state it will be a breaking change that implies a downtime.

Can we try that and see if keeping v1alpha1 in the versions (but served: false) smoothens the upgrade process?

Yes, I'm going to try this to avoid the breaking.

@thbkrkr
Copy link
Contributor Author

thbkrkr commented Sep 25, 2019

As the schema has changed between v1alpha1 and v1beta1, I don't think we can have both versions referenced in the latest CRDs without implementing conversion.

I tried with:

  versions:
  - name: v1beta1
    served: true
    storage: true
  - name: v1alpha1
    served: false
    storage: false

And this workflow:

  • deploy the operator 0.9
  • deploy a test ES resource in v1alpha1
  • apply the new CRDs with versions v1beta1 and v1alpha1
  • update the operator to the latest version

Then the operator is failing with the following errors:

2019-09-25T14:50:26.314+0200 ERROR controller-runtime.eventhandler.EnqueueRequestForOwner Could not retrieve rest mapping {"ver": "0.10.0-SNAPSHOT-263b5431", "kind": "Elasticsearch.elasticsearch.k8s.elastic.co", "error": "no matches for kind "Elasticsearch" in version "elasticsearch.k8s.elastic.co/v1alpha1""}

2019-09-25T14:50:26.562+0200 ERROR controller-runtime.controller Reconciler error {"ver": "0.10.0-SNAPSHOT-263b5431", "controller": "elasticsearch-controller", "request": "default/test", "error": "Elasticsearch.elasticsearch.k8s.elastic.co "test" is invalid: []: Invalid value: map[string]interface {}{...}: validation failure list:\nspec.nodes.name in
body should match '[a-zA-Z0-9-]+'"}

Even if it's not our goal for now, I tried to update the test ES resource with the new schema (adding the name) and got:

2019-09-25T15:29:08.949+0200 ERROR controller-runtime.controller Reconciler error {"ver": "0.10.0-SNAPSHOT-263b5431", "controller": "elasticsearch-controller", "request": "default/test", "error": "Object default/test-es-http-certs-internal is already owned by another Elasticsearch controller test", "errorCauses": [{"error": "Object default/test-es-http-certs-internal is already owned by another Elasticsearch controller test"}]}

@pebrc
Copy link
Collaborator

pebrc commented Sep 25, 2019

Would it help if instead of renaming we would keep the alpha versions around?

@pebrc
Copy link
Collaborator

pebrc commented Sep 25, 2019

Ok I played around with this a bit here pebrc@76fddfc

and it looks like if we keep the original code around and fix the CRD generation (which was not running and broken) we can have both versions.

But:

  • if served: true old resources will be reconciled by the new controller
  • if served: false then controller-runtime.eventhandler.EnqueueRequestForOwner Could not retrieve rest mapping

Also:

  • kubectl apply -f crds.yaml will fail miserably on update because the last-applied-configuration annoation created by kubectl will exceed the max annotation length
  • our controller level version check should work once we update to the correct minVersion in the code

@thbkrkr
Copy link
Contributor Author

thbkrkr commented Sep 25, 2019

Well done!

  • kubectl apply -f crds.yaml will fail miserably on update because the last-applied-configuration annoation created by kubectl will exceed the max annotation length

The description fields can be truncated or removed using crd:maxDescLen=[-1|0|n] (kubernetes-sigs/controller-tools#299).

-       $(CONTROLLER_GEN) crd paths="./pkg/apis/..." output:crd:artifacts:config=config/crds
+       $(CONTROLLER_GEN) crd:maxDescLen=100 paths="./pkg/apis/..." output:crd:artifacts:config=config/crds

@pebrc
Copy link
Collaborator

pebrc commented Sep 25, 2019

The description fields can be truncated

I used kubectl replace instead which does not write the annotation

@@ -102,8 +102,7 @@ func (k *Kibana) SetAssociationConf(assocConf *commonv1alpha1.AssociationConf) {
k.assocConf = assocConf
}

// +genclient
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
// +kubebuilder:object:root=true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch

assocConf *commonv1alpha1.AssociationConf
Spec KibanaSpec `json:"spec,omitempty"`
Status KibanaStatus `json:"status,omitempty"`
assocConf *commonv1alpha1.AssociationConf `json:"-"` //nolint:govet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here

Copy link
Contributor Author

@thbkrkr thbkrkr Oct 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we prefer to keep this hidden (that was the behaviour in 0.9).

@anyasabo
Copy link
Contributor

anyasabo commented Oct 1, 2019

Just to be clear since the v1beta1 types are essentially just a copy and paste of the v1alpha1 types, and they need to be that way to be compatible between versions: if we make any type changes in the future we need to make sure we change both the v1alpha1 and v1beta1 types, correct? Until we have a conversion webhook that is.

@thbkrkr
Copy link
Contributor Author

thbkrkr commented Oct 1, 2019

if we make any type changes in the future we need to make sure we change both the v1alpha1 and v1beta1 types, correct?

No, if we make any type changes in the future, we need to create a new version (e.g. v1beta2 or v1).

@thbkrkr
Copy link
Contributor Author

thbkrkr commented Oct 1, 2019

Also, we use our controller-version to prevent the latest operator (1.0.0-beta1) to handle previous custom resources (0.9), to overcome the fact that we have no conversion webhook.

const ControllerVersionAnnotation = "common.k8s.elastic.co/controller-version"
const (
ControllerVersionAnnotation = "common.k8s.elastic.co/controller-version"
MinCompatibleControllerVersion = "0.10.0-SNAPSHOT"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These names are confusing to me at least, just going by the names it's not clear why we would need both the last incompatible version and the minimum compatible version. Going by how they're used I think it's because we're using "last incompatible" to mean the version before we started adding the annotation, but it might be worth finding more descriptive names and/or adding explanatory comments on why these exist

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm fine with MinCompatibleControllerVersion .

How about we change LastIncompatibleControllerVersion to:

UnknownControllerVersion = "UNKNOWN" // may match resources created with ECK-0.8.0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will that work? I think it needs to be a parseable version for later versions of the controller to know they shouldn't do anything with it when it does a version comparison, but definitely possible I'm misunderstanding

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, indeed! Then maybe:

UnknownControllerVersion = "0.0.0-UNKNOWN" // may match resources created with ECK-0.8.0

hack/patch-crd/main.go Outdated Show resolved Hide resolved
Copy link
Contributor

@sebgl sebgl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
I tested the alpha -> beta migration locally, everything seems to behave as expected.

Copy link
Contributor

@anyasabo anyasabo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

A note to other reviewers: make sure you are on controller-gen 0.2.1, 0.2.0 has some surprising behavior

@anyasabo
Copy link
Contributor

anyasabo commented Oct 1, 2019

This should also close out #1770

@thbkrkr
Copy link
Contributor Author

thbkrkr commented Oct 1, 2019

Glad we are finally here. Thank you for your feedbacks.

@thbkrkr thbkrkr merged commit 25caaf1 into elastic:master Oct 1, 2019
@thbkrkr thbkrkr deleted the bump-version-v1beta1 branch October 4, 2019 20:46
@pebrc pebrc changed the title Bump version v1betav1 Bump CRD versions to v1betav1 Oct 9, 2019
@pebrc pebrc removed the >enhancement Enhancement of existing functionality label Oct 9, 2019
@thbkrkr thbkrkr changed the title Bump CRD versions to v1betav1 Bump CRD versions to v1beta1 Oct 11, 2019
@ximenzaoshi
Copy link

ximenzaoshi commented Oct 12, 2019

Object default/test-es-http-certs-internal is already owned by another Elasticsearch controller test
I still don't know how to solve this error. The old resource's owner ref belongs to v1alpha1, but the new operator use v1beta1. I think version conversion cannot fix this. Is there any way to use new controllers to manage old version resources?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move CRDs to v1beta1 for 1.0.0-beta1
6 participants