-
Notifications
You must be signed in to change notification settings - Fork 910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crossplane Core Fails to Start up after upgrading from V1.7.0 to V1.13.2 #4557
Comments
for reference - we tested upgrade v1.3.1 - v1.7.0 - v1.12.2 and then tested with Code from v1.13.2 |
can you show output from command: kubectl get crd compositionrevisions.apiextensions.crossplane.io -o json | jq '.status.storedVersions' |
Hey @haarchri Our CI pipeline collects an obj dump from the cluster before disposing of it. Looking in the dump from Parisa's last pipeline run I found the following
GLOBAL+I540621@W-R90YFPQE /cygdrive/c/dev/fpa131/2023-08-29-crossplane-canary-upgrade-problem/parisa-can-pr/b4/a5110b-dmi-resources/all_resources |
Hey @haarchri As I mentioned in the slack thread we're very keen to get 1.13.2 out to canary quickly so any work around that would let us move forward would be much appreciated. Look at timestamps on some of the objects in the obj dump, I see we have this sequence Now I also mentioned in slack that one of things different about our deployment of Crossplane is we have an additional init container that runs before the init containers provided by Crossplane. Since compositionevisions is not enabled for our 1.7 install could we work around this situation by having our init container in the 1.13.2 deploy delete the compositionrevision CRD ? Or maybe conditionally delete the CRD if it finds that the only version supported version is v1alpha1. That would probably be safer. What do you think? |
I can reproduce this issue simply by trying an upgrade from v1.7.0 to v1.13.2:
I'll investigate what's going on further... |
And it works fine if I try it from |
@turkenh I suppose the problem is that we should first apply the CRDs and then proceed with the migration part if needed ? |
In 1.13.2 we upgrade the CRDs to the latest version available before installing the new CRDs. If I'm not wrong the point is that jumping from 1.7.x to 1.13.2 there is no v1beta1 to upgrade it to first, hence the init job does nothing and then when we upgrade the crd removing v1alpha1 it fails. So yes, upgrading to 1.12 first should do the trick. |
This was my first impression while reading the migrator code, but applying CRD fails with that error already, so not possible like a chicken-egg problem. In general, if we attempt upgrading from a version that has an old version as So, I believe it is fair to expect upgrading over an intermediate version since it is not feasible to support upgrading from v1.7.0 -> v1.12.2: CRD apply works since v1.12.2 -> v1.13.2: CRD apply works since new storageVersion, which is For reference, composition revision CRDs have the following versions: In versions:
- name: v1alpha1
storage: true In versions:
- name: v1
storage: true
- name: v1beta1
storage: false
- name: v1alpha1
storage: false In
|
I also believe it's fair to expect the end user to take an intermediary upgrade path. cc @plumbis |
Hey guys Thanks for the analysis so far. At this point I don't think we will easily be able to rollout < 1.13.2 to our canary and prod landscapes. The folks who own the CI/CD infrastructure enforce a rollout of dev then canary then production. So we would first have to downgrade dev clusters to whatever intermediate version was selected, then roll that out to canary and production, and then begin with the rollout of 1.13.2. That said, would my work around that I suggested here |
Deleting the v1alpha1 version could result in data loss. An option would be to let your init container add the v1 and v1beta1 versions to the existing CRD, so that the latter can move all resources to v1 and then update the crd to drop the v1alpha1 |
Thanks for the response @phisco. How would the deletion of the CRD result in data loss given that
|
If that's the case then there should be no problem |
Thanks! |
Just realized that you called out that there could be data loss but didn't indicate whether or not you thought my idea would actually get my team unstuck. What do you think? |
Opened a discussion to make sure the maintainers come to an agreement on upgrade policies and we'll get that included in the docs. |
Hey everyone, Just asking again if folks think the workaround mentioned here |
Yes, @dee0sap, it should |
I've hit the same bug trying to go from 1.9 to 1.12.1:
|
@darkmuggle for locks.pkg.crossplane.io we fixed this in https://github.com/crossplane/crossplane/releases/tag/v1.13.2 |
Crossplane does not currently have enough maintainers to address every issue and pull request. This issue has been automatically marked as |
I ran into the same issue. I resolved by uninstalling the old CRDS of crossplane. |
I'm running into the same issue trying to go from 1.9.1 to 1.15.2 |
What happened?
After upgrading from 1.7.0 to 1.13.2 the crossplane core doesn't startup. We see error messages like
crossplane: error: core.initCommand.Run(): cannot initialize core: cannot apply crd: cannot patch object: CustomResourceDefinition.apiextensions.k8s.io "compositionrevisions.apiextensions.crossplane.io" is invalid: status.storedVersions[0]: Invalid value: "v1alpha1": must appear in spec.versions
.This is a canary upgrade from 1.7.0 to 1.13.2. In the dev environment, we upgraded from 1.7.0 to 1.12.2 and then got the fix release of 1.13.2 and everything works fine. What is happening in the canary is this upgrade is from 1.7.0 to 1.13.2 and again we see a similar error before the 1.13.2 fix.
Something to note here is we did not have compositionrevisions enabled before the upgrade.
Logs from init-container before upgrade and new init-container after upgrade are attached
How can we reproduce it?
It happened in the transient test clusters that are CI/CD pipelines. But should be reproducible by upgrading from 1.7.0 to 1.13.2.
What environment did it happen in?
oldinit-vs-newinit.txt
The text was updated successfully, but these errors were encountered: