Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lock CRD upgrade fail going from 1.7.0 to 1.12.2 or higher. #4442

Closed
Tracked by #4372
dee0sap opened this issue Aug 3, 2023 · 13 comments · Fixed by #4447
Closed
Tracked by #4372

lock CRD upgrade fail going from 1.7.0 to 1.12.2 or higher. #4442

dee0sap opened this issue Aug 3, 2023 · 13 comments · Fixed by #4447
Labels
bug Something isn't working
Milestone

Comments

@dee0sap
Copy link
Contributor

dee0sap commented Aug 3, 2023

What happened?

After upgrading from 1.7.0 to 1.12.2 we found that Crossplane init container failed to complete and was reporting
crossplane: error: core.initCommand.Run(): cannot initialize core: cannot apply crd: cannot patch object: CustomResourceDefinition.apiextensions.k8s.io "locks.pkg.crossplane.io" is invalid: status.storedVersions[0]: Invalid value: "v1alpha1": must appear in spec.versions

How can we reproduce it?

Upgrade from 1.7.0 to 1.12.2 plus ?????

@turkenh tried the upgrade but didn't see the above. Not sure what extra action is required.

What environment did it happen in?

Crossplane version: 1.12.2
k8s: 1.25.10 <!-- will try to get history of k8s upgrades
k8s distro : aws <!-- Teammate is checking how widespread this problem is, could be on others as well
os: Unavailable at this time
kernel: Unavailable at this time

@haarchri
Copy link
Contributor

haarchri commented Aug 3, 2023

just a note we dropped v1alpha1 for lock in v1.11.0 #3479

@phisco
Copy link
Contributor

phisco commented Aug 3, 2023

no, I don't think it would help @haarchri, if objects using the old dropped API exists, you need to go through the process described here, which we automated for CompositionRevisions in 1.13.1.

@dee0sap
Copy link
Contributor Author

dee0sap commented Aug 3, 2023

Btw signing off. T'is 2:30 a.m. here and I have an early meeting tomorrow :(

@haarchri
Copy link
Contributor

haarchri commented Aug 3, 2023

@phisco then we need a new release for that and add lock crd with v1alpha1 like v1.13.1 fix

@dee0sap
Copy link
Contributor Author

dee0sap commented Aug 3, 2023

@phisco @haarchri

So now that I have had a couple of hours of sleep...

Are these lock objects transient? Only 6 of our 8 dev clusters failed to update and they were all 1.7.0. And @turkenh also performed a 1.7.0 -> 1.12.2 upgrade without error.

I am wondering if we can 'fix' our broken clusters by

  • scaling the replicas to 0
  • delete any Locks + delete the Lock CRD
  • deploy 1.13.1

@dee0sap
Copy link
Contributor Author

dee0sap commented Aug 3, 2023

Posted to slack
https://crossplane.slack.com/archives/CEG3T90A1/p1691085062031389
asking for someone to vet my workaround idea

@danports
Copy link

danports commented Aug 3, 2023

Pretty sure that workaround is exactly what I did when I hit the Lock CRD issue with one of my clusters when upgrading to 1.11.

@haarchri
Copy link
Contributor

haarchri commented Aug 4, 2023

If you installed Crossplane prior to version v1.4.0, you would have the v1alpha1 version of locks.pkg.crossplane.io. However, starting from version v1.4.0, the storageVersion was updated to v1beta1, and drop for v1alpha1 was in version v1.11.0

added a repo for reproduce this issue https://github.com/haarchri/crossplane-issue-4442
and add a fix with: #4447

@jbw976 jbw976 added this to the v1.14 milestone Aug 4, 2023
@dee0sap
Copy link
Contributor Author

dee0sap commented Aug 4, 2023

Thanks @haarchri, @phisco & @jbw976

I manually fixed one of our 2 broken dev clusters last night.
And I have proposed to the team that

  • We leave the other one to go through the normal rollout process ( which we will engage as soon as a Crossplane release with the fix is available )
  • We identify the canary and prod clusters that will need the fix
  • Assuming the new Crossplane release fixes our broken dev cluster, roll the new Crossplane out to canary and prod asap

Again, thanks so much for the assist :)

@jbw976
Copy link
Member

jbw976 commented Aug 5, 2023

as soon as a Crossplane release with the fix is available

@phisco @turkenh - did you all already discuss a patch release for this Lock migration fixed by #4447 too? I'm not sure if I can articulate it, but I feel like this issue isn't quite as pervasive/severe/urgent as the CompositionRevision migration fix we put into v1.13.1. Thoughts?

@dee0sap
Copy link
Contributor Author

dee0sap commented Aug 5, 2023

Hey @jbw976 @phisco @turkenh

For my team, we're very keen to have a fix. However I don't think it matters to us whether it is 1.13.2 or 1.14.x. I mean we just jumped from 1.7.0 to 1.12.2, and failed :), so we should be fine with jumping to 1.14.x.

@dee0sap
Copy link
Contributor Author

dee0sap commented Aug 6, 2023

Hey everyone,

I surveyed all of our clusters and while v1alpha1 shows up in versionNames a few times storedVersions is always v1beta1 at this point.

Is v1alapha1 a problem if it is in versionNames or only if it is in storedVersions?

@jbw976
Copy link
Member

jbw976 commented Aug 7, 2023

@phisco has opened a backport PR for the fix in #4462, and we're expecting to do a 1.13.2 to patch this tomorrow 😇

/cc @dee0sap

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
5 participants