Auto-alias resource apiVersions #798

lblackstone · 2019-09-11T22:31:54Z

Proposed changes

Kubernetes apiVersions are mostly forward/backward
compatible, so for cases where we know it's safe, we
auto-alias the apiVersions so that the engine does not
force a replacement when a resource is updated to a
compatible apiVersion.

Related issues (optional)

Fixes #573

metral

Overall LGTM.

For future PRs of this size, it'd be great if we could break it up into multiple smaller commits, preferably with identical changes segmented together. It took me a couple of review sessions to get through it so I'm sure I've missed something 😄

hausdorff · 2019-09-30T16:31:12Z

pkg/gen/typegen.go

+	case kinds.DaemonSet:
+		return []string{
+			"kubernetes:apps/v1:DaemonSet",
+			// For some reason, there is no `apps/v1beta1:DaemonSet`.


There is, it's under extensions: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#daemonset-v1beta1-extensions

hausdorff · 2019-09-30T16:43:10Z

pkg/gen/typegen.go

+	// It's unsafe to move between `extensions/v1beta1`, and the newer apiVersions due to differences in
+	// the behavior of the Deployment and ReplicaSet. Even if the apiVersion is changed,
+	// the resource will continue to use the old behavior, which will break the await logic. Without an
+	// alias set, the engine will recreate the resource with the newer apiVersion.


This general issue makes me think that I don't understand the implications of this PR:

It is true API versions are generally incompatible (e.g., Deployment.spec.selector is not required in e/v1b1, but is required in other versions), but I do not get the sense that we are super confident that we understand which API versions are actually schematically incompatible. Is this wrong?

In the event where there are incompatibilities, does aliasing actually cause problems? If a person changes a Deployment apiVersion from e/v1b1 to apps/v1, it seems like they'd get an error that .spec.selector is not set, and then they'll set it, and the upgrade will "just work"—right?

In the event where aliasing does cause problems, it seems like we'd want a very clear answer on which of these are schematically different, because any pair that is schematically different can't be aliased at all.

I do not get the sense that we are super confident that we understand which API versions are actually schematically incompatible. Is this wrong?

Any schema incompatibilities will be caught during an update. For the example you mentioned with the Deployment selector, the SDK will raise an error if the selector is not present on newer apiVersions.

In the event where there are incompatibilities, does aliasing actually cause problems?

Yes, aliasing can cause problems in specific cases. The one I know about currently is the e/v1b1 Deployment. The problem is a bit subtle, which I tried to capture in the comments here:

If the Deployment is created with the e/v1b1 API, the Deployment will never include the newer status fields, regardless of which API is used to retrieve the state. This is true even when the apiVersion of the Deployment is changed to a newer version. The Deployment has to be deleted and then recreated to get the new behavior.

The await logic works differently for the e/v1b1 Deployment. The problem with aliasing is that it changes the associated apiVersion, and breaks the await logic (which can't tell that the Deployment was created with the e/v1b1 apiVersion). I suspect that this is a bug in Kubernetes.

In summary, it's not exactly aliasing that is the problem, it's the different behavior from e/v1b1 Deployment resources, even after the apiVersion is changed explicitly. We need to recreate these resources to get them to work properly, which is probably out of scope for this PR.

But Kubernetes itself treats these as the same, and doesn't replace, right? Like if I had a apps/v1/DaemonSet named foo, and then tried to post a extensions/v1beta1/DamesonSet named foo it would modify the existing resource instead of creating a new one (or even replacing the old one), right? Which is not the same as what would happen if I posted a core/v1/Pod named foo.

So we should match that - right?

But Kubernetes itself treats these as the same, and doesn't replace, right?

Yes, Kubernetes would not replace the resource for a changed apiVersion.

I think we have two choices here:

Force a replacement for the extensions/v1beta1 apiVersion when the resource is updated to a newer apiVersion. The behavior is different between these versions, and a replacement is necessary to avoid having a resource that claims to be a newer version but keeps the old behavior.

Change the await logic so that every Deployment apiVersion uses the extensions/v1beta1 logic. This has the advantage of not requiring a replacement, but is less robust and more complex. This has a chance of introducing bugs for newer apiVersions since the readiness logic would change.

I'm in favor of 1. Worst case, this is a one-time replacement for a limited number of resources. The extensions/v1beta1 apiVersion will be completely unsupported in another 6 months (k8s 1.18), at which time this will be a moot point, and we can likely remove the legacy code path.

@hausdorff @metral Thoughts?

Just to loop back on this, I believe it should be sufficient to mark apiVersion as a replacement-requiring property for e/v1b1 and alias all apiVersions of every resource. The schemas are incompatible, but if you're changing the apiVersion users will change the incompatible properties too. Replacing e/v1b1s will also fix the status issue (though obviously this is not an amazing experience).

Oh, just saw the new comments. We could match the non-replace behavior here, but we'd have to keep track of which resources used to be e/v1b1. That's kind of a pain, and I don't see it as the end of the world if we don't do it this way, but that's fine with me too.

The latest commits to this PR take approach 1. I think this is the best tradeoff, and is vastly better than the current behavior, which requires manual user intervention to resolve. As @hausdorff mentioned, many Helm charts are rolling apiVersions now, so it's important to get this fix out ASAP.

It's a bit of pain to do it the other way, but not that big of a pain. You could put this state in the annotations, for example.

@hausdorff I decided to go with your annotation suggestion to minimize disruption to existing resources. That should also catch hard-to-diagnose cases where the user aliased a e/v1b1 Deployment themselves.

Tested this out and was able to seamlessly move between apiVersions without breaking the await logic.

hausdorff · 2019-09-30T16:45:13Z

I think I don't understand @metral's comment about breaking up the PR because it's too big. It's not actually 375 files, it's 5 files and 370 autogenerated files. Does that seem unmanageable?

hausdorff · 2019-10-02T19:51:45Z

This is causing issues because a lot of people are upgrading Charts, since extensions/v1beta1 is has been removed from Kubernetes v1.16, many charts are transitioning to newer .apiVersions. Pulumi thinks of identical resources with different values for this field to as different resources, while Kubernetes considers them the same—so when changed we schedule a Create of the "new" resource and a subsequent Delete of the "old" resource. When we try to do the Create Kubernetes rejects the operation because it already exists.

Kubernetes apiVersions are mostly forward/backward compatible, so for cases where we know it's safe, we auto-alias the apiVersions so that the engine does not force a replacement when a resource is updated to a compatible apiVersion.

Force a replacement for resources with the extensions/v1beta1 apiVersion when the apiVersion changes.

Reverse the force-replace change for e/v1b1 apiVersions, and instead store the apiVersion for the orginial create as an annotation on the k8s resource. The await logic uses this annotation to choose the correct logic path.

pkg/gen/typegen.go

pkg/metadata/annotations.go

hausdorff · 2019-10-04T18:26:23Z

Moving from e/v1b1 -> apps/v1 triggers a replace. This is not intentional, right?

hausdorff

Alright I think this is good enough to ship. The previous replace issue I mentioned before turns out to have been because I was using a new build of the provider plugin, but somehow an old version of the TS code.

Can we file follow-up issues for the things I mentioned?

Also, it would be nice for the alias tests to specifically test that e/v1b1 does not hang when it's upgraded and a new rollout is triggered.

Check that e/v1b1 Deployment can update successfully after changing the apiVersion.

lblackstone · 2019-10-04T19:47:25Z

@hausdorff Updated the test to verify the rollout of an upgrade e/v1b1 Deployment and opened issues to track feedback.

lblackstone requested a review from hausdorff September 11, 2019 22:31

lblackstone force-pushed the lblackstone/autoaliasing branch 4 times, most recently from 6e1c788 to d596f9e Compare September 12, 2019 22:53

lblackstone requested a review from metral September 12, 2019 22:58

lblackstone force-pushed the lblackstone/autoaliasing branch 3 times, most recently from 4dcf058 to 74d9191 Compare September 19, 2019 18:30

lblackstone marked this pull request as ready for review September 19, 2019 18:30

lblackstone force-pushed the lblackstone/autoaliasing branch from 74d9191 to 105cefd Compare September 24, 2019 16:14

metral approved these changes Sep 26, 2019

View reviewed changes

hausdorff reviewed Sep 30, 2019

View reviewed changes

lukehoban mentioned this pull request Oct 1, 2019

Automatically alias different ApiVersions #573

Closed

Auto-alias resource apiVersions

650805d

Kubernetes apiVersions are mostly forward/backward compatible, so for cases where we know it's safe, we auto-alias the apiVersions so that the engine does not force a replacement when a resource is updated to a compatible apiVersion.

lblackstone force-pushed the lblackstone/autoaliasing branch from 105cefd to c0ed959 Compare October 3, 2019 18:40

Special case extensions/v1beta1 resources

0fcd357

Force a replacement for resources with the extensions/v1beta1 apiVersion when the apiVersion changes.

lblackstone force-pushed the lblackstone/autoaliasing branch from c0ed959 to 0fcd357 Compare October 3, 2019 19:32

Fix test

8c4a7fe

lblackstone force-pushed the lblackstone/autoaliasing branch from d797e39 to 8c4a7fe Compare October 3, 2019 22:54

Store createApiVersion as annotation

411309d

Reverse the force-replace change for e/v1b1 apiVersions, and instead store the apiVersion for the orginial create as an annotation on the k8s resource. The await logic uses this annotation to choose the correct logic path.

lblackstone force-pushed the lblackstone/autoaliasing branch from 2b89d1b to 411309d Compare October 4, 2019 04:31

hausdorff reviewed Oct 4, 2019

View reviewed changes

pkg/gen/typegen.go Show resolved Hide resolved

pkg/metadata/annotations.go Outdated Show resolved Hide resolved

pkg/metadata/annotations.go Show resolved Hide resolved

Rename to initialApiVersion

4667f5d

hausdorff approved these changes Oct 4, 2019

View reviewed changes

lblackstone mentioned this pull request Oct 4, 2019

Auto-alias all apiVersions rather than hardcoded set #828

Closed

Improve test

3e8705d

Check that e/v1b1 Deployment can update successfully after changing the apiVersion.

lblackstone merged commit 55fc3c0 into master Oct 4, 2019

pulumi-bot deleted the lblackstone/autoaliasing branch October 4, 2019 20:31

j-maxi mentioned this pull request Aug 12, 2021

Cannot update apiVersion of apiextensions.CustomResource #1668

Closed

EronWright mentioned this pull request Feb 1, 2024

Support for converting a resource's API version #2807

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto-alias resource apiVersions #798

Auto-alias resource apiVersions #798

lblackstone commented Sep 11, 2019 •

edited

Loading

metral left a comment

hausdorff Sep 30, 2019

hausdorff Sep 30, 2019

lblackstone Oct 3, 2019

lukehoban Oct 3, 2019

lblackstone Oct 3, 2019

hausdorff Oct 3, 2019

hausdorff Oct 3, 2019

lblackstone Oct 3, 2019

hausdorff Oct 3, 2019

lblackstone Oct 4, 2019

hausdorff commented Sep 30, 2019

hausdorff commented Oct 2, 2019

hausdorff commented Oct 4, 2019

hausdorff left a comment

lblackstone commented Oct 4, 2019

Auto-alias resource apiVersions #798

Auto-alias resource apiVersions #798

Conversation

lblackstone commented Sep 11, 2019 • edited Loading

Proposed changes

Related issues (optional)

metral left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hausdorff commented Sep 30, 2019

hausdorff commented Oct 2, 2019

hausdorff commented Oct 4, 2019

hausdorff left a comment

Choose a reason for hiding this comment

lblackstone commented Oct 4, 2019

lblackstone commented Sep 11, 2019 •

edited

Loading