-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto-alias resource apiVersions #798
Conversation
6e1c788
to
d596f9e
Compare
4dcf058
to
74d9191
Compare
74d9191
to
105cefd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM.
For future PRs of this size, it'd be great if we could break it up into multiple smaller commits, preferably with identical changes segmented together. It took me a couple of review sessions to get through it so I'm sure I've missed something 😄
pkg/gen/typegen.go
Outdated
case kinds.DaemonSet: | ||
return []string{ | ||
"kubernetes:apps/v1:DaemonSet", | ||
// For some reason, there is no `apps/v1beta1:DaemonSet`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is, it's under extensions
: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#daemonset-v1beta1-extensions
pkg/gen/typegen.go
Outdated
// It's unsafe to move between `extensions/v1beta1`, and the newer apiVersions due to differences in | ||
// the behavior of the Deployment and ReplicaSet. Even if the apiVersion is changed, | ||
// the resource will continue to use the old behavior, which will break the await logic. Without an | ||
// alias set, the engine will recreate the resource with the newer apiVersion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This general issue makes me think that I don't understand the implications of this PR:
- It is true API versions are generally incompatible (e.g.,
Deployment.spec.selector
is not required in e/v1b1, but is required in other versions), but I do not get the sense that we are super confident that we understand which API versions are actually schematically incompatible. Is this wrong? - In the event where there are incompatibilities, does aliasing actually cause problems? If a person changes a Deployment
apiVersion
from e/v1b1 to apps/v1, it seems like they'd get an error that.spec.selector
is not set, and then they'll set it, and the upgrade will "just work"—right? - In the event where aliasing does cause problems, it seems like we'd want a very clear answer on which of these are schematically different, because any pair that is schematically different can't be aliased at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I do not get the sense that we are super confident that we understand which API versions are actually schematically incompatible. Is this wrong?
Any schema incompatibilities will be caught during an update. For the example you mentioned with the Deployment selector, the SDK will raise an error if the selector is not present on newer apiVersions.
- In the event where there are incompatibilities, does aliasing actually cause problems?
Yes, aliasing can cause problems in specific cases. The one I know about currently is the e/v1b1 Deployment. The problem is a bit subtle, which I tried to capture in the comments here:
- If the Deployment is created with the e/v1b1 API, the Deployment will never include the newer status fields, regardless of which API is used to retrieve the state. This is true even when the apiVersion of the Deployment is changed to a newer version. The Deployment has to be deleted and then recreated to get the new behavior.
- The await logic works differently for the e/v1b1 Deployment. The problem with aliasing is that it changes the associated apiVersion, and breaks the await logic (which can't tell that the Deployment was created with the e/v1b1 apiVersion). I suspect that this is a bug in Kubernetes.
In summary, it's not exactly aliasing that is the problem, it's the different behavior from e/v1b1 Deployment resources, even after the apiVersion is changed explicitly. We need to recreate these resources to get them to work properly, which is probably out of scope for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But Kubernetes itself treats these as the same, and doesn't replace, right? Like if I had a apps/v1/DaemonSet
named foo
, and then tried to post a extensions/v1beta1/DamesonSet
named foo
it would modify the existing resource instead of creating a new one (or even replacing the old one), right? Which is not the same as what would happen if I posted a core/v1/Pod
named foo
.
So we should match that - right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But Kubernetes itself treats these as the same, and doesn't replace, right?
Yes, Kubernetes would not replace the resource for a changed apiVersion.
I think we have two choices here:
- Force a replacement for the
extensions/v1beta1
apiVersion when the resource is updated to a newer apiVersion. The behavior is different between these versions, and a replacement is necessary to avoid having a resource that claims to be a newer version but keeps the old behavior. - Change the await logic so that every Deployment apiVersion uses the
extensions/v1beta1
logic. This has the advantage of not requiring a replacement, but is less robust and more complex. This has a chance of introducing bugs for newer apiVersions since the readiness logic would change.
I'm in favor of 1. Worst case, this is a one-time replacement for a limited number of resources. The extensions/v1beta1
apiVersion will be completely unsupported in another 6 months (k8s 1.18), at which time this will be a moot point, and we can likely remove the legacy code path.
@hausdorff @metral Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to loop back on this, I believe it should be sufficient to mark apiVersion
as a replacement-requiring property for e/v1b1 and alias all apiVersion
s of every resource. The schemas are incompatible, but if you're changing the apiVersion
users will change the incompatible properties too. Replacing e/v1b1s will also fix the status issue (though obviously this is not an amazing experience).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, just saw the new comments. We could match the non-replace behavior here, but we'd have to keep track of which resources used to be e/v1b1. That's kind of a pain, and I don't see it as the end of the world if we don't do it this way, but that's fine with me too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest commits to this PR take approach 1. I think this is the best tradeoff, and is vastly better than the current behavior, which requires manual user intervention to resolve. As @hausdorff mentioned, many Helm charts are rolling apiVersions now, so it's important to get this fix out ASAP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit of pain to do it the other way, but not that big of a pain. You could put this state in the annotations, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hausdorff I decided to go with your annotation suggestion to minimize disruption to existing resources. That should also catch hard-to-diagnose cases where the user aliased a e/v1b1 Deployment themselves.
Tested this out and was able to seamlessly move between apiVersions without breaking the await logic.
I think I don't understand @metral's comment about breaking up the PR because it's too big. It's not actually 375 files, it's 5 files and 370 autogenerated files. Does that seem unmanageable? |
This is causing issues because a lot of people are upgrading Charts, since |
Kubernetes apiVersions are mostly forward/backward compatible, so for cases where we know it's safe, we auto-alias the apiVersions so that the engine does not force a replacement when a resource is updated to a compatible apiVersion.
105cefd
to
c0ed959
Compare
Force a replacement for resources with the extensions/v1beta1 apiVersion when the apiVersion changes.
c0ed959
to
0fcd357
Compare
d797e39
to
8c4a7fe
Compare
Reverse the force-replace change for e/v1b1 apiVersions, and instead store the apiVersion for the orginial create as an annotation on the k8s resource. The await logic uses this annotation to choose the correct logic path.
2b89d1b
to
411309d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright I think this is good enough to ship. The previous replace issue I mentioned before turns out to have been because I was using a new build of the provider plugin, but somehow an old version of the TS code.
Can we file follow-up issues for the things I mentioned?
Also, it would be nice for the alias tests to specifically test that e/v1b1 does not hang when it's upgraded and a new rollout is triggered.
Check that e/v1b1 Deployment can update successfully after changing the apiVersion.
@hausdorff Updated the test to verify the rollout of an upgrade e/v1b1 Deployment and opened issues to track feedback. |
Proposed changes
Kubernetes apiVersions are mostly forward/backward
compatible, so for cases where we know it's safe, we
auto-alias the apiVersions so that the engine does not
force a replacement when a resource is updated to a
compatible apiVersion.
Related issues (optional)
Fixes #573