New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Server-side apply: migration from client-side apply leaves stuck fields in the object #99003
Server-side apply: migration from client-side apply leaves stuck fields in the object #99003
Comments
/sig api-machinery |
Note that the same happens if the field is removed after the initial client-side apply and the following server-side apply. Since I'm doing SSA from a custom app, and not with |
/assign @apelisse |
Thanks for the copy-pastable instructions @aermakov-zalando, that really made things super simple to reproduce! Yeah, we don't entirely convert
I'm not sure I follow that use-case, can you describe that for me?
Is that custom-app using kubectl at all? The whole "conversion from csa to ssa" only happens for kubectl. I'm happy to understand your use-case and see if you're going to run into this situation again. |
Sorry, I've missed a word there. What I meant is that I would expect the field to be dropped if I remove it from the local manifest after the client-side apply but before the first server-side apply, but this is not something that happens.
We have several apps that deploy resources to our clusters. They've been using |
Looking the SSA docs, though, it doesn't look like this use case is supported by SSA. Handover of fields from one applier to the other one requires the former owner to apply again with an empty manifest to relinquish the field ownership. This means that a user can't just switch from applying with tool A to applying with tool B, which can be problematic for manifests that are only applied infrequently (now the legacy tool has to be dragged around forever). Wouldn't it make more sense to have an SSA with |
Yeah, we don't have that feature, and since it's the first time someone asks for it, I'm not sure how widely needed this is. An alternative workaround is to remove the entry from ManagedFields (use at your own risk). |
This is a problem for |
Julian and I talked about this and we'll see how we can address it specifically for |
I see. It'd be nice if custom controllers could benefit from this (since as far as I understand one of the stated goals of SSA was to make |
Maybe I'm wrong but I don't expect this to be very frequent. Removing the managed fields is still a fairly easy operation to run in any language. |
@julianvmodesto That might be an option for the client (kubectl) too, just remove the "kubectl-client-side-applied" manager when server-side apply is ran. |
I don't think it's that easy. This only works if the manifest that's applied with |
You're right, I don't think we would remove the fields merely because the manager was removed, but let me try to reproduce first. Again, I don't expect this to be a very frequent process or that it'll be massively used by people (especially outside kubectl).
Is that a solution for your use-case? Client-side apply and then remove the annotation, then you can server-side apply freely? I know it's not great. |
I don't really like this. This means that even though the resource is in state X' (edited manually) and the manifest is in state Y (edited by the user), an apply would reset it first to X (last value from the last-applied-configuration annotation), which can potentially break things. It also makes the apply process non-atomic, so I potentially have to retry this multiple times (and I don't think I can PATCH a particular resource version). |
I think I can reproduce this in a kubectl integration test: #99277 |
I don't think I'd call this done just yet. |
Shouldn't csaupgrade.UpgradeManagedFields be backported to earlier versions of client-go? This is bug that affects people now so a fix should be made readily available to be able to let tools make a fix now, without having to wait for v1.26 and then only support this fix for v1.26 and later. |
I think there's an issue if a resource was first applied with kubectl 1.14, then modified with kubectl 1.19 and then server-side-applied with kubectl 1.26: Apply with kubectl 1.14
then apply with kubectl 1.19
then apply with kubectl 1.26 with
Result:
This is tested on a kubernetes 1.20 apiserver |
I was able to reproduce this locally. This case is not supported by the automatic ownership transition -- only fields owned by the field manager which owns Since I don't think there is a change we can make in the code to automatically fix this in general, since it is not possible to distinguish your In your case, this can be worked around by: kubectl apply (v1.26) apiVersion: v1
data:
one:
three: three
kind: ConfigMap
metadata:
name: test-ssa kubectl apply (v1.26) apiVersion: v1
data:
three: three
kind: ConfigMap
metadata:
name: test-ssa In other words, take ownership of the old fields used in 1.14, and send another patch to remove them. Even if we can't automatically transition these users, I do think a better/cleaner workaround is necessary so I am reopening this /reopen |
@alexzielenski: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This commit incorporates several significant changes to the resource diff logic. These changes are summarized below, followed by detailed explanations of each: 1. Drop usage of the "kubectl.kubernetes.io/last-applied-configuration" annotation. 2. Compute preview diffs using resource inputs rather than making a dry-run API call. 3. Automatically update `.metadata.managedFields` to work with resources that were managed with client-side apply, and later upgraded to use server-side apply. 4. Fix a bug with the diff calculation so that resource drift is detected accurately after a refresh. --- 1. Previous versions of the provider used the "kubectl.kubernetes.io/last-applied-configuration" annotation to store a copy of the configuration as part of the Kubernetes resource. This annotation was used for input diff computation in client-side apply mode. Specifically, input diffs were computed between the new inputs and the last-applied-configuration value from the previous live state. Using this annotation resulted in a large number of reported issues (30+) for the provider, and it will no longer be used. 14916d3 added logic to prune a map based on a target map. This logic is used to prune live state to match the previous inputs before performing an input diff computation. This avoids the need to store the last-applied-configuration. 2. Server-side apply mode in 3.x versions of the provider performed several network-bound API calls as part of the resource diff computation. This led to poor performance and diverged from the way that other Pulumi providers compute resource diffs. This behavior was brought in line with other providers, and now previews resource changes using the client-side inputs. Accepted changes are still updated using server-side apply. Note that this now requires the stack to be refreshed to detect resource drift in both client-side apply and server-side apply modes. 3. In server-side apply (SSA) mode, Kubernetes uses information in the `.metadata.managedFields` property to determine ownership of each field in a resource. Kubernetes v1.18+ sets the managedFields property in both client-side and server-side apply modes. Operations performed using client-side mode set a field manager with an operation type of "Update", while server-side operations use "Apply". Kubernetes treats these as separate field managers, which can lead to problems during server-side updates. To avoid this class of problems, the provider automatically takes ownership of fields managed by known client-side field managers during the first SSA-enabled update. This change should be transparent to users. See kubernetes/kubernetes#99003 for additional details. 4. Previous versions of the provider contained a bug where previous inputs rather than live state were used to compute a diff. This prevented the provider from detecting resource drift caused by other controllers. This bug was fixed so that drift will be corrected on update after a refresh is performed. --------- Co-authored-by: Ramon Quitales <ramon@pulumi.com>
This commit incorporates several significant changes to the resource diff logic. These changes are summarized below, followed by detailed explanations of each: 1. Drop usage of the "kubectl.kubernetes.io/last-applied-configuration" annotation. 2. Compute preview diffs using resource inputs rather than making a dry-run API call. 3. Automatically update `.metadata.managedFields` to work with resources that were managed with client-side apply, and later upgraded to use server-side apply. 4. Fix a bug with the diff calculation so that resource drift is detected accurately after a refresh. --- 1. Previous versions of the provider used the "kubectl.kubernetes.io/last-applied-configuration" annotation to store a copy of the configuration as part of the Kubernetes resource. This annotation was used for input diff computation in client-side apply mode. Specifically, input diffs were computed between the new inputs and the last-applied-configuration value from the previous live state. Using this annotation resulted in a large number of reported issues (30+) for the provider, and it will no longer be used. 14916d3 added logic to prune a map based on a target map. This logic is used to prune live state to match the previous inputs before performing an input diff computation. This avoids the need to store the last-applied-configuration. 2. Server-side apply mode in 3.x versions of the provider performed several network-bound API calls as part of the resource diff computation. This led to poor performance and diverged from the way that other Pulumi providers compute resource diffs. This behavior was brought in line with other providers, and now previews resource changes using the client-side inputs. Accepted changes are still updated using server-side apply. Note that this now requires the stack to be refreshed to detect resource drift in both client-side apply and server-side apply modes. 3. In server-side apply (SSA) mode, Kubernetes uses information in the `.metadata.managedFields` property to determine ownership of each field in a resource. Kubernetes v1.18+ sets the managedFields property in both client-side and server-side apply modes. Operations performed using client-side mode set a field manager with an operation type of "Update", while server-side operations use "Apply". Kubernetes treats these as separate field managers, which can lead to problems during server-side updates. To avoid this class of problems, the provider automatically takes ownership of fields managed by known client-side field managers during the first SSA-enabled update. This change should be transparent to users. See kubernetes/kubernetes#99003 for additional details. 4. Previous versions of the provider contained a bug where previous inputs rather than live state were used to compute a diff. This prevented the provider from detecting resource drift caused by other controllers. This bug was fixed so that drift will be corrected on update after a refresh is performed. --------- Co-authored-by: Ramon Quitales <ramon@pulumi.com>
This commit incorporates several significant changes to the resource diff logic. These changes are summarized below, followed by detailed explanations of each: 1. Drop usage of the "kubectl.kubernetes.io/last-applied-configuration" annotation. 2. Compute preview diffs using resource inputs rather than making a dry-run API call. 3. Automatically update `.metadata.managedFields` to work with resources that were managed with client-side apply, and later upgraded to use server-side apply. 4. Fix a bug with the diff calculation so that resource drift is detected accurately after a refresh. --- 1. Previous versions of the provider used the "kubectl.kubernetes.io/last-applied-configuration" annotation to store a copy of the configuration as part of the Kubernetes resource. This annotation was used for input diff computation in client-side apply mode. Specifically, input diffs were computed between the new inputs and the last-applied-configuration value from the previous live state. Using this annotation resulted in a large number of reported issues (30+) for the provider, and it will no longer be used. 14916d3 added logic to prune a map based on a target map. This logic is used to prune live state to match the previous inputs before performing an input diff computation. This avoids the need to store the last-applied-configuration. 2. Server-side apply mode in 3.x versions of the provider performed several network-bound API calls as part of the resource diff computation. This led to poor performance and diverged from the way that other Pulumi providers compute resource diffs. This behavior was brought in line with other providers, and now previews resource changes using the client-side inputs. Accepted changes are still updated using server-side apply. Note that this now requires the stack to be refreshed to detect resource drift in both client-side apply and server-side apply modes. 3. In server-side apply (SSA) mode, Kubernetes uses information in the `.metadata.managedFields` property to determine ownership of each field in a resource. Kubernetes v1.18+ sets the managedFields property in both client-side and server-side apply modes. Operations performed using client-side mode set a field manager with an operation type of "Update", while server-side operations use "Apply". Kubernetes treats these as separate field managers, which can lead to problems during server-side updates. To avoid this class of problems, the provider automatically takes ownership of fields managed by known client-side field managers during the first SSA-enabled update. This change should be transparent to users. See kubernetes/kubernetes#99003 for additional details. 4. Previous versions of the provider contained a bug where previous inputs rather than live state were used to compute a diff. This prevented the provider from detecting resource drift caused by other controllers. This bug was fixed so that drift will be corrected on update after a refresh is performed. --------- Co-authored-by: Ramon Quitales <ramon@pulumi.com>
This commit incorporates several significant changes to the resource diff logic. These changes are summarized below, followed by detailed explanations of each: 1. Drop usage of the "kubectl.kubernetes.io/last-applied-configuration" annotation. 2. Compute preview diffs using resource inputs rather than making a dry-run API call. 3. Automatically update `.metadata.managedFields` to work with resources that were managed with client-side apply, and later upgraded to use server-side apply. 4. Fix a bug with the diff calculation so that resource drift is detected accurately after a refresh. --- 1. Previous versions of the provider used the "kubectl.kubernetes.io/last-applied-configuration" annotation to store a copy of the configuration as part of the Kubernetes resource. This annotation was used for input diff computation in client-side apply mode. Specifically, input diffs were computed between the new inputs and the last-applied-configuration value from the previous live state. Using this annotation resulted in a large number of reported issues (30+) for the provider, and it will no longer be used. 14916d3 added logic to prune a map based on a target map. This logic is used to prune live state to match the previous inputs before performing an input diff computation. This avoids the need to store the last-applied-configuration. 2. Server-side apply mode in 3.x versions of the provider performed several network-bound API calls as part of the resource diff computation. This led to poor performance and diverged from the way that other Pulumi providers compute resource diffs. This behavior was brought in line with other providers, and now previews resource changes using the client-side inputs. Accepted changes are still updated using server-side apply. Note that this now requires the stack to be refreshed to detect resource drift in both client-side apply and server-side apply modes. 3. In server-side apply (SSA) mode, Kubernetes uses information in the `.metadata.managedFields` property to determine ownership of each field in a resource. Kubernetes v1.18+ sets the managedFields property in both client-side and server-side apply modes. Operations performed using client-side mode set a field manager with an operation type of "Update", while server-side operations use "Apply". Kubernetes treats these as separate field managers, which can lead to problems during server-side updates. To avoid this class of problems, the provider automatically takes ownership of fields managed by known client-side field managers during the first SSA-enabled update. This change should be transparent to users. See kubernetes/kubernetes#99003 for additional details. 4. Previous versions of the provider contained a bug where previous inputs rather than live state were used to compute a diff. This prevented the provider from detecting resource drift caused by other controllers. This bug was fixed so that drift will be corrected on update after a refresh is performed. --------- Co-authored-by: Ramon Quitales <ramon@pulumi.com>
@alexzielenski That use case is to use server-side apply to remove annotation add by The kubectl is version 1.23
command output
The kubectl in 1.27.3 output
|
I'm thinking that many use cases would be covered if adding support to apply to migrate the ownership of all fields except .status. So some alternative implementation of csaupgrade.FindFieldsOwners is needed here: One typical case for this would be when calling ApplyOptions.Run from https://github.com/kubernetes-sigs/cli-utils/blob/master/pkg/apply/task/apply_task.go#L158 since the cli-utils library in my understanding is built for having a single owner of a resource (except .status). I my case I use kpt to apply manifests to clusters, where I face problem described in this issue (as specified in kptdev/kpt#3592). Kpt is in turn implemeted using cli-utils. After the version of cli-utils where upgraded in kpt (so csaupgrade is used) the problem have lessened, but as described above it isn't solved. |
I'm not sure why youd like to remove the annotation, but you can do so via SSA using two kubectl apply's. Two commands are needed since First move the field into SSA management, and take ownership of it kubectl apply --server-side --force-conflicts --field-manager temporary-annotation-manager -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
namespace: default
spec:
template:
metadata:
annotations:
"kubectl.kubernetes.io/restartedAt": null
EOF After this command
Then inform k8s apiserer that kubectl apply --server-side --field-manager temporary-annotation-manager -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
namespace: default
EOF Resultant yaml: apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "4"
creationTimestamp: "2023-12-11T17:43:26Z"
generation: 5
managedFields:
- apiVersion: apps/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations: {}
f:spec:
f:progressDeadlineSeconds: {}
f:replicas: {}
f:revisionHistoryLimit: {}
f:selector: {}
f:strategy:
f:rollingUpdate:
f:maxSurge: {}
f:maxUnavailable: {}
f:type: {}
f:template:
f:metadata:
f:creationTimestamp: {}
f:labels:
f:app: {}
f:spec:
f:containers:
k:{"name":"nginx"}:
.: {}
f:image: {}
f:imagePullPolicy: {}
f:name: {}
f:ports:
k:{"containerPort":80,"protocol":"TCP"}:
.: {}
f:containerPort: {}
f:protocol: {}
f:resources: {}
f:terminationMessagePath: {}
f:terminationMessagePolicy: {}
f:dnsPolicy: {}
f:restartPolicy: {}
f:schedulerName: {}
f:securityContext: {}
f:terminationGracePeriodSeconds: {}
manager: kubectl
operation: Apply
time: "2023-10-05T09:06:49Z"
- apiVersion: apps/v1
fieldsType: FieldsV1
fieldsV1:
f:spec:
f:replicas: {}
manager: kubectl
operation: Update
subresource: scale
- apiVersion: apps/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:deployment.kubernetes.io/revision: {}
f:status:
f:availableReplicas: {}
f:conditions:
.: {}
k:{"type":"Available"}:
.: {}
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
k:{"type":"Progressing"}:
.: {}
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
f:observedGeneration: {}
f:readyReplicas: {}
f:replicas: {}
f:unavailableReplicas: {}
f:updatedReplicas: {}
manager: kube-controller-manager
operation: Update
subresource: status
time: "2023-12-11T17:46:50Z"
name: nginx-deployment
namespace: default
resourceVersion: "397"
uid: 44bcee41-f1e7-49f2-ad18-17eaa03daad5
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: nginx
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: nginx
spec:
containers:
- image: nginx:1.18
imagePullPolicy: IfNotPresent
name: nginx
ports:
- containerPort: 80
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
status:
conditions:
- lastTransitionTime: "2023-12-11T17:43:26Z"
lastUpdateTime: "2023-12-11T17:43:26Z"
message: Deployment does not have minimum availability.
reason: MinimumReplicasUnavailable
status: "False"
type: Available
- lastTransitionTime: "2023-12-11T17:43:26Z"
lastUpdateTime: "2023-12-11T17:46:50Z"
message: ReplicaSet "nginx-deployment-584f64656" is progressing.
reason: ReplicaSetUpdated
status: "True"
type: Progressing
observedGeneration: 5
replicas: 2
unavailableReplicas: 2
updatedReplicas: 1 |
Migrating from client-side apply to server-side apply (as currently implemented in kubectl and documented here) leaves the objects in a somewhat corrupted state, and result in the resources forever being inconsistent with their expected state. Namely, removed fields that have been set in the initial client-side applied version stay in the resource forever and aren't dropped.
What happened:
If I apply a manifest using client-side apply and then switch to server-side apply with no changes, the operation succeeds. However, if I then delete a field from the manifest and run server-side apply again, the field stays in the resource instead of being removed
What you expected to happen:
I would obviously expect the field to be removed from the resource.
How to reproduce it (as minimally and precisely as possible):
Create a configmap in the cluster with client-side apply:
Confirm that the configmap is fine:
Apply the same manifest with server-side apply:
Confirm that it wasn't changed:
Remove one of the values from the configmap and apply again using server-side apply:
Check the configmap, expecting the
legacy
value to be removed:As you can see, the
legacy
value is still there.Anything else we need to know?:
This happens because the client-side apply results are recorded in
managedFields
with anUpdate
operation by thekubectl-client-side-apply
field manager. When the resource is migrated to server-side apply, the results are instead tracked asApply
by thekubectl
field manager. If there are no conflicts between the two fields (and naturally you wouldn't expect them to be there, since you're trying to convert the same resource to SSA), both field managers will own the fields and the client-side apply won't get kicked out. If you then remove a field from the list of fields managed by SSA, the field will still be managed by the oldUpdate
fromkubectl-client-side-apply
and wouldn't be removed as a result.Environment:
Kubernetes version (use
kubectl version
):The text was updated successfully, but these errors were encountered: