-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Empty objects vs. null objects cause repeated syncs #18213
Comments
Is this with Server Side Diff on or off? I would imagine looking at https://argo-cd.readthedocs.io/en/stable/user-guide/diff-strategies/#server-side-diff may help with scenarios like this. I don't believe there's plans to make the default Server Side Diff until a 3.x release but this may be something you want to take a look at... |
i just tried enabling that and it didn't seem to make a difference on my environment |
FWIW, this works for me for the "resources: {}" diff
|
I haven't set any options for server side diff in the application or the main configmap, so it should be off.
@evanrich Thanks, I was hoping I wouldn't have to ignore resources for the cases where there are real differences, but I can probably use that for now. |
This is very strange, I have multiple clusters with basically the same configuration. Same application (external-secrets) deployed to both clusters. I verified that on the cluster which is not showing a diff and therefore not syncing has So same application running in the same version of Argo and Kubernetes seems to sometimes ignore this difference and sometimes not ignore it. |
Noticed this problem with version 2.11 quite a lot Adding this to ignoreDifferences doesn't seem like a good idea to me |
This has suddenly emerged in my k3s cluster as well. Kubernetes version: v1.30.0+k3s1 |
Similar issue: #13004 |
Just wanted to note that I also tried enabling server side diff, but it didn't have any effect on this issue. |
I haven't seen any updates in the release notes for recent releases that indicate this has been changed, so I'm staying on helm chart version 6.7.18 (argo v2.10.9) hopefully this gets addressed/commented on soon because it's not just empty objects but the a ton of other resources as well are suddenly throwing sync errors in latest versions. |
Same problem here. Like others, I have used ignoreDifferences:
- group: apps
kind: "*"
jqPathExpressions:
- ".spec.template.spec.containers[].resources"
- group: apps
kind: "*"
jqPathExpressions:
- ".spec.template.spec.initContainers[].resources" |
Comparing a working vs. non-working cluster, I found that the gvkParser is nil on the non-working cluster (the one where the empty resources is detected as a difference). With some changes to the code I see this error message in the logs on the cluster that's not working:
See also: argoproj/gitops-engine#425 This seems to be a known issue in the GVKParser: kubernetes/kubernetes#103597 |
The issue in my case turned out to be caused by an older version of the kubernetes metrics server installed in the EKS cluster. First I updated updated the application controller using a locally built container (see argoproj/gitops-engine#585) to display the parser error.
This duplication comes from the kube api server having multiple versions of the APIResourceList definition, which can be seen in the raw openapi schema.
Then I searched for where
The only place it was referenced was the metrics server api endpoint
After the metrics update removed Edit: also wanted to note that after applying the newer metrics server it takes a few minutes before the openapi schema is updated, and then you'll also need to restart the argocd-application-controller pod. |
Interesting. I've seen similar duplications in k8s 1.30.0, so I'll be diving into the problem more. Glad you were able to sort it in your instance! |
Same here, fresh k3s deployment ships with |
I'm seeing the same since (I think) I upgraded from k8s 1.29.* to 1.30.1. Most of my helm charts render |
This disappeared for me with the upgrade to 1.30.2 |
@pschichtel interesting... was there a change in version of metrics-server? (Or of any other aggregated API?) |
@crenshaw-dev metrics-server was updated as part of the 1.29.* -> 1.30.1 upgrade, but it hasn't changed when upgrading to 1.30.2 |
That's strange... are you able to port-forward your metrics-server service and dump the |
not sure that's helpful
|
I just checked argo again and the issue actually came back, sorry for the confusion! |
Okay that makes more sense, I'm not seeing any relevant changes in k8s or metrics-server. I spent a bunch of time today trying to de-dupe OpenAPI models before feeding them to GVKParser. It avoids the NewGVKParser error, but apparently some models (like Working on it though. |
I'm working on the fix and have written up a description of the issue here: https://github.com/argoproj/gitops-engine/pull/587/files#diff-bbeadede3a86f27922253617b66c49b7672bb52896a10cf5626b398fa0fa9280 I'm trying to decide between just logging the dropped duplicate GVKs or also surfacing to the user that their diff might be wrong due to duplicate GVKs. So far I'm trying to surface it to the user, but might end up being more code than it's worth. A couple open questions:
|
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* fix(controller): bad server-side diffs (#18213) Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> * use upstream commit Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> --------- Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
The app eventually got in sync. |
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* fix(controller): bad server-side diffs (#18213) (2.10) Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> * fix revision Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> * hopefully the right hash now Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> --------- Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* fix(controller): bad server-side diffs (#18213) (2.11) Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> * hopefully the right hash now Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> --------- Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
* fix(controller): bad server-side diffs (#18213) (2.9) Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> * fix(controller): bad server-side diffs (#18213) (2.9) Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> * hopefully the right hash now Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> --------- Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
Summary
When a config option is not set in the a kube manifest (e.g. statefulset) ArgoCD continually tries to re-sync the object because it sees a diff between an empty and undefined object.
Motivation
I have a statefulset that does not define the
![Screenshot 2024-05-14 at 7 50 11 AM](https://private-user-images.githubusercontent.com/156120/330434311-cede707b-8c9e-4e55-8625-fc0b1a9b1880.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjAwMzc1NTAsIm5iZiI6MTcyMDAzNzI1MCwicGF0aCI6Ii8xNTYxMjAvMzMwNDM0MzExLWNlZGU3MDdiLThjOWUtNGU1NS04NjI1LWZjMGIxYTliMTg4MC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzAzJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcwM1QyMDA3MzBaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1iNDg4MDM4MjJmZjQ0ZjYwOTM3MWI3OTZlZGIxZjE0ZGZkZThkNjFiNTc5MDBmMzY5MjJkZWFjY2ZlMmIwYWY1JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.3mQvj-g_QbUA_j_aH-5U0ygFwAEfe3mVngnVs_4UyhM)
resources
section. ArgoCD continually tries to sync the object because Kubernetes automatically injects an empty object.Proposal
The diff calculator should ignore the difference between an empty and null object in a kubernetes resource field.
Related
#15554
The text was updated successfully, but these errors were encountered: