Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ApplicationSet suddenly deletes applications #18780

Closed
audrey-mux opened this issue Jun 23, 2024 · 4 comments · Fixed by #18781
Closed

ApplicationSet suddenly deletes applications #18780

audrey-mux opened this issue Jun 23, 2024 · 4 comments · Fixed by #18781
Labels
bug/in-triage This issue needs further triage to be correctly classified bug Something isn't working component:application-sets Bulk application management related type:bug

Comments

@audrey-mux
Copy link

Checklist:

  • [ x ] I've searched in the docs and FAQ for my answer: https://bit.ly/argocd-faq.
  • [ x ] I've included steps to reproduce the bug.
  • [ x ] I've pasted the output of argocd version.

Describe the bug

We had a sudden deletion of a handful of applications created by appsets. The applicationset controller looks like it lost its connection to the kube-api service for less than a second. This caused errors in the application generation. The connection issue resolved quickly, but within a few seconds of the event the affected applications were deleted by the applicationset controller.

They were recreated a few seconds later, but the damage was done.

Since set the application controller policy to create-update and adding

  finalizers:
    - resources-finalizer.argocd.argoproj.io
spec:
  syncPolicy:
    preserveResourcesOnDeletion: true

To all applicationset manifests. Will that be enough to prevent deletion if this sort of error were to happen again?

To Reproduce

Break the applicationset controllers access to the local kube-api service.

Expected behavior

Expected at a minimum a retry, not application deletion.

Screenshots

Version

❯ argocd version
argocd: v2.11.3+3f344d5
  BuildDate: 2024-06-06T12:33:08Z
  GitCommit: 3f344d54a4e0bbbb4313e1c19cfe1e544b162598
  GitTreeState: clean
  GoVersion: go1.22.4
  Compiler: gc
  Platform: darwin/arm64
argocd-server: v2.11.2+25f7504
  BuildDate: 2024-05-23T13:32:13Z
  GitCommit: 25f7504ecc198e7d7fdc055fdb83ae50eee5edd0
  GitTreeState: clean
  GoVersion: go1.21.9
  Compiler: gc
  Platform: linux/amd64
  Kustomize Version: v5.2.1 2023-10-19T20:13:51Z
  Helm Version: v3.14.4+g81c902a
  Kubectl Version: v0.26.11
  Jsonnet Version: v0.20.0

Logs

The API connection errors

2024-06-21 16:30:14.089	time="2024-06-21T23:30:14Z" level=error msg="error generating application from params" applicationset=argocd/gcp-pd-csi-driver error="error listing clusters: Get \"https://172.16.192.1:443/api/v1/namespaces/argocd/secrets?labelSelector=argocd.argoproj.io%2Fsecret-type%3Dcluster\": http2: client connection lost" generator="{nil &ClusterGenerator{Selector:{map[csi_driver:enabled environment:staging provider:gcp] []},Template:ApplicationSetTemplate{ApplicationSetTemplateMeta:ApplicationSetTemplateMeta{Name:,Namespace:,Labels:map[string]string{},Annotations:map[string]string{},Finalizers:[],},Spec:ApplicationSpec{Source:nil,Destination:ApplicationDestination{Server:,Namespace:,Name:,},Project:,SyncPolicy:nil,IgnoreDifferences:[]ResourceIgnoreDifferences{},Info:[]Info{},RevisionHistoryLimit:nil,Sources:[]ApplicationSource{},},},Values:map[string]string{environment: staging,},} nil nil nil nil nil nil nil nil}"
2024-06-21 16:30:14.089	time="2024-06-21T23:30:14Z" level=error msg="error generating params" error="error listing clusters: Get \"https://172.16.192.1:443/api/v1/namespaces/argocd/secrets?labelSelector=argocd.argoproj.io%2Fsecret-type%3Dcluster\": http2: client connection lost" generator="&{0xc0014ea060 {{}} 0xc00109a4e0 argocd 0xc000fa8240}"
2024-06-21 16:30:14.089	time="2024-06-21T23:30:14Z" level=error msg="error occurred during application validation: Get \"https://172.16.192.1:443/apis/argoproj.io/v1alpha1/namespaces/argocd/appprojects/aws-eu-central-1-dop1\": http2: client connection lost" applicationset=argocd/flink-kubernetes-operator
2024-06-21 16:30:14.089	time="2024-06-21T23:30:14Z" level=error msg="error generating application from params" applicationset=argocd/access error="error listing clusters: Get \"https://172.16.192.1:443/api/v1/namespaces/argocd/secrets?labelSelector=argocd.argoproj.io%2Fsecret-type%3Dcluster\": http2: client connection lost" generator="{nil &ClusterGenerator{Selector:{map[environment:staging] []},Template:ApplicationSetTemplate{ApplicationSetTemplateMeta:ApplicationSetTemplateMeta{Name:,Namespace:,Labels:map[string]string{},Annotations:map[string]string{},Finalizers:[],},Spec:ApplicationSpec{Source:nil,Destination:ApplicationDestination{Server:,Namespace:,Name:,},Project:,SyncPolicy:nil,IgnoreDifferences:[]ResourceIgnoreDifferences{},Info:[]Info{},RevisionHistoryLimit:nil,Sources:[]ApplicationSource{},},},Values:map[string]string{environment: staging,},} nil nil nil nil nil nil nil nil}"
2024-06-21 16:30:14.089	time="2024-06-21T23:30:14Z" level=error msg="error generating params" error="error listing clusters: Get \"https://172.16.192.1:443/api/v1/namespaces/argocd/secrets?labelSelector=argocd.argoproj.io%2Fsecret-type%3Dcluster\": http2: client connection lost" generator="&{0xc0014ea060 {{}} 0xc00109a4e0 argocd 0xc000fa8240}"
2024-06-21 16:30:14.089	time="2024-06-21T23:30:14Z" level=error msg="error generating application from params" applicationset=argocd/autoscaler error="error listing clusters: Get \"https://172.16.192.1:443/api/v1/namespaces/argocd/secrets?labelSelector=argocd.argoproj.io%2Fsecret-type%3Dcluster\": http2: client connection lost" generator="{nil &ClusterGenerator{Selector:{map[environment:staging] []},Template:ApplicationSetTemplate{ApplicationSetTemplateMeta:ApplicationSetTemplateMeta{Name:,Namespace:,Labels:map[string]string{},Annotations:map[string]string{},Finalizers:[],},Spec:ApplicationSpec{Source:nil,Destination:ApplicationDestination{Server:,Namespace:,Name:,},Project:,SyncPolicy:nil,IgnoreDifferences:[]ResourceIgnoreDifferences{},Info:[]Info{},RevisionHistoryLimit:nil,Sources:[]ApplicationSource{},},},Values:map[string]string{environment: staging,},} nil nil nil nil nil nil nil nil}"
2024-06-21 16:30:14.089	time="2024-06-21T23:30:14Z" level=error msg="error generating params" error="error listing clusters: Get \"https://172.16.192.1:443/api/v1/namespaces/argocd/secrets?labelSelector=argocd.argoproj.io%2Fsecret-type%3Dcluster\": http2: client connection lost" generator="&{0xc0014ea060 {{}} 0xc00109a4e0 argocd 0xc000fa8240}"
2024-06-21 16:30:14.089	time="2024-06-21T23:30:14Z" level=error msg="error generating params" error="error listing clusters: Get \"https://172.16.192.1:443/api/v1/namespaces/argocd/secrets?labelSelector=argocd.argoproj.io%2Fsecret-type%3Dcluster\": http2: client connection lost" generator="&{0xc0014ea060 {{}} 0xc00109a4e0 argocd 0xc000fa8240}"
2024-06-21 16:30:14.089	time="2024-06-21T23:30:14Z" level=error msg="error generating application from params" applicationset=argocd/coredns error="error listing clusters: Get \"https://172.16.192.1:443/api/v1/namespaces/argocd/secrets?labelSelector=argocd.argoproj.io%2Fsecret-type%3Dcluster\": http2: client connection lost" generator="{nil &ClusterGenerator{Selector:{map[environment:staging] []},Template:ApplicationSetTemplate{ApplicationSetTemplateMeta:ApplicationSetTemplateMeta{Name:,Namespace:,Labels:map[string]string{},Annotations:map[string]string{},Finalizers:[],},Spec:ApplicationSpec{Source:nil,Destination:ApplicationDestination{Server:,Namespace:,Name:,},Project:,SyncPolicy:nil,IgnoreDifferences:[]ResourceIgnoreDifferences{},Info:[]Info{},RevisionHistoryLimit:nil,Sources:[]ApplicationSource{},},},Values:map[string]string{environment: staging,},} nil nil nil nil nil nil nil nil}"
2024-06-21 16:30:14.089	time="2024-06-21T23:30:14Z" level=error msg="error generating application from params" applicationset=argocd/storage error="error listing clusters: Get \"https://172.16.192.1:443/api/v1/namespaces/argocd/secrets?labelSelector=argocd.argoproj.io%2Fsecret-type%3Dcluster\": http2: client connection lost" generator="{nil &ClusterGenerator{Selector:{map[environment:staging] []},Template:ApplicationSetTemplate{ApplicationSetTemplateMeta:ApplicationSetTemplateMeta{Name:,Namespace:,Labels:map[string]string{},Annotations:map[string]string{},Finalizers:[],},Spec:ApplicationSpec{Source:nil,Destination:ApplicationDestination{Server:,Namespace:,Name:,},Project:,SyncPolicy:nil,IgnoreDifferences:[]ResourceIgnoreDifferences{},Info:[]Info{},RevisionHistoryLimit:nil,Sources:[]ApplicationSource{},},},Values:map[string]string{environment: staging,},} nil nil nil nil nil nil nil nil}"
2024-06-21 16:30:14.089	time="2024-06-21T23:30:14Z" level=error msg="error generating application from params" applicationset=argocd/gce-addons error="error listing clusters: Get \"https://172.16.192.1:443/api/v1/namespaces/argocd/secrets?labelSelector=argocd.argoproj.io%2Fsecret-type%3Dcluster\": http2: client connection lost" generator="{nil &ClusterGenerator{Selector:{map[environment:staging] []},Template:ApplicationSetTemplate{ApplicationSetTemplateMeta:ApplicationSetTemplateMeta{Name:,Namespace:,Labels:map[string]string{},Annotations:map[string]string{},Finalizers:[],},Spec:ApplicationSpec{Source:nil,Destination:ApplicationDestination{Server:,Namespace:,Name:,},Project:,SyncPolicy:nil,IgnoreDifferences:[]ResourceIgnoreDifferences{},Info:[]Info{},RevisionHistoryLimit:nil,Sources:[]ApplicationSource{},},},Values:map[string]string{environment: staging,},} nil nil nil nil nil nil nil nil}"
2024-06-21 16:30:14.089	time="2024-06-21T23:30:14Z" level=error msg="error generating application from params" applicationset=argocd/rbac error="error listing clusters: Get \"https://172.16.192.1:443/api/v1/namespaces/argocd/secrets?labelSelector=argocd.argoproj.io%2Fsecret-type%3Dcluster\": http2: client connection lost" generator="{nil &ClusterGenerator{Selector:{map[environment:staging] []},Template:ApplicationSetTemplate{ApplicationSetTemplateMeta:ApplicationSetTemplateMeta{Name:,Namespace:,Labels:map[string]string{},Annotations:map[string]string{},Finalizers:[],},Spec:ApplicationSpec{Source:nil,Destination:ApplicationDestination{Server:,Namespace:,Name:,},Project:,SyncPolicy:nil,IgnoreDifferences:[]ResourceIgnoreDifferences{},Info:[]Info{},RevisionHistoryLimit:nil,Sources:[]ApplicationSource{},},},Values:map[string]string{environment: staging,},} nil nil nil nil nil nil nil nil}"
2024-06-21 16:30:14.089	time="2024-06-21T23:30:14Z" level=error msg="error generating params" error="error listing clusters: Get \"https://172.16.192.1:443/api/v1/namespaces/argocd/secrets?labelSelector=argocd.argoproj.io%2Fsecret-type%3Dcluster\": http2: client connection lost" generator="&{0xc0014ea060 {{}} 0xc00109a4e0 argocd 0xc000fa8240}"
2024-06-21 16:30:14.089	time="2024-06-21T23:30:14Z" level=error msg="error generating application from params" applicationset=argocd/vault-csi-controller error="error listing clusters: Get \"https://172.16.192.1:443/api/v1/namespaces/argocd/secrets?labelSelector=argocd.argoproj.io%2Fsecret-type%3Dcluster\": http2: client connection lost" generator="{nil &ClusterGenerator{Selector:{map[environment:staging] []},Template:ApplicationSetTemplate{ApplicationSetTemplateMeta:ApplicationSetTemplateMeta{Name:,Namespace:,Labels:map[string]string{},Annotations:map[string]string{},Finalizers:[],},Spec:ApplicationSpec{Source:nil,Destination:ApplicationDestination{Server:,Namespace:,Name:,},Project:,SyncPolicy:nil,IgnoreDifferences:[]ResourceIgnoreDifferences{},Info:[]Info{},RevisionHistoryLimit:nil,Sources:[]ApplicationSource{},},},Values:map[string]string{environment: staging,},} nil nil nil nil nil nil nil nil}"
2024-06-21 16:30:14.088	time="2024-06-21T23:30:14Z" level=error msg="error generating params" error="error listing clusters: Get \"https://172.16.192.1:443/api/v1/namespaces/argocd/secrets?labelSelector=argocd.argoproj.io%2Fsecret-type%3Dcluster\": http2: client connection lost" generator="&{0xc0014ea060 {{}} 0xc00109a4e0 argocd 0xc000fa8240}"
2024-06-21 16:30:14.088	W0621 23:30:14.088600       7 reflector.go:347] pkg/mod/k8s.io/client-go@v0.26.11/tools/cache/reflector.go:169: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
2024-06-21 16:30:14.088	W0621 23:30:14.088583       7 reflector.go:347] pkg/mod/k8s.io/client-go@v0.26.11/tools/cache/reflector.go:169: watch of *v1.Secret ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
2024-06-21 16:30:14.088	W0621 23:30:14.088563       7 reflector.go:347] pkg/mod/k8s.io/client-go@v0.26.11/tools/cache/reflector.go:169: watch of *v1.Secret ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
2024-06-21 16:30:14.088	W0621 23:30:14.088528       7 reflector.go:347] pkg/mod/k8s.io/client-go@v0.26.11/tools/cache/reflector.go:169: watch of *v1alpha1.ApplicationSet ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
2024-06-21 16:30:14.088	time="2024-06-21T23:30:14Z" level=error msg="error generating application from params" applicationset=argocd/tracing error="error listing clusters: Get \"https://172.16.192.1:443/api/v1/namespaces/argocd/secrets?labelSelector=argocd.argoproj.io%2Fsecret-type%3Dcluster\": http2: client connection lost" generator="{nil &ClusterGenerator{Selector:{map[environment:staging] []},Template:ApplicationSetTemplate{ApplicationSetTemplateMeta:ApplicationSetTemplateMeta{Name:,Namespace:,Labels:map[string]string{},Annotations:map[string]string{},Finalizers:[],},Spec:ApplicationSpec{Source:nil,Destination:ApplicationDestination{Server:,Namespace:,Name:,},Project:,SyncPolicy:nil,IgnoreDifferences:[]ResourceIgnoreDifferences{},Info:[]Info{},RevisionHistoryLimit:nil,Sources:[]ApplicationSource{},},},Values:map[string]string{environment: staging,},} nil nil nil nil nil nil nil nil}"
2024-06-21 16:30:14.088	time="2024-06-21T23:30:14Z" level=error msg="error generating params" error="error listing clusters: Get \"https://172.16.192.1:443/api/v1/namespaces/argocd/secrets?labelSelector=argocd.argoproj.io%2Fsecret-type%3Dcluster\": http2: client connection lost" generator="&{0xc0014ea060 {{}} 0xc00109a4e0 argocd 0xc000fa8240}"
2024-06-21 16:30:14.088	W0621 23:30:14.088376       7 reflector.go:347] pkg/mod/k8s.io/client-go@v0.26.11/tools/cache/reflector.go:169: watch of *v1alpha1.Application ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding

and the deletion (app names partially redacted)

2024-06-21 16:30:19.565	time="2024-06-21T23:30:19Z" level=info msg="Deleted application" app=argocd/eastus-vos1-rbac applicationset=argocd/rbac
2024-06-21 16:30:19.515	time="2024-06-21T23:30:19Z" level=info msg="Deleted application" app=argocd/us-west1-vos1-coredns applicationset=argocd/coredns
2024-06-21 16:30:19.461	time="2024-06-21T23:30:19Z" level=info msg="Deleted application" app=argocd/us-east-1-dos1-vault-csi-controller applicationset=argocd/vault-csi-controller
2024-06-21 16:30:19.411	time="2024-06-21T23:30:19Z" level=info msg="Deleted application" app=argocd/us-west1-vos1-access applicationset=argocd/access
2024-06-21 16:30:19.356	time="2024-06-21T23:30:19Z" level=info msg="Deleted application" app=argocd/eastus-vos1-gce-addons applicationset=argocd/gce-addons
2024-06-21 16:30:19.309	time="2024-06-21T23:30:19Z" level=info msg="Deleted application" app=argocd/us-west4-vos1-storage applicationset=argocd/storage
2024-06-21 16:30:19.259	time="2024-06-21T23:30:19Z" level=info msg="Deleted application" app=argocd/us-west4-vos1-tracing applicationset=argocd/tracing
2024-06-21 16:30:19.166	time="2024-06-21T23:30:19Z" level=info msg="Deleted application" app=argocd/eastus-vos1-autoscaler applicationset=argocd/autoscaler
2024-06-21 16:30:19.076	time="2024-06-21T23:30:19Z" level=info msg="Deleted application" app=argocd/us-east-1-tos1-rbac applicationset=argocd/rbac
2024-06-21 16:30:19.008	time="2024-06-21T23:30:19Z" level=info msg="Deleted application" app=argocd/us-east-1-dos1-coredns applicationset=argocd/coredns
2024-06-21 16:30:18.961	time="2024-06-21T23:30:18Z" level=info msg="Deleted application" app=argocd/us-west4-vos1-vault-csi-controller applicationset=argocd/vault-csi-controller
2024-06-21 16:30:18.910	time="2024-06-21T23:30:18Z" level=info msg="Deleted application" app=argocd/eastus-vos1-access applicationset=argocd/access
2024-06-21 16:30:18.861	time="2024-06-21T23:30:18Z" level=info msg="Deleted application" app=argocd/us-west2-ves3-gce-addons applicationset=argocd/gce-addons
2024-06-21 16:30:18.808	time="2024-06-21T23:30:18Z" level=info msg="Deleted application" app=argocd/us-west2-ves3-storage applicationset=argocd/storage
2024-06-21 16:30:18.758	time="2024-06-21T23:30:18Z" level=info msg="Deleted application" app=argocd/us-west2-ves3-tracing applicationset=argocd/tracing
2024-06-21 16:30:18.661	time="2024-06-21T23:30:18Z" level=info msg="Deleted application" app=argocd/us-east-1-tos1-autoscaler applicationset=argocd/autoscaler
2024-06-21 16:30:18.562	time="2024-06-21T23:30:18Z" level=info msg="Deleted application" app=argocd/us-west4-vos1-rbac applicationset=argocd/rbac
2024-06-21 16:30:18.509	time="2024-06-21T23:30:18Z" level=info msg="Deleted application" app=argocd/us-west2-ves3-coredns applicationset=argocd/coredns
2024-06-21 16:30:18.465	time="2024-06-21T23:30:18Z" level=info msg="Deleted application" app=argocd/us-west1-vos1-vault-csi-controller applicationset=argocd/vault-csi-controller
2024-06-21 16:30:18.410	time="2024-06-21T23:30:18Z" level=info msg="Deleted application" app=argocd/us-east-1-tos1-access applicationset=argocd/access
2024-06-21 16:30:18.359	time="2024-06-21T23:30:18Z" level=info msg="Deleted application" app=argocd/us-east-1-dos1-gce-addons applicationset=argocd/gce-addons
2024-06-21 16:30:18.308	time="2024-06-21T23:30:18Z" level=info msg="Deleted application" app=argocd/us-west1-vos1-storage applicationset=argocd/storage
2024-06-21 16:30:18.262	time="2024-06-21T23:30:18Z" level=info msg="Deleted application" app=argocd/eastus-vos1-tracing applicationset=argocd/tracing
2024-06-21 16:30:18.161	time="2024-06-21T23:30:18Z" level=info msg="Deleted application" app=argocd/us-west4-vos1-autoscaler applicationset=argocd/autoscaler
2024-06-21 16:30:18.066	time="2024-06-21T23:30:18Z" level=info msg="Deleted application" app=argocd/us-east-1-dos1-rbac applicationset=argocd/rbac
2024-06-21 16:30:18.012	time="2024-06-21T23:30:18Z" level=info msg="Deleted application" app=argocd/us-east-1-tos1-coredns applicationset=argocd/coredns
2024-06-21 16:30:17.962	time="2024-06-21T23:30:17Z" level=info msg="Deleted application" app=argocd/eastus-vos1-vault-csi-controller applicationset=argocd/vault-csi-controller
2024-06-21 16:30:17.911	time="2024-06-21T23:30:17Z" level=info msg="Deleted application" app=argocd/us-west2-ves3-access applicationset=argocd/access
2024-06-21 16:30:17.857	time="2024-06-21T23:30:17Z" level=info msg="Deleted application" app=argocd/us-east-1-tos1-gce-addons applicationset=argocd/gce-addons
2024-06-21 16:30:17.811	time="2024-06-21T23:30:17Z" level=info msg="Deleted application" app=argocd/-us-east-1-dos1-storage applicationset=argocd/storage
@audrey-mux audrey-mux added the bug Something isn't working label Jun 23, 2024
@crenshaw-dev
Copy link
Member

I think this is probably another example of this bug: #18212

I think the comment by @todaywasawesome here was prescient: a generator failure might mean that it's time to stop the world.

I'm going to revert #17062 until the author has time to make it safer.

@audrey-mux
Copy link
Author

Ah, yep that’s likely it. It’s happening even with cluster generators

@audrey-mux
Copy link
Author

audrey-mux commented Jun 24, 2024

Hey @crenshaw-dev

Was wondering if there's an ETA on #18781 getting merged and a new release cut?

@alexmt alexmt added bug/in-triage This issue needs further triage to be correctly classified component:application-sets Bulk application management related type:bug labels Jun 24, 2024
@crenshaw-dev
Copy link
Member

@audrey-mux I'll cherry-pick the change to 2.12 and 2.12 and plan to cut a release today or tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug/in-triage This issue needs further triage to be correctly classified bug Something isn't working component:application-sets Bulk application management related type:bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants