Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to delete an Application if its target cluster is deleted, Argo CD enters infinite app deletion reconciliation loop #5817

Closed
jgwest opened this issue Mar 19, 2021 · 1 comment · Fixed by #6557
Labels
bug Something isn't working

Comments

@jgwest
Copy link
Member

jgwest commented Mar 19, 2021

If you delete an application whose cluster no longer exists, the application delete will fail (not great). But, worse, after 3-4 minutes, Argo CD enters an infinite deletion reconciliation loop, flooding the k8s server with requests (see logs below).

There are two problems here:

  1. I don't think we should prevent Applications from being deleted when the targeted cluster doesn't exist anymore. The cluster being deleted seems like a pretty clear signal of user intent.
  2. We probably shouldn't be entering an infinite deletion reconciliation loop 😄 .

This was originally reproduced with the ApplicationSet controller (and is causing issues with cluster generator), but here are a quick set of steps to reproduce that don't require the controller.

Reproduced with Argo CD master branch (as of this writing), but was also reproducible with v1.8.x.

Steps to reproduce:

  1. Create a cluster named 'cluster1' (for example)
  2. Create an application against that cluster
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: cluster1-guestbook
  namespace: argocd
spec:
  destination:
    name: cluster1  # must match cluster name above, of course
    namespace: guestbook
  project: default
  source:
    path: guestbook
    repoURL: https://github.com/argoproj/argocd-example-apps.git
    targetRevision: HEAD

(you don't need to sync it)

  1. Ensure that the application appears within Argo CD.

  2. Begin tailing the logs of the application controller, eg kubectl logs -f pod/argocd-application-controller-0 -n argocd

  3. Delete that cluster (either delete the cluster within the web UI, or delete the cluster secret)

  4. Delete the application within the web UI or CLI (foreground deletion, though background has the same behaviour)

Within the application controller logs, you will see:

time="2021-03-19T11:49:22Z" level=info msg="Deleting resources" application=cluster1-guestbook
time="2021-03-19T11:49:22Z" level=info msg="Unable to delete application resources: unable to find destination server: there are no clusters with this name: cluster6" application=cluster1-guestbook dest-namespace=guestbook dest-server=

Which isn't great (IMHO we should allow deletion if the cluster doesn't exist), but keep going...

  1. IMPORTANT: wait ~3-4 minutes, and watch the application controller logs.

After about 3-4 minutes, you will see Argo CD get stuck in what appears to be a deletion reconciliation loop. It gets into a state where it keeps trying to process that Application over and over, and since the cluster doesn't exist, it will never succeed.

IMHO it seems like the correct behaviour is for Argo CD to allow invalid Applications (Applications that point to clusters w/o a corresponding secret) to be deleted. The deletion finalizer and/or reconciler shouldn't prevent this operation.

Raw logs:

# Initial deletion error
time="2021-03-19T11:49:22Z" level=info msg="Deleting resources" application=cluster1-guestbook
time="2021-03-19T11:49:22Z" level=info msg="Unable to delete application resources: unable to find destination server: there are no clusters with this name: cluster6" application=cluster1-guestbook dest-namespace=guestbook dest-server= reason=StatusRefreshed type=Warning
time="2021-03-19T11:49:22Z" level=info msg="Deleting resources" application=cluster1-guestbook
time="2021-03-19T11:49:22Z" level=info msg="Unable to delete application resources: unable to find destination server: there are no clusters with this name: cluster6" application=cluster1-guestbook dest-namespace=guestbook dest-server= reason=StatusRefreshed type=Warning


# A minute later, still OK
time="2021-03-19T11:50:25Z" level=info msg="Deleting resources" application=cluster1-guestbook
time="2021-03-19T11:50:25Z" level=info msg="Unable to delete application resources: unable to find destination server: there are no clusters with this name: cluster6" application=cluster1-guestbook dest-namespace=guestbook dest-server= reason=StatusRefreshed type=Warning


# We enter the infinite loop
time="2021-03-19T11:53:25Z" level=info msg="Deleting resources" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Refreshing app status (comparison expired. reconciledAt: 2021-03-19 11:47:54 +0000 UTC, expiry: 3m0s), level (2)" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Updated sync status: OutOfSync -> Unknown" application=cluster1-guestbook dest-namespace=guestbook dest-server= reason=ResourceUpdated type=Normal
time="2021-03-19T11:53:25Z" level=info msg="Unable to delete application resources: unable to find destination server: there are no clusters with this name: cluster6" application=cluster1-guestbook dest-namespace=guestbook dest-server= reason=StatusRefreshed type=Warning
time="2021-03-19T11:53:25Z" level=info msg="Updated health status: Missing -> Unknown" application=cluster1-guestbook dest-namespace=guestbook dest-server= reason=ResourceUpdated type=Normal
time="2021-03-19T11:53:25Z" level=info msg="Update successful" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Reconciliation completed" application=cluster1-guestbook dest-name=cluster6 dest-namespace=guestbook dest-server= fields.level=2 time_ms=69
time="2021-03-19T11:53:25Z" level=info msg="Deleting resources" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Refreshing app status (comparison expired. reconciledAt: 2021-03-19 11:47:54 +0000 UTC, expiry: 3m0s), level (2)" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Unable to delete application resources: unable to find destination server: there are no clusters with this name: cluster6" application=cluster1-guestbook dest-namespace=guestbook dest-server= reason=StatusRefreshed type=Warning
time="2021-03-19T11:53:25Z" level=info msg="Update successful" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Reconciliation completed" application=cluster1-guestbook dest-name=cluster6 dest-namespace=guestbook dest-server= fields.level=2 time_ms=21
time="2021-03-19T11:53:25Z" level=info msg="Refreshing app status (comparison expired. reconciledAt: 2021-03-19 11:47:54 +0000 UTC, expiry: 3m0s), level (2)" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Update successful" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Reconciliation completed" application=cluster1-guestbook dest-name=cluster6 dest-namespace=guestbook dest-server= fields.level=2 time_ms=19
time="2021-03-19T11:53:25Z" level=info msg="Deleting resources" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Refreshing app status (comparison expired. reconciledAt: 2021-03-19 11:47:54 +0000 UTC, expiry: 3m0s), level (2)" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Unable to delete application resources: unable to find destination server: there are no clusters with this name: cluster6" application=cluster1-guestbook dest-namespace=guestbook dest-server= reason=StatusRefreshed type=Warning
time="2021-03-19T11:53:25Z" level=info msg="Update successful" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Reconciliation completed" application=cluster1-guestbook dest-name=cluster6 dest-namespace=guestbook dest-server= fields.level=2 time_ms=19
time="2021-03-19T11:53:25Z" level=info msg="Refreshing app status (comparison expired. reconciledAt: 2021-03-19 11:47:54 +0000 UTC, expiry: 3m0s), level (2)" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Update successful" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Reconciliation completed" application=cluster1-guestbook dest-name=cluster6 dest-namespace=guestbook dest-server= fields.level=2 time_ms=25
time="2021-03-19T11:53:25Z" level=info msg="Deleting resources" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Refreshing app status (comparison expired. reconciledAt: 2021-03-19 11:47:54 +0000 UTC, expiry: 3m0s), level (2)" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Unable to delete application resources: unable to find destination server: there are no clusters with this name: cluster6" application=cluster1-guestbook dest-namespace=guestbook dest-server= reason=StatusRefreshed type=Warning
time="2021-03-19T11:53:25Z" level=info msg="Update successful" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Reconciliation completed" application=cluster1-guestbook dest-name=cluster6 dest-namespace=guestbook dest-server= fields.level=2 time_ms=24
time="2021-03-19T11:53:25Z" level=info msg="Refreshing app status (comparison expired. reconciledAt: 2021-03-19 11:47:54 +0000 UTC, expiry: 3m0s), level (2)" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Update successful" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Reconciliation completed" application=cluster1-guestbook dest-name=cluster6 dest-namespace=guestbook dest-server= fields.level=2 time_ms=16
time="2021-03-19T11:53:25Z" level=info msg="Deleting resources" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Refreshing app status (comparison expired. reconciledAt: 2021-03-19 11:47:54 +0000 UTC, expiry: 3m0s), level (2)" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Unable to delete application resources: unable to find destination server: there are no clusters with this name: cluster6" application=cluster1-guestbook dest-namespace=guestbook dest-server= reason=StatusRefreshed type=Warning
time="2021-03-19T11:53:25Z" level=info msg="Update successful" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Reconciliation completed" application=cluster1-guestbook dest-name=cluster6 dest-namespace=guestbook dest-server= fields.level=2 time_ms=21
time="2021-03-19T11:53:25Z" level=info msg="Refreshing app status (comparison expired. reconciledAt: 2021-03-19 11:47:54 +0000 UTC, expiry: 3m0s), level (2)" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Deleting resources" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Update successful" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Reconciliation completed" application=cluster1-guestbook dest-name=cluster6 dest-namespace=guestbook dest-server= fields.level=2 time_ms=20
time="2021-03-19T11:53:25Z" level=info msg="Refreshing app status (comparison expired. reconciledAt: 2021-03-19 11:47:54 +0000 UTC, expiry: 3m0s), level (2)" application=cluster1-guestbook
time="2021-03-19T11:53:25Z" level=info msg="Unable to delete application resources: unable to find destination server: there are no clusters with this name: cluster6" application=cluster1-guestbook dest-namespace=guestbook dest-server= reason=StatusRefreshed type=Warning
(... this keeps going ...)
@jgwest jgwest added the bug Something isn't working label Mar 19, 2021
@jgwest jgwest changed the title Unable to delete Application if its cluster is deleted, Argo CD enters infinite app deletion reconciliation loop Unable to delete an Application if its target cluster is deleted, Argo CD enters infinite app deletion reconciliation loop Mar 19, 2021
@sbose78
Copy link
Contributor

sbose78 commented Mar 26, 2021

This should be tracked for the next patch release of 2.0, that is 2.0.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants