Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support preventing hook deletion when Application sync is terminated #3502

Closed
chancez opened this issue Apr 28, 2020 · 14 comments
Closed

Support preventing hook deletion when Application sync is terminated #3502

chancez opened this issue Apr 28, 2020 · 14 comments
Labels
component:core Syncing, diffing, cluster state cache enhancement New feature or request type:usability Enhancement of an existing feature
Milestone

Comments

@chancez
Copy link
Contributor

chancez commented Apr 28, 2020

#2678 Summary

ArgoCD converts helm hooks into argocd hooks, which get deleted if an Application sync is canceled. This can cause problems if your using hooks that require a namespace to be created, as you must also set your namespace to be created as a hook, such that the namespace is created prior to your job hooks, otherwise they fail to be created because the namespace doesn't exist yet. This causes an issue though when you cancel a running sync, as it will cause ArgoCD to delete the namespace, which isn't usually a desired behavior.

Create a chart with a template for creating the namespace as a helm hook. If you cancel a running sync the entire namespace will be deleted.

apiVersion: v1
kind: Namespace
metadata:
  name: {{ .Release.Namespace }}
  annotations:
    "helm.sh/hook": pre-install
    "helm.sh/hook-weight": "-10"
    "helm.sh/resource-policy": keep

Motivation

Initially, I only added the helm hook annotations to my namespace because I have other hooks which run in a namespace (jobs), and argocd failed to deploy my Application without these annotations on my namespace because the jobs would be ordered prior to my namespace being created without it (the jobs run in tje preSync phase, while the namespace is created in the sync phase).

Proposal

I see one of two options:

  • Allow configuring some resources, or all of them to not be deleted if a sync is terminated. This could potentially be done by adding an option to argocd.argoproj.io/sync-options to prevent deleting running hooks in the general case. For helm based apps, it could use "helm.sh/resource-policy": keep as a synonym for the new sync-option to prevent deleting of running hooks.
  • Allow configuring how a helm based Applications's helm hooks are converted into argocd hooks/sync-waves. I believe is many cases that you could instead convert helm hooks into sync waves instead of sync hooks, and that would also prevent the issue as sync-waves only change the ordering, and resources aren't deleted if a sync is terminated unless they're in the sync hooks phase.

I also think it's quite likely that it might make sense to always order namespaces before preSync, sync, and postSync by default. That may also help, but if people are using hooks already, it won't help prevent the deletion issue, only the ordering issue.

@jessesuen we discussed this a bit in slack https://argoproj.slack.com/archives/CASHNF6MS/p1588095718030500

@chancez chancez added the enhancement New feature or request label Apr 28, 2020
@jessesuen
Copy link
Member

I don't think we want to encourage a pattern where normal application resources are made into a hook, simply to achieve ordering.

Even in Helm, a resource which has the helm.sh/hook annotation is not considered part of the helm release, and has the potential to be GC'ed automatically in the future.

https://helm.sh/docs/topics/charts_hooks/#hook-resources-are-not-managed-with-corresponding-releases

The resources that a hook creates are currently not tracked or managed as part of the release. Once Helm verifies that the hook has reached its ready state, it will leave the hook resource alone. Garbage collection of hook resources when the corresponding release is deleted may be added to Helm 3 in the future, so any hook resources that must never be deleted should be annotated with helm.sh/resource-policy: keep.

Similarly, Argo CD hooks are not considered part of Argo CD application resources, and features like drift detection and health assessments do not work. So the solution should involve not having to resort to adding helm.sh/hook": pre-install to a Namespace resource.

My current thought is that we specially treat Namespaces as something that needs to be installed before sync hooks.

@jessesuen
Copy link
Member

As a reference point, it's generally considered an anti-pattern in Helm to include Namspace resource as part of a chart. The decision in issue helm/helm#6794, introduced an option to create namespaces on-the-fly as part of helm install so that Chart authors could continue to write namespace-agnostic charts.

However, in the case of GitOps, "packaging" namespace-agnostic charts is less of a requirement, and Namespaces are often declared in git, adjacent to the application resources, and acceptable/expected to be created as part of the deploy. The problem is that it does not allow PreSync hooks to run, which generally need to run in the namespace of the Application.

This lends itself to the argument for Argo CD to have special treatment of Namespace resources, as being created at the beginning of a deploy before hooks. This makes the behavior more consistent with the --create-namespace flag of helm install, which happens before any hooks/resources are run/created.

@alexmt
Copy link
Collaborator

alexmt commented Apr 28, 2020

#2737 describes the related issue. Helm deletes hooks after all hooks after done. This allows having hooks with dependencies e.g. Job + ServiceAccount

@jessesuen
Copy link
Member

Initially, I only added the helm hook annotations to my namespace because I have other hooks which run in a namespace (jobs)

@chancez i'm not sure if this is an option for you, but a workaround is that instead of PreSync hooks, hooks could be converted to be a Sync hooks. Sync hooks run in the same phase that application resources are deployed, at which point waves could be used to interweave the ordering of hooks vs. application resources. For example, you could do something like:

  1. Namespace (argocd.argoproj.io/sync-wave: "-2")
  2. Job (argocd.argoproj.io/hook: Sync, argocd.argoproj.io/sync-wave: "-1")
  3. All other resources

@chancez
Copy link
Contributor Author

chancez commented Apr 28, 2020

I don't think we want to encourage a pattern where normal application resources are made into a hook, simply to achieve ordering.

I completely agree. I dislike it a lot, but I'm consuming a chart I didn't write (jupyterhub), which makes it difficult to adapt without diverging from upstream. That said, I've considering using a variation of kustomize & helm to better support this, but it's simply not something I'm ready to commit to just yet. I've also just disabled the hook it was creating as I didn't need it anymore for other unrelated reasons, so I was able to remove the helm hook annotations on my namespace and it still succeeds.

My current thought is that we specially treat Namespaces as something that needs to be installed before sync hooks.

I think this makes sense.

I also think it might make sense to simply have the ability to create a namespace if it doesn't exist from the ArgoCD application configuration in the spec.destination field. Eg: spec.destination.createNamespace=true creates spec.destintation.namespace if it doesn't already exist and isn't managed by the application's chart/kustomize/directory. That said, I don't mind defining namespaces as YAML, so I don't think that's entirely necessary, and namespaces created in the application's manifests will still need the special order handling.

a workaround is that instead of PreSync hooks, hooks could be converted to be a Sync hooks

@jessesuen I didn't write the chart, so I can't change the annotations on the resources being created without forking it or going down a custom config management plugin route (which I'm still considering).

I wish there was a better way to support modifying off-the-shelf helm charts without having to fork them or use a custom config management plugin, but I'm not sure how much could be done about that.

Having first class support for json patches via argocd could be a potential approach, or even potentially reusing kustomize transformers for it. Ideally it would be possible though even if you're not using kustomize.

@amkartashov
Copy link

@chancez
There is a solution for creating namespace before hooks: use app-of-apps approach. Personally, I don't like it, because in my case it will create chain of objects just to deploy a helm chart:

app1: contains ns and app2 -> app2: contains values-app.yaml and requirements.yaml with helm chart as a dep -> helm package in external helm repo.

What I want is a single app which has only values.yaml for a helm package w/o extra indentation and reference to this helm package, which will create ns automatically. Because in my case gitops approach means just keeping helm chart version and helm values. See also #2789

@jdfalk
Copy link
Contributor

jdfalk commented Apr 30, 2020

This doesn't just affect namespace creation, the prometheus operator creates a service account, role, and role binding to run an application then removes them. However, argocd just purges them immediately after they are created. I couldn't care less about namespaces, but for other things it just has to match the helm behavior.

@chancez
Copy link
Contributor Author

chancez commented Apr 30, 2020

@jdfalk I recall hitting that as well, I can't even remember how I got around it because I eventually just moved to using kustomize for prometheus-operator and at that point I had everything as raw YAML files so I could just tweak the hook annotations or remove them. Good point bringing that up.

@jannfis jannfis added component:core Syncing, diffing, cluster state cache type:usability Enhancement of an existing feature labels May 14, 2020
@jdfalk
Copy link
Contributor

jdfalk commented Jun 11, 2020

This needs to be resolved, it's causing more and more issues for us and pretty soon we are going to have to switch to flux because this isn't production ready.

@alexmt alexmt added this to the v1.7 milestone Jun 16, 2020
@alexmt
Copy link
Collaborator

alexmt commented Aug 14, 2020

The namespace auto-creation feature should improve the situation. Argo CD can automatically create namespace before start executing hooks.

This does not seem to be enough, since namespace auto-creation does not allow to configure namespace details such as annotation and labels. We've discussed several ways to address it:

  • introduce 'don't delete' hook deletion policy.
  • introduce the way to create the resource in pre-sync phase with hooks without converting it to hook

First approach looks more like a hack. The main difference between hooks and resources is that hooks are ephemeral and supposed to be created/delete on the fly during the syncing process. The namespace should not be marked as a hook just because it should be created before hooks. Second looks better since it is solving the root cause limitation - ability to create namespace before everything else, but kind of compete with waves feature. We want to try to find a way to avoid a new annotation.

So the final proposal is to treat Namespace and CRD as exceptions and execute both before any other resources that reference Namespace/CRD.

The way it would be implemented opens the door to introduce user specified dependencies between resources. I'm hoping we won't need it to avoid make synchronization more difficult.

@jessesuen
Copy link
Member

the prometheus operator creates a service account, role, and role binding to run an application then removes them. However, argocd just purges them immediately after they are created.

Two changes to Argo CD are being made:

  1. In v1.7, hook deletion now happens at the end of a sync operation
  2. In v1.8, Namespaces and CRDs are being created at the beginning of the sync operation, so that PreSync resources can live in the namespace or be of the CRD Kind.

@igorcezar
Copy link

the prometheus operator creates a service account, role, and role binding to run an application then removes them. However, argocd just purges them immediately after they are created.

Two changes to Argo CD are being made:

1. In v1.7, hook deletion now happens at the end of a sync operation

2. In v1.8, Namespaces and CRDs are being created at the beginning of the sync operation, so that PreSync resources can live in the namespace or be of the CRD Kind.

For now there is no version of ArgoCD that allows the creation of CRD using PreSync, is that right?

At least in v1.5 when CRD has PreSync or helm hook, CRD is created in the cluster but the release keeps syncing forever, never creating the others resources.

The enhancement in v1.8 (#4354) will allow CRD to be installed even if there is some hook defined on it? Or it will be needed some feature like #4331?

@jessesuen
Copy link
Member

For now there is no version of ArgoCD that allows the creation of CRD using PreSync, is that right?

correct

The enhancement in v1.8 (#4354) will allow CRD to be installed even if there is some hook defined on it? Or it will be needed some feature like #4331?

For CRDs (i'm not talking about CR instances), we ignore hooks completely and they are installed up front.

These features have been implemented in v1.8. Closing

@ghost
Copy link

ghost commented May 11, 2021

Just to validate that I read this correctly, the fixes made so far do not provide a fix voor de 'delete prevention' but only provide a solution to sync the resource at the right time in the sync. Am I correct? @jessesuen @igorcezar @alexandrfox

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:core Syncing, diffing, cluster state cache enhancement New feature or request type:usability Enhancement of an existing feature
Projects
None yet
Development

No branches or pull requests

7 participants