-
Notifications
You must be signed in to change notification settings - Fork 7.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Helm should delete job after it is successfully finished #1769
Comments
I would love to see this and I will try to get around to submitting a PR for it if I have the time. In the meantime, I tag this on to the end of the job name as a workaround:
|
I'm currently using workaround from @thomastaylor312 and it is very-very-very bad to do it this way.
As a result I've just run into a performance issue with a storm of kill container / start container events. |
@dzavalkinolx: We should probably bring this up in the kubernetes issues. That isn't so much a problem with Helm as it is with how Kubernetes Jobs work. Have you also tried setting |
This is a tricky one. Helm doesn't implicitly manage the lifecycles of things like this. In particular, it never deletes a user-created resource without receiving an explicit request from a user. Now, we recently introduced an annotation called Since Tiller does not actively monitor resources once they are deployed, I'm not sure this would be a terribly powerful annotation, but it could work on hooks because we do watch hooks for lifecycle events. |
@dzavalkinolx Have you tried |
@longseespace I tried that too, but it just kills the job after the amount of time and spins up a new one |
I ran into this problem today too. I found out about Helm Hooks recently and I thought it would be perfect to implement pre-upgrade database migrations ( I'm probably missing something, but what's the point of having a job executed in a pre-upgrade / post-upgrade hook if once the job gets executed the first time you do |
That's why we recommend appending a random string to a job name if you know for sure you are going to re-run a job again. |
I see, I'm not sure I like the approach of leaving behind a new successful job for every deployment. Right now what I've done is adding a new step to our deployment script that does |
It feels like helm should be managing the full lifecycle of its hooks, as they're documented as the approach to take when it comes to executing lifecycle events such as migrations. Someone without a strong understanding of kubernetes could end up with hundreds of useless jobs. I almost feel like the ideal solution here would just be to have tiller delete an old job if it hits a name conflict with a hook job it attempts to create. You could verify that the job was required by helm by ensuring that the correct hook annotation is present. You could also put this behavior behind a command line flag, such as
|
I am having the exact same challenges with a python / django app. Does anyone have a mechanism such that helm will abort an upgrade if the pre-upgrade job fails? |
I may take this once I get some other work done for 2.5. Assigning to myself for now, if someone else wants to take it before I work on it, let me know |
@thomastaylor312 Have you finished this feature yet? If not, I would like to take this work. |
@DoctorZK Feel free to take it. Thank you for offering to do it! |
What about a simple annotation |
Good suggestion. I have thought out two approaches to solve this problem. Add a simple annotation in the hook templates, which is the easiest to implement. However, it can not solve this kind of problem: helm fails during the pre-install/pre-upgrade process, but users try to install/upgrade the release with the same chart again, which will incur resource objects name conflict in K8S. Therefore, with the approach, we should add another annotation, such as Add flags after install/upgrade/rollback/delete commands (e.g., upgrade $release_name --include-hooks) which can solve name conflict problem, however, it will also remove some kinds of hooks that users are not intended to delete, such as configmaps and secrets that are designed to use repeatedly by different versions of the same release. I prefer the first one, which can control hooks with a finer granularity. |
@DoctorZK as you suggests I would like to have another annotation Are you willing to work on this? if not I can take care of it |
Thanks for your help. I have finished the coding process, and now is under test. I will submit the pull request as soon as possible. |
For those who want to have a workaround for this, until the final solution is implemented.
|
@thomastaylor312 Did this land in 2.7? |
yes, everything currently in master landed in 2.7. |
Just to be clear... if I add |
@macropin Is your hook defined as a |
@thomastaylor312 It's defined as |
It will create a new object (generally a |
The job has the following annotations:
The job only runs once on the first install, and never again on subsequent upgrades. Running Helm v2.7.0. Should I create a separate issue for this? |
@macropin Yes. Could you please create another issue with details about your cluster and, if possible, an example chart that duplicates the issue |
Without the annotation, helm upgrade fails. helm/helm#1769
Without those annotation, helm upgrade fails because of : helm/helm#1769
* Update job.yaml Without those annotation, helm upgrade fails because of : helm/helm#1769 * Increasing version number
@macropin How did you solve it? I'm facing a similar issue. It never creates job every subsequent upgrade. I'm using the same annotation as yours. helm version: v2.14.3 |
@sohel2020 and @macropin - Same board, helm v3. The job is never re-ran on subsequent upgrades. |
What is the solution for rerunning jobs on helm v3? |
Same problem with Helm version: version.BuildInfo{Version:"v3.1.2", GitCommit:"d878d4d45863e42fd5cff6743294a11d28a9abce", GitTreeState:"clean", GoVersion:"go1.13.8"} Jobs are neither deleted nor run on subsequent upgrade commands. |
We regularly hit this one too in Helm 3.1. |
vol-clean-{{ template "alfresco-identity.fullname" . }} updated to name: vol-clean-{{ template "alfresco-identity.fullname" . }}-{{ randAlphaNum 5 | lower }} based on helm/helm#1769 our issue is https://issues.alfresco.com/jira/browse/AAE-3212
* Update job.yaml Without those annotation, helm upgrade fails because of : helm/helm#1769 * Increasing version number
* Update job.yaml Without those annotation, helm upgrade fails because of : helm/helm#1769 * Increasing version number
* Update job.yaml Without those annotation, helm upgrade fails because of : helm/helm#1769 * Increasing version number
In the future I believe we will be able to rely as well on job TTL |
@paologallinaharbur only partially: for example if job ttl is 5 min, and you make a commit before and the previous commit caused job to fail so it is still there, you will have same issue The two options that have worked for me:
The best option would be to have a hook policy that is applied before a hook is run, eg something like
|
@thecrazzymouse Were you able to find a solution? |
I have a job bind to
post-install,post-upgrade,post-rollback
when i’m running update charts i get
Error: UPGRADE FAILED: jobs.batch "opining-quetzal-dns-upsert" already exists
kubectl get jobs
returnsSo, how we are supposed to use jobs as hooks if it is not deleted after it is successfully finished?
There is no way to update chart if it has such job.
The text was updated successfully, but these errors were encountered: