Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helm hook does not wait for hook to complete before proceeding with upgrade #6874

Closed
vmalloc opened this issue Nov 3, 2019 · 5 comments
Labels
Milestone

Comments

@vmalloc
Copy link

@vmalloc vmalloc commented Nov 3, 2019

I have a chart with a pre-upgrade hook that is supposed to run database migrations. This is more or less how it is written at the moment:

apiVersion: batch/v1
kind: Job
metadata:
  name: "{{.Release.Name}}-migrate"
  labels:
    app.kubernetes.io/managed-by: {{.Release.Service | quote }}
    app.kubernetes.io/instance: {{.Release.Name | quote }}
    app.kubernetes.io/version: {{ .Chart.AppVersion | quote}}
    helm.sh/chart: "{{.Chart.Name}}-{{.Chart.Version}}"
  annotations:
    "helm.sh/hook": "pre-upgrade,pre-install"
    "helm.sh/hook-delete-policy": "before-hook-creation"
spec:
  activeDeadlineSeconds: 300
  template:
    metadata:
      name: "{{.Release.Name}}"
      labels:
        app.kubernetes.io/managed-by: {{.Release.Service | quote }}
        app.kubernetes.io/instance: {{.Release.Name | quote }}
        helm.sh/chart: "{{.Chart.Name}}-{{.Chart.Version}}"
    spec:
      restartPolicy: Never
      containers:
      - name: migrate-db
        image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
        imagePullPolicy: Always
        command: ["/bin/migrate-db"]
        env:
          ...

The migrate-db tool might take a while to run, so I expected the hook to pause the upgrade process until the upgrade is complete. However it seems that this is not the case. Even when I replace the command for the migration script with ["/bin/sleep", "60"], the upgrade proceeds and updates all resources, causing backends to boot up with an incomplete migration and ending up as failed...

Output of helm version:

lient: &version.Version{SemVer:"v2.15.2", GitCommit:"8dce272473e5f2a7bf58ce79bb5c3691db54c96b", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.15.2", GitCommit:"8dce272473e5f2a7bf58ce79bb5c3691db54c96b", GitTreeState:"clean"}

Output of kubectl version:

Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T23:42:50Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.10-eks-5ac0f1", GitCommit:"5ac0f1d9ab2c254ea2b0ce3534fd72932094c6e1", GitTreeState:"clean", BuildDate:"2019-08-20T22:39:46Z", GoVersion:"go1.11.13", Compiler:"gc", Platform:"linux/amd64"}

Cloud Provider/Platform (AKS, GKE, Minikube etc.): EKS (AWS)

@lkusnadi

This comment has been minimized.

Copy link

@lkusnadi lkusnadi commented Nov 4, 2019

I can confirm this. Upgraded Tiller and client to 2.15.2. Hooks with pre-install, pre-upgrade are no longer blocking. Hooks and Deployment are happening at the same time.
I have to roll back to version 2.14.1 to get the blocking behavior of Hooks on pre-install and pre-upgrade.

@bacongobbler bacongobbler added this to the 2.16.1 milestone Nov 6, 2019
@tmirks

This comment has been minimized.

Copy link

@tmirks tmirks commented Nov 7, 2019

This may be a duplicate / same root cause as #6832.

@thomastaylor312

This comment has been minimized.

Copy link
Collaborator

@thomastaylor312 thomastaylor312 commented Nov 7, 2019

So many of these reported hook issues may all be resolved by a recently merged fix #6897. Can anyone give the dev-2 branch a try and see if it fixes it for them?

@bacongobbler

This comment has been minimized.

Copy link
Member

@bacongobbler bacongobbler commented Nov 7, 2019

While I don't have a chart which I can use to verify the behaviour locally, I'm fairly certain this has been fixed in #6897. That patch fixes an issue with the kube client unable to wait for Jobs to run to completion as seen in #6894. Kubernetes 1.16 introduced a breaking change to the legacy schema so we had to update the code to reflect that change.

I'm going to close this as tentatively resolved via #6897, which will be in Helm 2.16.1. Please feel free to report your findings once 2.16.1 has been released with the likely fix mentioned above and we'll re-open if this is still an issue in 2.16.1+. I'm interested to hear your findings!

@thomastaylor312

This comment has been minimized.

Copy link
Collaborator

@thomastaylor312 thomastaylor312 commented Nov 7, 2019

I am pretty certain that this should be solved by #6907. Tiller wasn't waiting for certain jobs (see the explanation in the linked PR) due to a bug. Going to leave this closed for now as the PR is now open. If that PR does not resolve the issue, please let us know

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.