Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

helm upgrade --install istio-init fails for upgrade when going from 1.1.0 to 1.1.1. #12855

Closed
JayG-EBSCO opened this issue Mar 27, 2019 · 28 comments · Fixed by #15677

Comments

@JayG-EBSCO
Copy link

commented Mar 27, 2019

Describe the bug
I've been working on testing upgrades via helm from 1.1.0 > 1.1.1. Using helm upgrade --install istio-init returns an error when trying to apply the upgrade over an existing 1.1.0 deployment of the crds. It only appears successful if I add the --force flag.

The following error text is received:

Istalled Istio version is: istio-1.1.0
Istio version is different: Upgrading...
Error: UPGRADE FAILED: Job.batch "istio-init-crd-10" is invalid: spec.template: Invalid value: core.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string{"controller-uid":"a5c14ec7-50cc-11e9-9528-124033095ef8", "job-name":"istio-init-crd-10"}, Annotations:map[string]string{"sidecar.istio.io/inject":"false"}, OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Spec:core.PodSpec{Volumes:[]core.Volume{core.Volume{Name:"crd-10", VolumeSource:core.VolumeSource{HostPath:(*core.HostPathVolumeSource)(nil), EmptyDir:(*core.EmptyDirVolumeSource)(nil), GCEPersistentDisk:(*core.GCEPersistentDiskVolumeSource)(nil), AWSElasticBlockStore:(*core.AWSElasticBlockStoreVolumeSource)(nil), GitRepo:(*core.GitRepoVolumeSource)(nil), Secret:(*core.SecretVolumeSource)(nil), NFS:(*core.NFSVolumeSource)(nil), ISCSI:(*core.ISCSIVolumeSource)(nil), Glusterfs:(*core.GlusterfsVolumeSource)(nil), PersistentVolumeClaim:(*core.PersistentVolumeClaimVolumeSource)(nil), RBD:(*core.RBDVolumeSource)(nil), Quobyte:(*core.QuobyteVolumeSource)(nil), FlexVolume:(*core.FlexVolumeSource)(nil), Cinder:(*core.CinderVolumeSource)(nil), CephFS:(*core.CephFSVolumeSource)(nil), Flocker:(*core.FlockerVolumeSource)(nil), DownwardAPI:(*core.DownwardAPIVolumeSource)(nil), FC:(*core.FCVolumeSource)(nil), AzureFile:(*core.AzureFileVolumeSource)(nil), ConfigMap:(*core.ConfigMapVolumeSource)(0xc42c3513c0), VsphereVolume:(*core.VsphereVirtualDiskVolumeSource)(nil), AzureDisk:(*core.AzureDiskVolumeSource)(nil), PhotonPersistentDisk:(*core.PhotonPersistentDiskVolumeSource)(nil), Projected:(*core.ProjectedVolumeSource)(nil), PortworxVolume:(*core.PortworxVolumeSource)(nil), ScaleIO:(*core.ScaleIOVolumeSource)(nil), StorageOS:(*core.StorageOSVolumeSource)(nil)}}}, InitContainers:[]core.Container(nil), Containers:[]core.Container{core.Container{Name:"istio-init-crd-10", Image:"docker.io/istio/kubectl:1.1.1", Command:[]string{"kubectl", "apply", "-f", "/etc/istio/crd-10/crd-10.yaml"}, Args:[]string(nil), WorkingDir:"", Ports:[]core.ContainerPort(nil), EnvFrom:[]core.EnvFromSource(nil), Env:[]core.EnvVar(nil), Resources:core.ResourceRequirements{Limits:core.ResourceList(nil), Requests:core.ResourceList(nil)}, VolumeMounts:[]core.VolumeMount{core.VolumeMount{Name:"crd-10", ReadOnly:true, MountPath:"/etc/istio/crd-10", SubPath:"", MountPropagation:(*core.MountPropagationMode)(nil)}}, VolumeDevices:[]core.VolumeDevice(nil), LivenessProbe:(*core.Probe)(nil), ReadinessProbe:(*core.Probe)(nil), Lifecycle:(*core.Lifecycle)(nil), TerminationMessagePath:"/dev/termination-log", TerminationMessagePolicy:"File", ImagePullPolicy:"IfNotPresent", SecurityContext:(*core.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, RestartPolicy:"OnFailure", TerminationGracePeriodSeconds:(*int64)(0xc42e246848), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"ClusterFirst", NodeSelector:map[string]string(nil), ServiceAccountName:"istio-init-service-account", AutomountServiceAccountToken:(*bool)(nil), NodeName:"", SecurityContext:(*core.PodSecurityContext)(0xc4356f0fc0), ImagePullSecrets:[]core.LocalObjectReference(nil), Hostname:"", Subdomain:"", Affinity:(*core.Affinity)(nil), SchedulerName:"default-scheduler", Tolerations:[]core.Toleration(nil), HostAliases:[]core.HostAlias(nil), PriorityClassName:"", Priority:(*int32)(nil), DNSConfig:(*core.PodDNSConfig)(nil), ReadinessGates:[]core.PodReadinessGate(nil)}}: field is immutable && Job.batch "istio-init-crd-11" is invalid: spec.template: Invalid value: core.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string{"controller-uid":"a5c24bb0-50cc-11e9-9528-124033095ef8", "job-name":"istio-init-crd-11"}, Annotations:map[string]string{"sidecar.istio.io/inject":"false"}, OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Spec:core.PodSpec{Volumes:[]core.Volume{core.Volume{Name:"crd-11", VolumeSource:core.VolumeSource{HostPath:(*core.HostPathVolumeSource)(nil), EmptyDir:(*core.EmptyDirVolumeSource)(nil), GCEPersistentDisk:(*core.GCEPersistentDiskVolumeSource)(nil), AWSElasticBlockStore:(*core.AWSElasticBlockStoreVolumeSource)(nil), GitRepo:(*core.GitRepoVolumeSource)(nil), Secret:(*core.SecretVolumeSource)(nil), NFS:(*core.NFSVolumeSource)(nil), ISCSI:(*core.ISCSIVolumeSource)(nil), Glusterfs:(*core.GlusterfsVolumeSource)(nil), PersistentVolumeClaim:(*core.PersistentVolumeClaimVolumeSource)(nil), RBD:(*core.RBDVolumeSource)(nil), Quobyte:(*core.QuobyteVolumeSource)(nil), FlexVolume:(*core.FlexVolumeSource)(nil), Cinder:(*core.CinderVolumeSource)(nil), CephFS:(*core.CephFSVolumeSource)(nil), Flocker:(*core.FlockerVolumeSource)(nil), DownwardAPI:(*core.DownwardAPIVolumeSource)(nil), FC:(*core.FCVolumeSource)(nil), AzureFile:(*core.AzureFileVolumeSource)(nil), ConfigMap:(*core.ConfigMapVolumeSource)(0xc430ee2740), VsphereVolume:(*core.VsphereVirtualDiskVolumeSource)(nil), AzureDisk:(*core.AzureDiskVolumeSource)(nil), PhotonPersistentDisk:(*core.PhotonPersistentDiskVolumeSource)(nil), Projected:(*core.ProjectedVolumeSource)(nil), PortworxVolume:(*core.PortworxVolumeSource)(nil), ScaleIO:(*core.ScaleIOVolumeSource)(nil), StorageOS:(*core.StorageOSVolumeSource)(nil)}}}, InitContainers:[]core.Container(nil), Containers:[]core.Container{core.Container{Name:"istio-init-crd-11", Image:"docker.io/istio/kubectl:1.1.1", Command:[]string{"kubectl", "apply", "-f", "/etc/istio/crd-11/crd-11.yaml"}, Args:[]string(nil), WorkingDir:"", Ports:[]core.ContainerPort(nil), EnvFrom:[]core.EnvFromSource(nil), Env:[]core.EnvVar(nil), Resources:core.ResourceRequirements{Limits:core.ResourceList(nil), Requests:core.ResourceList(nil)}, VolumeMounts:[]core.VolumeMount{core.VolumeMount{Name:"crd-11", ReadOnly:true, MountPath:"/etc/istio/crd-11", SubPath:"", MountPropagation:(*core.MountPropagationMode)(nil)}}, VolumeDevices:[]core.VolumeDevice(nil), LivenessProbe:(*core.Probe)(nil), ReadinessProbe:(*core.Probe)(nil), Lifecycle:(*core.Lifecycle)(nil), TerminationMessagePath:"/dev/termination-log", TerminationMessagePolicy:"File", ImagePullPolicy:"IfNotPresent", SecurityContext:(*core.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, RestartPolicy:"OnFailure", TerminationGracePeriodSeconds:(*int64)(0xc43b148058), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"ClusterFirst", NodeSelector:map[string]string(nil), ServiceAccountName:"istio-init-service-account", AutomountServiceAccountToken:(*bool)(nil), NodeName:"", SecurityContext:(*core.PodSecurityContext)(0xc4388a1650), ImagePullSecrets:[]core.LocalObjectReference(nil), Hostname:"", Subdomain:"", Affinity:(*core.Affinity)(nil), SchedulerName:"default-scheduler", Tolerations:[]core.Toleration(nil), HostAliases:[]core.HostAlias(nil), PriorityClassName:"", Priority:(*int32)(nil), DNSConfig:(*core.PodDNSConfig)(nil), ReadinessGates:[]core.PodReadinessGate(nil)}}: field is immutable

Expected behavior
Expected the CRDS deployment of istio-init to be upgraded to 1.1.1, as the same process is successful in upgrading Istio itself.

Steps to reproduce the bug

  1. Download the package for 1.10
  2. Install 1.1.0 via: helm upgrade --install istio-init install/kubernetes/helm/istio-init --namespace istio-system
  3. Download the package for 1.1.1
  4. Repeat the same command to try and upgrade: helm upgrade --install istio-init install/kubernetes/helm/istio-init --namespace istio-system

Error is returned

Version
Kubernentes 1.11.5 on AWS EKS

Installation
The state is repeatable both from the terminal on MacOS, and via automation used via Python, deployed by Jenkins.

Environment
AWS EKS

@JayG-EBSCO

This comment has been minimized.

Copy link
Author

commented Mar 28, 2019

Probably worth noting, this has been the case using helm 2.10.

@joberdick

This comment has been minimized.

Copy link

commented Apr 3, 2019

Same issue here. Only difference in my setup from @JayG-EBSCO's is I am using helm 2.11 and I coming from istio-1.1.0-snapshot.5 to 1.1.1.

@Morriz

This comment has been minimized.

Copy link

commented Apr 5, 2019

I believe this has to do with default values of null, like in https://github.com/istio/istio/blob/master/install/kubernetes/helm/istio/charts/grafana/values.yaml

which are not allowed to be overridden with type map...

@meganwalker-ibm

This comment has been minimized.

Copy link

commented Apr 9, 2019

This is also the case for istio 1.1.1 -> 1.1.2 using helm 2.9.1.

@mhotan

This comment has been minimized.

Copy link

commented Apr 10, 2019

I was able to repro as well

helm version 2.13 and upgrading 1.1.0 -> 1.1.1 or 1.1.2

@meganwalker-ibm

This comment has been minimized.

Copy link

commented Apr 10, 2019

Having done some more testing we can share the following results for this issue:
Helm: 2.9.1; going from istio 1.1.1 to 1.1.2

  • helm upgrade --install istio-init -> Fails with the error in the OP
  • helm template install/kubernetes/helm/istio --name istio --namespace istio-system | kubectl apply -f - -> Also fails with the error in the OP
  • for i in install/kubernetes/helm/istio-init/files/crd*yaml; do kubectl apply -f $i; done -> appears to work. Atleast it doesn't blow up, and given the diff of 1.1.1 and 1.1.2 I suspect it's expected that none of the CRDs have changed. Naturally this means helm will be out of the loop on any upgrades done this way.
@linsun

This comment has been minimized.

Copy link
Member

commented Apr 10, 2019

@morvencao could you check this out when you have some time?

@morvencao

This comment has been minimized.

Copy link
Member

commented Apr 11, 2019

Unlike the deployment, daemonset... we need to add --force to recreate the CRD creation jobs while upgrade, so that all new CRDs can be applied to api-server.

@hobbytp

This comment has been minimized.

Copy link

commented Apr 18, 2019

I meet the same problem when I try to upgrade between 2 istio-init version 1.1.3 with little changing in the 2nd istio-init values.yaml file.

@morvencao

This comment has been minimized.

Copy link
Member

commented Apr 18, 2019

Currently there is no better solution except adding --force parameter to force CRDs creation job to run again.

@hobbytp

This comment has been minimized.

Copy link

commented Apr 20, 2019

can we use post-install and hook-delete to delete the job when it finished and then re-create the job to run again in upgrade?

@morvencao

This comment has been minimized.

Copy link
Member

commented Apr 20, 2019

@hobbytp I'm afraid not.
We will not introduce more helm hooks, because Istio was disrupted in the past by the helm hooks and we expect that Istio simply uses helm as a templating engine only, free from the tiller.

@Pluies

This comment has been minimized.

Copy link

commented May 7, 2019

@morvencao It's damned if you do, damned if you don't, because as this issue shows Istio is currently disrupted by not having helm hooks :(

I understand not wanting to have hooks in the main chart, but if this can help other people, this is how we currently patch the istio-init chart to get around this issue:

diff --git a/helm/istio-init/templates/job-crd-10.yaml b/helm/istio-init/templates/job-crd-10.yaml
index 87d6469..0bc0a9d 100644
--- a/helm/istio-init/templates/job-crd-10.yaml
+++ b/helm/istio-init/templates/job-crd-10.yaml
@@ -3,6 +3,9 @@ kind: Job
 metadata:
   namespace: {{ .Release.Namespace }}
   name: istio-init-crd-10
+  annotations:
+    "helm.sh/hook": "pre-rollback, pre-upgrade, pre-install"
+    "helm.sh/hook-delete-policy": "before-hook-creation"
 spec:
   template:
     metadata:
diff --git a/helm/istio-init/templates/job-crd-11.yaml b/helm/istio-init/templates/job-crd-11.yaml
index 0f3a4b8..a97326e 100644
--- a/helm/istio-init/templates/job-crd-11.yaml
+++ b/helm/istio-init/templates/job-crd-11.yaml
@@ -3,6 +3,9 @@ kind: Job
 metadata:
   namespace: {{ .Release.Namespace }}
   name: istio-init-crd-11
+  annotations:
+    "helm.sh/hook": "pre-rollback, pre-upgrade, pre-install"
+    "helm.sh/hook-delete-policy": "before-hook-creation"
 spec:
   template:
     metadata:

Edit May 13th: ⚠️ adding these hooks fixes the upgrade issue, but breaks the initial istio install. See comment downthread for an explanation of our current fix.

@pviniciusfm

This comment has been minimized.

Copy link

commented May 7, 2019

@Pluies this solution worked for me. Thank you

@APuertaSales

This comment has been minimized.

Copy link

commented May 8, 2019

@Pluies , thank you. It did the trick to add this annotation to istio-init-crd-10 and istio-init-crd-11

@meganwalker-ibm

This comment has been minimized.

Copy link

commented May 10, 2019

If --force is going to be the recommended upgrade then I suggest updating the https://istio.io/docs/setup/kubernetes/upgrade/steps/ docs for the Helm upgrade option of the Control Plane upgrade; along with understanding if having the CRDs forced would cause any downtime of a running istio system during the command.

@Pluies

This comment has been minimized.

Copy link

commented May 13, 2019

Since writing my previous comment, we've realised that our end-to-end infrastructure builds were failing, as these new annotations break a fresh install of istio-init, as the Job needs its ServiceAccount to be available, and the ServiceAccount won't be available pre-install:

Events:
  Type     Reason        Age                From            Message
  ----     ------        ----               ----            -------
  Warning  FailedCreate  3m (x10 over 31m)  job-controller  Error creating: pods "istio-init-crd-10-" is forbidden: error looking up service account istio-system/istio-init-service-account: serviceaccount "istio-init-service-account" not found

After thinking about the situation a bit more, we came to the following conclusion:

  • The whole point of the istio-init chart is to install Istio's CRDs before the main istio chart can be installed.
  • This is a lot of complexity to install a few CRDs, and leads to annoying side-effects when upgrading Istio (cf this thread)
  • We also had to use an extra script to block the automated install process and wait until the istio-init Jobs were complete before installing Istio. Sequentially using helm to install istio-init and then istio will fail, as helm install returns as soon as the Job objects have been added to the cluster but without any guarantee that the corresponding Pods actually finished running, meaning Istio's install will fail, expecting CRDs that haven't been installed yet. In our experience, using a small EKS cluster, it can take up to a minute from submission to completion.
  • Helm has recently (well, "recently" - a year ago) introduced a new hook type specifically for CRDs, which negates the need to use the istio-init chart altogether when using helm hooks

And our solution therefore was to remove the istio-init chart from our infrastructure entirely, and instead move the CRD definition into the main istio chart with the hook annotations from Helm:

    "helm.sh/hook": crd-install

This makes the upgrade process form a version of istio to another a bit more involved, as we have to cherry-pick the CRDs from istio-init and add them to istio, but it makes our repo simpler.

It's been smooth sailing so far. I'll report on this thread if we encounter more issues.

@morvencao

This comment has been minimized.

Copy link
Member

commented May 14, 2019

@Pluies Thanks, Actually we have introduced the crd-install in 1.0.x, and then we have switched to independent istio-init in 1.1.x to preserve the data continuity of the custom resource content during the upgrade process and further enables Istio to evolve beyond a Helm-based installation.
I think @sdake know more about the context.

@Pluies

This comment has been minimized.

Copy link

commented May 21, 2019

That's super interesting @morvencao - I'd be very keen to learn more about the reasoning behind this decision, could you shed some light @sdake ?

@Pluies

This comment has been minimized.

Copy link

commented Jun 21, 2019

Sigh... It looks like I'm going through the same journey as you, but as a slow-motion car crash.

Our update from Istio 1.1.x to 1.2.0 has failed despite adding the new 1.2 CRDs with the crd-install hook annotation, because Helm only runs these hooks on chart install, and not on upgrade; and therefore won't install new CRDs on a Helm upgrade.

Looks like we'll be reverting to istio + istio-init charts and the extra "wait script" for our cluster installs.

@Pluies

This comment has been minimized.

Copy link

commented Jun 21, 2019

And for completeness, here's the thread where @sdake commented on the issue, explaining the exact same issue we just encountered (and describing better than I did :) ) and providing the rationale for splitting into two Helm charts: #9604 (comment)

@joberdick

This comment has been minimized.

Copy link

commented Jun 21, 2019

It seems the best approach is to give up on helm tiller and just use helm template everything. Then use script to manage any operations outside of that.

@Pluies

This comment has been minimized.

Copy link

commented Jun 28, 2019

The istio install guide documents both approaches. We like using Helm w/ Tiller (for all its faults :) ) so we go for option 2, which recommends (paraphrased):

  • helm install istio-init
  • Verify that all 23 Istio CRDs were committed to the Kubernetes api-server using the following command: kubectl get crds | grep 'istio.io\|certmanager.k8s.io' | wc -l
  • helm install istio

... Which doesn't actually work for upgrades, as our cluster (which went from istio 1.0 to 1.1 to 1.2) now has 54 CRDs matching istio|certmanager.

Instead, we wait until the CRD jobs created by istio-init are successful, using a script as follows:

helm upgrade --install istio-init istio-init

# Check whether a kube job has succeeded
function succeeded() {
  if [ "$(kubectl get job $1 --no-headers -o custom-columns=:status.succeeded)" == "1" ]
  then
    return 0
  else
    return 1
  fi
}

echo "Waiting for Istio CRDs to be available..."
MAX_TRIES=20
while true
do
  if succeeded istio-init-crd-10 \
  && succeeded istio-init-crd-11 \
  && succeeded istio-init-crd-12
  then
    echo "All CRDs installed"
    break
  fi
  echo -n '.'
  let "i++" && [[ $i == ${MAX_TRIES} ]] && echo && echo "Istio CRDs jobs have not succeeded after ${MAX_TRIES} tries, giving up." && exit -1
  sleep 5
done

helm upgrade --install istio istio

So far, this seems to work really well, both for 1.2 reapplys and 1.1 -> 1.2 upgrades.

@sdake

This comment has been minimized.

Copy link
Member

commented Jul 20, 2019

@morvencao the correct way to handle this job failure without the need for force is to rename the job (via sed) on every release. We have a script that does this in T&R, although I am not sure of the specifics.

Can you look into using this approach (we do this for other jobs in the istio chart).

@utka FYI this may need your assistance.

@sdake sdake modified the milestones: 1.1, 1.3 Jul 20, 2019
@sdake

This comment has been minimized.

Copy link
Member

commented Jul 20, 2019

@morvencao I'd use the naming convention istio-init-crd-10-x.y.z where x.y.z is the current release. The implementation will just run a new job, that will create the per version CRDs if needed.

@sdake

This comment has been minimized.

Copy link
Member

commented Jul 20, 2019

@morvencao the key is to get rid of --force here.

@sdake

This comment has been minimized.

Copy link
Member

commented Jul 20, 2019

Marking as a P0 for 1.3. This is a blocking bug.

@morvencao

This comment has been minimized.

Copy link
Member

commented Jul 20, 2019

@sdake Yes, we're using that solution for hook jobs in Istio chart, but not through sed, just adding version suffix to job name, see:
https://github.com/istio/istio/blob/master/install/kubernetes/helm/istio/charts/security/templates/create-custom-resources-job.yaml#L63
https://github.com/istio/istio/blob/master/install/kubernetes/helm/istio/charts/security/templates/cleanup-secrets.yaml#L80

Have fired a PR #15677 for this and local test is positive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.