Skip to content
This repository has been archived by the owner on May 16, 2023. It is now read-only.

Fails to discover master and form a cluster #341

Closed
demisx opened this issue Oct 22, 2019 · 6 comments
Closed

Fails to discover master and form a cluster #341

demisx opened this issue Oct 22, 2019 · 6 comments

Comments

@demisx
Copy link

demisx commented Oct 22, 2019

Chart version:
7.3.0

Kubernetes version:
1.14.6

Kubernetes provider: E.g. GKE (Google Kubernetes Engine)
AWS

Helm Version:
v2.15.0

helm get release output
e.g. helm get elasticsearch (replace elasticsearch with the name of your helm release)

REVISION: 1
RELEASED: Tue Oct 22 11:12:21 2019
CHART: elasticsearch-7.3.0
USER-SUPPLIED VALUES:
image: docker.elastic.co/elasticsearch/elasticsearch-oss
imageTag: 7.3.2
roles:
  ingest: false

COMPUTED VALUES:
antiAffinity: hard
antiAffinityTopologyKey: kubernetes.io/hostname
clusterHealthCheckParams: wait_for_status=green&timeout=1s
clusterName: elasticsearch
esConfig: {}
esJavaOpts: -Xmx1g -Xms1g
esMajorVersion: ""
extraEnvs: []
extraInitContainers: []
extraVolumeMounts: []
extraVolumes: []
fsGroup: ""
fullnameOverride: ""
httpPort: 9200
image: docker.elastic.co/elasticsearch/elasticsearch-oss
imagePullPolicy: IfNotPresent
imagePullSecrets: []
imageTag: 7.3.2
ingress:
  annotations: {}
  enabled: false
  hosts:
  - chart-example.local
  path: /
  tls: []
initResources: {}
labels: {}
lifecycle: {}
masterService: ""
masterTerminationFix: false
maxUnavailable: 1
minimumMasterNodes: 2
nameOverride: ""
networkHost: 0.0.0.0
nodeAffinity: {}
nodeGroup: master
nodeSelector: {}
persistence:
  annotations: {}
  enabled: true
podAnnotations: {}
podManagementPolicy: Parallel
podSecurityContext:
  fsGroup: 1000
priorityClassName: ""
protocol: http
readinessProbe:
  failureThreshold: 3
  initialDelaySeconds: 10
  periodSeconds: 10
  successThreshold: 3
  timeoutSeconds: 5
replicas: 3
resources:
  limits:
    cpu: 1000m
    memory: 2Gi
  requests:
    cpu: 100m
    memory: 2Gi
roles:
  data: "true"
  ingest: false
  master: "true"
schedulerName: ""
secretMounts: []
securityContext:
  capabilities:
    drop:
    - ALL
  runAsNonRoot: true
  runAsUser: 1000
service:
  annotations: {}
  nodePort: null
  type: ClusterIP
sidecarResources: {}
sysctlInitContainer:
  enabled: true
sysctlVmMaxMapCount: 262144
terminationGracePeriod: 120
tolerations: []
transportPort: 9300
updateStrategy: RollingUpdate
volumeClaimTemplate:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 30Gi

HOOKS:
---
# elasticsearch-qxaid-test
apiVersion: v1
kind: Pod
metadata:
  name: "elasticsearch-qxaid-test"
  annotations:
    "helm.sh/hook": test-success
spec:
  containers:
  - name: "elasticsearch-dxggn-test"
    image: "docker.elastic.co/elasticsearch/elasticsearch-oss:7.3.2"
    command:
      - "sh"
      - "-c"
      - |
        #!/usr/bin/env bash -e
        curl -XGET --fail 'elasticsearch-master:9200/_cluster/health?wait_for_status=green&timeout=1s'
  restartPolicy: Never
MANIFEST:

---
# Source: elasticsearch/templates/poddisruptionbudget.yaml
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: "elasticsearch-master-pdb"
spec:
  maxUnavailable: 1
  selector:
    matchLabels:
      app: "elasticsearch-master"
---
# Source: elasticsearch/templates/service.yaml
kind: Service
apiVersion: v1
metadata:
  name: elasticsearch-master
  labels:
    heritage: "Tiller"
    release: "elasticsearch"
    chart: "elasticsearch-7.3.0"
    app: "elasticsearch-master"
  annotations:
    {}
    
spec:
  type: ClusterIP
  selector:
    heritage: "Tiller"
    release: "elasticsearch"
    chart: "elasticsearch-7.3.0"
    app: "elasticsearch-master"
  ports:
  - name: http
    protocol: TCP
    port: 9200
  - name: transport
    protocol: TCP
    port: 9300
---
# Source: elasticsearch/templates/service.yaml
kind: Service
apiVersion: v1
metadata:
  name: elasticsearch-master-headless
  labels:
    heritage: "Tiller"
    release: "elasticsearch"
    chart: "elasticsearch-7.3.0"
    app: "elasticsearch-master"
  annotations:
    service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
spec:
  clusterIP: None # This is needed for statefulset hostnames like elasticsearch-0 to resolve
  # Create endpoints also if the related pod isn't ready
  publishNotReadyAddresses: true
  selector:
    app: "elasticsearch-master"
  ports:
  - name: http
    port: 9200
  - name: transport
    port: 9300
---
# Source: elasticsearch/templates/statefulset.yaml
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  name: elasticsearch-master
  labels:
    heritage: "Tiller"
    release: "elasticsearch"
    chart: "elasticsearch-7.3.0"
    app: "elasticsearch-master"
  annotations:
    esMajorVersion: "7"
spec:
  serviceName: elasticsearch-master-headless
  selector:
    matchLabels:
      app: "elasticsearch-master"
  replicas: 3
  podManagementPolicy: Parallel
  updateStrategy:
    type: RollingUpdate
  volumeClaimTemplates:
  - metadata:
      name: elasticsearch-master
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 30Gi
      
  template:
    metadata:
      name: "elasticsearch-master"
      labels:
        heritage: "Tiller"
        release: "elasticsearch"
        chart: "elasticsearch-7.3.0"
        app: "elasticsearch-master"
      annotations:
        
    spec:
      securityContext:
        fsGroup: 1000
        
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - "elasticsearch-master"
            topologyKey: kubernetes.io/hostname
      terminationGracePeriodSeconds: 120
      volumes:
      initContainers:
      - name: configure-sysctl
        securityContext:
          runAsUser: 0
          privileged: true
        image: "docker.elastic.co/elasticsearch/elasticsearch-oss:7.3.2"
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
        resources:
          {}
          
      containers:
      - name: "elasticsearch"
        securityContext:
          capabilities:
            drop:
            - ALL
          runAsNonRoot: true
          runAsUser: 1000
          
        image: "docker.elastic.co/elasticsearch/elasticsearch-oss:7.3.2"
        imagePullPolicy: "IfNotPresent"
        readinessProbe:
          failureThreshold: 3
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 3
          timeoutSeconds: 5
          
          exec:
            command:
              - sh
              - -c
              - |
                #!/usr/bin/env bash -e
                # If the node is starting up wait for the cluster to be ready (request params: 'wait_for_status=green&timeout=1s' )
                # Once it has started only check that the node itself is responding
                START_FILE=/tmp/.es_start_file

                http () {
                    local path="${1}"
                    if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
                      BASIC_AUTH="-u ${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
                    else
                      BASIC_AUTH=''
                    fi
                    curl -XGET -s -k --fail ${BASIC_AUTH} http://127.0.0.1:9200${path}
                }

                if [ -f "${START_FILE}" ]; then
                    echo 'Elasticsearch is already running, lets check the node is healthy'
                    http "/"
                else
                    echo 'Waiting for elasticsearch cluster to become cluster to be ready (request params: "wait_for_status=green&timeout=1s" )'
                    if http "/_cluster/health?wait_for_status=green&timeout=1s" ; then
                        touch ${START_FILE}
                        exit 0
                    else
                        echo 'Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )'
                        exit 1
                    fi
                fi
        ports:
        - name: http
          containerPort: 9200
        - name: transport
          containerPort: 9300
        resources:
          limits:
            cpu: 1000m
            memory: 2Gi
          requests:
            cpu: 100m
            memory: 2Gi
          
        env:
          - name: node.name
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: cluster.initial_master_nodes
            value: ""
          - name: discovery.seed_hosts
            value: "elasticsearch-master-headless"
          - name: cluster.name
            value: "elasticsearch"
          - name: network.host
            value: "0.0.0.0"
          - name: ES_JAVA_OPTS
            value: "-Xmx1g -Xms1g"
          - name: node.data
            value: "true"
          - name: node.ingest
            value: "false"
          - name: node.master
            value: "true"
        volumeMounts:
          - name: "elasticsearch-master"
            mountPath: /usr/share/elasticsearch/data

Describe the bug:
We've been deploying the Elastic search 7.3.0 helm chart to a freshly built kops k8s cluster without issues for many weeks. However, I've tried to rebuild one environment yesterday and it always fails at the Elastic search deployment step now with the following error (see below) and for the sake of god I can't figure out why it fails.

{"type": "server", "timestamp": "2019-10-22T18:10:22,660+0000", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-0",  "message": "master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and [cluster.initial_master_nodes] is empty on this node: have discovered [{elasticsearch-master-0}{jaY1E2suRxqrLK_HZxzI6w}{ZKO_C0-QSeSD2r5ZpY2wfA}{100.96.5.4}{100.96.5.4:9300}{dm}, {elasticsearch-master-2}{-xgmZe-2Q-GaZ_-x80LMhQ}{cd3IWwklR8GRAapYdoiISQ}{100.96.4.3}{100.96.4.3:9300}{dm}, {elasticsearch-master-1}{h2qFhHx8SHuwbzWyWM7Xvw}{awYThFu8RNGOilyZfqF9Xg}{100.96.3.6}{100.96.3.6:9300}{dm}]; discovery will continue using [100.96.4.3:9300, 100.96.3.6:9300] from hosts providers and [{elasticsearch-master-0}{jaY1E2suRxqrLK_HZxzI6w}{ZKO_C0-QSeSD2r5ZpY2wfA}{100.96.5.4}{100.96.5.4:9300}{dm}] from last-known cluster state; node term 0, last-accepted version 0 in term 0"  }
{"type": "server", "timestamp": "2019-10-22T18:10:27,653+0000", "level": "DEBUG", "component": "o.e.a.a.c.h.TransportClusterHealthAction", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-0",  "message": "no known master node, scheduling a retry"  }
{"type": "server", "timestamp": "2019-10-22T18:10:28,654+0000", "level": "DEBUG", "component": "o.e.a.a.c.h.TransportClusterHealthAction", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-0",  "message": "timed out while retrying [cluster:monitor/health] after failure (timeout [1s])"  }
{"type": "server", "timestamp": "2019-10-22T18:10:28,654+0000", "level": "WARN", "component": "r.suppressed", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-0",  "message": "path: /_cluster/health, params: {wait_for_status=green, timeout=1s}" , 
"stacktrace": ["org.elasticsearch.discovery.MasterNotDiscoveredException: null",
"at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$3.onTimeout(TransportMasterNodeAction.java:251) [elasticsearch-7.3.2.jar:7.3.2]",
"at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:325) [elasticsearch-7.3.2.jar:7.3.2]",
"at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:252) [elasticsearch-7.3.2.jar:7.3.2]",
"at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:572) [elasticsearch-7.3.2.jar:7.3.2]",
"at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688) [elasticsearch-7.3.2.jar:7.3.2]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]",
"at java.lang.Thread.run(Thread.java:835) [?:?]"] }

Steps to reproduce:

  1. Create new k8s cluster using latest kops 1.14.0 release
  2. Deploy elastic search helm chart with:
helm install elastic/elasticsearch --name elasticsearch --set image=docker.elastic.co/elasticsearch/elasticsearch-oss --set imageTag=7.3.2 --set roles.ingest=false --version 7.3.0 --atomic
  1. Check each master pod log. It fails to locate master and form a cluster.

Expected behavior:
Successful deployment of elastic search helm chart as before.

Provide logs and/or server output (if relevant):

$ kubectl describe service/elasticsearch-master 
---
Name:              elasticsearch-master
Namespace:         default
Labels:            app=elasticsearch-master
                   chart=elasticsearch-7.3.0
                   heritage=Tiller
                   release=elasticsearch
Annotations:       <none>
Selector:          app=elasticsearch-master,chart=elasticsearch-7.3.0,heritage=Tiller,release=elasticsearch
Type:              ClusterIP
IP:                100.67.88.213
Port:              http  9200/TCP
TargetPort:        9200/TCP
Endpoints:         
Port:              transport  9300/TCP
TargetPort:        9300/TCP
Endpoints:         
Session Affinity:  None
Events:            <none>
$ kubectl describe service/elasticsearch-master-headless
---
Name:              elasticsearch-master-headless
Namespace:         default
Labels:            app=elasticsearch-master
                   chart=elasticsearch-7.3.0
                   heritage=Tiller
                   release=elasticsearch
Annotations:       service.alpha.kubernetes.io/tolerate-unready-endpoints: true
Selector:          app=elasticsearch-master
Type:              ClusterIP
IP:                None
Port:              http  9200/TCP
TargetPort:        9200/TCP
Endpoints:         100.96.3.7:9200,100.96.4.5:9200,100.96.5.5:9200
Port:              transport  9300/TCP
TargetPort:        9300/TCP
Endpoints:         100.96.3.7:9300,100.96.4.5:9300,100.96.5.5:9300
Session Affinity:  None
Events:            <none>
$ kubectl get all
---
NAME                         READY   STATUS    RESTARTS   AGE
pod/busybox                  1/1     Running   0          14m
pod/elasticsearch-master-0   0/1     Running   0          9m49s
pod/elasticsearch-master-1   0/1     Running   0          9m49s
pod/elasticsearch-master-2   0/1     Running   0          9m49s

NAME                                    TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
service/elasticsearch-master            ClusterIP   100.67.88.213   <none>        9200/TCP,9300/TCP   9m49s
service/elasticsearch-master-headless   ClusterIP   None            <none>        9200/TCP,9300/TCP   9m49s
service/kubernetes                      ClusterIP   100.64.0.1      <none>        443/TCP             47m

NAME                                    READY   AGE
statefulset.apps/elasticsearch-master   0/3     9m49s
@demisx
Copy link
Author

demisx commented Oct 22, 2019

These are the events written during helm install:

$ kubectl get events --sort-by=.metadata.creationTimestamp
---
3m2s        Normal    SuccessfulCreate         statefulset/elasticsearch-master                                    create Claim elasticsearch-master-elasticsearch-master-0 Pod elasticsearch-master-0 in StatefulSet elasticsearch-master success
3m2s        Normal    SuccessfulCreate         statefulset/elasticsearch-master                                    create Pod elasticsearch-master-0 in StatefulSet elasticsearch-master successful
3m2s        Normal    NoPods                   poddisruptionbudget/elasticsearch-master-pdb                        No matching pods found
3m1s        Normal    ProvisioningSucceeded    persistentvolumeclaim/elasticsearch-master-elasticsearch-master-1   Successfully provisioned volume pvc-2d94baf3-f503-11e9-ad08-0ec4353e481e using kubernetes.io/aws-ebs
3m1s        Normal    ProvisioningSucceeded    persistentvolumeclaim/elasticsearch-master-elasticsearch-master-2   Successfully provisioned volume pvc-2d9b1dc2-f503-11e9-ad08-0ec4353e481e using kubernetes.io/aws-ebs
3m1s        Normal    SuccessfulCreate         statefulset/elasticsearch-master                                    create Pod elasticsearch-master-1 in StatefulSet elasticsearch-master successful
2m47s       Warning   FailedScheduling         pod/elasticsearch-master-0                                          pod has unbound immediate PersistentVolumeClaims (repeated 3 times)
3m1s        Normal    SuccessfulCreate         statefulset/elasticsearch-master                                    create Claim elasticsearch-master-elasticsearch-master-2 Pod elasticsearch-master-2 in StatefulSet elasticsearch-master success
3m1s        Normal    SuccessfulCreate         statefulset/elasticsearch-master                                    create Pod elasticsearch-master-2 in StatefulSet elasticsearch-master successful
3m1s        Normal    SuccessfulCreate         statefulset/elasticsearch-master                                    create Claim elasticsearch-master-elasticsearch-master-1 Pod elasticsearch-master-1 in StatefulSet elasticsearch-master success
3m1s        Warning   FailedScheduling         pod/elasticsearch-master-1                                          pod has unbound immediate PersistentVolumeClaims (repeated 3 times)
3m1s        Warning   FailedScheduling         pod/elasticsearch-master-2                                          pod has unbound immediate PersistentVolumeClaims (repeated 3 times)
2m58s       Warning   FailedAttachVolume       pod/elasticsearch-master-2                                          AttachVolume.Attach failed for volume "pvc-2d9b1dc2-f503-11e9-ad08-0ec4353e481e" : "Error attaching EBS volume \"vol-0517939ba50d54333\"" to instance "i-0d551a72d7465bf04" since volume is in "creating" state
3m          Normal    Scheduled                pod/elasticsearch-master-1                                          Successfully assigned default/elasticsearch-master-1 to ip-172-20-54-204.us-west-2.compute.internal
2m58s       Warning   FailedAttachVolume       pod/elasticsearch-master-1                                          AttachVolume.Attach failed for volume "pvc-2d94baf3-f503-11e9-ad08-0ec4353e481e" : "Error attaching EBS volume \"vol-0f9cde9034c310eeb\"" to instance "i-0f5ef1528c98a14dc" since volume is in "creating" state
3m          Normal    Scheduled                pod/elasticsearch-master-2                                          Successfully assigned default/elasticsearch-master-2 to ip-172-20-91-45.us-west-2.compute.internal
2m54s       Normal    SuccessfulAttachVolume   pod/elasticsearch-master-2                                          AttachVolume.Attach succeeded for volume "pvc-2d9b1dc2-f503-11e9-ad08-0ec4353e481e"
2m54s       Normal    SuccessfulAttachVolume   pod/elasticsearch-master-1                                          AttachVolume.Attach succeeded for volume "pvc-2d94baf3-f503-11e9-ad08-0ec4353e481e"
2m46s       Normal    ProvisioningSucceeded    persistentvolumeclaim/elasticsearch-master-elasticsearch-master-0   Successfully provisioned volume pvc-2d90c274-f503-11e9-ad08-0ec4353e481e using kubernetes.io/aws-ebs
2m45s       Normal    Scheduled                pod/elasticsearch-master-0                                          Successfully assigned default/elasticsearch-master-0 to ip-172-20-124-210.us-west-2.compute.internal
2m43s       Warning   FailedAttachVolume       pod/elasticsearch-master-0                                          AttachVolume.Attach failed for volume "pvc-2d90c274-f503-11e9-ad08-0ec4353e481e" : "Error attaching EBS volume \"vol-05ed42b8e9d0aa92a\"" to instance "i-08d5acff318c6059a" since volume is in "creating" state
2m42s       Normal    Started                  pod/elasticsearch-master-2                                          Started container configure-sysctl
2m42s       Normal    Started                  pod/elasticsearch-master-1                                          Started container configure-sysctl
2m42s       Normal    Created                  pod/elasticsearch-master-2                                          Created container configure-sysctl
2m42s       Normal    Created                  pod/elasticsearch-master-1                                          Created container configure-sysctl
2m42s       Normal    Pulled                   pod/elasticsearch-master-1                                          Container image "docker.elastic.co/elasticsearch/elasticsearch-oss:7.3.0" already present on machine
2m42s       Normal    Created                  pod/elasticsearch-master-2                                          Created container elasticsearch
2m42s       Normal    Pulled                   pod/elasticsearch-master-2                                          Container image "docker.elastic.co/elasticsearch/elasticsearch-oss:7.3.0" already present on machine
2m42s       Normal    Pulled                   pod/elasticsearch-master-2                                          Container image "docker.elastic.co/elasticsearch/elasticsearch-oss:7.3.0" already present on machine
2m41s       Normal    Started                  pod/elasticsearch-master-2                                          Started container elasticsearch
2m41s       Normal    Created                  pod/elasticsearch-master-1                                          Created container elasticsearch
2m41s       Normal    Pulled                   pod/elasticsearch-master-1                                          Container image "docker.elastic.co/elasticsearch/elasticsearch-oss:7.3.0" already present on machine
2m41s       Normal    Started                  pod/elasticsearch-master-1                                          Started container elasticsearch
2m39s       Normal    SuccessfulAttachVolume   pod/elasticsearch-master-0                                          AttachVolume.Attach succeeded for volume "pvc-2d90c274-f503-11e9-ad08-0ec4353e481e"
9s          Warning   Unhealthy                pod/elasticsearch-master-2                                          Readiness probe failed: Waiting for elasticsearch cluster to become cluster to be ready (request params: "wait_for_status=green&timeout=1s" )
Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )
2m27s       Normal    Started                  pod/elasticsearch-master-0                                          Started container configure-sysctl
2m27s       Normal    Pulled                   pod/elasticsearch-master-0                                          Container image "docker.elastic.co/elasticsearch/elasticsearch-oss:7.3.0" already present on machine
2m27s       Normal    Created                  pod/elasticsearch-master-0                                          Created container configure-sysctl
2m26s       Normal    Created                  pod/elasticsearch-master-0                                          Created container elasticsearch
2m26s       Normal    Started                  pod/elasticsearch-master-0                                          Started container elasticsearch
2m26s       Normal    Pulled                   pod/elasticsearch-master-0                                          Container image "docker.elastic.co/elasticsearch/elasticsearch-oss:7.3.0" already present on machine
2s          Warning   Unhealthy                pod/elasticsearch-master-1                                          Readiness probe failed: Waiting for elasticsearch cluster to become cluster to be ready (request params: "wait_for_status=green&timeout=1s" )
Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )
2s          Warning   Unhealthy                pod/elasticsearch-master-0                                          Readiness probe failed: Waiting for elasticsearch cluster to become cluster to be ready (request params: "wait_for_status=green&timeout=1s" )
Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )

@demisx
Copy link
Author

demisx commented Oct 22, 2019

Believe it or not, but the problem turned out to be in the recent helm v2.15.0 upgrade that I've got from brew. Downgrading it back to v2.14.3 fixed this problem.

# Uninstall tiller and helm client from server
kubectl delete all -l app=helm -n kube-system 
kubectl -n kube-system delete serviceaccount/tiller
kubectl delete clusterrolebinding/tiller
brew uninstall kubernetes-helm

# Install older version of helm and tiller
brew install https://raw.githubusercontent.com/Homebrew/homebrew-core/0a17b8e50963de12e8ab3de22e53fccddbe8a226/Formula/kubernetes-helm.rb
kubectl -n kube-system create serviceaccount tiller
kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
helm init --wait --service-account=tiller --history-max 200

@demisx demisx closed this as completed Oct 22, 2019
@jmlrt
Copy link
Member

jmlrt commented Oct 23, 2019

Hi @demisx, yes we are working on Helm 2.15 support and also had some issues with it that should be addressed soon in #338. Meanwhile we precognize staying on Helm 2.14.3.

@demisx
Copy link
Author

demisx commented Oct 23, 2019

@jmlrt Thank you for confirming. Would be really helpful if this issue was disclosed somehow. Would save us a great chunk of dev time for sure. Maybe a short warning sentence in README? Something that would indicate helm chart a dependency on particular helm version. My apologies, if this is already mentioned somewhere. I am still learning in and outs of helm.

@demisx demisx changed the title Fails to discover master and form cluster Fails to discover master and form a cluster Oct 23, 2019
@jmlrt
Copy link
Member

jmlrt commented Oct 23, 2019

Well Elastic Helm Charts should be compatible with every Helm v2 release (discloser we certainly won't be compatible with Helm v3 with the current code) in theory as we don't have code specific to some release.

However Helm 2.15.0 brought a lot of changes including one breaking change on Chart apiVersion and 2 regressions which forced them to release in emergency 2.15.1 a few hours ago (one change was fixed and the other was reverted).

If you take a look at these issues, you'll see that it wasn't impacted only Elastic charts but many other charts.

Overall I can advise to always test Helm version upgrades in a sandbox environment before (this is right for almost every softwares but specifically for Helm).

In addition I can add a mention of the Helm version that we currently test in the README.

Oh and by the way, meanwhile #338 has been merged and you should now be able to use Helm 2.15.1 with Elastic Charts if your other charts have no issues.

@demisx
Copy link
Author

demisx commented Oct 23, 2019

@jmlrt Understood. Thank you for sharing your thoughts. I think another good practice maybe just to stick with the helm version that is currently used in your tests/ folder. It just never occurred to me that going from one helm upgrade to another can result in a cluster formation error like this. I guess I've learned it the hard way! Properly upgrading apps is a small science already. 😄

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants