Skip to content
This repository has been archived by the owner on Aug 25, 2021. It is now read-only.

Unable to run v0.2.0 in minikube. "pod has unbound PersistentVolumeClaims" #13

Closed
kohv opened this issue Sep 29, 2018 · 6 comments
Closed
Labels
bug Something isn't working

Comments

@kohv
Copy link

kohv commented Sep 29, 2018

Running helm install --name consul --namespace=pkr -f dev-consul.yaml ./consul-helm with these custom values:

syncCatalog:
  enabled: true
server:
  storage: "8Gi"

It completes successfully, but when running kubectl get pods -n pkr, I see that consul-server-1 & consul-server-2 are Pending.

NAME                                   READY     STATUS    RESTARTS   AGE
consul-4vstb                           0/1       Running   0          15m
consul-server-0                        0/1       Running   0          15m
consul-server-1                        0/1       Pending   0          15m
consul-server-2                        0/1       Pending   0          15m
consul-sync-catalog-587b6859f6-dpj5v   1/1       Running   0          15m

Closer inspection on pods consul-server-0:

kubectl describe pods consul-server-0 -n pkr
Name:           consul-server-0
Namespace:      pkr
Node:           minikube/10.0.2.15
Start Time:     Sat, 29 Sep 2018 22:39:09 +0300
Labels:         app=consul
                chart=consul-0.1.0
                component=server
                controller-revision-hash=consul-server-66479c5df5
                hasDNS=true
                release=consul
                statefulset.kubernetes.io/pod-name=consul-server-0
Annotations:    consul.hashicorp.com/connect-inject=false
Status:         Running
IP:             172.17.0.9
Controlled By:  StatefulSet/consul-server
Containers:
  consul:
    Container ID:  docker://7de44c6027bdb78ba4b7bc73643701aa9e0bbb55abce8ce2c7b8e12e2adf82b0
    Image:         consul:1.2.3
    Image ID:      docker-pullable://consul@sha256:ea66d17d8c8c1f1afb2138528d62a917093fcd2e3b3a7b216a52c253189ea980
    Ports:         8500/TCP, 8301/TCP, 8302/TCP, 8300/TCP, 8600/TCP, 8600/UDP
    Host Ports:    0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/UDP
    Command:
      /bin/sh
      -ec
      CONSUL_FULLNAME="consul"

exec /bin/consul agent \
  -advertise="${POD_IP}" \
  -bind=0.0.0.0 \
  -bootstrap-expect=3 \
  -client=0.0.0.0 \
  -config-dir=/consul/config \
  -datacenter=dc1 \
  -data-dir=/consul/data \
  -domain=consul \
  -hcl="connect { enabled = true }" \
  -ui \
  -retry-join=${CONSUL_FULLNAME}-server-0.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
  -retry-join=${CONSUL_FULLNAME}-server-1.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
  -retry-join=${CONSUL_FULLNAME}-server-2.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
  -server

    State:          Running
      Started:      Sat, 29 Sep 2018 22:39:10 +0300
    Ready:          False
    Restart Count:  0
    Readiness:      exec [/bin/sh -ec curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | \
grep -E '".+"'
] delay=5s timeout=5s period=3s #success=1 #failure=2
    Environment:
      POD_IP:      (v1:status.podIP)
      NAMESPACE:  pkr (v1:metadata.namespace)
    Mounts:
      /consul/config from config (rw)
      /consul/data from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-fd6r9 (ro)
Conditions:
  Type           Status
  Initialized    True
  Ready          False
  PodScheduled   True
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-consul-server-0
    ReadOnly:   false
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      consul-server-config
    Optional:  false
  default-token-fd6r9:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-fd6r9
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                 Age                  From               Message
  ----     ------                 ----                 ----               -------
  Warning  FailedScheduling       31m (x2 over 31m)    default-scheduler  pod has unbound PersistentVolumeClaims
  Normal   Scheduled              31m                  default-scheduler  Successfully assigned consul-server-0 to minikube
  Normal   SuccessfulMountVolume  31m                  kubelet, minikube  MountVolume.SetUp succeeded for volume "pvc-55188bbe-c41f-11e8-b65d-080027750557"
  Normal   SuccessfulMountVolume  31m                  kubelet, minikube  MountVolume.SetUp succeeded for volume "config"
  Normal   SuccessfulMountVolume  31m                  kubelet, minikube  MountVolume.SetUp succeeded for volume "default-token-fd6r9"
  Normal   Pulled                 31m                  kubelet, minikube  Container image "consul:1.2.3" already present on machine
  Normal   Created                31m                  kubelet, minikube  Created container
  Normal   Started                31m                  kubelet, minikube  Started container
  Warning  Unhealthy              16m (x299 over 30m)  kubelet, minikube  Readiness probe failed:

This "pod has unbound PersistentVolumeClaims" error is same for all consul servers. Yet, when running kubectl get pvc & kubectl get pv, I see persistent volumes fine:

kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                           STORAGECLASS   REASON    AGE
pvc-55188bbe-c41f-11e8-b65d-080027750557   8Gi        RWO            Delete           Bound     pkr/data-consul-server-0   standard                 37m
pvc-55245df6-c41f-11e8-b65d-080027750557   8Gi        RWO            Delete           Bound     pkr/data-consul-server-1   standard                 37m
pvc-5533ebc2-c41f-11e8-b65d-080027750557   8Gi        RWO            Delete           Bound     pkr/data-consul-server-2   standard                 37m

kubectl get pvc -n pkr
NAME                   STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
data-consul-server-0   Bound     pvc-55188bbe-c41f-11e8-b65d-080027750557   8Gi        RWO            standard       38m
data-consul-server-1   Bound     pvc-55245df6-c41f-11e8-b65d-080027750557   8Gi        RWO            standard       38m
data-consul-server-2   Bound     pvc-5533ebc2-c41f-11e8-b65d-080027750557   8Gi        RWO            standard       38m
postgresql             Bound     pvc-56367c46-c41f-11e8-b65d-080027750557   8Gi        RWO            standard       38m

So, I don't understand the error, since pv & pvc outputs seem fine to me.
How should I debug this further? I've tried deleting the minikube cluster and starting over from scratch, but I get this result every time.

kubectl version

Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.3", GitCommit:"a4529464e4629c21224b3d52edfe0ea91b072862", GitTreeState:"clean", BuildDate:"2018-09-10T11:44:36Z", GoVersion:"go1.11", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:44:10Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

helm version
Client: &version.Version{SemVer:"v2.11.0", GitCommit:"2e55dbe1fdb5fdb96b75ff144a339489417b146b", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.11.0", GitCommit:"2e55dbe1fdb5fdb96b75ff144a339489417b146b", GitTreeState:"clean"}```

@kohv
Copy link
Author

kohv commented Sep 30, 2018

I had to remove affinity rules on the StatefulSet, here

After removing that anti-affinity rule, I was able to successfully launch consul server & consul-ui. I'm very new to k8s, but it seems to me that the anti-affinity rules are conflicting when running multiple consul server agents?

@ervikrant06
Copy link

Yes, I also hit the same issue this is happening because of affinity rules only. for redundancy purpose they want to start the single consul server on each node.

@mitchellh
Copy link
Contributor

This seems like two things:

  • We need docs to state you'll need at least N distinct nodes available where N is the number of Consul servers.
  • OR: you need to get rid of the affinity rules. We need to add configuration to allow this.

@s3than
Copy link
Contributor

s3than commented Oct 4, 2018

We resolved this issue by making the affinity a variable

    {{- if .Values.affinity }}
      affinity:
{{ tpl .Values.affinity . | indent 8 }}
{{- end }}

And then in the values

### Consul Settings
affinity: |
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchLabels:
            app: {{ template "consul.name" . }}
            release: "{{ .Release.Name }}"
            component: server
        topologyKey: kubernetes.io/hostname

This then means in minikube you can override with

## Affinity settings, this allows us to configure and run in Minikube
affinity: |
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 1
      podAffinityTerm:
        topologyKey: kubernetes.io/hostname
        labelSelector:
          matchExpressions:
          - key: component
            operator: In
            values:
            - "{{ .Release.Name }}-{{ .Values.Component }}"

@s3than
Copy link
Contributor

s3than commented Oct 21, 2018

@mitchellh Is there anything I can provide to get this item resolved?

Making affinity a variable allows testing on Minikube

Thanks

@adilyse
Copy link
Contributor

adilyse commented Nov 8, 2018

Thanks to @s3than, server affinity is now configurable in the values file! Additional documentation will be added once this PR in the Consul repository is merged.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants