Skip to content
This repository has been archived by the owner on Sep 21, 2023. It is now read-only.

etcd Cluster fails to start #353

Open
loomsen opened this issue Oct 22, 2018 · 1 comment
Open

etcd Cluster fails to start #353

loomsen opened this issue Oct 22, 2018 · 1 comment

Comments

@loomsen
Copy link

loomsen commented Oct 22, 2018

Hi guys,
thank you for your effort. I've tried diving into this following the example, however, my etcd cluster fails to start.

First it looks like this

kubectl -n default get pods
NAME                              READY     STATUS     RESTARTS   AGE
etcd-operator-779446c7d8-t2hm9    3/3       Running    0          28m
example-etcd-f4rhsm64d4           0/1       Init:0/1   0          9s
vault-operator-7dc8b55b4d-mkz5p   1/1       Running    0          28m

Then it errors out, without providing any output:

 nvarz:~/playground/vault-operator (master *%=)$ kubectl -n default get pods -w
NAME                              READY     STATUS    RESTARTS   AGE
etcd-operator-779446c7d8-t2hm9    3/3       Running   0          29m
example-etcd-9x68tmdxl7           0/1       Error     0          40s
example-etcd-f4rhsm64d4           0/1       Running   0          56s
vault-operator-7dc8b55b4d-mkz5p   1/1       Running   0          29m
^C nvarz:~/playground/vault-operator (master *%=)$ kubectl describe -n default example-etcd-9x68tmdxl7
the server doesn't have a resource type "example-etcd-9x68tmdxl7"
 nvarz:~/playground/vault-operator (master *%=)$ kubectl describe -n default example-etcd-f4rhsm64d4
the server doesn't have a resource type "example-etcd-f4rhsm64d4"

Container Logs of example-etcd-f4rhsm64d4

WARNING: 2018/10/22 17:34:08 grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: Error while dialing dial tcp 0.0.0.0:2379: getsockopt: connection refused"; Reconnecting to {0.0.0.0:2379 0  <nil>}
2018-10-22 17:34:08.883317 I | raft: 850a5ffb901b8769 is starting a new election at term 119
2018-10-22 17:34:08.883345 I | raft: 850a5ffb901b8769 became candidate at term 120
2018-10-22 17:34:08.883354 I | raft: 850a5ffb901b8769 received MsgVoteResp from 850a5ffb901b8769 at term 120
2018-10-22 17:34:08.883362 I | raft: 850a5ffb901b8769 [logterm: 2, index: 5] sent MsgVote request to 3922346fe0f3212c at term 120
2018-10-22 17:34:09.883322 I | raft: 850a5ffb901b8769 is starting a new election at term 120
2018-10-22 17:34:09.883353 I | raft: 850a5ffb901b8769 became candidate at term 121
2018-10-22 17:34:09.883361 I | raft: 850a5ffb901b8769 received MsgVoteResp from 850a5ffb901b8769 at term 121
2018-10-22 17:34:09.883369 I | raft: 850a5ffb901b8769 [logterm: 2, index: 5] sent MsgVote request to 3922346fe0f3212c at term 121
2018-10-22 17:34:10.168368 W | rafthttp: health check for peer 3922346fe0f3212c could not connect: dial tcp 10.42.8.6:2380: getsockopt: connection refused
WARNING: 2018/10/22 17:34:10 grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: Error while dialing dial tcp 0.0.0.0:2379: getsockopt: connection refused"; Reconnecting to {0.0.0.0:2379 0  <nil>}
2018-10-22 17:34:10.578366 I | etcdserver: skipped leadership transfer for stopping non-leader member
WARNING: 2018/10/22 17:34:10 grpc: addrConn.transportMonitor exits due to: context canceled
2018-10-22 17:34:10.578447 I | rafthttp: stopping peer 3922346fe0f3212c...
2018-10-22 17:34:10.578471 I | rafthttp: stopped streaming with peer 3922346fe0f3212c (writer)
2018-10-22 17:34:10.578485 I | rafthttp: stopped streaming with peer 3922346fe0f3212c (writer)
2018-10-22 17:34:10.578522 I | rafthttp: stopped HTTP pipelining with peer 3922346fe0f3212c
2018-10-22 17:34:10.578533 I | rafthttp: stopped streaming with peer 3922346fe0f3212c (stream MsgApp v2 reader)
2018-10-22 17:34:10.578538 I | rafthttp: stopped streaming with peer 3922346fe0f3212c (stream Message reader)
2018-10-22 17:34:10.578544 I | rafthttp: stopped peer 3922346fe0f3212c

What I end up with

kubectl -n default get pods -w
NAME                              READY     STATUS      RESTARTS   AGE
etcd-operator-779446c7d8-t2hm9    3/3       Running     0          33m
example-668f9f8f7d-76mh7          1/2       Running     0          3m
example-668f9f8f7d-7m4vd          1/2       Running     0          3m
example-668f9f8f7d-n7vgr          1/2       Running     0          3m
example-etcd-9x68tmdxl7           0/1       Error       0          4m
example-etcd-f4rhsm64d4           0/1       Completed   0          4m
vault-operator-7dc8b55b4d-mkz5p   1/1       Running     0          33m

Nothing listed under sealed

kubectl -n default get vault example -o yaml
apiVersion: vault.security.coreos.com/v1alpha1
kind: VaultService
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"vault.security.coreos.com/v1alpha1","kind":"VaultService","metadata":{"annotations":{},"name":"example","namespace":"default"},"spec":{"nodes":3,"version":"0.9.1-0"}}
  creationTimestamp: 2018-10-22T17:31:07Z
  generation: 1
  name: example
  namespace: default
  resourceVersion: "2377045"
  selfLink: /apis/vault.security.coreos.com/v1alpha1/namespaces/default/vaultservices/example
  uid: 42467cbb-d620-11e8-8fcd-0050568b2ddd
spec:
  TLS:
    static:
      clientSecret: example-default-vault-client-tls
      serverSecret: example-default-vault-server-tls
  baseImage: quay.io/coreos/vault
  configMapName: ""
  nodes: 3
  version: 0.9.1-0
status:
  clientPort: 8200
  initialized: false
  phase: Running
  serviceName: example
  vaultStatus:
    active: ""
    sealed: null
    standby: null
@ledroide
Copy link

ledroide commented Aug 1, 2019

Same issue here.

$ kubectl get all,vault,etcd -l 'app in (etcd,vault)' 
NAME                             READY   STATUS    RESTARTS   AGE
pod/vault-svc-6bc5678fcc-l7t2r   1/2     Running   1          24m
pod/vault-svc-6bc5678fcc-znmwj   1/2     Running   1          24m
pod/vault-svc-etcd-9s974hwgj5    1/1     Running   0          25m
pod/vault-svc-etcd-gvschtkphl    0/1     Error     0          25m
NAME                            TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                      AGE
service/vault-svc               ClusterIP   10.233.15.50   <none>        8200/TCP,8201/TCP,9102/TCP   24m
service/vault-svc-etcd          ClusterIP   None           <none>        2379/TCP,2380/TCP            25m
service/vault-svc-etcd-client   ClusterIP   10.233.10.40   <none>        2379/TCP                     25m
NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/vault-svc   0/2     2            0           24m
replicaset.apps/vault-svc-6bc5678fcc   2         2         0       24m
etcdcluster.etcd.database.coreos.com/vault-svc-etcd   25m

@loomsen : could you solve it ?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants