Skip to content
This repository was archived by the owner on Mar 26, 2020. It is now read-only.
This repository was archived by the owner on Mar 26, 2020. It is now read-only.

GlusterD kubernetes: systemctl start glusterd silent failures.  #1496

@jayunit100

Description

@jayunit100

Note: I didn't setup an ETCD url. I assume that either way, glusterd should fail fast and obviously if ETCD isnt working, however, its a silent failure.

Observed behavior

Running the kube cluster recipes Gluster pods are running and healthy, but systemctl status glusterd2 tells another story, it completely failed.

Expected/desired behavior

Pods should exit if glusterd can't startup, or at least log this to stderr. Right now no logs and only way to know its broken is to run glustercli peer status or similar inside the pod.

Details on how to reproduce (minimal and precise)

Create the following file:

---
apiVersion: v1
kind: Namespace
metadata:
  name: gluster-storage
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: gluster
  namespace: gluster-storage
  labels:
    gluster-storage: glusterd2
spec:
  selector:
    matchLabels:
      name: glusterd2-daemon
  template:
    metadata:
      labels:
        name: glusterd2-daemon
    spec:
      containers:
        - name: glusterd2
          image: docker.io/gluster/glusterd2-nightly:20190204
# TODO: Enable the below once passing environment variables to the containers is fixed
#          env:
#            - name: GD2_RESTAUTH
#              value: "false"
# Enable if an external etcd cluster has been set up etcd
#            - name: GD2_ETCDENDPOINTS
#              value: "http://gluster-etcd:2379"
# Generate and set a random uuid here
#            - name: GD2_CLUSTER_ID
#              value: "9610ec0b-17e7-405e-82f7-5f78d0b22463"
          securityContext:
            capabilities: {}
            privileged: true
          volumeMounts:
            - name: gluster-dev
              mountPath: "/dev"
            - name: gluster-cgroup
              mountPath: "/sys/fs/cgroup"
              readOnly: true
            - name: gluster-lvm
              mountPath: "/run/lvm"
            - name: gluster-kmods
              mountPath: "/usr/lib/modules"
              readOnly: true

      volumes:
        - name: gluster-dev
          hostPath:
            path: "/dev"
        - name: gluster-cgroup
          hostPath:
            path: "/sys/fs/cgroup"
        - name: gluster-lvm
          hostPath:
            path: "/run/lvm"
        - name: gluster-kmods
          hostPath:
            path: "/usr/lib/modules"

---
apiVersion: v1
kind: Service
metadata:
  name: glusterd2-service
  namespace: gluster-storage
spec:
  selector:
    name: glusterd2-daemon
  ports:
    - protocol: TCP
      port: 24007
      targetPort: 24007
# GD2 will be available on kube-host:31007 externally
      nodePort: 31007
  type: NodePort

And exec -t -i into one of the pods, you'll see its healthy, but running systemctl status glusterd2 will show error logs. re running this command manually, you will then see the following logs

WARNING: 2019/02/04 19:43:51 grpc: addrConn.createTransport failed to connect to {[fe80::345c:baff:fefe:edc6]:2379 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp [fe80::345c:baff:fefe:edc6]:2379: connect: invalid argument". Reconnecting...
WARNING: 2019/02/04 19:43:51 grpc: addrConn.createTransport failed to connect to {[fe80::345c:baff:fefe:edc6]:2379 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp [fe80::345c:baff:fefe:edc6]:2379: connect: invalid argument". Reconnecting...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions