Brick goes offline when created a new PVC #24

ksandha · 2018-10-11T09:14:22Z

Created a GCS cluster

[vagrant@kube1 ~]$ kubectl get pods -n gcs
NAME                                   READY     STATUS    RESTARTS   AGE
csi-attacher-glusterfsplugin-0         2/2       Running   0          1h
csi-nodeplugin-glusterfsplugin-4cvkd   2/2       Running   0          1h
csi-nodeplugin-glusterfsplugin-m9z9n   2/2       Running   0          1h
csi-nodeplugin-glusterfsplugin-wclwr   2/2       Running   0          1h
csi-provisioner-glusterfsplugin-0      2/2       Running   0          1h
etcd-chvm79wqr4                        1/1       Running   0          1h
etcd-ndccs6pkq7                        1/1       Running   0          1h
etcd-operator-54bbdfc55d-vkxh6         1/1       Running   0          1h
etcd-rrfgwq5xkd                        1/1       Running   0          3m
kube1-0                                1/1       Running   0          1h
kube2-0                                1/1       Running   0          1h
kube3-0                                1/1       Running   0          1h

create a PVC mount it on a app pod.

[root@kube1-0 /]# glustercli volume status --endpoints="http://kube2-0.glusterd2.gcs:24007"
Volume : pvc-350277cfcd3111e8un/glusterd2/bricks/pvc-350277cfcd3111e8/subvol1/bri
+--------------------------------------+-----------------------+---------------------------------------------------------------------+--------+-------+-----+
|               BRICK ID               |         HOST          |                                PATH                                 | ONLINE | PORT  | PID |
+--------------------------------------+-----------------------+---------------------------------------------------------------------+--------+-------+-----+
| 1e9711eb-6ab9-4381-8f37-4fe929ae8e36 | kube3-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-350277cfcd3111e8/subvol1/brick1/brick | true   | 49152 |  53 |
| 044ae8e2-4dcb-45f1-9e17-7fa0b1b8084b | kube1-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-350277cfcd3111e8/subvol1/brick2/brick | true   | 49152 |  53 |
| ddc9610e-4216-4d96-a4e1-5558703d2f1a | kube2-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-350277cfcd3111e8/subvol1/brick3/brick | true   | 49152 |  53 |
+--------------------------------------+-----------------------+---------------------------------------------------------------------+--------+-------+-----+

Kill a brick from the gd2 pod

[root@kube1-0 /]# kill -9 53
[root@kube1-0 /]# 
[root@kube1-0 /]#  glustercli volume status --endpoints="http://kube2-0.glusterd2.
Volume : pvc-350277cfcd3111e8
+--------------------------------------+-----------------------+---------------------------------------------------------------------+--------+-------+-----+
|               BRICK ID               |         HOST          |                                PATH                                 | ONLINE | PORT  | PID |
+--------------------------------------+-----------------------+---------------------------------------------------------------------+--------+-------+-----+
| 1e9711eb-6ab9-4381-8f37-4fe929ae8e36 | kube3-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-350277cfcd3111e8/subvol1/brick1/brick | true   | 49152 |  53 |
| 044ae8e2-4dcb-45f1-9e17-7fa0b1b8084b | kube1-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-350277cfcd3111e8/subvol1/brick2/brick | false  |     0 |   0 |
| ddc9610e-4216-4d96-a4e1-5558703d2f1a | kube2-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-350277cfcd3111e8/subvol1/brick3/brick | true   | 49152 |  53 |
+--------------------------------------+-----------------------+---------------------------------------------------------------------+--------+-------+-----+
[root@kube1-0 /]#

delete the app pods and the pvc
[vagrant@kube1 ~]$ [vagrant@kube1 ~]$ kubectl delete pod redis1 pod "redis1" deleted [vagrant@kube1 ~]$ [vagrant@kube1 ~]$ [vagrant@kube1 ~]$ [vagrant@kube1 ~]$ kubectl -n gcs -it exec kube1-0 -- /bin/bash [root@kube1-0 /]# glustercli volume status --endpoints="http://kube2-0.glusterd2.gcs:24007" Volume : pvc-350277cfcd3111e8 +--------------------------------------+-----------------------+---------------------------------------------------------------------+--------+-------+-----+ | BRICK ID | HOST | PATH | ONLINE | PORT | PID | +--------------------------------------+-----------------------+---------------------------------------------------------------------+--------+-------+-----+ | 1e9711eb-6ab9-4381-8f37-4fe929ae8e36 | kube3-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-350277cfcd3111e8/subvol1/brick1/brick | true | 49152 | 53 | | 044ae8e2-4dcb-45f1-9e17-7fa0b1b8084b | kube1-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-350277cfcd3111e8/subvol1/brick2/brick | false | 0 | 0 | | ddc9610e-4216-4d96-a4e1-5558703d2f1a | kube2-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-350277cfcd3111e8/subvol1/brick3/brick | true | 49152 | 53 | +--------------------------------------+-----------------------+---------------------------------------------------------------------+--------+-------+-----+ [root@kube1-0 /]# [root@kube1-0 /]# exit [vagrant@kube1 ~]$ kubectl get pods No resources found. [vagrant@kube1 ~]$ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE gcs-pvc1 Bound pvc-350277cfcd3111e8 2Gi RWX glusterfs-csi 19m [vagrant@kube1 ~]$ kubectl delete pvc gcs-pvc1 persistentvolumeclaim "gcs-pvc1" deleted [vagrant@kube1 ~]$ kubectl get pvc No resources found. [vagrant@kube1 ~]$ kubectl -n gcs -it exec kube1-0 -- /bin/bash [root@kube1-0 /]# glustercli volume status --endpoints="http://kube2-0.glusterd2.gcs:24007" No volumes found [root@kube1-0 /]#
delete the gd2 pod and wait for a new pod to spin

[vagrant@kube1 ~]$ kubectl delete -n gcs pods kube1-0 --grace-period=0
pod "kube1-0" deleted

Create a new pvc and check the volume status in the gd2 pod

[root@kube1-0 /]# glustercli volume status --endpoints="http://kube3-0.glusterd2.gcs:24007"
Volume : pvc-044c90d4cd3411e8
+--------------------------------------+-----------------------+---------------------------------------------------------------------+--------+-------+-----+
|               BRICK ID               |         HOST          |                                PATH                                 | ONLINE | PORT  | PID |
+--------------------------------------+-----------------------+---------------------------------------------------------------------+--------+-------+-----+
| 981ca2e3-e282-4073-b540-82a1bd849a5c | kube2-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-044c90d4cd3411e8/subvol1/brick3/brick | true   | 49152 | 175 |
| 22752a52-4be3-4ec2-beec-ad53290dfd3e | kube3-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-044c90d4cd3411e8/subvol1/brick1/brick | true   | 49152 | 173 |
| 81d97c58-4f76-4c10-bcb7-1d64c552e515 | kube1-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-044c90d4cd3411e8/subvol1/brick2/brick | false  |     0 |   0 |
+--------------------------------------+-----------------------+---------------------------------------------------------------------+--------+-------+-----+
[root@kube1-0 /]#

The text was updated successfully, but these errors were encountered:

Madhu-1 · 2018-10-11T09:17:09Z

@ksandha can you paste the kubectl get po -ngcs output after bricks goes to offline

ksandha · 2018-10-11T09:20:24Z

[vagrant@kube1 ~]$ kubectl get po -n gcs
NAME                                   READY     STATUS    RESTARTS   AGE
csi-attacher-glusterfsplugin-0         2/2       Running   0          1h
csi-nodeplugin-glusterfsplugin-4cvkd   2/2       Running   0          1h
csi-nodeplugin-glusterfsplugin-m9z9n   2/2       Running   0          1h
csi-nodeplugin-glusterfsplugin-wclwr   2/2       Running   0          1h
csi-provisioner-glusterfsplugin-0      2/2       Running   0          1h
etcd-chvm79wqr4                        1/1       Running   0          1h
etcd-ndccs6pkq7                        1/1       Running   0          1h
etcd-operator-54bbdfc55d-vkxh6         1/1       Running   0          1h
etcd-rrfgwq5xkd                        1/1       Running   0          29m
kube1-0                                1/1       Running   2          19m
kube2-0                                1/1       Running   0          1h
kube3-0                                1/1       Running   0          1h

Madhu-1 · 2018-10-11T10:29:41Z

RCA:

as we are not persisting the run directory in the glusterd2 stateful-set where are bricks gets created, once the glusterd2 pod restarts the content present in this directory will get vanished, because of this bricks are not coming up, even we are not able to delete the volume

aravindavk · 2018-10-12T05:05:06Z

as we are not persisting the run directory in the glusterd2 stateful-set where are bricks gets created, once the glusterd2 pod restarts the content present in this directory will get vanished, because of this bricks are not coming up, even we are not able to delete the volume

As per my understanding, bricks are mounted in run directory. Glusterd2 restart will remount these bricks on start. Are there any failures while mounting the bricks? We don't need run dir to be persistent, am I missing something here?

Madhu-1 · 2018-10-25T11:45:53Z

I need to Analyze the bug again, this may take time. moving out of GCS/0.2

Madhu-1 · 2018-11-13T12:16:56Z

not able to reproduce with the latest build, @ksandha please verify this bug with latest build

Madhu-1 · 2018-11-15T07:18:17Z

@ksandha PTAL

ksandha · 2018-11-20T09:40:32Z

Couldn't hit the issue with the latest build. @Madhu-1 please take appropriate action

Madhu-1 · 2018-11-20T09:43:00Z

closing as per @ksandha comment.

ksandha mentioned this issue Oct 11, 2018

GD2 volume not deleted gracefully #25

Closed

JohnStrunk added the bug Something isn't working label Oct 11, 2018

Madhu-1 mentioned this issue Oct 12, 2018

Failed to remount bricks after glusterd2 container restart gluster/glusterd2#1277

Closed

JohnStrunk added the GCS/0.2 GCS 0.2 release label Oct 18, 2018

atinmu assigned Madhu-1 Oct 25, 2018

Madhu-1 mentioned this issue Oct 25, 2018

Update gd2 container image to fix size issue #45

Merged

Madhu-1 added GCS/0.3 GCS 0.3 release and removed GCS/0.2 GCS 0.2 release labels Oct 25, 2018

Madhu-1 added GCS/0.4 GCS 0.4 release and removed bug Something isn't working GCS/0.3 GCS 0.3 release labels Nov 13, 2018

Madhu-1 closed this as completed Nov 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Brick goes offline when created a new PVC #24

Brick goes offline when created a new PVC #24

ksandha commented Oct 11, 2018

Madhu-1 commented Oct 11, 2018

ksandha commented Oct 11, 2018

Madhu-1 commented Oct 11, 2018

aravindavk commented Oct 12, 2018

Madhu-1 commented Oct 25, 2018

Madhu-1 commented Nov 13, 2018

Madhu-1 commented Nov 15, 2018

ksandha commented Nov 20, 2018

Madhu-1 commented Nov 20, 2018

Brick goes offline when created a new PVC #24

Brick goes offline when created a new PVC #24

Comments

ksandha commented Oct 11, 2018

Madhu-1 commented Oct 11, 2018

ksandha commented Oct 11, 2018

Madhu-1 commented Oct 11, 2018

aravindavk commented Oct 12, 2018

Madhu-1 commented Oct 25, 2018

Madhu-1 commented Nov 13, 2018

Madhu-1 commented Nov 15, 2018

ksandha commented Nov 20, 2018

Madhu-1 commented Nov 20, 2018