Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pod is not able to mount openebs PVC #1688

Closed
rtnpro opened this issue Jul 25, 2018 · 47 comments
Closed

Pod is not able to mount openebs PVC #1688

rtnpro opened this issue Jul 25, 2018 · 47 comments

Comments

@rtnpro
Copy link

rtnpro commented Jul 25, 2018

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

What happened:

During deployment of a pod with RWO PVC from openebs storageclass, it got stuck in ContainerCreating stage for ever, because it cannot mount openebs volume.

Events:
  Type     Reason                  Age                From                     Message
  ----     ------                  ----               ----                     -------
  Warning  FailedScheduling        58s (x2 over 59s)  default-scheduler        pod has unbound PersistentVolumeClaims (repeated 4 times)
  Normal   Scheduled               58s                default-scheduler        Successfully assigned redis-master-0 to node0
  Normal   SuccessfulAttachVolume  58s                attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-a036d681-8fd4-11e8-ad96-de1a202c9007"
  Normal   SuccessfulMountVolume   55s                kubelet, node0           MountVolume.SetUp succeeded for volume "default-token-ngjhh"
  Warning  FailedMount             24s (x4 over 43s)  kubelet, node0           MountVolume.WaitForAttach failed for volume "pvc-a036d681-8fd4-11e8-ad96-de1a202c9007" : failed to get any path for iscsi disk, last err seen:
iscsi: failed to sendtargets to portal 10.233.27.8:3260 output: iscsiadm: cannot make connection to 10.233.27.8: Connection refused
iscsiadm: cannot make connection to 10.233.27.8: Connection refused
iscsiadm: cannot make connection to 10.233.27.8: Connection refused
iscsiadm: cannot make connection to 10.233.27.8: Connection refused
iscsiadm: cannot make connection to 10.233.27.8: Connection refused
iscsiadm: cannot make connection to 10.233.27.8: Connection refused
iscsiadm: connection login retries (reopen_max) 5 exceeded
iscsiadm: No portals found
, err exit status 21
  Warning  FailedMount  8s (x2 over 17s)  kubelet, node0  MountVolume.MountDevice failed for volume "pvc-a036d681-8fd4-11e8-ad96-de1a202c9007" : executable file not found in $PATH

Please note that 10.233.27.8 is IP of a service:

pvc-a036d681-8fd4-11e8-ad96-de1a202c9007-ctrl-svc   ClusterIP   10.233.27.8     <none>        3260/TCP,9501/TCP   22m       openebs/controller=jiva-controller,vsm=pvc-a036d681-8fd4-11e8-ad96-de1a202c9007

I am able to access 10.233.27.8:3260 from my k8s worker nodes:

[root@node0 ~]# telnet 10.233.27.8 3260
Trying 10.233.27.8...
Connected to 10.233.27.8.
Escape character is '^]'.
[root@node1 ~]# telnet 10.233.27.8 3260
Trying 10.233.27.8...
Connected to 10.233.27.8.
Escape character is '^]'.

What you expected to happen:

The pod should have been able to properly mount openebs volume and start.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

kubectl get nodes -o wide

NAME      STATUS    ROLES     AGE       VERSION   EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION          CONTAINER-RUNTIME
master0   Ready     master    22h       v1.10.4   <none>        CentOS Linux 7 (Core)   4.4.127-mainline-rev1   docker://1.13.1
node0     Ready     node      22h       v1.10.4   <none>        CentOS Linux 7 (Core)   4.4.127-mainline-rev1   docker://1.13.1
node1     Ready     node      22h       v1.10.4   <none>        CentOS Linux 7 (Core)   4.4.127-mainline-rev1   docker://1.13.1
proxy0    Ready     node      22h       v1.10.4   <none>        CentOS Linux 7 (Core)   4.4.127-mainline-rev1   docker://1.13.1
proxy1    Ready     node      22h       v1.10.4   <none>        CentOS Linux 7 (Core)   4.4.127-mainline-rev1   docker://1.13.1

kubectl get pods --all-namespaces -o wide

NAMESPACE          NAME                                                              READY     STATUS              RESTARTS   AGE       IP              NODE
cert-manager       cert-manager-7cdb98dbb5-94hx7                                     1/1       Running             0          16h       10.233.77.4     node0
external-ingress   external-ingress-nginx-ingress-controller-5bcc4f94c-n2xdm         1/1       Running             0          15h       172.18.1.12     proxy1
external-ingress   external-ingress-nginx-ingress-default-backend-56b446bb8b-bhlxw   1/1       Running             0          15h       10.233.127.10   node1
internal-ingress   internal-ingress-nginx-ingress-controller-75bbc77f64-xkjnv        1/1       Running             0          16h       172.18.1.11     proxy0
internal-ingress   internal-ingress-nginx-ingress-default-backend-699dddcfd4-kkzw9   1/1       Running             0          16h       10.233.127.7    node1
kube-system        calico-kube-controllers-bf8b6897b-fls7v                           1/1       Running             0          22h       172.18.1.20     node0
kube-system        canal-node-f8ljz                                                  2/2       Running             0          22h       172.18.1.21     node1
kube-system        canal-node-m5qjb                                                  2/2       Running             0          22h       172.18.1.1      master0
kube-system        canal-node-mjp4q                                                  2/2       Running             0          22h       172.18.1.12     proxy1
kube-system        canal-node-nzdvs                                                  2/2       Running             0          22h       172.18.1.11     proxy0
kube-system        canal-node-vgvmt                                                  2/2       Running             0          22h       172.18.1.20     node0
kube-system        heapster-heapster-76579d7f67-6wpn8                                2/2       Running             0          16h       10.233.127.9    node1
kube-system        kube-apiserver-master0                                            1/1       Running             0          22h       172.18.1.1      master0
kube-system        kube-controller-manager-master0                                   1/1       Running             0          22h       172.18.1.1      master0
kube-system        kube-dns-7bd4d5fbb6-dc8f9                                         3/3       Running             0          22h       10.233.127.5    node1
kube-system        kube-dns-7bd4d5fbb6-hb8s8                                         3/3       Running             0          22h       10.233.77.2     node0
kube-system        kube-proxy-7mzcn                                                  1/1       Running             0          22h       172.18.1.1      master0
kube-system        kube-proxy-gw2sc                                                  1/1       Running             0          22h       172.18.1.21     node1
kube-system        kube-proxy-m8s77                                                  1/1       Running             0          22h       172.18.1.20     node0
kube-system        kube-proxy-q6mrf                                                  1/1       Running             0          22h       172.18.1.11     proxy0
kube-system        kube-proxy-tnqlk                                                  1/1       Running             0          22h       172.18.1.12     proxy1
kube-system        kube-scheduler-master0                                            1/1       Running             0          22h       172.18.1.1      master0
kube-system        kubedns-autoscaler-679b8b455-mrb42                                1/1       Running             0          22h       10.233.127.4    node1
kube-system        kubernetes-dashboard-567bcb664f-j4rvx                             1/1       Running             0          16h       10.233.127.8    node1
kube-system        nginx-proxy-node0                                                 1/1       Running             0          22h       172.18.1.20     node0
kube-system        nginx-proxy-node1                                                 1/1       Running             0          22h       172.18.1.21     node1
kube-system        nginx-proxy-proxy0                                                1/1       Running             0          22h       172.18.1.11     proxy0
kube-system        nginx-proxy-proxy1                                                1/1       Running             0          22h       172.18.1.12     proxy1
kube-system        tiller-deploy-5c688d5f9b-knh62                                    1/1       Running             0          16h       10.233.77.3     node0
odoo-rtnpro-com    pvc-9b69e197-8f73-11e8-ad96-de1a202c9007-ctrl-766f75858b-jlkvz    2/2       Running             0          12h       10.233.77.31    node0
odoo-rtnpro-com    pvc-9b69e197-8f73-11e8-ad96-de1a202c9007-rep-7f5ddd46cd-htp5x     1/1       Running             0          12h       10.233.127.22   node1
odoo-rtnpro-com    pvc-9b69e197-8f73-11e8-ad96-de1a202c9007-rep-7f5ddd46cd-qf9h8     1/1       Running             0          12h       10.233.77.33    node0
odoo-rtnpro-com    pvc-9b6cfeeb-8f73-11e8-ad96-de1a202c9007-ctrl-865db58655-dc64c    1/1       Running             0          12h       10.233.77.30    node0
odoo-rtnpro-com    pvc-9b6cfeeb-8f73-11e8-ad96-de1a202c9007-rep-6f8f47bc5f-gtx4m     1/1       Running             0          12h       10.233.127.23   node1
odoo-rtnpro-com    pvc-9b6cfeeb-8f73-11e8-ad96-de1a202c9007-rep-6f8f47bc5f-t2cgm     1/1       Running             0          12h       10.233.77.32    node0
openebs            openebs-apiserver-6989d8765c-2hw6r                                1/1       Running             0          37m       10.233.77.34    node0
openebs            openebs-provisioner-65c98fd5d4-8wmjt                              1/1       Running             0          37m       10.233.127.24   node1
redis              pvc-a036d681-8fd4-11e8-ad96-de1a202c9007-ctrl-6c77c545d9-zsl9v    2/2       Running             0          30m       10.233.127.25   node1
redis              pvc-a036d681-8fd4-11e8-ad96-de1a202c9007-rep-dc78bb788-khbfs      1/1       Running             0          30m       10.233.77.36    node0
redis              pvc-a036d681-8fd4-11e8-ad96-de1a202c9007-rep-dc78bb788-nrflc      1/1       Running             0          30m       10.233.127.26   node1
redis              redis-master-0                                                    0/1       ContainerCreating   0          30m       <none>          node0
redis              redis-slave-b99c475c-mwvx6                                        0/1       CrashLoopBackOff    10         30m       10.233.77.35    node0

kubectl get services --all-namespaces -o wide

NAMESPACE          NAME                                                TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE       SELECTOR
default            kubernetes                                          ClusterIP   10.233.0.1      <none>        443/TCP                      22h       <none>
external-ingress   external-ingress-nginx-ingress-controller           NodePort    10.233.13.107   <none>        80:31347/TCP,443:32306/TCP   15h       app=nginx-ingress,component=controller,release=external-ingress
external-ingress   external-ingress-nginx-ingress-default-backend      ClusterIP   10.233.19.172   <none>        80/TCP                       15h       app=nginx-ingress,component=default-backend,release=external-ingress
internal-ingress   internal-ingress-nginx-ingress-controller           NodePort    10.233.34.153   <none>        80:31990/TCP,443:31770/TCP   16h       app=nginx-ingress,component=controller,release=internal-ingress
internal-ingress   internal-ingress-nginx-ingress-default-backend      ClusterIP   10.233.57.149   <none>        80/TCP                       16h       app=nginx-ingress,component=default-backend,release=internal-ingress
kube-system        heapster                                            ClusterIP   10.233.43.111   <none>        8082/TCP                     16h       app=heapster-heapster
kube-system        kube-dns                                            ClusterIP   10.233.0.3      <none>        53/UDP,53/TCP                22h       k8s-app=kube-dns
kube-system        kubernetes-dashboard                                ClusterIP   10.233.44.112   <none>        443/TCP                      16h       app=kubernetes-dashboard,release=kubernetes-dashboard
kube-system        tiller-deploy                                       ClusterIP   10.233.60.11    <none>        44134/TCP                    16h       app=helm,name=tiller
odoo-rtnpro-com    pvc-9b69e197-8f73-11e8-ad96-de1a202c9007-ctrl-svc   ClusterIP   10.233.10.112   <none>        3260/TCP,9501/TCP            12h       openebs/controller=jiva-controller,vsm=pvc-9b69e197-8f73-11e8-ad96-de1a202c9007
odoo-rtnpro-com    pvc-9b6cfeeb-8f73-11e8-ad96-de1a202c9007-ctrl-svc   ClusterIP   10.233.34.232   <none>        3260/TCP,9501/TCP            12h       openebs/controller=jiva-controller,vsm=pvc-9b6cfeeb-8f73-11e8-ad96-de1a202c9007
openebs            openebs-apiservice                                  ClusterIP   10.233.30.254   <none>        5656/TCP                     38m       app=openebs,component=apiserver,release=openebs
redis              pvc-a036d681-8fd4-11e8-ad96-de1a202c9007-ctrl-svc   ClusterIP   10.233.27.8     <none>        3260/TCP,9501/TCP            31m       openebs/controller=jiva-controller,vsm=pvc-a036d681-8fd4-11e8-ad96-de1a202c9007
redis              redis-master                                        ClusterIP   10.233.35.110   <none>        6379/TCP                     31m       app=redis,release=redis,role=master
redis              redis-slave                                         ClusterIP   10.233.59.128   <none>        6379/TCP                     31m       app=redis,release=redis,role=slave

kubectl get sc

NAME          PROVISIONER                    AGE
openebs-app   openebs.io/provisioner-iscsi   33m
openebs-db    openebs.io/provisioner-iscsi   33m

kubectl get pv

NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS     CLAIM                                        STORAGECLASS   REASON    AGE
pvc-9b69e197-8f73-11e8-ad96-de1a202c9007   4Gi        RWO            Delete           Released   odoo-rtnpro-com/odoo-rtnpro-com-postgresql   openebs-db               12h
pvc-9b6cfeeb-8f73-11e8-ad96-de1a202c9007   2Gi        RWO            Delete           Released   odoo-rtnpro-com/odoo-rtnpro-com-odoo         openebs-app              12h
pvc-a036d681-8fd4-11e8-ad96-de1a202c9007   2Gi        RWO            Delete           Bound      redis/redis-data-redis-master-0              openebs-db               33m

kubectl get pvc --all-namespaces

NAMESPACE   NAME                        STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
redis       redis-data-redis-master-0   Bound     pvc-a036d681-8fd4-11e8-ad96-de1a202c9007   2Gi        RWO            openebs-db     33m

cat /etc/os-release

NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

Kernel

Build from Scaleway: 4.4.127-mainline-rev1

Installed helm app: stable/redis with following values.yaml:

master:
  persistence:
    storageClass: openebs-db
    size: 2Gi

Openebs storageclass and storagepool defs

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: openebs-db
provisioner: openebs.io/provisioner-iscsi
parameters:
  openebs.io/storage-pool: "default"
  openebs.io/jiva-replica-count: "2"
  openebs.io/volume-monitor: "true"
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: openebs-app
provisioner: openebs.io/provisioner-iscsi
parameters:
  openebs.io/storage-pool: "default"
  openebs.io/jiva-replica-count: "2"
apiVersion: openebs.io/v1alpha1
kind: StoragePool
metadata:
  name: default
  type: hostdir
spec:
  path: /var/openebs
@kmova
Copy link
Member

kmova commented Jul 25, 2018

Thanks @rtnpro for creating the issue with these detailed steps!

Could you run the following commands on both the nodes - node0 and node1 and share the output please.

iscsiadm -V
iscsiadm -m discovery -t sendtargets -p 10.233.27.8:3260
sudo docker ps | grep kubelet

@rtnpro
Copy link
Author

rtnpro commented Jul 25, 2018

@kmova Here you go:

[root@node0 ~]# iscsiadm -V
iscsiadm version 6.2.0.874-7
[root@node0 ~]# iscsiadm -m discovery -t sendtargets -p 10.233.27.8:3260
10.233.27.8:3260,1 iqn.2016-09.com.openebs.jiva:pvc-a036d681-8fd4-11e8-ad96-de1a202c9007
[root@node0 ~]# docker ps | grep kubelet

However, kubespray runs kubelet binary instead of a container.

[root@node0 ~]# cat /etc/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=docker.service
Wants=docker.socket

[Service]
User=root
EnvironmentFile=-/etc/kubernetes/kubelet.env
ExecStartPre=-/bin/mkdir -p /var/lib/kubelet/volume-plugins
ExecStart=/usr/local/bin/kubelet \
		$KUBE_LOGTOSTDERR \
		$KUBE_LOG_LEVEL \
		$KUBELET_API_SERVER \
		$KUBELET_ADDRESS \
		$KUBELET_PORT \
		$KUBELET_HOSTNAME \
		$KUBE_ALLOW_PRIV \
		$KUBELET_ARGS \
		$DOCKER_SOCKET \
		$KUBELET_NETWORK_PLUGIN \
		$KUBELET_VOLUME_PLUGIN \
		$KUBELET_CLOUDPROVIDER
Restart=always
RestartSec=10s

[Install]
WantedBy=multi-user.target
[root@node0 ~]# ls -lah /usr/local/bin/kubelet
-rwxr-xr-x 1 root root 255M Jul 24 08:42 /usr/local/bin/kubelet

Here's the container running on the node:

[root@node0 ~]# docker ps
CONTAINER ID        IMAGE                                                                                                              COMMAND                  CREATED             STATUS              PORTS               NAMES
bbea14133d60        5523573c02e4                                                                                                       "/app-entrypoint.s..."   3 hours ago         Up 3 hours                              k8s_odoo-rtnpro-com-odoo_odoo-rtnpro-com-odoo-7557967476-cnxg8_odoo-rtnpro-com_e9070b49-8fe4-11e8-ad96-de1a202c9007_0
29bfc37c71f4        b3b8a2229953                                                                                                       "docker-entrypoint..."   3 hours ago         Up 3 hours                              k8s_odoo-rtnpro-com-postgresql_odoo-rtnpro-com-postgresql-867fcf89cb-4s6v8_odoo-rtnpro-com_e903c527-8fe4-11e8-ad96-de1a202c9007_0
3bfc6c184b30        gcr.io/google_containers/pause-amd64:3.0                                                                           "/pause"                 3 hours ago         Up 3 hours                              k8s_POD_odoo-rtnpro-com-odoo-7557967476-cnxg8_odoo-rtnpro-com_e9070b49-8fe4-11e8-ad96-de1a202c9007_0
8a3dab8b5202        gcr.io/google_containers/pause-amd64:3.0                                                                           "/pause"                 3 hours ago         Up 3 hours                              k8s_POD_odoo-rtnpro-com-postgresql-867fcf89cb-4s6v8_odoo-rtnpro-com_e903c527-8fe4-11e8-ad96-de1a202c9007_0
340175aa5701        7b1a9e8f1860                                                                                                       "launch replica --..."   5 hours ago         Up 5 hours                              k8s_pvc-a036d681-8fd4-11e8-ad96-de1a202c9007-rep-con_pvc-a036d681-8fd4-11e8-ad96-de1a202c9007-rep-dc78bb788-khbfs_redis_a09dff44-8fd4-11e8-ad96-de1a202c9007_0
32de9fbf0d8d        gcr.io/google_containers/pause-amd64:3.0                                                                           "/pause"                 5 hours ago         Up 5 hours                              k8s_POD_pvc-a036d681-8fd4-11e8-ad96-de1a202c9007-rep-dc78bb788-khbfs_redis_a09dff44-8fd4-11e8-ad96-de1a202c9007_0
195e30b48101        gcr.io/google_containers/pause-amd64:3.0                                                                           "/pause"                 5 hours ago         Up 5 hours                              k8s_POD_redis-slave-b99c475c-mwvx6_redis_a0356037-8fd4-11e8-ad96-de1a202c9007_0
531e0a5617d6        a5cd60514b80                                                                                                       "/bin/sh -c 'entry..."   5 hours ago         Up 5 hours                              k8s_openebs-apiserver_openebs-apiserver-6989d8765c-2hw6r_openebs_a5d1f2b3-8fd3-11e8-ad96-de1a202c9007_0
7a2d8512262a        gcr.io/google_containers/pause-amd64:3.0                                                                           "/pause"                 5 hours ago         Up 5 hours                              k8s_POD_openebs-apiserver-6989d8765c-2hw6r_openebs_a5d1f2b3-8fd3-11e8-ad96-de1a202c9007_0
ef6d09816af5        quay.io/jetstack/cert-manager-controller@sha256:3ff035464d9349f5b1d23a2fef9c9b8419026671506a5be5cbae6e958ac46802   "/usr/bin/cert-man..."   20 hours ago        Up 20 hours                             k8s_cert-manager_cert-manager-7cdb98dbb5-94hx7_cert-manager_3a7a4394-8f50-11e8-ad96-de1a202c9007_0
0004839437f1        gcr.io/google_containers/pause-amd64:3.0                                                                           "/pause"                 20 hours ago        Up 20 hours                             k8s_POD_cert-manager-7cdb98dbb5-94hx7_cert-manager_3a7a4394-8f50-11e8-ad96-de1a202c9007_0
2020087e5ee2        gcr.io/kubernetes-helm/tiller@sha256:417aae19a0709075df9cc87e2fcac599b39d8f73ac95e668d9627fec9d341af2              "/tiller"                20 hours ago        Up 20 hours                             k8s_tiller_tiller-deploy-5c688d5f9b-knh62_kube-system_c54712a1-8f4f-11e8-ad96-de1a202c9007_0
dca7447d4871        gcr.io/google_containers/pause-amd64:3.0                                                                           "/pause"                 20 hours ago        Up 20 hours                             k8s_POD_tiller-deploy-5c688d5f9b-knh62_kube-system_c54712a1-8f4f-11e8-ad96-de1a202c9007_0
92b0042ccbcb        8a7739f672b4                                                                                                       "/sidecar --v=2 --..."   26 hours ago        Up 26 hours                             k8s_sidecar_kube-dns-7bd4d5fbb6-hb8s8_kube-system_226b66f0-8f1f-11e8-ad96-de1a202c9007_0
8eff9c763ad3        6816817d9dce                                                                                                       "/dnsmasq-nanny -v..."   26 hours ago        Up 26 hours                             k8s_dnsmasq_kube-dns-7bd4d5fbb6-hb8s8_kube-system_226b66f0-8f1f-11e8-ad96-de1a202c9007_0
6e50a30fc7f3        55ffe31ac578                                                                                                       "/kube-dns --domai..."   26 hours ago        Up 26 hours                             k8s_kubedns_kube-dns-7bd4d5fbb6-hb8s8_kube-system_226b66f0-8f1f-11e8-ad96-de1a202c9007_0
c9449c280010        gcr.io/google_containers/pause-amd64:3.0                                                                           "/pause"                 26 hours ago        Up 26 hours                             k8s_POD_kube-dns-7bd4d5fbb6-hb8s8_kube-system_226b66f0-8f1f-11e8-ad96-de1a202c9007_0
94538aca12e4        34aebe64326d                                                                                                       "/usr/bin/kube-con..."   26 hours ago        Up 26 hours                             k8s_calico-kube-controllers_calico-kube-controllers-bf8b6897b-fls7v_kube-system_925282c3-8f1e-11e8-ad96-de1a202c9007_0
f4548e012fe5        gcr.io/google_containers/pause-amd64:3.0                                                                           "/pause"                 26 hours ago        Up 26 hours                             k8s_POD_calico-kube-controllers-bf8b6897b-fls7v_kube-system_925282c3-8f1e-11e8-ad96-de1a202c9007_0
9e047f7b5909        e96a297310fd                                                                                                       "start_runit"            26 hours ago        Up 26 hours                             k8s_calico-node_canal-node-vgvmt_kube-system_79f2a701-8f1e-11e8-ad96-de1a202c9007_0
f42692adf0d8        f0fad859c909                                                                                                       "/bin/sh -c '/opt/..."   26 hours ago        Up 26 hours                             k8s_flannel_canal-node-vgvmt_kube-system_79f2a701-8f1e-11e8-ad96-de1a202c9007_0
d7aafb82bfcb        gcr.io/google_containers/pause-amd64:3.0                                                                           "/pause"                 26 hours ago        Up 26 hours                             k8s_POD_canal-node-vgvmt_kube-system_79f2a701-8f1e-11e8-ad96-de1a202c9007_0
5d2751aa42ee        3ea2e2baccf4                                                                                                       "/usr/local/bin/ku..."   26 hours ago        Up 26 hours                             k8s_kube-proxy_kube-proxy-m8s77_kube-system_3a1f5678-8f1e-11e8-ad96-de1a202c9007_0
18b8d3dfb056        gcr.io/google_containers/pause-amd64:3.0                                                                           "/pause"                 26 hours ago        Up 26 hours                             k8s_POD_kube-proxy-m8s77_kube-system_3a1f5678-8f1e-11e8-ad96-de1a202c9007_0
8f40f07cef28        ae513a47849c                                                                                                       "nginx -g 'daemon ..."   26 hours ago        Up 26 hours                             k8s_nginx-proxy_nginx-proxy-node0_kube-system_d915dfa38e37492e76f37f0f115895fd_0
3ba9057a068d        gcr.io/google_containers/pause-amd64:3.0                                                                           "/pause"                 26 hours ago        Up 26 hours                             k8s_POD_nginx-proxy-node0_kube-system_d915dfa38e37492e76f37f0f115895fd_0

@kmova
Copy link
Member

kmova commented Jul 25, 2018

This shows that iscsiadm is able to login from node0 and kubelet is running directly on the node.

Couple of more trouble shooting steps from node-0.

iscsiadm -m node session
iscsiadm -m node -l 
iscsiadm -m node session
kubectl logs -f -n redis pvc-a036d681-8fd4-11e8-ad96-de1a202c9007-ctrl-6c77c545d9-zsl9v
iscsiadm -m node -u
iscsiadm -m node session

@rtnpro
Copy link
Author

rtnpro commented Jul 25, 2018

@kmova Here you go:

[root@node0 ~]# iscsiadm -m node session

10.233.27.8:3260,1 iqn.2016-09.com.openebs.jiva:pvc-a036d681-8fd4-11e8-ad96-de1a202c9007
10.233.41.130:3260,1 iqn.2016-09.com.openebs.jiva:pvc-07712a22-8f61-11e8-ad96-de1a202c9007
10.233.2.248:3260,1 iqn.2016-09.com.openebs.jiva:pvc-62ced7c5-8f6d-11e8-ad96-de1a202c9007
10.233.35.211:3260,1 iqn.2016-09.com.openebs.jiva:pvc-278fdd6c-8f72-11e8-ad96-de1a202c9007
10.233.12.71:3260,1 iqn.2016-09.com.openebs.jiva:pvc-62d19f5d-8f6d-11e8-ad96-de1a202c9007
10.233.10.112:3260,1 iqn.2016-09.com.openebs.jiva:pvc-9b69e197-8f73-11e8-ad96-de1a202c9007

[root@node0 ~]# iscsiadm -m node -l

[root@node0 ~]# iscsiadm -m node session

10.233.27.8:3260,1 iqn.2016-09.com.openebs.jiva:pvc-a036d681-8fd4-11e8-ad96-de1a202c9007
10.233.41.130:3260,1 iqn.2016-09.com.openebs.jiva:pvc-07712a22-8f61-11e8-ad96-de1a202c9007
10.233.2.248:3260,1 iqn.2016-09.com.openebs.jiva:pvc-62ced7c5-8f6d-11e8-ad96-de1a202c9007
10.233.35.211:3260,1 iqn.2016-09.com.openebs.jiva:pvc-278fdd6c-8f72-11e8-ad96-de1a202c9007
10.233.12.71:3260,1 iqn.2016-09.com.openebs.jiva:pvc-62d19f5d-8f6d-11e8-ad96-de1a202c9007
10.233.10.112:3260,1 iqn.2016-09.com.openebs.jiva:pvc-9b69e197-8f73-11e8-ad96-de1a202c9007

[root@node0 ~]# iscsiadm -m node -u

Logging out of session [sid: 1, target: iqn.2016-09.com.openebs.jiva:pvc-07712a22-8f61-11e8-ad96-de1a202c9007, portal: 10.233.41.130,3260]
Logging out of session [sid: 2, target: iqn.2016-09.com.openebs.jiva:pvc-62d19f5d-8f6d-11e8-ad96-de1a202c9007, portal: 10.233.12.71,3260]
Logging out of session [sid: 3, target: iqn.2016-09.com.openebs.jiva:pvc-62ced7c5-8f6d-11e8-ad96-de1a202c9007, portal: 10.233.2.248,3260]
Logging out of session [sid: 4, target: iqn.2016-09.com.openebs.jiva:pvc-278fdd6c-8f72-11e8-ad96-de1a202c9007, portal: 10.233.35.211,3260]
Logging out of session [sid: 5, target: iqn.2016-09.com.openebs.jiva:pvc-9b69e197-8f73-11e8-ad96-de1a202c9007, portal: 10.233.10.112,3260]
Logging out of session [sid: 6, target: iqn.2016-09.com.openebs.jiva:pvc-a036d681-8fd4-11e8-ad96-de1a202c9007, portal: 10.233.27.8,3260]
iscsiadm: Could not logout of [sid: 1, target: iqn.2016-09.com.openebs.jiva:pvc-07712a22-8f61-11e8-ad96-de1a202c9007, portal: 10.233.41.130,3260].
iscsiadm: initiator reported error (9 - internal error)
iscsiadm: Could not logout of [sid: 2, target: iqn.2016-09.com.openebs.jiva:pvc-62d19f5d-8f6d-11e8-ad96-de1a202c9007, portal: 10.233.12.71,3260].
iscsiadm: initiator reported error (2 - session not found)
iscsiadm: Could not logout of [sid: 3, target: iqn.2016-09.com.openebs.jiva:pvc-62ced7c5-8f6d-11e8-ad96-de1a202c9007, portal: 10.233.2.248,3260].
iscsiadm: initiator reported error (2 - session not found)
Logout of [sid: 4, target: iqn.2016-09.com.openebs.jiva:pvc-278fdd6c-8f72-11e8-ad96-de1a202c9007, portal: 10.233.35.211,3260] successful.
Logout of [sid: 5, target: iqn.2016-09.com.openebs.jiva:pvc-9b69e197-8f73-11e8-ad96-de1a202c9007, portal: 10.233.10.112,3260] successful.
Logout of [sid: 6, target: iqn.2016-09.com.openebs.jiva:pvc-a036d681-8fd4-11e8-ad96-de1a202c9007, portal: 10.233.27.8,3260] successful.
iscsiadm: Could not logout of all requested sessions

[root@node0 ~]# iscsiadm -m node session

10.233.27.8:3260,1 iqn.2016-09.com.openebs.jiva:pvc-a036d681-8fd4-11e8-ad96-de1a202c9007
10.233.41.130:3260,1 iqn.2016-09.com.openebs.jiva:pvc-07712a22-8f61-11e8-ad96-de1a202c9007
10.233.2.248:3260,1 iqn.2016-09.com.openebs.jiva:pvc-62ced7c5-8f6d-11e8-ad96-de1a202c9007
10.233.35.211:3260,1 iqn.2016-09.com.openebs.jiva:pvc-278fdd6c-8f72-11e8-ad96-de1a202c9007
10.233.12.71:3260,1 iqn.2016-09.com.openebs.jiva:pvc-62d19f5d-8f6d-11e8-ad96-de1a202c9007
10.233.10.112:3260,1 iqn.2016-09.com.openebs.jiva:pvc-9b69e197-8f73-11e8-ad96-de1a202c9007

kubectl logs -f -n redis pvc-a036d681-8fd4-11e8-ad96-de1a202c9007-ctrl-6c77c545d9-zsl9v pvc-a036d681-8fd4-11e8-ad96-de1a202c9007-ctrl-con

mount: permission denied
time="2018-07-25T06:33:29Z" level=info msg="Listening on :9501"
STATE = closed
10.233.77.36 - - [25/Jul/2018:06:34:04 +0000] "POST /v1/register HTTP/1.1" 200 0
time="2018-07-25T06:34:04Z" level=info msg="Connecting to remote: 10.233.77.36:9502"
time="2018-07-25T06:34:04Z" level=info msg="Opening: 10.233.77.36:9502"
time="2018-07-25T06:34:04Z" level=info msg="Adding backend: tcp://10.233.77.36:9502"
time="2018-07-25T06:34:04Z" level=info msg="Set replica tcp://10.233.77.36:9502 to mode RW"
time="2018-07-25T06:34:04Z" level=info msg="Start monitoring tcp://10.233.77.36:9502"
time="2018-07-25T06:34:04Z" level=info msg="Get backend tcp://10.233.77.36:9502 revision counter 0"
time="2018-07-25T06:34:04Z" level=info msg="SCSI device created"
time="2018-07-25T06:34:04Z" level=info msg="Listening ..."
10.233.77.36 - - [25/Jul/2018:06:34:04 +0000] "POST /v1/volumes/cHZjLWEwMzZkNjgxLThmZDQtMTFlOC1hZDk2LWRlMWEyMDJjOTAwNw==?action=start HTTP/1.1" 200 631
time="2018-07-25T06:34:04Z" level=info msg="Connecting to remote: 10.233.127.26:9502"
time="2018-07-25T06:34:04Z" level=info msg="Opening: 10.233.127.26:9502"
time="2018-07-25T06:34:04Z" level=info msg="Snapshot: 10.233.77.36:9502 fc88b457-309c-419f-b63d-302424a64bd2 UserCreated false Created at 2018-07-25T06:34:04Z"
time="2018-07-25T06:34:04Z" level=info msg="Snapshot: 10.233.127.26:9502 fc88b457-309c-419f-b63d-302424a64bd2 UserCreated false Created at 2018-07-25T06:34:04Z"
time="2018-07-25T06:34:04Z" level=info msg="Adding backend: tcp://10.233.127.26:9502"
time="2018-07-25T06:34:04Z" level=info msg="Start monitoring tcp://10.233.127.26:9502"
10.233.127.26 - - [25/Jul/2018:06:34:04 +0000] "POST /v1/replicas HTTP/1.1" 200 426
10.233.127.26 - - [25/Jul/2018:06:34:04 +0000] "GET /v1/replicas/dGNwOi8vMTAuMjMzLjEyNy4yNjo5NTAy HTTP/1.1" 200 426
time="2018-07-25T06:34:04Z" level=info msg="Set revision counter of 10.233.127.26:9502 to : 0"
time="2018-07-25T06:34:04Z" level=info msg="Set backend tcp://10.233.127.26:9502 revision counter to 0"
time="2018-07-25T06:34:04Z" level=info msg="Synchronizing volume-head-001.img.meta to volume-head-001.img.meta@10.233.127.26:9700"
time="2018-07-25T06:34:04Z" level=info msg="10.233.127.25:3260"
time="2018-07-25T06:34:04Z" level=info msg="Accepting ..."
time="2018-07-25T06:34:04Z" level=info msg="connection is connected from 10.233.77.0:33236...\n"
time="2018-07-25T06:34:04Z" level=info msg="Listening ..."
time="2018-07-25T06:34:04Z" level=warning msg="unexpected connection state: full feature"
time="2018-07-25T06:34:04Z" level=error msg="read tcp 10.233.127.25:3260->10.233.77.0:33236: read: connection reset by peer"
time="2018-07-25T06:34:04Z" level=info msg="10.233.127.25:3260"
time="2018-07-25T06:34:04Z" level=info msg="Accepting ..."
time="2018-07-25T06:34:04Z" level=info msg="connection is connected from 10.233.77.0:33238...\n"
time="2018-07-25T06:34:04Z" level=info msg="Listening ..."
time="2018-07-25T06:34:04Z" level=info msg="New Session initiator name:iqn.1994-05.com.redhat:62a78a2fe1,target name:iqn.2016-09.com.openebs.jiva:pvc-a036d681-8fd4-11e8-ad96-de1a202c9007,ISID:0x23d000006"
10.233.127.26 - - [25/Jul/2018:06:34:04 +0000] "POST /v1/replicas/dGNwOi8vMTAuMjMzLjEyNy4yNjo5NTAy?action=preparerebuild HTTP/1.1" 200 235
time="2018-07-25T06:34:04Z" level=info msg="Done synchronizing volume-head-001.img.meta to volume-head-001.img.meta@10.233.127.26:9700"
time="2018-07-25T06:34:04Z" level=warning msg="check condition"
time="2018-07-25T06:34:05Z" level=error msg="non support"
time="2018-07-25T06:34:05Z" level=warning msg="check condition"
10.233.127.26 - - [25/Jul/2018:06:34:05 +0000] "GET /v1/replicas/dGNwOi8vMTAuMjMzLjEyNy4yNjo5NTAy HTTP/1.1" 200 426
time="2018-07-25T06:34:05Z" level=info msg="Get backend tcp://10.233.77.36:9502 revision counter 0"
time="2018-07-25T06:34:05Z" level=info msg="Set revision counter of 10.233.127.26:9502 to : 0"
time="2018-07-25T06:34:05Z" level=info msg="Set backend tcp://10.233.127.26:9502 revision counter to 0"
time="2018-07-25T06:34:05Z" level=info msg="Set replica tcp://10.233.127.26:9502 to mode RW"
time="2018-07-25T06:34:05Z" level=info msg="Update peer details of 10.233.77.36:9502 "
time="2018-07-25T06:34:05Z" level=info msg="Update peer details of 10.233.127.26:9502 "
10.233.127.26 - - [25/Jul/2018:06:34:05 +0000] "POST /v1/replicas/dGNwOi8vMTAuMjMzLjEyNy4yNjo5NTAy?action=verifyrebuild HTTP/1.1" 200 426
time="2018-07-25T06:57:57Z" level=info msg="10.233.127.25:3260"
time="2018-07-25T06:57:57Z" level=info msg="Accepting ..."
time="2018-07-25T06:57:57Z" level=info msg="connection is connected from 10.233.77.0:36692...\n"
time="2018-07-25T06:57:57Z" level=info msg="Listening ..."
time="2018-07-25T06:58:08Z" level=error msg=EOF
time="2018-07-25T06:59:52Z" level=info msg="10.233.127.25:3260"
time="2018-07-25T06:59:52Z" level=info msg="Accepting ..."
time="2018-07-25T06:59:52Z" level=info msg="connection is connected from 10.1.177.99:60176...\n"
time="2018-07-25T06:59:52Z" level=info msg="Listening ..."
time="2018-07-25T09:11:39Z" level=error msg="unexpected EOF"
time="2018-07-25T11:36:42Z" level=info msg="10.233.127.25:3260"
time="2018-07-25T11:36:42Z" level=info msg="Accepting ..."
time="2018-07-25T11:36:42Z" level=info msg="connection is connected from 10.233.77.0:33750...\n"
time="2018-07-25T11:36:42Z" level=info msg="Listening ..."
time="2018-07-25T11:36:42Z" level=warning msg="unexpected connection state: full feature"
time="2018-07-25T11:36:42Z" level=error msg="read tcp 10.233.127.25:3260->10.233.77.0:33750: read: connection reset by peer"
time="2018-07-25T11:36:55Z" level=info msg="10.233.127.25:3260"
time="2018-07-25T11:36:55Z" level=info msg="Accepting ..."
time="2018-07-25T11:36:55Z" level=info msg="connection is connected from 10.233.77.0:33794...\n"
time="2018-07-25T11:36:55Z" level=info msg="Listening ..."
time="2018-07-25T11:36:58Z" level=error msg=EOF
time="2018-07-25T11:37:01Z" level=info msg="10.233.127.25:3260"
time="2018-07-25T11:37:01Z" level=info msg="Accepting ..."
time="2018-07-25T11:37:01Z" level=info msg="connection is connected from 10.233.77.0:33820...\n"
time="2018-07-25T11:37:01Z" level=info msg="Listening ..."
time="2018-07-25T11:37:01Z" level=warning msg="unexpected connection state: full feature"
time="2018-07-25T11:37:01Z" level=error msg="read tcp 10.233.127.25:3260->10.233.77.0:33820: read: connection reset by peer"
time="2018-07-25T11:37:21Z" level=info msg="10.233.127.25:3260"
time="2018-07-25T11:37:21Z" level=info msg="Accepting ..."
time="2018-07-25T11:37:21Z" level=info msg="connection is connected from 10.233.77.0:33882...\n"
time="2018-07-25T11:37:21Z" level=info msg="Listening ..."
time="2018-07-25T11:37:21Z" level=warning msg="unexpected connection state: full feature"
time="2018-07-25T11:37:21Z" level=error msg="read tcp 10.233.127.25:3260->10.233.77.0:33882: read: connection reset by peer"
time="2018-07-25T11:53:36Z" level=error msg=EOF
time="2018-07-25T11:55:34Z" level=info msg="10.233.127.25:3260"
time="2018-07-25T11:55:34Z" level=info msg="Accepting ..."
time="2018-07-25T11:55:34Z" level=info msg="connection is connected from 10.233.77.0:37718...\n"
time="2018-07-25T11:55:34Z" level=info msg="Listening ..."
time="2018-07-25T11:55:34Z" level=warning msg="unexpected connection state: full feature"
time="2018-07-25T11:55:34Z" level=error msg="read tcp 10.233.127.25:3260->10.233.77.0:37718: read: connection reset by peer"
time="2018-07-25T11:55:34Z" level=info msg="10.233.127.25:3260"
time="2018-07-25T11:55:34Z" level=info msg="Accepting ..."
time="2018-07-25T11:55:34Z" level=info msg="connection is connected from 10.233.77.0:37720...\n"
time="2018-07-25T11:55:34Z" level=info msg="Listening ..."
time="2018-07-25T11:55:34Z" level=info msg="New Session initiator name:iqn.1994-05.com.redhat:62a78a2fe1,target name:iqn.2016-09.com.openebs.jiva:pvc-a036d681-8fd4-11e8-ad96-de1a202c9007,ISID:0x23d000007"
time="2018-07-25T11:55:34Z" level=error msg="non support"
time="2018-07-25T11:55:34Z" level=warning msg="check condition"
time="2018-07-25T11:55:35Z" level=error msg="non support"
time="2018-07-25T11:55:35Z" level=warning msg="check condition"

kubectl logs -f -n redis pvc-a036d681-8fd4-11e8-ad96-de1a202c9007-ctrl-6c77c545d9-zsl9v maya-volume-exporter

I0725 06:33:29.722991       1 command.go:70] Starting maya-volume-exporter ...
I0725 06:33:29.723186       1 logs.go:43] Starting Server: :9500

@kmova
Copy link
Member

kmova commented Jul 25, 2018

Thanks @rtnpro .. will check on these logs and get back. ( cc @payes)
Btw, is it possible to start using 0.6 and have the configuration (storage-class) changed to 3 replica?

@rtnpro
Copy link
Author

rtnpro commented Jul 25, 2018

@kmova I will try out version 0.6 and get back to you.

@rtnpro
Copy link
Author

rtnpro commented Jul 25, 2018

@kmova I updated the openebs setup to 0.6.0 tag, cleaned up my redis setup and deployed it afresh. Still redis-master pod is failing to mount openebs volume as mentioned above.

Events:
  Type     Reason                  Age                From                     Message
  ----     ------                  ----               ----                     -------
  Warning  FailedScheduling        1m (x2 over 1m)    default-scheduler        pod has unbound PersistentVolumeClaims (repeated 4 times)
  Normal   Scheduled               1m                 default-scheduler        Successfully assigned redis-master-0 to node0
  Normal   SuccessfulAttachVolume  1m                 attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-db7f33d1-9024-11e8-ad96-de1a202c9007"
  Normal   SuccessfulMountVolume   1m                 kubelet, node0           MountVolume.SetUp succeeded for volume "default-token-6hvrg"
  Warning  FailedMount             15s (x5 over 43s)  kubelet, node0           MountVolume.WaitForAttach failed for volume "pvc-db7f33d1-9024-11e8-ad96-de1a202c9007" : failed to get any path for iscsi disk, last err seen:
iscsi: failed to sendtargets to portal 10.233.11.146:3260 output: iscsiadm: cannot make connection to 10.233.11.146: Connection refused
iscsiadm: cannot make connection to 10.233.11.146: Connection refused
iscsiadm: cannot make connection to 10.233.11.146: Connection refused
iscsiadm: cannot make connection to 10.233.11.146: Connection refused
iscsiadm: cannot make connection to 10.233.11.146: Connection refused
iscsiadm: cannot make connection to 10.233.11.146: Connection refused
iscsiadm: connection login retries (reopen_max) 5 exceeded
iscsiadm: No portals found
, err exit status 21
  Warning  FailedMount  5s  kubelet, node0  MountVolume.MountDevice failed for volume "pvc-db7f33d1-9024-11e8-ad96-de1a202c9007" : executable file not found in $PATH

kubectl describe storageclass openebs-db

Name:            openebs-db
IsDefaultClass:  No
Annotations:     kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{},"name":"openebs-db","namespace":""},"parameters":{"openebs.io/jiva-replica-count":"3","openebs.io/storage-pool":"default","openebs.io/volume-monitor":"true"},"provisioner":"openebs.io/provisioner-iscsi"}

Provisioner:           openebs.io/provisioner-iscsi
Parameters:            openebs.io/jiva-replica-count=3,openebs.io/storage-pool=default,openebs.io/volume-monitor=true
AllowVolumeExpansion:  <unset>
MountOptions:          <none>
ReclaimPolicy:         Delete
VolumeBindingMode:     Immediate
Events:                <none>

@kmova
Copy link
Member

kmova commented Jul 26, 2018

executable file not found in $PATH does indicate issue with the initiator. Can you share the output of this command please. lsmod

I may have to setup this locally to further debug. I will use the gist you provided above.

@rtnpro
Copy link
Author

rtnpro commented Jul 26, 2018

@kmova Here's the output for lsmod on nod0:

Module                  Size  Used by
ipip                    4940  0
tunnel4                 2325  1 ipip
ip_tunnel              12243  1 ipip
iscsi_tcp               8686  5
libiscsi_tcp           12154  1 iscsi_tcp
libiscsi               35958  2 libiscsi_tcp,iscsi_tcp
scsi_transport_iscsi    68251  3 iscsi_tcp,libiscsi
xt_statistic            1383  2
nf_conntrack_netlink    28053  0
ip6t_rpfilter           1780  1
xt_multiport            1830  36
xt_set                  7692  2
iptable_raw             1452  1
ip6table_raw            1392  1
ip6table_nat            1883  1
nf_conntrack_ipv6       7781  38
nf_defrag_ipv6         22966  1 nf_conntrack_ipv6
ip6table_filter         1532  1
nf_nat_ipv6             5155  1 ip6table_nat
ip6table_mangle         1660  1
ip6_tables             14904  4 ip6table_filter,ip6table_mangle,ip6table_nat,ip6table_raw
iptable_mangle          1592  1
ip_set_hash_net        21616  4
vxlan                  31941  0
ip6_udp_tunnel          1903  1 vxlan
udp_tunnel              2355  1 vxlan
xt_nat                  1913  19
xt_recent               8186  2
ipt_REJECT              1521  2
nf_reject_ipv4          2611  1 ipt_REJECT
ip_set                 23623  2 ip_set_hash_net,xt_set
nfnetlink               5488  2 ip_set,nf_conntrack_netlink
ip_vs                 124500  0
xt_comment              1099  662
xt_mark                 1317  152
xfs                   540100  2
ipt_MASQUERADE          1213  5
nf_nat_masquerade_ipv4     1929  1 ipt_MASQUERADE
xfrm_user              24053  1
iptable_nat             1895  1
nf_conntrack_ipv4       7466  43
nf_defrag_ipv4          1539  1 nf_conntrack_ipv4
nf_nat_ipv4             4891  1 iptable_nat
xt_addrtype             2829  3
iptable_filter          1592  1
xt_conntrack            3337  79
nf_nat                 11792  4 nf_nat_ipv4,nf_nat_ipv6,xt_nat,nf_nat_masquerade_ipv4
nf_conntrack           67486  9 ip_vs,nf_nat,nf_nat_ipv4,nf_nat_ipv6,xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_netlink,nf_conntrack_ipv4,nf_conntrack_ipv6
br_netfilter           12776  0
bridge                 98869  1 br_netfilter
stp                     1629  1 bridge
llc                     3633  2 stp,bridge
overlay                29307  30
dm_mod                 83625  6
kvm_intel             149282  0
kvm                   307552  1 kvm_intel
fjes                   19630  0
irqbypass               2744  1 kvm
autofs4                22384  2

@rtnpro
Copy link
Author

rtnpro commented Jul 30, 2018

@kmova @payes Any update on this?

@alexzimmer96
Copy link

@kmova @payes I'm facing the same issue.

@halcyon
Copy link

halcyon commented Oct 4, 2018

I just built a new cluster and had exactly the same iscsiadm errors in my pods. On my system I had set my default storage class to use xfs. It turned out that my new node did not have xfsprogs installed, so it failed to create the filesystem - which then failed to mount. The crucial clue for my situation appeared in my node's dmesg. It might be worth looking to see if you have anything similar in your node's dmesg.

[ 2208.921535] scsi host9: iSCSI Initiator over TCP/IP
[ 2208.927151] scsi 9:0:0:0: Direct-Access     CLOUDBYT OPENEBS          0.2  PQ: 0 ANSI: 5
[ 2208.966391] scsi 9:0:0:0: Attached scsi generic sg0 type 0
[ 2208.968752] sd 9:0:0:0: [sda] 209715200 512-byte logical blocks: (107 GB/100 GiB)
[ 2208.968754] sd 9:0:0:0: [sda] 4096-byte physical blocks
[ 2208.969050] sd 9:0:0:0: [sda] Write Protect is off
[ 2208.969052] sd 9:0:0:0: [sda] Mode Sense: 03 00 10 08
[ 2208.969099] sd 9:0:0:0: [sda] No Caching mode page found
[ 2208.969104] sd 9:0:0:0: [sda] Assuming drive cache: write through
[ 2208.971207] sd 9:0:0:0: [sda] Attached SCSI disk
[ 2210.027376] SGI XFS with ACLs, security attributes, realtime, no debug enabled
[ 2210.029449] XFS (sda): Invalid superblock magic number
[ 2210.095686] XFS (sda): Invalid superblock magic number
[ 2226.219257] XFS (sda): Invalid superblock magic number
[ 2226.283051] XFS (sda): Invalid superblock magic number
[ 2258.396192] XFS (sda): Invalid superblock magic number
[ 2258.461262] XFS (sda): Invalid superblock magic number
[ 2322.631336] XFS (sda): Invalid superblock magic number
[ 2322.690916] XFS (sda): Invalid superblock magic number
[ 2342.166615]  connection1:0: detected conn error (1020)

@payes
Copy link
Member

payes commented Oct 5, 2018

Thanks @halcyon for sharing the update.
@alexanderzimmer @rtnpro We have started working on this issue and will update you with the analysis and fix ASAP.

@halcyon
Copy link

halcyon commented Oct 5, 2018

I'd like to add that my comment was based on version 0.6 with jiva. Unfortunately I had timeouts using 0.7 with jiva and it failed to mount. I'm using Debian 9.5 (stretch).

@BalaBalaYi
Copy link

BalaBalaYi commented Nov 27, 2018

Same issue (Using 0.6):
image

Is there any workaround?

@rhugga
Copy link

rhugga commented Dec 17, 2018

Anyone have a fix for this? I am having this exact same errors using Rancher 2.13, K8s 1.11.5, Docker 18.6 and Openebs 0.8.

  Normal   SuccessfulAttachVolume  46s                attachdetach-controller          AttachVolume.Attach succeeded for volume "pvc-be91f5c6-0215-11e9-9f7a-008cfae6ae7c"
  Warning  FailedMount             31s                kubelet, st13p01if-ztds15205101  MountVolume.WaitForAttach failed for volume "pvc-be91f5c6-0215-11e9-9f7a-008cfae6ae7c" : failed to get any path for iscsi disk, last err seen:
iscsi: failed to sendtargets to portal 172.31.63.15:3260 output: iscsiadm: cannot make connection to 172.31.63.15: Connection refused
iscsiadm: cannot make connection to 172.31.63.15: Connection refused
iscsiadm: cannot make connection to 172.31.63.15: Connection refused
iscsiadm: cannot make connection to 172.31.63.15: Connection refused
iscsiadm: cannot make connection to 172.31.63.15: Connection refused
iscsiadm: cannot make connection to 172.31.63.15: Connection refused
iscsiadm: connection login retries (reopen_max) 5 exceeded
iscsiadm: No portals found
, err exit status 21
  kubelet: 
    cluster_dns_server: "172.31.0.10"
    cluster_domain: "cluster.local"
    extra_binds: 
      - "/etc/iscsi:/etc/iscsi"
      - "/sbin/iscsiadm:/sbin/iscsiadm"
      - "/var/lib/iscsi:/var/lib/iscsi"
      - "/lib/modules:/lib/modules"

@utkarshmani1997
Copy link
Member

utkarshmani1997 commented Dec 19, 2018

Hi @rhugga, @BalaBalaYi, @halcyon, @rtnpro here is the blog for setting up openebs on custom rancher cluster and let us know if it is helping you. Most probably docker version seems to be a problem, since rancher only support following version of docker

@muratkars
Copy link
Member

muratkars commented Dec 20, 2018

Hi @rhugga, @BalaBalaYi, @halcyon, @rtnpro i was able to reproduce this in my lab. And here is what i have done to fix it with the help from @sonasingh46 @shan.
First my environment:
k8s: 1.11.5
rke: 2.1.3
kernel: 4.14.35-1844.0.7.el7uek.x86_64
docker: 18.06.1-ce
openebs: 0.8
os: OracleLinux 7.6

Kubelet is running with this config:

kubelet: 
   extra_binds: 
     - "/etc/iscsi:/etc/iscsi"
     - "/sbin/iscsiadm:/sbin/iscsiadm"
     - "/var/lib/iscsi:/var/lib/iscsi"
     - "/lib/modules"

iscsi-initiator-utils installed on all nodes using the command below, but that doesn't by default install the iscsi_tcp driver

$ yum install iscsi-initiator-utils -y 

Seperately installed iscsi_tcp on all worker nodes
$ modprobe iscsi_tcp

Here is how it looks after loading iscsi_tcp

$ lsmod |grep iscsi
iscsi_tcp              20480  0
libiscsi_tcp           28672  1 iscsi_tcp
libiscsi               57344  2 libiscsi_tcp,iscsi_tcp
scsi_transport_iscsi   110592  4 libiscsi_tcp,iscsi_tcp,libiscsi

After this;
Make sure you are using the correct storage classes with CAS config from here if you are using the additional SCs from the repo. I accidentally used some older <0.6 storage classes and had issues.

After above steps:

NAME                                                      READY     STATUS    RESTARTS   AGE
default-demo-vol1-claim-593994264-ctrl-85787b8d45-pkjxv   2/2       Running   0          17m
default-demo-vol1-claim-593994264-rep-699b75f776-f2z8n    1/1       Running   0          17m
default-demo-vol1-claim-593994264-rep-699b75f776-lxsfw    1/1       Running   0          17m
default-demo-vol1-claim-593994264-rep-699b75f776-n7fs5    1/1       Running   0          17m
percona-7999fdd9cf-h6t48                                  1/1       Running   0          7m

Here is the description of a percona app deployed with 3 replicas:

$ kubectl describe pod percona-7999fdd9cf-h6t48
...
...
Events:
  Type    Reason                  Age   From                     Message
  ----    ------                  ----  ----                     -------
  Normal  Scheduled               22m   default-scheduler        Successfully assigned default/percona-7999fdd9cf-h6t48 to olnode5
  Normal  SuccessfulAttachVolume  22m   attachdetach-controller  AttachVolume.Attach succeeded for volume "default-demo-vol1-claim-593994264"
  Normal  Pulling                 22m   kubelet, olnode5         pulling image "openebs/tests-custom-percona:latest"
  Normal  Pulled                  15m   kubelet, olnode5         Successfully pulled image "openebs/tests-custom-percona:latest"
  Normal  Created                 15m   kubelet, olnode5         Created container
  Normal  Started                 15m   kubelet, olnode5         Started container

@rhugga
Copy link

rhugga commented Dec 20, 2018

I was using CAS..... what is the correct storage class for cstor pool?

The storage class doesn't seem like it was my problem. I get past the lun allocation and it fails when trying to mount it. Outside of the storage class, my config is exactly the same as yours. Not sure how/why yours is working.

---
apiVersion: openebs.io/v1alpha1
kind: StoragePoolClaim
metadata:
  name: cstor-auto-pool
spec:
  name: cstor-auto-pool
  type: disk
  maxPools: 3
  poolSpec:
    poolType: striped
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: openebs-cstor-auto-pool
  annotations:
    openebs.io/cas-type: cstor
    cas.openebs.io/config: |
      - name: StoragePoolClaim
        value: "cstor-auto-pool"
provisioner: openebs.io/provisioner-iscsi

@rhugga
Copy link

rhugga commented Dec 20, 2018

Can you please give more detail on your environment because I'm still seeing the exact same issues.

Events:
  Type     Reason                  Age                 From                             Message
  ----     ------                  ----                ----                             -------
  Warning  FailedScheduling        43s (x2 over 43s)   default-scheduler                persistentvolumeclaim "demo-vol-claim" not found
  Warning  FailedScheduling        38s (x23 over 43s)  default-scheduler                pod has unbound PersistentVolumeClaims (repeated 2 times)
  Normal   SuccessfulAttachVolume  31s                 attachdetach-controller          AttachVolume.Attach succeeded for volume "pvc-d5dff055-0460-11e9-a0c0-008cfae6ae7c"
  Warning  FailedMount             5s (x2 over 11s)    kubelet, st13p01if-ztds01243201  MountVolume.WaitForAttach failed for volume "pvc-d5dff055-0460-11e9-a0c0-008cfae6ae7c" : failed to get any path for iscsi disk, last err seen:
iscsi: failed to sendtargets to portal 172.31.36.102:3260 output: iscsiadm: cannot make connection to 172.31.36.102: Connection refused
iscsiadm: cannot make connection to 172.31.36.102: Connection refused
iscsiadm: cannot make connection to 172.31.36.102: Connection refused
iscsiadm: cannot make connection to 172.31.36.102: Connection refused
iscsiadm: cannot make connection to 172.31.36.102: Connection refused
iscsiadm: cannot make connection to 172.31.36.102: Connection refused
iscsiadm: connection login retries (reopen_max) 5 exceeded
iscsiadm: No portals found
, err exit status 21

@rcoh
Copy link

rcoh commented Jan 25, 2019

I got into this state when I lost a node in my cluster...I'm confused -- I thought this was the exact problem OpenEBS was supposed to solve?

@kmova kmova added this to the 0.8.2 milestone Feb 24, 2019
@Yaxian
Copy link

Yaxian commented Feb 25, 2019

Hi @kmova, thanks very much for replying!
The node iZ2zegckpmdcwq51bxc71vZ is master.
The app mongo was scheduled at node2 iz2ze9ockjuosa8l5k8aydz, and pv was scheduled at node1 iz2zef6lmymbaovo4qdl84z.

node master

Name:            pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac
Labels:          openebs.io/cas-type=jiva
                 openebs.io/storageclass=mongo-pv-az
Annotations:     openEBSProvisionerIdentity: iz2zef6lmymbaovo4qdl84z
                 openebs.io/cas-type: jiva
                 pv.kubernetes.io/provisioned-by: openebs.io/provisioner-iscsi
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:    mongo-pv-az
Status:          Bound
Claim:           prod/mongo-persistent-storage-mongodb-2
Reclaim Policy:  Delete
Access Modes:    RWO
VolumeMode:      Filesystem
Capacity:        200G
Node Affinity:   <none>
Message:
Source:
    Type:               ISCSI (an ISCSI Disk resource that is attached to a kubelet's host machine and then exposed to the pod)
    TargetPortal:       10.107.222.18:3260
    IQN:                iqn.2016-09.com.openebs.jiva:pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac
    Lun:                0
    ISCSIInterface      default
    FSType:             xfs
    ReadOnly:           false
    Portals:            []
    DiscoveryCHAPAuth:  false
    SessionCHAPAuth:    false
    SecretRef:          nil
    InitiatorName:      <none>
Events:                 <none>
service iscsid status

● iscsid.service - iSCSI initiator daemon (iscsid)
   Loaded: loaded (/lib/systemd/system/iscsid.service; enabled; vendor preset: enabled)
   Active: active (running) since Sun 2019-01-06 01:28:18 CST; 1 months 19 days ago
     Docs: man:iscsid(8)
 Main PID: 1057 (iscsid)
    Tasks: 2
   Memory: 4.1M
      CPU: 10min 38.777s
   CGroup: /system.slice/iscsid.service
           ├─1056 /sbin/iscsid
           └─1057 /sbin/iscsid

Feb 24 00:15:48 iZ2zegckpmdcwq51bxc71vZ iscsid[1056]: connect to 10.96.135.56:3260 failed (Connection refused)
Feb 24 00:16:03 iZ2zegckpmdcwq51bxc71vZ iscsid[1056]: connection10:0 is operational after recovery (5 attempts)
Feb 24 00:19:27 iZ2zegckpmdcwq51bxc71vZ iscsid[1056]: Kernel reported iSCSI connection 10:0 error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
Feb 24 00:19:34 iZ2zegckpmdcwq51bxc71vZ iscsid[1056]: connect to 10.96.135.56:3260 failed (Connection refused)
Feb 24 00:19:49 iZ2zegckpmdcwq51bxc71vZ iscsid[1056]: connect to 10.96.135.56:3260 failed (Connection refused)
Feb 24 00:20:04 iZ2zegckpmdcwq51bxc71vZ iscsid[1056]: connect to 10.96.135.56:3260 failed (Connection refused)
Feb 24 00:20:20 iZ2zegckpmdcwq51bxc71vZ iscsid[1056]: connect to 10.96.135.56:3260 failed (Connection refused)
Feb 24 00:20:35 iZ2zegckpmdcwq51bxc71vZ iscsid[1056]: connect to 10.96.135.56:3260 failed (Connection refused)
Feb 24 00:26:45 iZ2zegckpmdcwq51bxc71vZ iscsid[1056]: Connection18:0 to [target: iqn.2016-09.com.openebs.jiva:pvc-b4a23e03-3787-11e9-a3ee-00163e0ec7ac, portal: 10.105.33.49,3260] through [iface: default]
Feb 24 11:52:19 iZ2zegckpmdcwq51bxc71vZ iscsid[1056]: Connection19:0 to [target: iqn.2016-09.com.openebs.jiva:pvc-77810b0b-37e7-11e9-a3ee-00163e0ec7ac, portal: 10.110.136.128,3260] through [iface: default
lines 1-22/22 (END)
lsmod | grep tcp

xt_tcpudp              16384  212
x_tables               36864  12 xt_statistic,xt_mark,xt_comment,ip_tables,xt_tcpudp,ipt_MASQUERADE,xt_conntrack,xt_nat,iptable_filter,ipt_REJECT,ip6_tables,xt_addrtype
iscsi_tcp              20480  11
libiscsi_tcp           24576  1 iscsi_tcp
libiscsi               53248  3 libiscsi_tcp,iscsi_tcp,ib_iser
scsi_transport_iscsi   102400  4 iscsi_tcp,ib_iser,libiscsi
  • Is kubelet running in container?

No. It is installed by apt


node1 iZ2zef6lmymbaovo4qdl84Z

service iscsid status

● iscsid.service - iSCSI initiator daemon (iscsid)
   Loaded: loaded (/lib/systemd/system/iscsid.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2019-01-14 01:12:36 CST; 1 months 11 days ago
     Docs: man:iscsid(8)
 Main PID: 682 (iscsid)
    Tasks: 2
   Memory: 2.6M
      CPU: 5min 10.159s
   CGroup: /system.slice/iscsid.service
           ├─681 /sbin/iscsid
           └─682 /sbin/iscsid

Jan 15 22:49:36 iZ2zef6lmymbaovo4qdl84Z iscsid[681]: connect to 10.107.185.110:3260 failed (Connection refused)
Jan 15 22:49:40 iZ2zef6lmymbaovo4qdl84Z iscsid[681]: connection10:0 is operational after recovery (4 attempts)
Jan 17 21:41:26 iZ2zef6lmymbaovo4qdl84Z iscsid[681]: Connection10:0 to [target: iqn.2016-09.com.openebs.jiva:pvc-eee3c714-18c2-11e9-a3ee-00163e0ec7ac, portal: 10.107.185.110,3260] through [iface: default]
Jan 17 21:41:26 iZ2zef6lmymbaovo4qdl84Z iscsid[681]: Connection4:0 to [target: iqn.2016-09.com.openebs.jiva:pvc-5d3f77f7-176c-11e9-a3ee-00163e0ec7ac, portal: 10.97.226.142,3260] through [iface: default] i
Jan 17 21:41:26 iZ2zef6lmymbaovo4qdl84Z iscsid[681]: Connection3:0 to [target: iqn.2016-09.com.openebs.jiva:pvc-12ba56f6-176c-11e9-a3ee-00163e0ec7ac, portal: 10.111.208.105,3260] through [iface: default]
Jan 17 21:41:26 iZ2zef6lmymbaovo4qdl84Z iscsid[681]: Connection2:0 to [target: iqn.2016-09.com.openebs.jiva:pvc-3c33bb77-176a-11e9-a3ee-00163e0ec7ac, portal: 10.106.78.94,3260] through [iface: default] is
Jan 17 21:59:38 iZ2zef6lmymbaovo4qdl84Z iscsid[681]: Connection11:0 to [target: iqn.2016-09.com.openebs.jiva:pvc-3c33bb77-176a-11e9-a3ee-00163e0ec7ac, portal: 10.106.78.94,3260] through [iface: default] i
Jan 18 06:59:54 iZ2zef6lmymbaovo4qdl84Z iscsid[681]: Connection12:0 to [target: iqn.2016-09.com.openebs.jiva:pvc-5d3f77f7-176c-11e9-a3ee-00163e0ec7ac, portal: 10.97.226.142,3260] through [iface: default]
Feb 23 23:38:15 iZ2zef6lmymbaovo4qdl84Z iscsid[681]: Kernel reported iSCSI connection 11:0 error (1022 - Invalid or unknown error code) state (3)
Feb 23 23:38:36 iZ2zef6lmymbaovo4qdl84Z iscsid[681]: connection11:0 is operational after recovery (2 attempts)
lines 1-22/22 (END)
lsmod | grep tcp 

xt_tcpudp              16384  212
x_tables               36864  12 xt_statistic,xt_mark,xt_comment,ip_tables,xt_tcpudp,ipt_MASQUERADE,xt_conntrack,xt_nat,iptable_filter,ipt_REJECT,ip6_tables,xt_addrtype
iscsi_tcp              20480  4
libiscsi_tcp           24576  1 iscsi_tcp
libiscsi               53248  3 libiscsi_tcp,iscsi_tcp,ib_iser
scsi_transport_iscsi   102400  4 iscsi_tcp,ib_iser,libiscsi


node2 iZ2ze9ockjuosa8l5k8aydZ

lsmod | grep tcp

xt_tcpudp              16384  212
iscsi_tcp              20480  16
libiscsi_tcp           24576  1 iscsi_tcp
libiscsi               53248  3 libiscsi_tcp,iscsi_tcp,ib_iser
scsi_transport_iscsi   102400  4 iscsi_tcp,ib_iser,libiscsi
x_tables               36864  12 xt_statistic,xt_mark,xt_comment,ip_tables,xt_tcpudp,ipt_MASQUERADE,xt_conntrack,xt_nat,iptable_filter,ipt_REJECT,ip6_tables,xt_addrtype
service iscsid status

● iscsid.service - iSCSI initiator daemon (iscsid)
   Loaded: loaded (/lib/systemd/system/iscsid.service; enabled; vendor preset: enabled)
   Active: active (running) since Sat 2019-01-05 08:52:02 CST; 1 months 20 days ago
     Docs: man:iscsid(8)
 Main PID: 15732 (iscsid)
    Tasks: 2
   Memory: 5.9M
      CPU: 26min 7.091s
   CGroup: /system.slice/iscsid.service
           ├─15731 /sbin/iscsid
           └─15732 /sbin/iscsid

Feb 24 11:48:31 iZ2ze9ockjuosa8l5k8aydZ iscsid[15731]: connect to 10.110.225.223:3260 failed (Connection refused)
Feb 24 11:48:46 iZ2ze9ockjuosa8l5k8aydZ iscsid[15731]: connect to 10.110.225.223:3260 failed (Connection refused)
Feb 24 11:49:02 iZ2ze9ockjuosa8l5k8aydZ iscsid[15731]: connect to 10.110.225.223:3260 failed (Connection refused)
Feb 24 11:49:18 iZ2ze9ockjuosa8l5k8aydZ iscsid[15731]: connect to 10.110.225.223:3260 failed (Connection refused)
Feb 24 11:49:33 iZ2ze9ockjuosa8l5k8aydZ iscsid[15731]: connect to 10.110.225.223:3260 failed (Connection refused)
Feb 24 11:49:49 iZ2ze9ockjuosa8l5k8aydZ iscsid[15731]: connect to 10.110.225.223:3260 failed (Connection refused)
Feb 24 11:50:05 iZ2ze9ockjuosa8l5k8aydZ iscsid[15731]: connect to 10.110.225.223:3260 failed (Connection refused)
Feb 24 11:50:20 iZ2ze9ockjuosa8l5k8aydZ iscsid[15731]: connect to 10.110.225.223:3260 failed (Connection refused)
Feb 24 11:50:28 iZ2ze9ockjuosa8l5k8aydZ iscsid[15731]: Connection12:0 to [target: iqn.2016-09.com.openebs.jiva:pvc-dccb21b0-175c-11e9-a3ee-00163e0ec7ac, portal: 10.110.225.223,3260] through [iface: defaul
Feb 24 11:52:27 iZ2ze9ockjuosa8l5k8aydZ iscsid[15731]: Connection23:0 to [target: iqn.2016-09.com.openebs.jiva:pvc-77809c70-37e7-11e9-a3ee-00163e0ec7ac, portal: 10.104.55.148,3260] through [iface: default
lines 1-22/22 (END)
iscsiadm --mode node --targetname iqn.2016-09.com.openebs.jiva:pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac  --portal 10.107.222.18:3260 

node.name = iqn.2016-09.com.openebs.jiva:pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac
node.tpgt = 1
node.startup = manual
node.leading_login = No
iface.hwaddress = <empty>
iface.ipaddress = <empty>
iface.iscsi_ifacename = default
iface.net_ifacename = <empty>
iface.transport_name = tcp
iface.initiatorname = <empty>
iface.bootproto = <empty>
iface.subnet_mask = <empty>
iface.gateway = <empty>
iface.ipv6_autocfg = <empty>
iface.linklocal_autocfg = <empty>
iface.router_autocfg = <empty>
iface.ipv6_linklocal = <empty>
iface.ipv6_router = <empty>
iface.state = <empty>
iface.vlan_id = 0
iface.vlan_priority = 0
iface.vlan_state = <empty>
iface.iface_num = 0
iface.mtu = 0
iface.port = 0
node.discovery_address = 10.107.222.18
node.discovery_port = 3260
node.discovery_type = send_targets
node.session.initial_cmdsn = 0
node.session.initial_login_retry_max = 8
node.session.xmit_thread_priority = -20
node.session.cmds_max = 128
node.session.queue_depth = 32
node.session.nr_sessions = 1
node.session.auth.authmethod = None
node.session.auth.username = <empty>
node.session.auth.password = <empty>
node.session.auth.username_in = <empty>
node.session.auth.password_in = <empty>
node.session.timeo.replacement_timeout = 120
node.session.err_timeo.abort_timeout = 15
node.session.err_timeo.lu_reset_timeout = 30
node.session.err_timeo.tgt_reset_timeout = 30
node.session.err_timeo.host_reset_timeout = 60
node.session.iscsi.FastAbort = Yes
node.session.iscsi.InitialR2T = No
node.session.iscsi.ImmediateData = Yes
node.session.iscsi.FirstBurstLength = 262144
node.session.iscsi.MaxBurstLength = 16776192
node.session.iscsi.DefaultTime2Retain = 0
node.session.iscsi.DefaultTime2Wait = 2
node.session.iscsi.MaxConnections = 1
node.session.iscsi.MaxOutstandingR2T = 1
node.session.iscsi.ERL = 0
node.conn[0].address = 10.107.222.18
node.conn[0].port = 3260
node.conn[0].startup = manual
node.conn[0].tcp.window_size = 524288
node.conn[0].tcp.type_of_service = 0
node.conn[0].timeo.logout_timeout = 15
node.conn[0].timeo.login_timeout = 15
node.conn[0].timeo.auth_timeout = 45
node.conn[0].timeo.noop_out_interval = 5
node.conn[0].timeo.noop_out_timeout = 5
node.conn[0].iscsi.MaxXmitDataSegmentLength = 0
node.conn[0].iscsi.MaxRecvDataSegmentLength = 262144
node.conn[0].iscsi.HeaderDigest = None
node.conn[0].iscsi.DataDigest = None
node.conn[0].iscsi.IFMarker = No
node.conn[0].iscsi.OFMarker = No
# END RECORD

Thanks very much.

@kmova
Copy link
Member

kmova commented Feb 25, 2019

@Yaxian .. from the above, I can confirm that the iSCSI is installed and the required kernel modules are loaded on the host.

In the iscsid logs, I am still missing pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac

From the logs provided, above - I see that some of the volumes are logged in. Did this mount error start occurring recently? Were there any recent changes/upgrades done to cluster? How is the K8s cluster built? kubeadm? rancher?

Would it be possible to connect the cluster to https://mayaonline.io - to help with debugging? If not, I will reach out for more logs and configuration.

@Yaxian
Copy link

Yaxian commented Feb 26, 2019

@kmova Thanks!I don't know how to check the logs of pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac. Would you tell me how to do it if you free?
Thanks, again!

@utkarshmani1997
Copy link
Member

utkarshmani1997 commented Feb 26, 2019

@Yaxian you can run kubectl logs -f <replica_pod> to get the logs.

@Yaxian
Copy link

Yaxian commented Feb 26, 2019

@utkarshmani1997 Thanks very much. Here is the pv logs

kl logs -f pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac-ctrl-bf59c68c-kksg7 pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac-ctrl-con

127.0.0.1 - - [25/Feb/2019:06:25:07 +0000] "GET /v1/stats HTTP/1.1" 200 495
time="2019-02-25T06:25:37Z" level=error msg=EOF
127.0.0.1 - - [25/Feb/2019:06:26:07 +0000] "GET /v1/stats HTTP/1.1" 200 495
127.0.0.1 - - [25/Feb/2019:06:27:07 +0000] "GET /v1/stats HTTP/1.1" 200 495
time="2019-02-25T06:27:25Z" level=info msg="10.244.0.157:3260"
time="2019-02-25T06:27:25Z" level=info msg="Accepting ..."
time="2019-02-25T06:27:25Z" level=info msg="connection is connected from 10.244.1.0:60528...\n"
time="2019-02-25T06:27:25Z" level=info msg="Listening ..."
time="2019-02-25T06:27:25Z" level=warning msg="unexpected connection state: full feature"
time="2019-02-25T06:27:25Z" level=error msg=EOF
time="2019-02-25T06:27:25Z" level=info msg="10.244.0.157:3260"
time="2019-02-25T06:27:25Z" level=info msg="Accepting ..."
time="2019-02-25T06:27:25Z" level=info msg="connection is connected from 10.244.1.0:60530...\n"
time="2019-02-25T06:27:25Z" level=info msg="Listening ..."
time="2019-02-25T06:27:26Z" level=info msg="New Session initiator name:iqn.1993-08.org.debian:01:51999edb23e9,target name:iqn.2016-09.com.openebs.jiva:pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac,ISID:0x23d18
0000"
time="2019-02-25T06:27:26Z" level=error msg="non support"
time="2019-02-25T06:27:26Z" level=warning msg="check condition"
127.0.0.1 - - [25/Feb/2019:06:28:07 +0000] "GET /v1/stats HTTP/1.1" 200 566
127.0.0.1 - - [25/Feb/2019:06:29:07 +0000] "GET /v1/stats HTTP/1.1" 200 566
127.0.0.1 - - [25/Feb/2019:06:30:07 +0000] "GET /v1/stats HTTP/1.1" 200 567
127.0.0.1 - - [25/Feb/2019:06:31:07 +0000] "GET /v1/stats HTTP/1.1" 200 567
127.0.0.1 - - [25/Feb/2019:06:32:07 +0000] "GET /v1/stats HTTP/1.1" 200 567
127.0.0.1 - - [25/Feb/2019:06:33:07 +0000] "GET /v1/stats HTTP/1.1" 200 567
127.0.0.1 - - [25/Feb/2019:06:34:07 +0000] "GET /v1/stats HTTP/1.1" 200 571
127.0.0.1 - - [25/Feb/2019:06:35:07 +0000] "GET /v1/stats HTTP/1.1" 200 571
127.0.0.1 - - [25/Feb/2019:06:36:07 +0000] "GET /v1/stats HTTP/1.1" 200 571
127.0.0.1 - - [25/Feb/2019:06:37:07 +0000] "GET /v1/stats HTTP/1.1" 200 571
127.0.0.1 - - [25/Feb/2019:06:38:07 +0000] "GET /v1/stats HTTP/1.1" 200 572

kl logs -f pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac-ctrl-bf59c68c-kksg7 maya-volume-exporter

I0226 06:31:07.784840       1 jivacollector.go:59] Got response: {"Name":"pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac","ReadIOPS":"200262","ReplicaCounter":1,"Replicas":[{"Address":"tcp://10.244.1.49:9502","Mode":"RW"}],"RevisionCounter":0,"SCSIIOCount":{"0":16176,"158":1421,"160":1,"163":1,"18":9944,"26":2842,"40":200262},"SectorSize":"4096","Size":"214748364800","Status":"RW","TotalReadBlockCount":"4872684544","TotalReadTime":"84741101511","TotalWriteBlockCount":"0","TotalWriteTime":"0","UpTime":"220167.905011","UsedBlocks":"4","UsedLogicalBlocks":"0","WriteIOPS":"0","actions":{},"links":{"self":"http://127.0.0.1:9501/v1/stats"},"type":"stats"}
I0226 06:32:07.784962       1 jivacollector.go:59] Got response: {"Name":"pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac","ReadIOPS":"200262","ReplicaCounter":1,"Replicas":[{"Address":"tcp://10.244.1.49:9502","Mode":"RW"}],"RevisionCounter":0,"SCSIIOCount":{"0":16176,"158":1421,"160":1,"163":1,"18":9944,"26":2842,"40":200262},"SectorSize":"4096","Size":"214748364800","Status":"RW","TotalReadBlockCount":"4872684544","TotalReadTime":"84741101511","TotalWriteBlockCount":"0","TotalWriteTime":"0","UpTime":"220227.905098","UsedBlocks":"4","UsedLogicalBlocks":"0","WriteIOPS":"0","actions":{},"links":{"self":"http://127.0.0.1:9501/v1/stats"},"type":"stats"}
I0226 06:33:07.785105       1 jivacollector.go:59] Got response: {"Name":"pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac","ReadIOPS":"200544","ReplicaCounter":1,"Replicas":[{"Address":"tcp://10.244.1.49:9502","Mode":"RW"}],"RevisionCounter":0,"SCSIIOCount":{"0":16198,"158":1423,"160":1,"163":1,"18":9958,"26":2846,"40":200544},"SectorSize":"4096","Size":"214748364800","Status":"RW","TotalReadBlockCount":"4879697920","TotalReadTime":"84855223327","TotalWriteBlockCount":"0","TotalWriteTime":"0","UpTime":"220287.905172","UsedBlocks":"4","UsedLogicalBlocks":"0","WriteIOPS":"0","actions":{},"links":{"self":"http://127.0.0.1:9501/v1/stats"},"type":"stats"}
I0226 06:34:07.785522       1 jivacollector.go:59] Got response: {"Name":"pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac","ReadIOPS":"200544","ReplicaCounter":1,"Replicas":[{"Address":"tcp://10.244.1.49:9502","Mode":"RW"}],"RevisionCounter":0,"SCSIIOCount":{"0":16198,"158":1423,"160":1,"163":1,"18":9958,"26":2846,"40":200544},"SectorSize":"4096","Size":"214748364800","Status":"RW","TotalReadBlockCount":"4879697920","TotalReadTime":"84855223327","TotalWriteBlockCount":"0","TotalWriteTime":"0","UpTime":"220347.905247","UsedBlocks":"4","UsedLogicalBlocks":"0","WriteIOPS":"0","actions":{},"links":{"self":"http://127.0.0.1:9501/v1/stats"},"type":"stats"}
I0226 06:35:07.785154       1 jivacollector.go:59] Got response: {"Name":"pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac","ReadIOPS":"200830","ReplicaCounter":1,"Replicas":[{"Address":"tcp://10.244.1.49:9502","Mode":"RW"}],"RevisionCounter":0,"SCSIIOCount":{"0":16222,"158":1425,"160":1,"163":1,"18":9972,"26":2850,"40":200830},"SectorSize":"4096","Size":"214748364800","Status":"RW","TotalReadBlockCount":"4886391808","TotalReadTime":"84971401226","TotalWriteBlockCount":"0","TotalWriteTime":"0","UpTime":"220407.905270","UsedBlocks":"4","UsedLogicalBlocks":"0","WriteIOPS":"0","actions":{},"links":{"self":"http://127.0.0.1:9501/v1/stats"},"type":"stats"}
I0226 06:36:07.784945       1 jivacollector.go:59] Got response: {"Name":"pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac","ReadIOPS":"200830","ReplicaCounter":1,"Replicas":[{"Address":"tcp://10.244.1.49:9502","Mode":"RW"}],"RevisionCounter":0,"SCSIIOCount":{"0":16222,"158":1425,"160":1,"163":1,"18":9972,"26":2850,"40":200830},"SectorSize":"4096","Size":"214748364800","Status":"RW","TotalReadBlockCount":"4886391808","TotalReadTime":"84971401226","TotalWriteBlockCount":"0","TotalWriteTime":"0","UpTime":"220467.905080","UsedBlocks":"4","UsedLogicalBlocks":"0","WriteIOPS":"0","actions":{},"links":{"self":"http://127.0.0.1:9501/v1/stats"},"type":"stats"}
I0226 06:37:07.785160       1 jivacollector.go:59] Got response: {"Name":"pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac","ReadIOPS":"201109","ReplicaCounter":1,"Replicas":[{"Address":"tcp://10.244.1.49:9502","Mode":"RW"}],"RevisionCounter":0,"SCSIIOCount":{"0":16244,"158":1427,"160":1,"163":1,"18":9986,"26":2854,"40":201109},"SectorSize":"4096","Size":"214748364800","Status":"RW","TotalReadBlockCount":"4893347840","TotalReadTime":"85079534376","TotalWriteBlockCount":"0","TotalWriteTime":"0","UpTime":"220527.905305","UsedBlocks":"4","UsedLogicalBlocks":"0","WriteIOPS":"0","actions":{},"links":{"self":"http://127.0.0.1:9501/v1/stats"},"type":"stats"}
kl logs -f pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac-rep-64b77d465c-582z8

time="2019-02-26T06:27:07Z" level=info msg="Get Volume Usage"
time="2019-02-26T06:27:07Z" level=info msg=GetUsage
10.244.0.157 - - [26/Feb/2019:06:27:07 +0000] "GET /v1/replicas/1/volusage HTTP/1.1" 200 181
time="2019-02-26T06:28:07Z" level=info msg="Get Volume Usage"
time="2019-02-26T06:28:07Z" level=info msg=GetUsage
10.244.0.157 - - [26/Feb/2019:06:28:07 +0000] "GET /v1/replicas/1/volusage HTTP/1.1" 200 181
time="2019-02-26T06:29:07Z" level=info msg="Get Volume Usage"
time="2019-02-26T06:29:07Z" level=info msg=GetUsage
10.244.0.157 - - [26/Feb/2019:06:29:07 +0000] "GET /v1/replicas/1/volusage HTTP/1.1" 200 181
time="2019-02-26T06:30:07Z" level=info msg="Get Volume Usage"
time="2019-02-26T06:30:07Z" level=info msg=GetUsage
10.244.0.157 - - [26/Feb/2019:06:30:07 +0000] "GET /v1/replicas/1/volusage HTTP/1.1" 200 181
time="2019-02-26T06:31:07Z" level=info msg="Get Volume Usage"
time="2019-02-26T06:31:07Z" level=info msg=GetUsage
10.244.0.157 - - [26/Feb/2019:06:31:07 +0000] "GET /v1/replicas/1/volusage HTTP/1.1" 200 181
time="2019-02-26T06:32:07Z" level=info msg="Get Volume Usage"
time="2019-02-26T06:32:07Z" level=info msg=GetUsage
10.244.0.157 - - [26/Feb/2019:06:32:07 +0000] "GET /v1/replicas/1/volusage HTTP/1.1" 200 181
time="2019-02-26T06:33:07Z" level=info msg="Get Volume Usage"
time="2019-02-26T06:33:07Z" level=info msg=GetUsage
10.244.0.157 - - [26/Feb/2019:06:33:07 +0000] "GET /v1/replicas/1/volusage HTTP/1.1" 200 181
time="2019-02-26T06:34:07Z" level=info msg="Get Volume Usage"
time="2019-02-26T06:34:07Z" level=info msg=GetUsage
10.244.0.157 - - [26/Feb/2019:06:34:07 +0000] "GET /v1/replicas/1/volusage HTTP/1.1" 200 181
10.244.0.157 - - [26/Feb/2019:06:35:07 +0000] "GET /v1/replicas/1/volusage HTTP/1.1" 200 181
time="2019-02-26T06:35:07Z" level=info msg="Get Volume Usage"
time="2019-02-26T06:35:07Z" level=info msg=GetUsage

@utkarshmani1997
Copy link
Member

@Yaxian is it possible to get complete logs since above logs are not helping us.

@Yaxian
Copy link

Yaxian commented Feb 26, 2019

@utkarshmani1997
Copy link
Member

utkarshmani1997 commented Feb 27, 2019

@Yaxian Thank you for the logs, just looked at the controller logs, replica got disconnected but controller could not remove it. This is a known issue: #2403 . BTW which version of openebs you are using, recently we have made few improvements in 0.8.1 related to rpc connection.

Can you provide us stack trace of controller if possible ?
curl http://<CONTROLLER_IP>:9501/debug/pprof/goroutine?debug=2

@Yaxian
Copy link

Yaxian commented Feb 27, 2019

@utkarshmani1997 Thank you for replying.

The version is 'openebs-0.8.1', installed by helm.

Here is the goroutine.log.

Thank you again.

@utkarshmani1997
Copy link
Member

utkarshmani1997 commented Feb 27, 2019

@Yaxian did you try restarting controller, just let me know if that is helping you.
Also logs are indicating that you are using < 0.8.1 version, since new logs have been added into 0.8.1. Are you providing correct version details of faulty pods ?

@Yaxian
Copy link

Yaxian commented Feb 27, 2019

Sorry, I made a mistake. The version is 0.8.0. Here is the images.

quay.io/openebs/cstor-pool-mgmt                                                  0.8.0                     851450de4386        2 months ago        363 MB
quay.io/openebs/m-exporter                                                       0.8.0                     efaf0ed0fbc4        2 months ago        24 MB
quay.io/openebs/snapshot-provisioner                                             0.8.0                     7af50c8bbc82        2 months ago        80.2 MB
quay.io/openebs/snapshot-controller                                              0.8.0                     2d7c1f622221        2 months ago        79.1 MB
quay.io/openebs/openebs-k8s-provisioner                                          0.8.0                     8d2a2fba2a3e        2 months ago        158 MB
quay.io/openebs/jiva                                                             0.8.0                     9760e1813d75        2 months ago        229 MB
quay.io/openebs/cstor-pool                                                       0.8.0                     1d8d23939061        2 months ago        327 MB
quay.io/openebs/node-disk-manager-amd64                                          v0.2.0                    4e876a9db129        4 months ago        154 MB
quay.io/openebs/openebs-tools                                                    3.8                       196d12cf6ab1        5 months ago        4.41 MB

I will try update to 0.8.1.

@utkarshmani1997
Copy link
Member

No problem @Yaxian

@Yaxian
Copy link

Yaxian commented Feb 28, 2019

Hi @utkarshmani1997 ,

I have updated the ctrl verion to 0.8.1

controlled By:      ReplicaSet/pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac-ctrl-68cfc79cf6
Containers:
  pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac-ctrl-con:
    Container ID:  docker://f578c821f99a4ee5b69e2773d0fff6ec854feb1e3c8dc787376e875a9c9e370a
    Image:         quay.io/openebs/jiva:0.8.1
    Image ID:      docker-pullable://quay.io/openebs/jiva@sha256:b40197d519ef0fa0dbcee47afa091ef46744705813553f774debaf2d75561814
    Ports:         3260/TCP, 9501/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      launch
    Args:
      controller
      --frontend
      gotgt
      --clusterIP
      10.107.222.18
      pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac
    State:          Running
      Started:      Fri, 01 Mar 2019 07:12:15 +0800
    Ready:          True
    Restart Count:  0
    Environment:
      REPLICATION_FACTOR:  1
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-gcrkp (ro)
  maya-volume-exporter:
    Container ID:  docker://13cf1f9c0edd0d2900d5a13a195a520dc7c5f0fea256bb3256529c064b2de493
    Image:         quay.io/openebs/m-exporter:0.8.1
    Image ID:      docker-pullable://quay.io/openebs/m-exporter@sha256:d260f02ecbaac41a1b6d27bf6f1f68f09b0c5ca3968bbeea5653c55e448df745
    Port:          9500/TCP
    Host Port:     0/TCP
    Command:
      maya-exporter
    Args:
      -c=http://127.0.0.1:9501
    State:          Running
      Started:      Fri, 01 Mar 2019 07:12:15 +0800
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-gcrkp (ro)

and delete the ctrl pod.

Here is the goroutine.log.

Should I delete the deployment, svc, pv, pvc of pvc-d9810127-378b-11e9-a3ee-00163e0ec7ac?

Thank you.

@kmova
Copy link
Member

kmova commented Mar 1, 2019

@Yaxian - Did you upgrade using the steps mentioned here: https://docs.openebs.io/docs/next/upgrade.html

(You don't need to delete anything. The steps mentioned in the above docs, will update the 0.8.0 components to 0.8.1)

@Yaxian
Copy link

Yaxian commented Mar 3, 2019

Yes.

./cstor_pool_upgrade.sh cstor-sparse-pool openebs

Patching the csp resource
Patching Pool Deployment with new image
Patching the SP resource
Successfully upgraded cstor-sparse-pool to 0.8.1
Running post pool upgrade scripts for cstor-sparse-pool...
error: unable to upgrade connection: container not found ("cstor-pool-mgmt")
Error: failed to set quorum for pool cstor-f10eac0f-175b-11e9-a3ee-00163e0ec7ac
error: unable to upgrade connection: container not found ("cstor-pool-mgmt")
ERROR: while executing zfs get quorum for pool cstor-f10eac0f-175b-11e9-a3ee-00163e0ec7ac, error: 1
error: unable to upgrade connection: container not found ("cstor-pool-mgmt")
Error: failed to set quorum for pool cstor-f10f5276-175b-11e9-a3ee-00163e0ec7ac
error: unable to upgrade connection: container not found ("cstor-pool-mgmt")
ERROR: while executing zfs get quorum for pool cstor-f10f5276-175b-11e9-a3ee-00163e0ec7ac, error: 1
error: unable to upgrade connection: container not found ("cstor-pool-mgmt")
Error: failed to set quorum for pool cstor-f10e1684-175b-11e9-a3ee-00163e0ec7ac
error: unable to upgrade connection: container not found ("cstor-pool-mgmt")
ERROR: while executing zfs get quorum for pool cstor-f10e1684-175b-11e9-a3ee-00163e0ec7ac, error: 1
Post upgrade for cstor-sparse-pool is failed.
Please retry by runnig ././cstor_pool_post_upgrade.sh cstor-sparse-pool openebs
Post upgrade of cstor-sparse-pool is done successfully to 0.8.1 Please run volume upgrade scripts.

@vishnuitta
Copy link
Member

hi @Yaxian .. sorry for delayed response. I notice that this got missed.
We do have a slack channel https://openebs-community.slack.com/messages where community helps, and, also response would be faster than this. Would you like to join there?

@AmitKumarDas
Copy link
Member

@Yaxian can you please join the slack by following below

https://github.com/openebs/openebs/blob/master/community/README.md

@kmova
Copy link
Member

kmova commented Mar 15, 2019

Thanks @Yaxian for helping resolve this issue. (Thanks @halcyon for the tip above.)

The issue got resolved after installing xfsprogs on the application node. This was with OpenEBS 0.8.0

apt install xfsprogs

The kubelet had the following errors during the mount process:

kubelet[687]: I0315 15:14:54.179765     687 mount_linux.go:453] `fsck` error fsck from util-linux 2.27.1
kubelet[687]: fsck.ext2: Bad magic number in super-block while trying to open /dev/sdn
kubelet[687]: /dev/sdn:
kubelet[687]: The superblock could not be read or does not describe a valid ext2/ext3/ext4
kubelet[687]: filesystem.  If the device is valid and it really contains an ext2/ext3/ext4

And dmesg was showing errors like:

[5985377.220132] XFS (sdn): Invalid superblock magic number
[5985377.306931] XFS (sdn): Invalid superblock magic number

@Yaxian
Copy link

Yaxian commented Mar 15, 2019

It may happen when the value of FSType is xfs provided in the StorageClass.

kmova added a commit to openebs-archive/openebs-docs that referenced this issue Mar 20, 2019
@ranjithwingrider
Copy link
Member

@riponbanik
Copy link

Thanks guys who did all the troubleshooting. In my case, I have found that if iscsi tool is not pre installed on node, even after installing the tool, the controller pod fails to run on port 3260.

root@pvc-20149e4c-292e-4d8f-b9f3-139c2a03b2a3-ctrl-5bfb77ccf5-5gbjt:/# netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp6 0 0 :::9500 :::* LISTEN -
tcp6 0 0 :::9501 :::* LISTEN 1/longhorn

@noyessie
Copy link

Please, I have the same issue. When I run dimes, that's my output:

[12497.347553] blk_update_request: I/O error, dev sdc, sector 70920 op 0x1:(WRITE) flags 0x800 phys_seg 6 prio class 0
[12497.354007] Buffer I/O error on dev sdc, logical block 8865, lost async page write
[12497.358397] Buffer I/O error on dev sdc, logical block 8866, lost async page write
[12497.362752] Buffer I/O error on dev sdc, logical block 8867, lost async page write
[12497.367255] Buffer I/O error on dev sdc, logical block 8868, lost async page write
[12497.371691] Buffer I/O error on dev sdc, logical block 8869, lost async page write
[12498.344212] sd 4:0:0:0: [sdc] tag#6 timing out command, waited 180s
[12498.348436] sd 4:0:0:0: [sdc] tag#6 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[12498.348446] sd 4:0:0:0: [sdc] tag#6 CDB: Write(10) 2a 00 00 04 00 00 00 00 10 00
[12498.348454] blk_update_request: I/O error, dev sdc, sector 262144 op 0x1:(WRITE) flags 0x800 phys_seg 2 prio class 0
[12498.354100] buffer_io_error: 1 callbacks suppressed
[12498.354113] Buffer I/O error on dev sdc, logical block 32768, lost async page write
[12498.358328] Buffer I/O error on dev sdc, logical block 32769, lost async page write
[12499.344950] sd 4:0:0:0: [sdc] tag#7 timing out command, waited 180s
[12499.350506] sd 4:0:0:0: [sdc] tag#7 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[12499.350514] sd 4:0:0:0: [sdc] tag#7 CDB: Write(10) 2a 00 00 0c 00 00 00 00 10 00
[12499.350519] blk_update_request: I/O error, dev sdc, sector 786432 op 0x1:(WRITE) flags 0x800 phys_seg 2 prio class 0
[12499.357107] Buffer I/O error on dev sdc, logical block 98304, lost async page write
[12499.361597] Buffer I/O error on dev sdc, logical block 98305, lost async page write
[12500.346264] sd 4:0:0:0: [sdc] tag#8 timing out command, waited 180s
[12500.351507] sd 4:0:0:0: [sdc] tag#8 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[12500.351518] sd 4:0:0:0: [sdc] tag#8 CDB: Write(10) 2a 00 00 14 00 00 00 00 10 00
[12500.351528] blk_update_request: I/O error, dev sdc, sector 1310720 op 0x1:(WRITE) flags 0x800 phys_seg 2 prio class 0
[12500.358211] Buffer I/O error on dev sdc, logical block 163840, lost async page write
[12500.363707] Buffer I/O error on dev sdc, logical block 163841, lost async page write
[12501.347321] sd 4:0:0:0: [sdc] tag#9 timing out command, waited 180s
[12501.352639] sd 4:0:0:0: [sdc] tag#9 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[12501.352656] sd 4:0:0:0: [sdc] tag#9 CDB: Write(10) 2a 00 00 1c 00 00 00 00 10 00
[12501.352671] blk_update_request: I/O error, dev sdc, sector 1835008 op 0x1:(WRITE) flags 0x800 phys_seg 2 prio class 0
[12501.359322] Buffer I/O error on dev sdc, logical block 229376, lost async page write
[12501.364827] Buffer I/O error on dev sdc, logical block 229377, lost async page write
[12502.349078] sd 4:0:0:0: [sdc] tag#10 timing out command, waited 180s
[12502.354925] sd 4:0:0:0: [sdc] tag#10 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[12502.354949] sd 4:0:0:0: [sdc] tag#10 CDB: Write(10) 2a 00 00 24 00 00 00 00 10 00
[12502.354960] blk_update_request: I/O error, dev sdc, sector 2359296 op 0x1:(WRITE) flags 0x800 phys_seg 2 prio class 0
[12502.364084] Buffer I/O error on dev sdc, logical block 294912, lost async page write
[12502.370920] Buffer I/O error on dev sdc, logical block 294913, lost async page write
[12503.369999] sd 4:0:0:0: [sdc] tag#1 timing out command, waited 180s
[12503.389548] sd 4:0:0:0: [sdc] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[12503.389558] sd 4:0:0:0: [sdc] tag#1 CDB: Write(10) 2a 00 00 00 00 00 00 0a 00 00
[12503.389572] blk_update_request: I/O error, dev sdc, sector 0 op 0x1:(WRITE) flags 0x4800 phys_seg 320 prio class 0
[12503.394808] Buffer I/O error on dev sdc, logical block 0, lost async page write
[12503.399033] Buffer I/O error on dev sdc, logical block 1, lost async page write
[12503.403316] Buffer I/O error on dev sdc, logical block 2, lost async page write
[12503.407690] Buffer I/O error on dev sdc, logical block 3, lost async page write
[12503.411903] Buffer I/O error on dev sdc, logical block 4, lost async page write
[12503.416210] Buffer I/O error on dev sdc, logical block 5, lost async page write
[12503.421254] Buffer I/O error on dev sdc, logical block 6, lost async page write
[12503.425704] Buffer I/O error on dev sdc, logical block 7, lost async page write
[12503.429887] Buffer I/O error on dev sdc, logical block 8, lost async page write
[12503.433855] Buffer I/O error on dev sdc, logical block 9, lost async page write
[12504.387007] sd 4:0:0:0: [sdc] tag#16 timing out command, waited 180s
[12504.391792] sd 4:0:0:0: [sdc] tag#16 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[12504.391810] sd 4:0:0:0: [sdc] tag#16 CDB: Write(10) 2a 00 00 00 0a 00 00 0a 00 00
[12504.391820] blk_update_request: I/O error, dev sdc, sector 2560 op 0x1:(WRITE) flags 0x4800 phys_seg 320 prio class 0

Any idea ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment