[1632.2.1] kernel deathspin on iSCSI session logout without deleting scsi devices before #2357

redbaron · 2018-02-13T16:32:04Z

Bug

Container Linux Version

$ cat /etc/os-release
DISTRIB_ID="Container Linux by CoreOS"
DISTRIB_RELEASE=1632.2.1
DISTRIB_CODENAME="Ladybug"
DISTRIB_DESCRIPTION="Container Linux by CoreOS 1632.2.1 (Ladybug)"

Environment

OEM=vmware_raw

iSCSI mounted disks

Actual Behavior

Now new containers can be started. Already running containers seem to be fine

Other Information

$ docker ps -a
CONTAINER ID        IMAGE                                      COMMAND                  CREATED              STATUS                     PORTS               NAMES
b52baae24f14        gcr.io/google_containers/pause-amd64:3.0   "/pause"                 About a minute ago   Created                                        k8s_POD_apps-sigpage-dev-email-templat-vgt40i-minio-7458565f84-zxhd4_dev_e1f31859-10d8-11e8-9e43-005056bced2e_5

$ dmesg
[18310.079477] Workqueue: netns cleanup_net
[18310.079551] Call Trace:
[18310.079628]  ? __schedule+0x2e3/0x890
[18310.079699]  schedule+0x32/0x80
[18310.079761]  schedule_preempt_disabled+0xa/0x10
[18310.079844]  __mutex_lock.isra.4+0x29a/0x510
[18310.079958]  ? cleanup_net+0x99/0x2c0
[18310.080029]  cleanup_net+0x99/0x2c0
[18310.080099]  process_one_work+0x144/0x350
[18310.080174]  worker_thread+0x4d/0x3e0
[18310.080254]  kthread+0xfc/0x130
[18310.080317]  ? rescuer_thread+0x310/0x310
[18310.080405]  ? kthread_park+0x60/0x60
[18310.080476]  ? do_syscall_64+0x59/0x1c0
[18310.080551]  ? SyS_exit_group+0x10/0x10
[18310.080623]  ret_from_fork+0x35/0x40
[18678.714037] INFO: task kworker/u24:4:13862 blocked for more than 120 seconds.
[18678.714176]       Not tainted 4.14.16-coreos #1
[18678.714258] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[18678.714396] kworker/u24:4   D    0 13862      2 0x80000080
[18678.714500] Workqueue: netns cleanup_net
[18678.714576] Call Trace:
[18678.714635]  ? __schedule+0x2e3/0x890
[18678.714707]  schedule+0x32/0x80
[18678.714771]  schedule_preempt_disabled+0xa/0x10
[18678.714854]  __mutex_lock.isra.4+0x29a/0x510
[18678.714935]  ? cleanup_net+0x99/0x2c0
[18678.715027]  cleanup_net+0x99/0x2c0
[18678.715097]  process_one_work+0x144/0x350
[18678.715173]  worker_thread+0x4d/0x3e0
[18678.715243]  kthread+0xfc/0x130
[18678.715305]  ? rescuer_thread+0x310/0x310
[18678.715380]  ? kthread_park+0x60/0x60
[18678.715450]  ? do_syscall_64+0x59/0x1c0
[18678.715523]  ? SyS_exit_group+0x10/0x10
[18678.715595]  ret_from_fork+0x35/0x40
[18801.593885] INFO: task kworker/u24:4:13862 blocked for more than 120 seconds.
[18801.594195]       Not tainted 4.14.16-coreos #1
[18801.594400] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[18801.594570] kworker/u24:4   D    0 13862      2 0x80000080
[18801.594665] Workqueue: netns cleanup_net
[18801.594758] Call Trace:
[18801.594810]  ? __schedule+0x2e3/0x890
[18801.594877]  schedule+0x32/0x80
[18801.594935]  schedule_preempt_disabled+0xa/0x10
[18801.595011]  __mutex_lock.isra.4+0x29a/0x510
[18801.595083]  ? cleanup_net+0x99/0x2c0
[18801.595146]  cleanup_net+0x99/0x2c0
[18801.595676]  process_one_work+0x144/0x350
[18801.595861]  worker_thread+0x4d/0x3e0
[18801.595928]  kthread+0xfc/0x130
[18801.596547]  ? rescuer_thread+0x310/0x310
[18801.596617]  ? kthread_park+0x60/0x60
[18801.596692]  ? do_syscall_64+0x59/0x1c0
[18801.596761]  ? SyS_exit_group+0x10/0x10
[18801.596829]  ret_from_fork+0x35/0x40

top top

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                       
21248 root      20   0       0      0      0 R  99.1  0.0  17:00.98 kworker/u24:2                                                                                                                                 
 4098 root      20   0 1527328 169428  69072 S   2.2  0.5   7:58.11 kubelet                                                                                                                                       
  855 root      20   0 1214240  25560   8160 S   1.1  0.1   0:21.64 containerd                                                                                                                                    
 4839 root      20   0   47324  34792  21292 S   1.1  0.1   3:01.51 calico-felix

Other Information

I'll add more into once I have it. Server is live and I am not rebooting it, so if you want me to collect some diagnostic I can run commands for you

The text was updated successfully, but these errors were encountered:

redbaron · 2018-02-13T16:42:24Z

$ echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event

$ grep '21248' /sys/kernel/debug/tracing/trace_pipe 
   kworker/u24:2-21248 [003] d.s. 19533.130008: workqueue_queue_work: work struct=ffff9e306be24400 function=iscsi_requeue_task [libiscsi] workqueue=ffff9e3069befa00 req_cpu=512 cpu=4294967295
   kworker/u24:2-21248 [003] d.s. 19533.641997: workqueue_queue_work: work struct=ffff9e3069a9bc00 function=iscsi_requeue_task [libiscsi] workqueue=ffff9e2c6c1ed800 req_cpu=512 cpu=4294967295
   kworker/u24:2-21248 [003] d.s. 19533.642005: workqueue_queue_work: work struct=ffff9e306bcec400 function=iscsi_requeue_task [libiscsi] workqueue=ffff9e2c6ca18000 req_cpu=512 cpu=4294967295
   kworker/u24:2-21248 [003] d.s. 19538.249949: workqueue_queue_work: work struct=ffff9e306be24400 function=iscsi_requeue_task [libiscsi] workqueue=ffff9e3069befa00 req_cpu=512 cpu=4294967295
   kworker/u24:2-21248 [003] d.s. 19538.761944: workqueue_queue_work: work struct=ffff9e3069a9bc00 function=iscsi_requeue_task [libiscsi] workqueue=ffff9e2c6c1ed800 req_cpu=512 cpu=4294967295
   kworker/u24:2-21248 [003] d.s. 19538.761956: workqueue_queue_work: work struct=ffff9e306bcec400 function=iscsi_requeue_task [libiscsi] workqueue=ffff9e2c6ca18000 req_cpu=512 cpu=4294967295
   kworker/u24:2-21248 [003] d.s. 19538.761960: workqueue_queue_work: work struct=ffff9e30691a9400 function=iscsi_requeue_task [libiscsi] workqueue=ffff9e2c6ca1b600 req_cpu=512 cpu=4294967295
   kworker/u24:2-21248 [003] d.h. 19540.445322: workqueue_queue_work: work struct=ffff9e296eeaf918 function=cfq_kick_queue workqueue=ffff9e2c6da09a00 req_cpu=512 cpu=3
   kworker/u24:2-21248 [003] d.s. 19543.369902: workqueue_queue_work: work struct=ffff9e306be24400 function=iscsi_requeue_task [libiscsi] workqueue=ffff9e3069befa00 req_cpu=512 cpu=4294967295
   kworker/u24:2-21248 [003] d.s. 19543.881890: workqueue_queue_work: work struct=ffff9e3069a9bc00 function=iscsi_requeue_task [libiscsi] workqueue=ffff9e2c6c1ed800 req_cpu=512 cpu=4294967295
   kworker/u24:2-21248 [003] d.s. 19543.881901: workqueue_queue_work: work struct=ffff9e30691a9400 function=iscsi_requeue_task [libiscsi] workqueue=ffff9e2c6ca1b600 req_cpu=512 cpu=4294967295
   kworker/u24:2-21248 [003] d.s. 19543.881903: workqueue_queue_work: work struct=ffff9e306bcec400 function=iscsi_requeue_task [libiscsi] workqueue=ffff9e2c6ca18000 req_cpu=512 cpu=4294967295
   kworker/u24:2-21248 [003] d.h. 19544.457798: workqueue_queue_work: work struct=ffff9e296eeaf918 function=cfq_kick_queue workqueue=ffff9e2c6da09a00 req_cpu=512 cpu=3
   kworker/u24:2-21248 [003] d.s. 19546.441862: workqueue_queue_work: work struct=ffffffffc0206580 function=dm_bufio_prefetch [dm_bufio] workqueue=ffff9e296e06f200 req_cpu=512 cpu=3
   kworker/u24:2-21248 [003] d.s. 19548.489844: workqueue_queue_work: work struct=ffff9e306be24400 function=iscsi_requeue_task [libiscsi] workqueue=ffff9e3069befa00 req_cpu=512 cpu=4294967295
   kworker/u24:2-21248 [003] d.s. 19549.001838: workqueue_queue_work: work struct=ffff9e3069a9bc00 function=iscsi_requeue_task [libiscsi] workqueue=ffff9e2c6c1ed800 req_cpu=512 cpu=4294967295
   kworker/u24:2-21248 [003] d.s. 19549.001855: workqueue_queue_work: work struct=ffff9e30691a9400 function=iscsi_requeue_task [libiscsi] workqueue=ffff9e2c6ca1b600 req_cpu=512 cpu=4294967295
   kworker/u24:2-21248 [003] d.s. 19549.001858: workqueue_queue_work: work struct=ffff9e306bcec400 function=iscsi_requeue_task [libiscsi] workqueue=ffff9e2c6ca18000 req_cpu=512 cpu=4294967295
   kworker/u24:2-21248 [003] d.s. 19553.609797: workqueue_queue_work: work struct=ffff9e306be24400 function=iscsi_requeue_task [libiscsi] workqueue=ffff9e3069befa00 req_cpu=512 cpu=4294967295
   kworker/u24:2-21248 [003] d.s. 19554.121785: workqueue_queue_work: work struct=ffff9e3069a9bc00 function=iscsi_requeue_task [libiscsi] workqueue=ffff9e2c6c1ed800 req_cpu=512 cpu=4294967295
   kworker/u24:2-21248 [003] d.s. 19554.121797: workqueue_queue_work: work struct=ffff9e30691a9400 function=iscsi_requeue_task [libiscsi] workqueue=ffff9e2c6ca1b600 req_cpu=512 cpu=4294967295
   kworker/u24:2-21248 [003] d.s. 19554.121798: workqueue_queue_work: work struct=ffff9e306bcec400 function=iscsi_requeue_task [libiscsi] workqueue=ffff9e2c6ca18000 req_cpu=512 cpu=4294967295

redbaron · 2018-02-13T16:47:30Z

$ echo l > /proc/sysrq-trigger

Shows that 21248 is upset

[19806.968294] CPU: 3 PID: 21248 Comm: kworker/u24:2 Not tainted 4.14.16-coreos #1
[19806.968295] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/21/2015
[19806.968302] Workqueue: scsi_wq_3 iscsi_free_session [scsi_transport_iscsi]
[19806.968302] task: ffff9e2c6de19e40 task.stack: ffffbd9711270000
[19806.968305] RIP: 0010:mutex_lock+0x1e/0x30
[19806.968305] RSP: 0018:ffffbd9711273de0 EFLAGS: 00000246
[19806.968306] RAX: 0000000000000000 RBX: ffff9e2c6d827040 RCX: 0000000000000000
[19806.968307] RDX: ffff9e2c6de19e40 RSI: 0000000000000206 RDI: ffff9e2c6d827040
[19806.968307] RBP: ffff9e2c6c3a9000 R08: 0000000000000001 R09: 0000000000000000
[19806.968308] R10: ffffbd9703263e38 R11: 0000000000000358 R12: ffff9e2c6d827000
[19806.968308] R13: ffff9e306be25128 R14: ffff9e2c6c3a9000 R15: ffff9e2c6d827010
[19806.968309] FS:  0000000000000000(0000) GS:ffff9e2c6fcc0000(0000) knlGS:0000000000000000
[19806.968309] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[19806.968310] CR2: 000000c420276748 CR3: 000000012d00a003 CR4: 00000000001606e0
[19806.968333] Call Trace:
[19806.968347]  scsi_remove_device+0x19/0x60 [scsi_mod]
[19806.968354]  scsi_remove_target+0x167/0x1b0 [scsi_mod]
[19806.968355]  iscsi_free_session+0x383/0x430 [scsi_transport_iscsi]
[19806.968366]  process_one_work+0x144/0x350
[19806.968367]  worker_thread+0x4d/0x3e0
[19806.968369]  kthread+0xfc/0x130
[19806.968370]  ? rescuer_thread+0x310/0x310
[19806.968371]  ? kthread_park+0x60/0x60
[19806.968372]  ? do_syscall_64+0xe9/0x1c0
[19806.968374]  ret_from_fork+0x35/0x40

redbaron · 2018-02-13T16:55:44Z

Somewhat related https://patchwork.kernel.org/patch/7987711/

redbaron · 2018-02-13T19:51:31Z

Difference between "stable" nodes:

Container Linux 1632.2.1 vs 1576.3.0
Kubelet 1.9.3 vs 1.8.4

It seems I can reliably reproduce test case. All is needed is to run POD with PersistentVolumeClaim, which then dynamically provisions PersistentVolume (we use netapp/trident for that), then delete the pod so that no more iSCSI volumes are used on the servers. AFAIK kubelet then triggers iSCSI session disconnect, which might be triggering some bug in the kernel

redbaron · 2018-02-13T19:56:33Z

$ echo 1 > /sys/module/libiscsi/parameters/debug_libiscsi_conn
$ echo 1 > /sys/module/libiscsi/parameters/debug_libiscsi_eh
$ echo 1>  /sys/module/libiscsi/parameters/debug_libiscsi_session

Doesn't show any iscsi commands which are different from freshly started and cordoned node (== no iSCSI volumes mounted, but iSCSI discovery is completed and disks are seen in multipath -ll):

$ dmesg -wH
...

[  +5.110978]  connection3:0: iscsi_check_transport_timeouts Sending nopout as ping
[  +0.000157]  connection3:0: iscsi_check_transport_timeouts Setting next tmo 4296848776
[  +0.000155]  connection1:0: iscsi_check_transport_timeouts Sending nopout as ping
[  +0.000136]  connection1:0: iscsi_check_transport_timeouts Setting next tmo 4296848776
[  +0.000140]  connection4:0: iscsi_check_transport_timeouts Sending nopout as ping
[  +0.000282]  connection4:0: iscsi_check_transport_timeouts Setting next tmo 4296848776
[  +0.000169]  connection2:0: iscsi_check_transport_timeouts Sending nopout as ping
[  +0.000174]  connection2:0: iscsi_check_transport_timeouts Setting next tmo 4296848777
[  +0.000166]  session3: iscsi_prep_mgmt_task mgmtpdu [op 0x0 hdr->itt 0x1e datalen 0]
[  +0.000008]  session1: iscsi_prep_mgmt_task mgmtpdu [op 0x0 hdr->itt 0x21 datalen 0]
[  +0.000008]  session4: iscsi_prep_mgmt_task mgmtpdu [op 0x0 hdr->itt 0x59 datalen 0]
[  +0.000005]  session2: iscsi_prep_mgmt_task mgmtpdu [op 0x0 hdr->itt 0x2e datalen 0]
[  +0.000665]  session4: __iscsi_complete_pdu [op 0x20 cid 0 itt 0x59 len 0]
[  +0.000143]  session4: iscsi_complete_task complete task itt 0x59 state 3 sc           (null)
[  +0.000152]  session4: iscsi_free_task freeing task itt 0x59 state 1 sc           (null)
[  +0.000170]  session1: __iscsi_complete_pdu [op 0x20 cid 0 itt 0x21 len 0]
[  +0.000123]  session1: iscsi_complete_task complete task itt 0x21 state 3 sc           (null)
[  +0.000151]  session1: iscsi_free_task freeing task itt 0x21 state 1 sc           (null)
[  +0.000201]  session2: __iscsi_complete_pdu [op 0x20 cid 0 itt 0x2e len 0]
[  +0.000154]  session2: iscsi_complete_task complete task itt 0x2e state 3 sc           (null)
[  +0.000154]  session2: iscsi_free_task freeing task itt 0x2e state 1 sc           (null)
[  +0.000237]  session3: __iscsi_complete_pdu [op 0x20 cid 0 itt 0x1e len 0]
[  +0.000123]  session3: iscsi_complete_task complete task itt 0x1e state 3 sc           (null)
[  +0.000152]  session3: iscsi_free_task freeing task itt 0x1e state 1 sc           (null)
[  +0.004162]  connection1:0: iscsi_check_transport_timeouts Setting next tmo 4296848778
[  +0.000148]  connection4:0: iscsi_check_transport_timeouts Setting next tmo 4296848778
[  +0.000850]  connection3:0: iscsi_check_transport_timeouts Setting next tmo 4296848780
[  +0.000144]  connection2:0: iscsi_check_transport_timeouts Setting next tmo 4296848779

redbaron · 2018-02-13T20:10:44Z

enabling scsi full logging (from https://github.com/hreinecke/sg3_utils/blob/master/scripts/scsi_logging_level)

$ echo 460175067 > /sys/module/scsi_mod/parameters/scsi_logging_level

shows no commands going to iscsi devices, just occasional sda writes, so it is not some sort of retry or SAN error

redbaron · 2018-02-13T20:14:19Z

kubelet errors:

$ journalctl -b0  -u kubelet |grep iscsi.*go:
Feb 13 19:39:38 erepnk13 kubelet-wrapper[4119]: E0213 19:39:38.596868    4119 iscsi_util.go:338] iscsi: failed to mount iscsi volume /dev/dm-9 [ext4] to /var/lib/kubelet/plugins/kubernetes.io/iscsi/iface-default/10.1.72.201:3260-iqn.1992-08.com.netapp:sn.440d6a6ad6a311e79b3500a0981e9162:vs.13-lun-9, error mount failed: exit status 32
Feb 13 19:39:54 erepnk13 kubelet-wrapper[4119]: E0213 19:39:54.355119    4119 iscsi_util.go:338] iscsi: failed to mount iscsi volume /dev/dm-14 [ext4] to /var/lib/kubelet/plugins/kubernetes.io/iscsi/iface-default/10.1.72.201:3260-iqn.1992-08.com.netapp:sn.440d6a6ad6a311e79b3500a0981e9162:vs.13-lun-12, error mount failed: exit status 32
Feb 13 19:42:26 erepnk13 kubelet-wrapper[4119]: W0213 19:42:26.055943    4119 iscsi_util.go:353] Warning: Unmount skipped because path does not exist: /var/lib/kubelet/plugins/kubernetes.io/iscsi/iface-/dev-apps-sigpage-dev-email-templat-qbf8m5-minio-9ec04:3260-dev-apps-sigpage-dev-email-templat-qbf8m5-minio-9ec04-lun-0
Feb 13 19:42:29 erepnk13 kubelet-wrapper[4119]: I0213 19:42:29.127381    4119 iscsi_util.go:408] iscsi: log out target 10.1.72.201:3260 iqn iqn.1992-08.com.netapp:sn.440d6a6ad6a311e79b3500a0981e9162:vs.13 iface default
Feb 13 19:42:29 erepnk13 kubelet-wrapper[4119]: W0213 19:42:29.664023    4119 iscsi_util.go:353] Warning: Unmount skipped because path does not exist: /var/lib/kubelet/plugins/kubernetes.io/iscsi/iface-/dev-apps-sigpage-dev-email-templat-qbf8m5-postgresql-9ec8c:3260-dev-apps-sigpage-dev-email-templat-qbf8m5-postgresql-9ec8c-lun-0
Feb 13 19:43:00 erepnk13 kubelet-wrapper[4119]: E0213 19:43:00.953910    4119 iscsi_util.go:338] iscsi: failed to mount iscsi volume /dev/dm-9 [ext4] to /var/lib/kubelet/plugins/kubernetes.io/iscsi/iface-default/10.1.72.201:3260-iqn.1992-08.com.netapp:sn.440d6a6ad6a311e79b3500a0981e9162:vs.13-lun-9, error mount failed: exit status 32

redbaron · 2018-02-13T20:16:10Z

Remaining mounts on the host:

/dev/mapper/mpathr /var/lib/rkt/pods/run/9ef1244e-c5c3-4379-ab5a-ab2a88fb03f9/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/kubelet/plugins/kubernetes.io/iscsi/iface-default/10.1.72.201:3260-iqn.1992-08.com.netapp:sn.440d6a6ad6a311e79b3500a0981e9162:vs.13-lun-9 ext4 rw,seclabel,relatime,stripe=16,data=ordered 0 0
/dev/mapper/mpathr /var/lib/kubelet/plugins/kubernetes.io/iscsi/iface-default/10.1.72.201:3260-iqn.1992-08.com.netapp:sn.440d6a6ad6a311e79b3500a0981e9162:vs.13-lun-9 ext4 rw,seclabel,relatime,stripe=16,data=ordered 0 0
/dev/mapper/mpathr /var/lib/rkt/pods/run/9ef1244e-c5c3-4379-ab5a-ab2a88fb03f9/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/kubelet/pods/13f7b778-10f6-11e8-9e43-005056bced2e/volumes/kubernetes.io~iscsi/dev-apps-sigpage-dev-email-templat-qbf8m5-minio-13f41 ext4 rw,seclabel,relatime,stripe=16,data=ordered 0 0
/dev/mapper/mpathr /var/lib/kubelet/pods/13f7b778-10f6-11e8-9e43-005056bced2e/volumes/kubernetes.io~iscsi/dev-apps-sigpage-dev-email-templat-qbf8m5-minio-13f41 ext4 rw,seclabel,relatime,stripe=16,data=ordered 0 0

rkt pod is kubelet

redbaron · 2018-02-13T20:45:42Z

Running command which kubelet tries to run:

$ iscsiadm -m node  -p 10.1.72.201 3260 -T iqn.1992-08.com.netapp:sn.440d6a6ad6a311e79b3500a0981e9162:vs.13 --logout -I default
Logging out of session [sid: 1, target: iqn.1992-08.com.netapp:sn.440d6a6ad6a311e79b3500a0981e9162:vs.13, portal: 10.1.72.201,3260]
iscsiadm: Could not logout of [sid: 1, target: iqn.1992-08.com.netapp:sn.440d6a6ad6a311e79b3500a0981e9162:vs.13, portal: 10.1.72.201,3260].
iscsiadm: initiator reported error (9 - internal error)
iscsiadm: Could not logout of all requested sessions

iscsid logs:

Feb 13 20:42:20 erepnk13 iscsid[884]: session in invalid state for logout. Try again later

I deleted all scsi devices from that session manually:

for d in $(iscsiadm -m session -P3 |grep -A 1 scsi3  |awk '/scsi disk/ {print $4}'); do echo 1 > /sys/block/$d/device/delete; done

Now session looks like following:

iscsiadm -m session -r 1 -P3
iSCSI Transport Class version 2.0-870
version 2.0-873
Target: iqn.1992-08.com.netapp:sn.440d6a6ad6a311e79b3500a0981e9162:vs.13
        Current Portal: 10.1.72.201:3260,1046
        Persistent Portal: 10.1.72.201:3260,1046
                **********
                Interface:
                **********
                Iface Name: default
                Iface Transport: tcp
                Iface Initiatorname: iqn.2016-04.com.coreos.iscsi:erepnk13
                Iface IPaddress: 10.1.72.7
                Iface HWaddress: <empty>
                Iface Netdev: <empty>
                SID: 1
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
                *********
                Timeouts:
                *********
                Recovery Timeout: 5
                Target Reset Timeout: 30
                LUN Reset Timeout: 30
                Abort Timeout: 15
                *****
                CHAP:
                *****
                username: <empty>
                password: ********
                username_in: <empty>
                password_in: ********
                ************************
                Negotiated iSCSI params:
                ************************
                HeaderDigest: None
                DataDigest: None
                MaxRecvDataSegmentLength: 262144
                MaxXmitDataSegmentLength: 65536
                FirstBurstLength: 65536
                MaxBurstLength: 65536
                ImmediateData: Yes
                InitialR2T: No
                MaxOutstandingR2T: 1
                ************************
                Attached SCSI devices:
                ************************
                Host Number: 3  State: running

redbaron · 2018-02-14T17:23:10Z

It seems that nothing needs to be mounted. I masked kubelet, rebooted node. Simple logout run iscsiadm -m node -p 10.1.72.201 3260 -T iqn.1992-08.com.netapp:sn.440d6a6ad6a311e79b3500a0981e9162:vs.13 --logout -I default sends it into death spiral

Debug iscsid entry on logout:

Feb 14 17:21:36 erepnk13 iscsid[872]: poll result 1
Feb 14 17:21:36 erepnk13 iscsid[872]: in kunbind_session
Feb 14 17:21:36 erepnk13 iscsid[872]: in __kipc_call
Feb 14 17:21:36 erepnk13 iscsid[872]: in kwritev
Feb 14 17:21:36 erepnk13 iscsid[872]: in nlpayload_read
Feb 14 17:21:36 erepnk13 iscsid[872]: in nlpayload_read

redbaron · 2018-02-14T17:54:29Z

Logging out from all targets sends 4 kworkers into infinite spin. same as iscsiadm -m node --logoutall=all

Found safe sequence to logout, which doesn't cause problems and succeeds:

multipath -F
for d in $(iscsiadm -m session -P3 |awk '/scsi disk/ {print $4}'); do echo 1 > /sys/block/$d/device/delete; done
iscsiadm -m node --logoutall=all

redbaron · 2018-02-15T17:38:51Z

Asked for help in open-iscsi maillist: https://groups.google.com/forum/#!topic/open-iscsi/Tc6ERb1QOBg

redbaron · 2018-02-15T19:36:22Z

"Bisecting":

1576.3.0 (4.13.16-coreos-r1) - OK
1576.4.0 (4.14.11-coreos) - FAIL

Interesting observation, on 1632.2.1 (didn't try other) on clean logout (multipath -F, delete scsi devices, iscsiadm logout):

$ lsmod |grep '^scsi_dh_alua'
scsi_dh_alua           20480  92

which is exactly 23 (number of LUNs) * 4 (number of portals) , so even scsi devices are gone, alua reference count didn't change.

redbaron · 2018-02-15T20:19:02Z

Hm, interesting! masking multipathd , rebooting and coming back - logout works just fine. Without multipathd even starting iscsi_tcp module reference count is 4 (== number of iSCSI portals), but with it running it is 92...

redbaron · 2018-02-15T20:23:10Z

With multipathd running, while kworkers are spinning on CPU, iscsi_tcp reference count slooowly going down (starting from 188 for some reason :( ):

iscsi_tcp              20480  151

redbaron · 2018-02-16T12:50:14Z

"Bisecting":
1576.3.0 (4.13.16-coreos-r1) - OK
1590.0.0 (4.14.0-rc8-coreos) - FAIL

redbaron · 2018-02-16T17:00:25Z

Reproduced on fedora. Fedora kernels print more info to dmesg:

https://bugzilla.redhat.com/show_bug.cgi?id=1546228

redbaron · 2018-02-18T18:57:20Z

git bisect points to fbce4d97fd ("scsi: fixup kernel warning during rmmod()") , posted findings to lkml: https://lkml.org/lkml/2018/2/18/89

redbaron · 2018-02-19T15:39:45Z

cherry-picking commit 81b6c9998979 ('scsi: core: check for device state in __scsi_remove_target()') on top of 4.14.20 fixes the problem for me

dm0- · 2018-02-19T22:45:18Z

Thanks for tracking this down. I'll cherry-pick the commit onto our 4.14.20 branch so it should be in the next releases.

redbaron · 2018-02-19T22:57:24Z

I suppose there are no reasons why this patch can't be cherry-picked to 4.14 stable branch by upstream. I'll ask them, lets see what they say

redbaron · 2018-02-21T18:44:47Z

@dm0- patch was pulled into 4.14-stable branch and will be released in 4.14.21: https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/commit/?id=a9ca15212fc81273f4f9f5efdececda0d4c06808

dm0- · 2018-02-21T18:50:13Z

Thanks, it looks like it will be released Friday, so I will update to it then to get it into next week's alpha.

redbaron · 2018-02-21T18:54:32Z

Any chance kernel can be updated in stable without rest of updates?

dm0- · 2018-02-21T19:11:45Z

Yes, we update the LTS kernel in all channels. The next stable minor release isn't scheduled, though, so the kernel update will be queued for release whenever there is an important security issue or bug fix that warrants it. (Unless a beta gets promoted to stable first, which happens in about five weeks.)

redbaron · 2018-02-26T16:29:20Z

@dm0- , is there an easy way to rebuild stable image with 4.14.21 kernel locally?

dm0- · 2018-02-26T16:36:28Z

You could run this in the SDK:

repo init -b build-1632
repo sync
./build_packages
./build_image prod

Then use the image_to_vm script if necessary to build a platform-specific image format.

dm0- · 2018-02-26T16:46:01Z

Oh, and I forgot our test images are public, so you could download this image (which also has a signature): http://builds.developer.core-os.net/boards/amd64-usr/1632.3.0%2Bjenkins2-build-1632%2Blocal-1195/coreos_production_image.bin.bz2 . Note that is stable with 4.14.22.

redbaron · 2018-02-26T17:13:29Z

Interesting! so I don't need to build anything? :) Where is full index of test images? http://builds.developer.core-os.net/boards/amd64-usr/ doesn't have 1632.3.0, even if your full URL works.

dm0- · 2018-02-26T23:27:11Z

Sorry, indexes are not generated for the development bucket, and unfortunately the directory names are unpredictable due to Jenkins build numbers being included in the OS version. There also is no easy way to determine the changes in a development build without access to an internal CI interface. (You could technically look up the build number in https://github.com/coreos/manifest-builds/tags and trace back manifest commits if you were so determined.)

Images for production releases such as 1632.3.0 are in a different bucket (which does generate indexes): https://stable.release.core-os.net/amd64-usr/1632.3.0

euank · 2018-02-27T02:16:17Z

The bucket doesn't have up-to-date html indexes generated, but it's a google-cloud-storage bucket that allows anonymous read.

It can be accessed via the google cloud web console (https://console.cloud.google.com/storage/browser/builds.developer.core-os.net/boards/amd64-usr) or via gsutil ls and such commands.

The console link requires some form of google login, but the api doesn't (e.g. the json list of objects works with no auth).

Of course, none of the above is documented... maybe we should generate indexes again...

dm0- · 2018-02-27T23:19:29Z

I'm going to close this issue since the fixed kernel version is queued in all channels, and beta and alpha are releasing tomorrow.

r7vme · 2018-04-13T14:20:36Z

We are hitting exactly the same issue. Big thanks to @redbaron for digging this down.

dm0- · 2018-04-13T16:02:04Z

@r7vme Fixed kernels are in every channel now, so you might be seeing a different problem. Do you have error messages or backtraces?

r7vme · 2018-04-16T08:22:38Z

@dm0- We are on older kernel 4.14.11-coreos. Yes, we are planning to upgrade to latest stable.

bgilbert added kind/bug component/kernel team/os area/stability labels Feb 13, 2018

redbaron mentioned this issue Feb 14, 2018

shouldn't iscsiadm logout be aborted if scsi devices still present? open-iscsi/open-iscsi#90

Closed

redbaron changed the title ~~[1632.2.1] Docker containers stuck in "Created" + "Workqueue: netns cleanup_net" hung task~~ [1632.2.1] kernel deathspin on iSCSI session logout without deleting scsi devices before Feb 15, 2018

dm0- mentioned this issue Feb 20, 2018

Backport an iSCSI fix to 4.14.20 coreos/linux#146

Merged

dm0- mentioned this issue Feb 22, 2018

Rebase 4.14.20 patches onto 4.14.21 coreos/linux#147

Merged

dm0- closed this as completed Feb 27, 2018

redbaron mentioned this issue Mar 7, 2018

Consider enabling debug/fortify features in kernel #2378

Open

[1632.2.1] kernel deathspin on iSCSI session logout without deleting scsi devices before #2357

[1632.2.1] kernel deathspin on iSCSI session logout without deleting scsi devices before #2357

Comments

redbaron commented Feb 13, 2018

Bug

Container Linux Version

Environment

Actual Behavior

Other Information

Other Information

redbaron commented Feb 13, 2018 • edited Loading

redbaron commented Feb 13, 2018

redbaron commented Feb 13, 2018

redbaron commented Feb 13, 2018

redbaron commented Feb 13, 2018

redbaron commented Feb 13, 2018

redbaron commented Feb 13, 2018

redbaron commented Feb 13, 2018 • edited Loading

redbaron commented Feb 13, 2018

redbaron commented Feb 14, 2018 • edited Loading

redbaron commented Feb 14, 2018

redbaron commented Feb 15, 2018

redbaron commented Feb 15, 2018

redbaron commented Feb 15, 2018

redbaron commented Feb 15, 2018

redbaron commented Feb 16, 2018

redbaron commented Feb 16, 2018

redbaron commented Feb 18, 2018

redbaron commented Feb 19, 2018 • edited Loading

dm0- commented Feb 19, 2018

redbaron commented Feb 19, 2018

redbaron commented Feb 21, 2018

dm0- commented Feb 21, 2018

redbaron commented Feb 21, 2018

dm0- commented Feb 21, 2018

redbaron commented Feb 26, 2018

dm0- commented Feb 26, 2018 • edited Loading

dm0- commented Feb 26, 2018

redbaron commented Feb 26, 2018

dm0- commented Feb 26, 2018

euank commented Feb 27, 2018 • edited Loading

dm0- commented Feb 27, 2018

r7vme commented Apr 13, 2018

dm0- commented Apr 13, 2018

r7vme commented Apr 16, 2018

redbaron commented Feb 13, 2018 •

edited

Loading

redbaron commented Feb 13, 2018 •

edited

Loading

redbaron commented Feb 14, 2018 •

edited

Loading

redbaron commented Feb 19, 2018 •

edited

Loading

dm0- commented Feb 26, 2018 •

edited

Loading

euank commented Feb 27, 2018 •

edited

Loading