Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

self-hosting hang/deadlock (zpool on zvol) #4694

Closed
nigoroll opened this issue May 24, 2016 · 14 comments
Closed

self-hosting hang/deadlock (zpool on zvol) #4694

nigoroll opened this issue May 24, 2016 · 14 comments
Labels
Component: Test Suite Indicates an issue with the test framework or a test case

Comments

@nigoroll
Copy link

nigoroll commented May 24, 2016

On solaris-ish systems, zfs is capabable of self hosting / creating zpools within other zpools

For me, this leads to a hang/deadlock:

zfs create -ps -V 1t pool/t/vol
zpool create newpool /dev/zvol/pool/t/vol &
sleep 5;
zpool list
  • debian 8.4
  • Linux haggis 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2 (2016-04-08) x86_64 GNU/Linux
  • zfs-dkms 0.6.5.7-8-jessie
May 24 16:07:53 haggis kernel: [23285.684420] systemd-udevd   D ffff880468c9d908     0   371      1 0x00000000
May 24 16:07:53 haggis kernel: [23285.684423]  ffff880468c9d4b0 0000000000000086 0000000000012f00 ffff88046c06bfd8
May 24 16:07:53 haggis kernel: [23285.684425]  0000000000012f00 ffff880468c9d4b0 ffffffffa0d7cb20 ffff88046c06ba90
May 24 16:07:53 haggis kernel: [23285.684427]  ffffffffa0d7cb24 ffff880468c9d4b0 00000000ffffffff ffffffffa0d7cb28
May 24 16:07:53 haggis kernel: [23285.684428] Call Trace:
May 24 16:07:53 haggis kernel: [23285.684440]  [<ffffffff815115f5>] ? schedule_preempt_disabled+0x25/0x70
May 24 16:07:53 haggis kernel: [23285.684442]  [<ffffffff815130a3>] ? __mutex_lock_slowpath+0xd3/0x1c0
May 24 16:07:53 haggis kernel: [23285.684445]  [<ffffffff815131ab>] ? mutex_lock+0x1b/0x2a
May 24 16:07:53 haggis kernel: [23285.684465]  [<ffffffffa0bf4e6e>] ? spa_open_common+0x4e/0x460 [zfs]
May 24 16:07:53 haggis kernel: [23285.684470]  [<ffffffffa0b622ef>] ? tsd_hash_dtor+0x6f/0x80 [spl]
May 24 16:07:53 haggis kernel: [23285.684483]  [<ffffffffa0bd860e>] ? dsl_pool_hold+0x1e/0x50 [zfs]
May 24 16:07:53 haggis kernel: [23285.684494]  [<ffffffffa0bb5842>] ? dmu_objset_hold+0x22/0xb0 [zfs]
May 24 16:07:53 haggis kernel: [23285.684505]  [<ffffffffa0bd8d9e>] ? dsl_prop_get+0x2e/0x80 [zfs]
May 24 16:07:53 haggis kernel: [23285.684517]  [<ffffffffa0c54a8a>] ? zvol_open+0x15a/0x290 [zfs]
May 24 16:07:53 haggis kernel: [23285.684521]  [<ffffffff811dd43c>] ? __blkdev_get+0xcc/0x480
May 24 16:07:53 haggis kernel: [23285.684523]  [<ffffffff811ddb40>] ? blkdev_get_by_dev+0x40/0x40
May 24 16:07:53 haggis kernel: [23285.684524]  [<ffffffff811dd9a6>] ? blkdev_get+0x1b6/0x310
May 24 16:07:53 haggis kernel: [23285.684526]  [<ffffffff811ddb40>] ? blkdev_get_by_dev+0x40/0x40
May 24 16:07:53 haggis kernel: [23285.684529]  [<ffffffff811a62a2>] ? do_dentry_open+0x1f2/0x330
May 24 16:07:53 haggis kernel: [23285.684531]  [<ffffffff811a65ad>] ? finish_open+0x2d/0x40
May 24 16:07:53 haggis kernel: [23285.684533]  [<ffffffff811b72ca>] ? do_last+0xaaa/0x1200
May 24 16:07:53 haggis kernel: [23285.684534]  [<ffffffff811b38a6>] ? link_path_walk+0x286/0x8a0
May 24 16:07:53 haggis kernel: [23285.684536]  [<ffffffff811b7adb>] ? path_openat+0xbb/0x680
May 24 16:07:53 haggis kernel: [23285.684538]  [<ffffffff8117abb5>] ? free_pages_and_swap_cache+0x95/0xb0
May 24 16:07:53 haggis kernel: [23285.684539]  [<ffffffff811b884a>] ? do_filp_open+0x3a/0x90
May 24 16:07:53 haggis kernel: [23285.684542]  [<ffffffff811c48ac>] ? __alloc_fd+0x7c/0x120
May 24 16:07:53 haggis kernel: [23285.684544]  [<ffffffff811a7ae9>] ? do_sys_open+0x129/0x220
May 24 16:07:53 haggis kernel: [23285.684546]  [<ffffffff81514a0d>] ? system_call_fast_compare_end+0x10/0x15
May 24 16:07:53 haggis kernel: [23285.684710] qemu-system-x86 D ffff88044ef746e8     0 15748      1 0x00000000
May 24 16:07:53 haggis kernel: [23285.684712]  ffff88044ef74290 0000000000000082 0000000000012f00 ffff8803d055bfd8
May 24 16:07:53 haggis kernel: [23285.684713]  0000000000012f00 ffff88044ef74290 ffffffffa0e00ce0 ffff8803d055be58
May 24 16:07:53 haggis kernel: [23285.684715]  ffffffffa0e00ce4 ffff88044ef74290 00000000ffffffff ffffffffa0e00ce8
May 24 16:07:53 haggis kernel: [23285.684716] Call Trace:
May 24 16:07:53 haggis kernel: [23285.684720]  [<ffffffff815115f5>] ? schedule_preempt_disabled+0x25/0x70
May 24 16:07:53 haggis kernel: [23285.684722]  [<ffffffff815130a3>] ? __mutex_lock_slowpath+0xd3/0x1c0
May 24 16:07:53 haggis kernel: [23285.684724]  [<ffffffff815131ab>] ? mutex_lock+0x1b/0x2a
May 24 16:07:53 haggis kernel: [23285.684737]  [<ffffffffa0c55d36>] ? zvol_release+0x36/0x90 [zfs]
May 24 16:07:53 haggis kernel: [23285.684739]  [<ffffffff811dd32d>] ? __blkdev_put+0x15d/0x1a0
May 24 16:07:53 haggis kernel: [23285.684741]  [<ffffffff811ddd81>] ? blkdev_close+0x21/0x30
May 24 16:07:53 haggis kernel: [23285.684743]  [<ffffffff811aa1ea>] ? __fput+0xca/0x1d0
May 24 16:07:53 haggis kernel: [23285.684745]  [<ffffffff810852dc>] ? task_work_run+0x8c/0xb0
May 24 16:07:53 haggis kernel: [23285.684747]  [<ffffffff81012e99>] ? do_notify_resume+0x69/0xa0
May 24 16:07:53 haggis kernel: [23285.684749]  [<ffffffff81514cca>] ? int_signal+0x12/0x17
May 24 16:07:53 haggis kernel: [23285.684755] zpool           D ffff8802865d70b8     0 15932  15869 0x00000004
May 24 16:07:53 haggis kernel: [23285.684756]  ffff8802865d6c60 0000000000000082 0000000000012f00 ffff88037bcbbfd8
May 24 16:07:53 haggis kernel: [23285.684758]  0000000000012f00 ffff8802865d6c60 ffff8803e4930a98 ffff88037bcbbb28
May 24 16:07:53 haggis kernel: [23285.684759]  ffff8803e4930a9c ffff8802865d6c60 00000000ffffffff ffff8803e4930aa0
May 24 16:07:53 haggis kernel: [23285.684761] Call Trace:
May 24 16:07:53 haggis kernel: [23285.684762]  [<ffffffff815115f5>] ? schedule_preempt_disabled+0x25/0x70
May 24 16:07:53 haggis kernel: [23285.684764]  [<ffffffff815130a3>] ? __mutex_lock_slowpath+0xd3/0x1c0
May 24 16:07:53 haggis kernel: [23285.684767]  [<ffffffff815131ab>] ? mutex_lock+0x1b/0x2a
May 24 16:07:53 haggis kernel: [23285.684768]  [<ffffffff811dd3d3>] ? __blkdev_get+0x63/0x480
May 24 16:07:53 haggis kernel: [23285.684770]  [<ffffffff811dd9fb>] ? blkdev_get+0x20b/0x310
May 24 16:07:53 haggis kernel: [23285.684772]  [<ffffffff811be3df>] ? dput+0x1f/0x170
May 24 16:07:53 haggis kernel: [23285.684774]  [<ffffffff811ddd24>] ? blkdev_get_by_path+0x54/0x90
May 24 16:07:53 haggis kernel: [23285.684786]  [<ffffffffa0c0aad5>] ? vdev_disk_open+0x355/0x3c0 [zfs]
May 24 16:07:53 haggis kernel: [23285.684798]  [<ffffffffa0c07cd8>] ? vdev_open+0xe8/0x500 [zfs]
May 24 16:07:53 haggis kernel: [23285.684810]  [<ffffffffa0c55dad>] ? zvol_is_zvol+0x1d/0x40 [zfs]
May 24 16:07:53 haggis kernel: [23285.684820]  [<ffffffffa0c08144>] ? vdev_open_children+0x54/0x170 [zfs]
May 24 16:07:53 haggis kernel: [23285.684831]  [<ffffffffa0c11853>] ? vdev_root_open+0x43/0xe0 [zfs]
May 24 16:07:53 haggis kernel: [23285.684840]  [<ffffffffa0c07cd8>] ? vdev_open+0xe8/0x500 [zfs]
May 24 16:07:53 haggis kernel: [23285.684849]  [<ffffffffa0c082cd>] ? vdev_create+0x1d/0xa0 [zfs]
May 24 16:07:53 haggis kernel: [23285.684852]  [<ffffffffa0a0313e>] ? zfs_allocatable_devs+0x5e/0x80 [zcommon]
May 24 16:07:53 haggis kernel: [23285.684865]  [<ffffffffa0bf5f3f>] ? spa_create+0x36f/0x9a0 [zfs]
May 24 16:07:53 haggis kernel: [23285.684876]  [<ffffffffa0c27309>] ? zfs_fill_zplprops_impl+0x1d9/0x380 [zfs]
May 24 16:07:53 haggis kernel: [23285.684886]  [<ffffffffa0c2b3a4>] ? zfs_ioc_pool_create+0x144/0x260 [zfs]
May 24 16:07:53 haggis kernel: [23285.684897]  [<ffffffffa0c29669>] ? zfsdev_ioctl+0x4a9/0x4e0 [zfs]
May 24 16:07:53 haggis kernel: [23285.684898]  [<ffffffff811bacdf>] ? do_vfs_ioctl+0x2cf/0x4b0
May 24 16:07:53 haggis kernel: [23285.684900]  [<ffffffff811baf41>] ? SyS_ioctl+0x81/0xa0
May 24 16:07:53 haggis kernel: [23285.684902]  [<ffffffff81516a28>] ? page_fault+0x28/0x30
May 24 16:07:53 haggis kernel: [23285.684904]  [<ffffffff81514a0d>] ? system_call_fast_compare_end+0x10/0x15
May 24 16:07:53 haggis kernel: [23285.684936] zpool           D ffff8803a5306568     0 16377  15894 0x00000000
May 24 16:07:53 haggis kernel: [23285.684938]  ffff8803a5306110 0000000000000082 0000000000012f00 ffff88037de3ffd8
May 24 16:07:53 haggis kernel: [23285.684939]  0000000000012f00 ffff8803a5306110 ffffffffa0d7cb20 ffff88037de3fe28
May 24 16:07:53 haggis kernel: [23285.684941]  ffffffffa0d7cb24 ffff8803a5306110 00000000ffffffff ffffffffa0d7cb28
May 24 16:07:53 haggis kernel: [23285.684942] Call Trace:
May 24 16:07:53 haggis kernel: [23285.684945]  [<ffffffff815115f5>] ? schedule_preempt_disabled+0x25/0x70
May 24 16:07:53 haggis kernel: [23285.684946]  [<ffffffff815130a3>] ? __mutex_lock_slowpath+0xd3/0x1c0
May 24 16:07:53 haggis kernel: [23285.684949]  [<ffffffff815131ab>] ? mutex_lock+0x1b/0x2a
May 24 16:07:53 haggis kernel: [23285.684961]  [<ffffffffa0bf94f9>] ? spa_all_configs+0x49/0x180 [zfs]
May 24 16:07:53 haggis kernel: [23285.684971]  [<ffffffffa0c26906>] ? zfs_ioc_pool_configs+0x16/0x40 [zfs]
May 24 16:07:53 haggis kernel: [23285.684981]  [<ffffffffa0c29669>] ? zfsdev_ioctl+0x4a9/0x4e0 [zfs]
May 24 16:07:53 haggis kernel: [23285.684983]  [<ffffffff811bacdf>] ? do_vfs_ioctl+0x2cf/0x4b0
May 24 16:07:53 haggis kernel: [23285.684984]  [<ffffffff811baf41>] ? SyS_ioctl+0x81/0xa0
May 24 16:07:53 haggis kernel: [23285.684985]  [<ffffffff81516a28>] ? page_fault+0x28/0x30
May 24 16:07:53 haggis kernel: [23285.684987]  [<ffffffff81514a0d>] ? system_call_fast_compare_end+0x10/0x15
May 24 16:09:53 haggis kernel: [23405.713235] systemd-udevd   D ffff880468c9d908     0   371      1 0x00000000
May 24 16:09:53 haggis kernel: [23405.713244]  ffff880468c9d4b0 0000000000000086 0000000000012f00 ffff88046c06bfd8
May 24 16:09:53 haggis kernel: [23405.713250]  0000000000012f00 ffff880468c9d4b0 ffffffffa0d7cb20 ffff88046c06ba90
May 24 16:09:53 haggis kernel: [23405.713256]  ffffffffa0d7cb24 ffff880468c9d4b0 00000000ffffffff ffffffffa0d7cb28
May 24 16:09:53 haggis kernel: [23405.713262] Call Trace:
May 24 16:09:53 haggis kernel: [23405.713290]  [<ffffffff815115f5>] ? schedule_preempt_disabled+0x25/0x70
May 24 16:09:53 haggis kernel: [23405.713298]  [<ffffffff815130a3>] ? __mutex_lock_slowpath+0xd3/0x1c0
May 24 16:09:53 haggis kernel: [23405.713308]  [<ffffffff815131ab>] ? mutex_lock+0x1b/0x2a
May 24 16:09:53 haggis kernel: [23405.713375]  [<ffffffffa0bf4e6e>] ? spa_open_common+0x4e/0x460 [zfs]
May 24 16:09:53 haggis kernel: [23405.713391]  [<ffffffffa0b622ef>] ? tsd_hash_dtor+0x6f/0x80 [spl]
May 24 16:09:53 haggis kernel: [23405.713441]  [<ffffffffa0bd860e>] ? dsl_pool_hold+0x1e/0x50 [zfs]
May 24 16:09:53 haggis kernel: [23405.713479]  [<ffffffffa0bb5842>] ? dmu_objset_hold+0x22/0xb0 [zfs]
May 24 16:09:53 haggis kernel: [23405.713522]  [<ffffffffa0bd8d9e>] ? dsl_prop_get+0x2e/0x80 [zfs]
May 24 16:09:53 haggis kernel: [23405.713566]  [<ffffffffa0c54a8a>] ? zvol_open+0x15a/0x290 [zfs]
May 24 16:09:53 haggis kernel: [23405.713580]  [<ffffffff811dd43c>] ? __blkdev_get+0xcc/0x480
May 24 16:09:53 haggis kernel: [23405.713587]  [<ffffffff811ddb40>] ? blkdev_get_by_dev+0x40/0x40
May 24 16:09:53 haggis kernel: [23405.713594]  [<ffffffff811dd9a6>] ? blkdev_get+0x1b6/0x310
May 24 16:09:53 haggis kernel: [23405.713601]  [<ffffffff811ddb40>] ? blkdev_get_by_dev+0x40/0x40
May 24 16:09:53 haggis kernel: [23405.713609]  [<ffffffff811a62a2>] ? do_dentry_open+0x1f2/0x330
May 24 16:09:53 haggis kernel: [23405.713616]  [<ffffffff811a65ad>] ? finish_open+0x2d/0x40
May 24 16:09:53 haggis kernel: [23405.713621]  [<ffffffff811b72ca>] ? do_last+0xaaa/0x1200
May 24 16:09:53 haggis kernel: [23405.713626]  [<ffffffff811b38a6>] ? link_path_walk+0x286/0x8a0
May 24 16:09:53 haggis kernel: [23405.713633]  [<ffffffff811b7adb>] ? path_openat+0xbb/0x680
May 24 16:09:53 haggis kernel: [23405.713640]  [<ffffffff8117abb5>] ? free_pages_and_swap_cache+0x95/0xb0
May 24 16:09:53 haggis kernel: [23405.713646]  [<ffffffff811b884a>] ? do_filp_open+0x3a/0x90
May 24 16:09:53 haggis kernel: [23405.713653]  [<ffffffff811c48ac>] ? __alloc_fd+0x7c/0x120
May 24 16:09:53 haggis kernel: [23405.713660]  [<ffffffff811a7ae9>] ? do_sys_open+0x129/0x220
May 24 16:09:53 haggis kernel: [23405.713669]  [<ffffffff81514a0d>] ? system_call_fast_compare_end+0x10/0x15
May 24 16:09:53 haggis kernel: [23405.714001] qemu-system-x86 D ffff88044ef746e8     0 15748      1 0x00000000
May 24 16:09:53 haggis kernel: [23405.714008]  ffff88044ef74290 0000000000000082 0000000000012f00 ffff8803d055bfd8
May 24 16:09:53 haggis kernel: [23405.714013]  0000000000012f00 ffff88044ef74290 ffffffffa0e00ce0 ffff8803d055be58
May 24 16:09:53 haggis kernel: [23405.714018]  ffffffffa0e00ce4 ffff88044ef74290 00000000ffffffff ffffffffa0e00ce8
May 24 16:09:53 haggis kernel: [23405.714024] Call Trace:
May 24 16:09:53 haggis kernel: [23405.714037]  [<ffffffff815115f5>] ? schedule_preempt_disabled+0x25/0x70
May 24 16:09:53 haggis kernel: [23405.714043]  [<ffffffff815130a3>] ? __mutex_lock_slowpath+0xd3/0x1c0
May 24 16:09:53 haggis kernel: [23405.714052]  [<ffffffff815131ab>] ? mutex_lock+0x1b/0x2a
May 24 16:09:53 haggis kernel: [23405.714095]  [<ffffffffa0c55d36>] ? zvol_release+0x36/0x90 [zfs]
May 24 16:09:53 haggis kernel: [23405.714102]  [<ffffffff811dd32d>] ? __blkdev_put+0x15d/0x1a0
May 24 16:09:53 haggis kernel: [23405.714109]  [<ffffffff811ddd81>] ? blkdev_close+0x21/0x30
May 24 16:09:53 haggis kernel: [23405.714114]  [<ffffffff811aa1ea>] ? __fput+0xca/0x1d0
May 24 16:09:53 haggis kernel: [23405.714121]  [<ffffffff810852dc>] ? task_work_run+0x8c/0xb0
May 24 16:09:53 haggis kernel: [23405.714129]  [<ffffffff81012e99>] ? do_notify_resume+0x69/0xa0
May 24 16:09:53 haggis kernel: [23405.714136]  [<ffffffff81514cca>] ? int_signal+0x12/0x17
May 24 16:09:53 haggis kernel: [23405.714152] zpool           D ffff8802865d70b8     0 15932  15869 0x00000004
May 24 16:09:53 haggis kernel: [23405.714157]  ffff8802865d6c60 0000000000000082 0000000000012f00 ffff88037bcbbfd8
May 24 16:09:53 haggis kernel: [23405.714163]  0000000000012f00 ffff8802865d6c60 ffff8803e4930a98 ffff88037bcbbb28
May 24 16:09:53 haggis kernel: [23405.714168]  ffff8803e4930a9c ffff8802865d6c60 00000000ffffffff ffff8803e4930aa0
May 24 16:09:53 haggis kernel: [23405.714173] Call Trace:
May 24 16:09:53 haggis kernel: [23405.714179]  [<ffffffff815115f5>] ? schedule_preempt_disabled+0x25/0x70
May 24 16:09:53 haggis kernel: [23405.714184]  [<ffffffff815130a3>] ? __mutex_lock_slowpath+0xd3/0x1c0
May 24 16:09:53 haggis kernel: [23405.714192]  [<ffffffff815131ab>] ? mutex_lock+0x1b/0x2a
May 24 16:09:53 haggis kernel: [23405.714198]  [<ffffffff811dd3d3>] ? __blkdev_get+0x63/0x480
May 24 16:09:53 haggis kernel: [23405.714204]  [<ffffffff811dd9fb>] ? blkdev_get+0x20b/0x310
May 24 16:09:53 haggis kernel: [23405.714212]  [<ffffffff811be3df>] ? dput+0x1f/0x170
May 24 16:09:53 haggis kernel: [23405.714218]  [<ffffffff811ddd24>] ? blkdev_get_by_path+0x54/0x90
May 24 16:09:53 haggis kernel: [23405.714265]  [<ffffffffa0c0aad5>] ? vdev_disk_open+0x355/0x3c0 [zfs]
May 24 16:09:53 haggis kernel: [23405.714309]  [<ffffffffa0c07cd8>] ? vdev_open+0xe8/0x500 [zfs]
May 24 16:09:53 haggis kernel: [23405.714350]  [<ffffffffa0c55dad>] ? zvol_is_zvol+0x1d/0x40 [zfs]
May 24 16:09:53 haggis kernel: [23405.714389]  [<ffffffffa0c08144>] ? vdev_open_children+0x54/0x170 [zfs]
May 24 16:09:53 haggis kernel: [23405.714430]  [<ffffffffa0c11853>] ? vdev_root_open+0x43/0xe0 [zfs]
May 24 16:09:53 haggis kernel: [23405.714466]  [<ffffffffa0c07cd8>] ? vdev_open+0xe8/0x500 [zfs]
May 24 16:09:53 haggis kernel: [23405.714501]  [<ffffffffa0c082cd>] ? vdev_create+0x1d/0xa0 [zfs]
May 24 16:09:53 haggis kernel: [23405.714512]  [<ffffffffa0a0313e>] ? zfs_allocatable_devs+0x5e/0x80 [zcommon]
May 24 16:09:53 haggis kernel: [23405.714559]  [<ffffffffa0bf5f3f>] ? spa_create+0x36f/0x9a0 [zfs]
May 24 16:09:53 haggis kernel: [23405.714601]  [<ffffffffa0c27309>] ? zfs_fill_zplprops_impl+0x1d9/0x380 [zfs]
May 24 16:09:53 haggis kernel: [23405.714641]  [<ffffffffa0c2b3a4>] ? zfs_ioc_pool_create+0x144/0x260 [zfs]
May 24 16:09:53 haggis kernel: [23405.714679]  [<ffffffffa0c29669>] ? zfsdev_ioctl+0x4a9/0x4e0 [zfs]
May 24 16:09:53 haggis kernel: [23405.714686]  [<ffffffff811bacdf>] ? do_vfs_ioctl+0x2cf/0x4b0
May 24 16:09:53 haggis kernel: [23405.714692]  [<ffffffff811baf41>] ? SyS_ioctl+0x81/0xa0
May 24 16:09:53 haggis kernel: [23405.714698]  [<ffffffff81516a28>] ? page_fault+0x28/0x30
May 24 16:09:53 haggis kernel: [23405.714705]  [<ffffffff81514a0d>] ? system_call_fast_compare_end+0x10/0x15
May 24 16:09:53 haggis kernel: [23405.714775] zpool           D ffff8803a5306568     0 16377  15894 0x00000000
May 24 16:09:53 haggis kernel: [23405.714781]  ffff8803a5306110 0000000000000082 0000000000012f00 ffff88037de3ffd8
May 24 16:09:53 haggis kernel: [23405.714786]  0000000000012f00 ffff8803a5306110 ffffffffa0d7cb20 ffff88037de3fe28
May 24 16:09:53 haggis kernel: [23405.714791]  ffffffffa0d7cb24 ffff8803a5306110 00000000ffffffff ffffffffa0d7cb28
May 24 16:09:53 haggis kernel: [23405.714797] Call Trace:
May 24 16:09:53 haggis kernel: [23405.714807]  [<ffffffff815115f5>] ? schedule_preempt_disabled+0x25/0x70
May 24 16:09:53 haggis kernel: [23405.714813]  [<ffffffff815130a3>] ? __mutex_lock_slowpath+0xd3/0x1c0
May 24 16:09:53 haggis kernel: [23405.714821]  [<ffffffff815131ab>] ? mutex_lock+0x1b/0x2a
May 24 16:09:53 haggis kernel: [23405.714868]  [<ffffffffa0bf94f9>] ? spa_all_configs+0x49/0x180 [zfs]
May 24 16:09:53 haggis kernel: [23405.714905]  [<ffffffffa0c26906>] ? zfs_ioc_pool_configs+0x16/0x40 [zfs]
May 24 16:09:53 haggis kernel: [23405.714942]  [<ffffffffa0c29669>] ? zfsdev_ioctl+0x4a9/0x4e0 [zfs]
May 24 16:09:53 haggis kernel: [23405.714949]  [<ffffffff811bacdf>] ? do_vfs_ioctl+0x2cf/0x4b0
May 24 16:09:53 haggis kernel: [23405.714955]  [<ffffffff811baf41>] ? SyS_ioctl+0x81/0xa0
May 24 16:09:53 haggis kernel: [23405.714960]  [<ffffffff81516a28>] ? page_fault+0x28/0x30
May 24 16:09:53 haggis kernel: [23405.714967]  [<ffffffff81514a0d>] ? system_call_fast_compare_end+0x10/0x15
May 24 16:11:53 haggis kernel: [23525.741944] systemd-udevd   D ffff880468c9d908     0   371      1 0x00000000
May 24 16:11:53 haggis kernel: [23525.741953]  ffff880468c9d4b0 0000000000000086 0000000000012f00 ffff88046c06bfd8
May 24 16:11:53 haggis kernel: [23525.741959]  0000000000012f00 ffff880468c9d4b0 ffffffffa0d7cb20 ffff88046c06ba90
May 24 16:11:53 haggis kernel: [23525.741965]  ffffffffa0d7cb24 ffff880468c9d4b0 00000000ffffffff ffffffffa0d7cb28
May 24 16:11:53 haggis kernel: [23525.741971] Call Trace:
May 24 16:11:53 haggis kernel: [23525.742001]  [<ffffffff815115f5>] ? schedule_preempt_disabled+0x25/0x70
May 24 16:11:53 haggis kernel: [23525.742008]  [<ffffffff815130a3>] ? __mutex_lock_slowpath+0xd3/0x1c0
May 24 16:11:53 haggis kernel: [23525.742018]  [<ffffffff815131ab>] ? mutex_lock+0x1b/0x2a
May 24 16:11:53 haggis kernel: [23525.742086]  [<ffffffffa0bf4e6e>] ? spa_open_common+0x4e/0x460 [zfs]
May 24 16:11:53 haggis kernel: [23525.742102]  [<ffffffffa0b622ef>] ? tsd_hash_dtor+0x6f/0x80 [spl]
May 24 16:11:53 haggis kernel: [23525.742153]  [<ffffffffa0bd860e>] ? dsl_pool_hold+0x1e/0x50 [zfs]
May 24 16:11:53 haggis kernel: [23525.742190]  [<ffffffffa0bb5842>] ? dmu_objset_hold+0x22/0xb0 [zfs]
May 24 16:11:53 haggis kernel: [23525.742234]  [<ffffffffa0bd8d9e>] ? dsl_prop_get+0x2e/0x80 [zfs]
May 24 16:11:53 haggis kernel: [23525.742277]  [<ffffffffa0c54a8a>] ? zvol_open+0x15a/0x290 [zfs]
May 24 16:11:53 haggis kernel: [23525.742293]  [<ffffffff811dd43c>] ? __blkdev_get+0xcc/0x480
May 24 16:11:53 haggis kernel: [23525.742300]  [<ffffffff811ddb40>] ? blkdev_get_by_dev+0x40/0x40
May 24 16:11:53 haggis kernel: [23525.742306]  [<ffffffff811dd9a6>] ? blkdev_get+0x1b6/0x310
May 24 16:11:53 haggis kernel: [23525.742313]  [<ffffffff811ddb40>] ? blkdev_get_by_dev+0x40/0x40
May 24 16:11:53 haggis kernel: [23525.742323]  [<ffffffff811a62a2>] ? do_dentry_open+0x1f2/0x330
May 24 16:11:53 haggis kernel: [23525.742330]  [<ffffffff811a65ad>] ? finish_open+0x2d/0x40
May 24 16:11:53 haggis kernel: [23525.742337]  [<ffffffff811b72ca>] ? do_last+0xaaa/0x1200
May 24 16:11:53 haggis kernel: [23525.742342]  [<ffffffff811b38a6>] ? link_path_walk+0x286/0x8a0
May 24 16:11:53 haggis kernel: [23525.742348]  [<ffffffff811b7adb>] ? path_openat+0xbb/0x680
May 24 16:11:53 haggis kernel: [23525.742356]  [<ffffffff8117abb5>] ? free_pages_and_swap_cache+0x95/0xb0
May 24 16:11:53 haggis kernel: [23525.742362]  [<ffffffff811b884a>] ? do_filp_open+0x3a/0x90
May 24 16:11:53 haggis kernel: [23525.742369]  [<ffffffff811c48ac>] ? __alloc_fd+0x7c/0x120
May 24 16:11:53 haggis kernel: [23525.742377]  [<ffffffff811a7ae9>] ? do_sys_open+0x129/0x220
May 24 16:11:53 haggis kernel: [23525.742385]  [<ffffffff81514a0d>] ? system_call_fast_compare_end+0x10/0x15
May 24 16:11:53 haggis kernel: [23525.742712] qemu-system-x86 D ffff88044ef746e8     0 15748      1 0x00000000
May 24 16:11:53 haggis kernel: [23525.742719]  ffff88044ef74290 0000000000000082 0000000000012f00 ffff8803d055bfd8
May 24 16:11:53 haggis kernel: [23525.742725]  0000000000012f00 ffff88044ef74290 ffffffffa0e00ce0 ffff8803d055be58
May 24 16:11:53 haggis kernel: [23525.742730]  ffffffffa0e00ce4 ffff88044ef74290 00000000ffffffff ffffffffa0e00ce8
May 24 16:11:53 haggis kernel: [23525.742735] Call Trace:
May 24 16:11:53 haggis kernel: [23525.742747]  [<ffffffff815115f5>] ? schedule_preempt_disabled+0x25/0x70
May 24 16:11:53 haggis kernel: [23525.742753]  [<ffffffff815130a3>] ? __mutex_lock_slowpath+0xd3/0x1c0
May 24 16:11:53 haggis kernel: [23525.742762]  [<ffffffff815131ab>] ? mutex_lock+0x1b/0x2a
May 24 16:11:53 haggis kernel: [23525.742803]  [<ffffffffa0c55d36>] ? zvol_release+0x36/0x90 [zfs]
May 24 16:11:53 haggis kernel: [23525.742811]  [<ffffffff811dd32d>] ? __blkdev_put+0x15d/0x1a0
May 24 16:11:53 haggis kernel: [23525.742817]  [<ffffffff811ddd81>] ? blkdev_close+0x21/0x30
May 24 16:11:53 haggis kernel: [23525.742823]  [<ffffffff811aa1ea>] ? __fput+0xca/0x1d0
May 24 16:11:53 haggis kernel: [23525.742830]  [<ffffffff810852dc>] ? task_work_run+0x8c/0xb0
May 24 16:11:53 haggis kernel: [23525.742839]  [<ffffffff81012e99>] ? do_notify_resume+0x69/0xa0
May 24 16:11:53 haggis kernel: [23525.742846]  [<ffffffff81514cca>] ? int_signal+0x12/0x17

Note that this case is different from #2469

@nigoroll
Copy link
Author

is is reproducible

May 24 16:58:53 haggis kernel: [  360.419136] zpool           D ffff88044e712df8     0  8316   6409 0x00000000
May 24 16:58:53 haggis kernel: [  360.419139]  ffff88044e7129a0 0000000000000082 0000000000012f00 ffff880062647fd8
May 24 16:58:53 haggis kernel: [  360.419141]  0000000000012f00 ffff88044e7129a0 ffff88046c65f218 ffff880062647ad8
May 24 16:58:53 haggis kernel: [  360.419142]  ffff88046c65f21c ffff88044e7129a0 00000000ffffffff ffff88046c65f220
May 24 16:58:53 haggis kernel: [  360.419144] Call Trace:
May 24 16:58:53 haggis kernel: [  360.419150]  [<ffffffff815115f5>] ? schedule_preempt_disabled+0x25/0x70
May 24 16:58:53 haggis kernel: [  360.419153]  [<ffffffff815130a3>] ? __mutex_lock_slowpath+0xd3/0x1c0
May 24 16:58:53 haggis kernel: [  360.419159]  [<ffffffff815131ab>] ? mutex_lock+0x1b/0x2a
May 24 16:58:53 haggis kernel: [  360.419163]  [<ffffffff811dd3d3>] ? __blkdev_get+0x63/0x480
May 24 16:58:53 haggis kernel: [  360.419165]  [<ffffffff811dd62d>] ? __blkdev_get+0x2bd/0x480
May 24 16:58:53 haggis kernel: [  360.419167]  [<ffffffff811dd9fb>] ? blkdev_get+0x20b/0x310
May 24 16:58:53 haggis kernel: [  360.419170]  [<ffffffff811be3df>] ? dput+0x1f/0x170
May 24 16:58:53 haggis kernel: [  360.419172]  [<ffffffff811ddd24>] ? blkdev_get_by_path+0x54/0x90
May 24 16:58:53 haggis kernel: [  360.419195]  [<ffffffffa0beaad5>] ? vdev_disk_open+0x355/0x3c0 [zfs]
May 24 16:58:53 haggis kernel: [  360.419208]  [<ffffffffa0be7cd8>] ? vdev_open+0xe8/0x500 [zfs]
May 24 16:58:53 haggis kernel: [  360.419219]  [<ffffffffa0c35dad>] ? zvol_is_zvol+0x1d/0x40 [zfs]
May 24 16:58:53 haggis kernel: [  360.419229]  [<ffffffffa0be8144>] ? vdev_open_children+0x54/0x170 [zfs]
May 24 16:58:53 haggis kernel: [  360.419240]  [<ffffffffa0bf1853>] ? vdev_root_open+0x43/0xe0 [zfs]
May 24 16:58:53 haggis kernel: [  360.419250]  [<ffffffffa0be7cd8>] ? vdev_open+0xe8/0x500 [zfs]
May 24 16:58:53 haggis kernel: [  360.419260]  [<ffffffffa0be82cd>] ? vdev_create+0x1d/0xa0 [zfs]
May 24 16:58:53 haggis kernel: [  360.419264]  [<ffffffffa09ec13e>] ? zfs_allocatable_devs+0x5e/0x80 [zcommon]
May 24 16:58:53 haggis kernel: [  360.419276]  [<ffffffffa0bd5f3f>] ? spa_create+0x36f/0x9a0 [zfs]
May 24 16:58:53 haggis kernel: [  360.419288]  [<ffffffffa0c07309>] ? zfs_fill_zplprops_impl+0x1d9/0x380 [zfs]
May 24 16:58:53 haggis kernel: [  360.419299]  [<ffffffffa0c0b3a4>] ? zfs_ioc_pool_create+0x144/0x260 [zfs]
May 24 16:58:53 haggis kernel: [  360.419310]  [<ffffffffa0c09669>] ? zfsdev_ioctl+0x4a9/0x4e0 [zfs]
May 24 16:58:53 haggis kernel: [  360.419311]  [<ffffffff811bacdf>] ? do_vfs_ioctl+0x2cf/0x4b0
May 24 16:58:53 haggis kernel: [  360.419313]  [<ffffffff811baf41>] ? SyS_ioctl+0x81/0xa0
May 24 16:58:53 haggis kernel: [  360.419315]  [<ffffffff81514a0d>] ? system_call_fast_compare_end+0x10/0x15
May 24 16:58:53 haggis kernel: [  360.419319] systemd-udevd   D ffff880459b6d748     0  8334    376 0x00000004
May 24 16:58:53 haggis kernel: [  360.419321]  ffff880459b6d2f0 0000000000000086 0000000000012f00 ffff880449bd3fd8
May 24 16:58:53 haggis kernel: [  360.419322]  0000000000012f00 ffff880459b6d2f0 ffffffffa0d5cb20 ffff880449bd3a90
May 24 16:58:53 haggis kernel: [  360.419324]  ffffffffa0d5cb24 ffff880459b6d2f0 00000000ffffffff ffffffffa0d5cb28
May 24 16:58:53 haggis kernel: [  360.419325] Call Trace:
May 24 16:58:53 haggis kernel: [  360.419328]  [<ffffffff815115f5>] ? schedule_preempt_disabled+0x25/0x70
May 24 16:58:53 haggis kernel: [  360.419329]  [<ffffffff815130a3>] ? __mutex_lock_slowpath+0xd3/0x1c0
May 24 16:58:53 haggis kernel: [  360.419332]  [<ffffffff815131ab>] ? mutex_lock+0x1b/0x2a
May 24 16:58:53 haggis kernel: [  360.419344]  [<ffffffffa0bd4e6e>] ? spa_open_common+0x4e/0x460 [zfs]
May 24 16:58:53 haggis kernel: [  360.419348]  [<ffffffffa0b722ef>] ? tsd_hash_dtor+0x6f/0x80 [spl]
May 24 16:58:53 haggis kernel: [  360.419361]  [<ffffffffa0bb860e>] ? dsl_pool_hold+0x1e/0x50 [zfs]
May 24 16:58:53 haggis kernel: [  360.419370]  [<ffffffffa0b95842>] ? dmu_objset_hold+0x22/0xb0 [zfs]
May 24 16:58:53 haggis kernel: [  360.419382]  [<ffffffffa0bb8d9e>] ? dsl_prop_get+0x2e/0x80 [zfs]
May 24 16:58:53 haggis kernel: [  360.419393]  [<ffffffffa0c34a8a>] ? zvol_open+0x15a/0x290 [zfs]
May 24 16:58:53 haggis kernel: [  360.419396]  [<ffffffff811dd43c>] ? __blkdev_get+0xcc/0x480
May 24 16:58:53 haggis kernel: [  360.419397]  [<ffffffff811ddb40>] ? blkdev_get_by_dev+0x40/0x40
May 24 16:58:53 haggis kernel: [  360.419399]  [<ffffffff811dd9a6>] ? blkdev_get+0x1b6/0x310
May 24 16:58:53 haggis kernel: [  360.419401]  [<ffffffff811ddb40>] ? blkdev_get_by_dev+0x40/0x40
May 24 16:58:53 haggis kernel: [  360.419403]  [<ffffffff811a62a2>] ? do_dentry_open+0x1f2/0x330
May 24 16:58:53 haggis kernel: [  360.419405]  [<ffffffff811a65ad>] ? finish_open+0x2d/0x40
May 24 16:58:53 haggis kernel: [  360.419407]  [<ffffffff811b72ca>] ? do_last+0xaaa/0x1200
May 24 16:58:53 haggis kernel: [  360.419408]  [<ffffffff811b3691>] ? link_path_walk+0x71/0x8a0
May 24 16:58:53 haggis kernel: [  360.419409]  [<ffffffff811b7adb>] ? path_openat+0xbb/0x680
May 24 16:58:53 haggis kernel: [  360.419412]  [<ffffffff8117abb5>] ? free_pages_and_swap_cache+0x95/0xb0
May 24 16:58:53 haggis kernel: [  360.419413]  [<ffffffff811b884a>] ? do_filp_open+0x3a/0x90
May 24 16:58:53 haggis kernel: [  360.419415]  [<ffffffff811c48ac>] ? __alloc_fd+0x7c/0x120
May 24 16:58:53 haggis kernel: [  360.419417]  [<ffffffff811a7ae9>] ? do_sys_open+0x129/0x220
May 24 16:58:53 haggis kernel: [  360.419419]  [<ffffffff81514a0d>] ? system_call_fast_compare_end+0x10/0x15
May 24 16:58:53 haggis kernel: [  360.419453] zpool           D ffff88044629cdb8     0  8737   8697 0x00000000
May 24 16:58:53 haggis kernel: [  360.419454]  ffff88044629c960 0000000000000086 0000000000012f00 ffff88044e81bfd8
May 24 16:58:53 haggis kernel: [  360.419456]  0000000000012f00 ffff88044629c960 ffffffffa0d5cb20 ffff88044e81be28
May 24 16:58:53 haggis kernel: [  360.419457]  ffffffffa0d5cb24 ffff88044629c960 00000000ffffffff ffffffffa0d5cb28
May 24 16:58:53 haggis kernel: [  360.419459] Call Trace:
May 24 16:58:53 haggis kernel: [  360.419461]  [<ffffffff815115f5>] ? schedule_preempt_disabled+0x25/0x70
May 24 16:58:53 haggis kernel: [  360.419463]  [<ffffffff815130a3>] ? __mutex_lock_slowpath+0xd3/0x1c0
May 24 16:58:53 haggis kernel: [  360.419465]  [<ffffffff815131ab>] ? mutex_lock+0x1b/0x2a
May 24 16:58:53 haggis kernel: [  360.419478]  [<ffffffffa0bd94f9>] ? spa_all_configs+0x49/0x180 [zfs]
May 24 16:58:53 haggis kernel: [  360.419489]  [<ffffffffa0c06906>] ? zfs_ioc_pool_configs+0x16/0x40 [zfs]
May 24 16:58:53 haggis kernel: [  360.419499]  [<ffffffffa0c09669>] ? zfsdev_ioctl+0x4a9/0x4e0 [zfs]
May 24 16:58:53 haggis kernel: [  360.419501]  [<ffffffff811bacdf>] ? do_vfs_ioctl+0x2cf/0x4b0
May 24 16:58:53 haggis kernel: [  360.419503]  [<ffffffff811baf41>] ? SyS_ioctl+0x81/0xa0
May 24 16:58:53 haggis kernel: [  360.419505]  [<ffffffff81516a28>] ? page_fault+0x28/0x30
May 24 16:58:53 haggis kernel: [  360.419507]  [<ffffffff81514a0d>] ? system_call_fast_compare_end+0x10/0x15

@nigoroll nigoroll reopened this May 24, 2016
@behlendorf
Copy link
Contributor

@nigoroll actually on Solaris-ish systems this isn't 100% safe either and may deadlock. It just happens to be much much more likely with ZoL due to a greater degree of concurrency allowed in the Linux block layer. But your definitely right this can deadlock and should be resolved, patches welcome. :)

@nigoroll
Copy link
Author

@behlendorf yes, you are right, and I've experienced a similar thing shortly after logging this issue: https://www.illumos.org/issues/6994
I thought I remembered a blog post from the early days of ZFS where one of the core devs proudly talked about the self-hosting capabilities, but yesterday I failed to dig it up again. I'd love to spend some time on this, let's see if I can.

@behlendorf behlendorf added Bug Component: Test Suite Indicates an issue with the test framework or a test case labels Jun 7, 2016
@richardelling
Copy link
Contributor

disagree, the deadlocks are below the volume interface (in ZFS module), not above it. So changing the wiring above the volume interface makes no difference.

@FransUrbo
Copy link
Contributor

Yeah, I've had it hang via a VirtualBox and iSCSI (physical disk->ZFS->ZVolume->iSCSI->Network->Physical Machine->VirtualBox VM->ZFS on VM disks).

@behlendorf
Copy link
Contributor

@kpande no it's not a regression, this has never been 100% safe. Over the years we'd addressed some of the possible deadlocks bet we never got all of them.

@behlendorf behlendorf removed the Bug label Sep 30, 2016
@behlendorf
Copy link
Contributor

@kpande nothing comes to mind. But if you wanted to add such as thing all you'd need to do is update vdev_open_children() so it always takes the first conditional where vdev_uses_zvols is true. This serializes the opening on the child devs which slows down importing a pool but removes the risk of deadlocking during import when laying pools on zvols.

@behlendorf
Copy link
Contributor

@kpande your error look more like #5118 than a deadlock. During normal operation I don't think we have any reported deadlocks for layering a pool on a pool.

@ryao
Copy link
Contributor

ryao commented Oct 16, 2016

@behlendorf @kpande's description of his problem in IRC is the ZFS stacking problem where a zvol is the backing device for another device that is the vdev of another pool. zvol_is_zvol() didn't handle that because it is impossible to detect. #5286 should provide a knob that can be used to workaround the problem as @kpande described it in IRC.

@nigoroll Would you try reproducing with this patch applied and the parallel open explicitly disabled? Either we disabled parallel vdev open and things deadlocked anyway, or our detection code for disabling it failed to realize the device is a zvol due to the symlink. It is somewhat hard to read the stack traces without kernel debug symbols, but my guess is that the problem is one of those two. If the symlink caused the problem, then it should be an easy fix.

@behlendorf
Copy link
Contributor

Closing as duplicate of #3484.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Test Suite Indicates an issue with the test framework or a test case
Projects
None yet
Development

No branches or pull requests

6 participants
@behlendorf @FransUrbo @richardelling @ryao @nigoroll and others