-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PANIC at zvol.c:1165:zvol_resume() #6263
Comments
Potentially the locking changesin #6226 will resolve this although it's difficult to say for certain. |
I have hit the same issue even with a more recent zfs version that includes that PR. version: 0.7.0-rc4_75_g5b7bb9838
Edit: |
@MyPod thanks for posting. We should get @tuxoko and @bprotopopov's thoughts on this. |
I took a look at the code, and I believe the locks are held across suspend/resume properly, such that a suspended zvol cannot be disowned as a result of Also, while reviewing the |
@MyPod can you please describe the structure of the filesystem that you are replicating, in particular, w.r.t. zvols ? |
in the first case, the sending side had several zvols under For zvols, the only parameters that is set is volblocksize, to 8K. The pool themselves have the following properties deviating from the default values:
Sample zvol:
All the zvols were pretty small at the time they were sent, being fresh installations of minimal Debian Jessie/Stretch, so 1-5GB each |
While ZFS allow renaming of in use ZVOLs at the DSL level without issues the ZVOL layer does not correctly update the renamed dataset if the device node is open (zv->zv_open_count > 0): trying to access the stale dataset name, for instance during a zfs receive, will cause the following failure: VERIFY3(zv->zv_objset->os_dsl_dataset->ds_owner == zv) failed ((null) == ffff8800dbb6fc00) PANIC at zvol.c:1255:zvol_resume() Showing stack for process 1390 CPU: 0 PID: 1390 Comm: zfs Tainted: P O 3.16.0-4-amd64 #1 Debian 3.16.51-3 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 0000000000000000 ffffffff8151ea00 ffffffffa0758a80 ffff88028aefba30 ffffffffa0417219 ffff880037179220 ffffffff00000030 ffff88028aefba40 ffff88028aefb9e0 2833594649524556 6f5f767a3e2d767a 6f3e2d7465736a62 Call Trace: [<0>] ? dump_stack+0x5d/0x78 [<0>] ? spl_panic+0xc9/0x110 [spl] [<0>] ? mutex_lock+0xe/0x2a [<0>] ? zfs_refcount_remove_many+0x1ad/0x250 [zfs] [<0>] ? rrw_exit+0xc8/0x2e0 [zfs] [<0>] ? mutex_lock+0xe/0x2a [<0>] ? dmu_objset_from_ds+0x9a/0x250 [zfs] [<0>] ? dmu_objset_hold_flags+0x71/0xc0 [zfs] [<0>] ? zvol_resume+0x178/0x280 [zfs] [<0>] ? zfs_ioc_recv_impl+0x88b/0xf80 [zfs] [<0>] ? zfs_refcount_remove_many+0x1ad/0x250 [zfs] [<0>] ? zfs_ioc_recv+0x1c2/0x2a0 [zfs] [<0>] ? dmu_buf_get_user+0x13/0x20 [zfs] [<0>] ? __alloc_pages_nodemask+0x166/0xb50 [<0>] ? zfsdev_ioctl+0x896/0x9c0 [zfs] [<0>] ? handle_mm_fault+0x464/0x1140 [<0>] ? do_vfs_ioctl+0x2cf/0x4b0 [<0>] ? __do_page_fault+0x177/0x410 [<0>] ? SyS_ioctl+0x81/0xa0 [<0>] ? async_page_fault+0x28/0x30 [<0>] ? system_call_fast_compare_end+0x10/0x15 Reviewed by: Tom Caputi <tcaputi@datto.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes #6263 Closes #8371
System information
Describe the problem you're observing
While receiving a zvol from another machine, I obtained this:
Describe how to reproduce the problem
This seems to happen infrequently on zfs receive. I don't have a 100% way to reproduce it, but it has happened a few times and forced me to reboot the machine involved as the pool affected would become unresponsive (zfs and zpool commands becoming unresponsive, no I/O possible from VMs running on the same pool)
The command used was
zfs send -Rv mov/stretch-git@send | ssh root@192.168.2.7 zfs recv -uv data/virt/git
however I have had this happen even locally (between two different pools on the same machine)
If anything else is needed, I have not yet rebooted this system, but will have to soon.
The text was updated successfully, but these errors were encountered: