Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZFS hang on 'zfs create' #3212

Closed
tuomari opened this issue Mar 23, 2015 · 4 comments
Closed

ZFS hang on 'zfs create' #3212

tuomari opened this issue Mar 23, 2015 · 4 comments
Milestone

Comments

@tuomari
Copy link

tuomari commented Mar 23, 2015

Got this while running latest HEAD ( bc88866 ) with #2351

There are two 6x3T raidz1 vdevs 30T in total, 550G free. No l2arc, no zil.
System load is mostly constant writes from zoneminder ( ~6MB/s )

When I created a 100G zvol and tried formatting it to ext4, i got the following error

Mar 23 13:38:51 helvi kernel: [50787.429600] blk_update_request: I/O error, dev zd608, sector 105119744
Mar 23 13:38:51 helvi kernel: [50787.429713] Buffer I/O error on dev zd608, logical block 13139968, lost async page write
Mar 23 13:38:51 helvi kernel: [50787.429875] Buffer I/O error on dev zd608, logical block 13139969, lost async page write
Mar 23 13:38:51 helvi kernel: [50787.429996] Buffer I/O error on dev zd608, logical block 13139970, lost async page write
Mar 23 13:38:51 helvi kernel: [50787.430184] Buffer I/O error on dev zd608, logical block 13139971, lost async page write
Mar 23 13:38:51 helvi kernel: [50787.430303] Buffer I/O error on dev zd608, logical block 13139972, lost async page write
Mar 23 13:38:51 helvi kernel: [50787.430473] Buffer I/O error on dev zd608, logical block 13139973, lost async page write
Mar 23 13:38:51 helvi kernel: [50787.430648] Buffer I/O error on dev zd608, logical block 13139974, lost async page write
Mar 23 13:38:51 helvi kernel: [50787.430775] Buffer I/O error on dev zd608, logical block 13139975, lost async page write
Mar 23 13:38:51 helvi kernel: [50787.430988] Buffer I/O error on dev zd608, logical block 13139976, lost async page write
Mar 23 13:38:51 helvi kernel: [50787.431090] Buffer I/O error on dev zd608, logical block 13139977, lost async page write
Mar 23 13:38:53 helvi kernel: [50790.250944] ------------[ cut here ]------------
Mar 23 13:38:53 helvi kernel: [50790.250999] WARNING: CPU: 1 PID: 29524 at fs/block_dev.c:67 bdev_inode_switch_bdi+0x7e/0x90()
Mar 23 13:38:53 helvi kernel: [50790.251083] Modules linked in: w83795 w83627ehf hwmon_vid btrfs vfat msdos fat jfs xfs reiserfs vhost_net vhost macvtap macvlan ebtable_nat ebtables ip6table_mangle ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt_tcpudp ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG iptable_filter xt_state xt_connmark iptable_mangle xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables 8021q garp bridge stp llc deflate ctr twofish_generic twofish_x86_64_3way xts lrw gf128mul glue_helper twofish_x86_64 twofish_common camellia_generic serpent_generic blowfish_generic blowfish_x86_64 blowfish_common cast5_generic cast_common des_generic cbc xcbc rmd160 sha512_generic sha256_generic sha1_ssse3 sha1_generic hmac crypto_null af_key xfrm_algo nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace sunrpc coretemp loop bttv tveeprom btcx_risc videobuf_dma_sg videobuf_core rc_core v4l2_common 
Mar 23 13:38:53 helvi kernel: videodev media iTCO_wdt snd_hda_codec_realtek iTCO_vendor_support gpio_ich snd_hda_codec_generic ast syscopyarea snd_hda_intel sysfillrect sysimgblt ttm drm_kms_helper snd_hda_controller drm snd_hda_codec snd_pcm snd_timer snd psmouse pcspkr lpc_ich mfd_core shpchp soundcore i7core_edac agpgart serio_raw ioatdma ipmi_si i2c_i801 edac_core ehci_pci dca i5500_temp ipmi_msghandler evdev acpi_cpufreq hid_generic usbhid hid uhci_hcd ehci_hcd usbcore usb_common sg sd_mod netxen_nic [last unloaded: i2c_dev]
Mar 23 13:38:53 helvi kernel: [50790.252333] CPU: 1 PID: 29524 Comm: mkfs.ext4 Not tainted 3.19.2-kvm-ovs-zfs-iudex #2
Mar 23 13:38:53 helvi kernel: [50790.253363] Hardware name: System manufacturer System Product Name/Z8NA-D6(C), BIOS 1303    05/10/2012
Mar 23 13:38:53 helvi kernel: [50790.253451]  ffffffff81aba71c ffff8807855e3d78 ffffffff81788f9f 0000000000000000
Mar 23 13:38:53 helvi kernel: [50790.253538]  0000000000000000 ffff8807855e3db8 ffffffff8108ed02 0000000000000000
Mar 23 13:38:53 helvi kernel: [50790.253665]  ffff8800be41b130 ffff8800be41b1b8 ffffffff81c50640 ffff880071e94c00
Mar 23 13:38:53 helvi kernel: [50790.253753] Call Trace:
Mar 23 13:38:53 helvi kernel: [50790.253794]  [<ffffffff81788f9f>] dump_stack+0x45/0x57
Mar 23 13:38:53 helvi kernel: [50790.253842]  [<ffffffff8108ed02>] warn_slowpath_common+0x92/0xd0
Mar 23 13:38:53 helvi kernel: [50790.253893]  [<ffffffff8108ed55>] warn_slowpath_null+0x15/0x20
Mar 23 13:38:53 helvi kernel: [50790.253942]  [<ffffffff8120c92e>] bdev_inode_switch_bdi+0x7e/0x90
Mar 23 13:38:53 helvi kernel: [50790.253993]  [<ffffffff8120ccd8>] __blkdev_put+0x78/0x1c0
Mar 23 13:38:53 helvi kernel: [50790.254054]  [<ffffffff8120ce76>] blkdev_put+0x56/0x160
Mar 23 13:38:53 helvi kernel: [50790.254105]  [<ffffffff8120cfa0>] blkdev_close+0x20/0x30
Mar 23 13:38:53 helvi kernel: [50790.254154]  [<ffffffff811d7575>] __fput+0xe5/0x1f0
Mar 23 13:38:53 helvi kernel: [50790.254200]  [<ffffffff811d76c9>] ____fput+0x9/0x10
Mar 23 13:38:53 helvi kernel: [50790.254248]  [<ffffffff810a829f>] task_work_run+0xaf/0xf0
Mar 23 13:38:53 helvi kernel: [50790.254298]  [<ffffffff81049b09>] do_notify_resume+0x59/0x80
Mar 23 13:38:53 helvi kernel: [50790.254348]  [<ffffffff81790b5f>] int_signal+0x12/0x17
Mar 23 13:38:53 helvi kernel: [50790.254394] ---[ end trace 81a3f9538ffe05af ]---

I then removed the zvol and tried to create it again. ZFS stopped all reads and writes and 'zfs create' command never went through.

echo w > /proc/sysrq-trigger produced the following: http://pastebin.com/YZ75AgfJ

I will keep the system running for few hours. I will gladly provide additional information if needed.

@dweeezil
Copy link
Contributor

Broken by torvalds/linux@34b48db first appearing in 3.19.

@dweeezil
Copy link
Contributor

See #3214.

@tuomari
Copy link
Author

tuomari commented Mar 24, 2015

Thank you very much for the quick reply!
I have the patch running and there has been no lockups or other errors so far.

@behlendorf
Copy link
Contributor

@tuomari @dweeezil Thanks for the quick fix and testing on this one. The fix itself looks good to me, but we'll wait and see what the buildbots have to say.

@behlendorf behlendorf added this to the 0.6.4 milestone Mar 24, 2015
DeHackEd pushed a commit to DeHackEd/zfs that referenced this issue Apr 4, 2015
ZoL had been setting max_sectors to UINT_MAX, but until Linux 3.19, it
the kernel artifically capped it at 1024 (BLK_DEF_MAX_SECTORS).
This cap was removed in torvalds/linux@34b48db.  This patch changes
it to DMU_MAX_ACCESS (in sectors) and also changes the ASSERT in
dmu_tx_hold_write() to allow the maximum transfer size.

Signed-off-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#3212
DeHackEd pushed a commit to DeHackEd/zfs that referenced this issue Apr 5, 2015
ZoL had been setting max_sectors to UINT_MAX, but until Linux 3.19, it
the kernel artifically capped it at 1024 (BLK_DEF_MAX_SECTORS).
This cap was removed in torvalds/linux@34b48db.  This patch changes
it to DMU_MAX_ACCESS (in sectors) and also changes the ASSERT in
dmu_tx_hold_write() to allow the maximum transfer size.

Signed-off-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#3212
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants