Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General protection fault - userquota_updates_task #8048

Closed
rigred opened this issue Oct 20, 2018 · 6 comments
Closed

General protection fault - userquota_updates_task #8048

rigred opened this issue Oct 20, 2018 · 6 comments

Comments

@rigred
Copy link

rigred commented Oct 20, 2018

System information

Type Version/Name
Distribution Name Arch Linux
Distribution Version Current (up to date)
Linux Kernel 4.18.14
Architecture x86-64
ZFS Version 0.7.11-1
SPL Version 0.7.11-1

Code is built from the archzfs repo which tracks the main release versions only.

Describe the problem you're observing

Occasionally and quite randomly but potentially triggered by starting a scrub (but not consistently),
ZFS will lock up with a general protection fault.

Reads from all datasets & pools are still possible, but writes are not.
It seems nothing gets written to disk either. Everything is stuck in a spinlock.

I have noticed that another user previously posted a very similar issue here
#7147

Describe how to reproduce the problem

It's intermittent and hard to pin down the trigger.

Include any warning/errors/backtraces from the system logs

[Sat Oct 20 21:31:12 2018] general protection fault: 0000 [#1] PREEMPT SMP NOPTI
[Sat Oct 20 21:31:12 2018] CPU: 3 PID: 3004 Comm: dp_sync_taskq Tainted: P           OE     4.18.14-arch1-1-ARCH #1
[Sat Oct 20 21:31:12 2018] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X370 Gaming K4, BIOS P4.70 04/18/2018
[Sat Oct 20 21:31:12 2018] RIP: 0010:multilist_sublist_remove+0x10/0x30 [zfs]
[Sat Oct 20 21:31:12 2018] Code: 48 89 06 48 89 56 08 48 89 32 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 48 03 77 38 48 8b 46 08 48 8b 16 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 06 48 05 
[Sat Oct 20 21:31:12 2018] RSP: 0018:ffff96e00961bd30 EFLAGS: 00010282
[Sat Oct 20 21:31:12 2018] RAX: ffff8fb7b79f8140 RBX: ffff8fb7c17b2800 RCX: 0000000000000001
[Sat Oct 20 21:31:12 2018] RDX: dead000000000100 RSI: ffff8fb7b2bc4b98 RDI: ffff8fb7b79f8100
[Sat Oct 20 21:31:12 2018] RBP: ffff8fb7b2bc4ac0 R08: 0000000000000000 R09: ffff96e00961bbed
[Sat Oct 20 21:31:12 2018] R10: 000000000000000f R11: ffff96e00961bd01 R12: ffff8fb7b2bc4bb8
[Sat Oct 20 21:31:12 2018] R13: ffff8fb7b2bc4bd8 R14: ffff8fb7b79f8100 R15: ffff8fb6190a5b00
[Sat Oct 20 21:31:12 2018] FS:  0000000000000000(0000) GS:ffff8fbabecc0000(0000) knlGS:0000000000000000
[Sat Oct 20 21:31:12 2018] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Sat Oct 20 21:31:12 2018] CR2: 00007f77edd2e700 CR3: 0000000522c98000 CR4: 00000000003406e0
[Sat Oct 20 21:31:12 2018] Call Trace:
[Sat Oct 20 21:31:12 2018]  userquota_updates_task+0xd0/0x4c0 [zfs]
[Sat Oct 20 21:31:12 2018]  ? dmu_objset_userobjspace_upgradable+0x50/0x50 [zfs]
[Sat Oct 20 21:31:12 2018]  ? dmu_objset_userobjspace_upgradable+0x50/0x50 [zfs]
[Sat Oct 20 21:31:12 2018]  taskq_thread+0x2ca/0x490 [spl]
[Sat Oct 20 21:31:12 2018]  ? wake_up_q+0x70/0x70
[Sat Oct 20 21:31:12 2018]  ? taskq_thread_should_stop+0x70/0x70 [spl]
[Sat Oct 20 21:31:12 2018]  kthread+0x112/0x130
[Sat Oct 20 21:31:12 2018]  ? kthread_flush_work_fn+0x10/0x10
[Sat Oct 20 21:31:12 2018]  ret_from_fork+0x22/0x40
[Sat Oct 20 21:31:12 2018] Modules linked in: cfg80211 rfkill scsi_transport_iscsi dm_mod vhost_net vhost macvtap macvlan tap xt_conntrack ipt_REJECT tun veth devlink nf_tables_set nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject ipt_MASQUERADE nft_ct xt_CHECKSUM xt_comment bridge nft_chain_nat_ipv6 stp llc nft_chain_nat_ipv4 nf_tables ebtable_nat ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables fuse xt_pkttype xt_tcpudp iptable_filter nct6775 hwmon_vid nls_iso8859_1 nls_cp437 amdkfd amd_iommu_v2 wmi_bmof mxm_wmi
[Sat Oct 20 21:31:12 2018]  amdgpu snd_usb_audio edac_mce_amd kvm_amd snd_usbmidi_lib snd_hwdep kvm snd_rawmidi chash gpu_sched snd_seq_device ttm uvcvideo drm_kms_helper crct10dif_pclmul videobuf2_vmalloc videobuf2_memops crc32_pclmul snd_pcm videobuf2_v4l2 ghash_clmulni_intel videobuf2_common pcbc drm aesni_intel mousedev snd_timer videodev aes_x86_64 crypto_simd igb cryptd joydev input_leds led_class glue_helper agpgart ccp sp5100_tco syscopyarea r8169 pcspkr sysfillrect i2c_algo_bit sysimgblt k10temp dca fb_sys_fops rng_core i2c_piix4 snd mii media soundcore gpio_amdpt pinctrl_amd evdev mac_hid wmi pcc_cpufreq acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace usbip_host sunrpc usbip_core cpuid msr sg crypto_user ip_tables x_tables zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) sd_mod
[Sat Oct 20 21:31:12 2018]  ahci libahci libata scsi_mod vfio_pci irqbypass vfio_virqfd vfio_iommu_type1 vfio vfat fat ext4 crc32c_generic crc32c_intel crc16 mbcache jbd2 fscrypto
[Sat Oct 20 21:31:12 2018] ---[ end trace f2f506a013033ee7 ]---
[Sat Oct 20 21:31:12 2018] RIP: 0010:multilist_sublist_remove+0x10/0x30 [zfs]
[Sat Oct 20 21:31:12 2018] Code: 48 89 06 48 89 56 08 48 89 32 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 48 03 77 38 48 8b 46 08 48 8b 16 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 06 48 05 
[Sat Oct 20 21:31:12 2018] RSP: 0018:ffff96e00961bd30 EFLAGS: 00010282
[Sat Oct 20 21:31:12 2018] RAX: ffff8fb7b79f8140 RBX: ffff8fb7c17b2800 RCX: 0000000000000001
[Sat Oct 20 21:31:12 2018] RDX: dead000000000100 RSI: ffff8fb7b2bc4b98 RDI: ffff8fb7b79f8100
[Sat Oct 20 21:31:12 2018] RBP: ffff8fb7b2bc4ac0 R08: 0000000000000000 R09: ffff96e00961bbed
[Sat Oct 20 21:31:12 2018] R10: 000000000000000f R11: ffff96e00961bd01 R12: ffff8fb7b2bc4bb8
[Sat Oct 20 21:31:12 2018] R13: ffff8fb7b2bc4bd8 R14: ffff8fb7b79f8100 R15: ffff8fb6190a5b00
[Sat Oct 20 21:31:12 2018] FS:  0000000000000000(0000) GS:ffff8fbabecc0000(0000) knlGS:0000000000000000
[Sat Oct 20 21:31:12 2018] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Sat Oct 20 21:31:12 2018] CR2: 00007f77edd2e700 CR3: 0000000522c98000 CR4: 00000000003406e0

This completely locks all writes on the pool. I can only read from it.

@bernie1995
Copy link
Contributor

Duplicate of #7997

@bernie1995
Copy link
Contributor

Duplicate of #8027

@bernie1995
Copy link
Contributor

Duplicate of #7933

@rigred
Copy link
Author

rigred commented Oct 20, 2018

Oh my, It's good to know that this has gotten attention before already.
I see that #8005 might be a fix?

@behlendorf
Copy link
Contributor

@rigred yes, #8005 is the backported fix you can apply to resolve the issue.

@bunder2015
Copy link
Contributor

Closing as #8005 has been merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants