Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[0.8.2] General protection fault in user access. Non-canonical address? #9417

Closed
fling- opened this issue Oct 6, 2019 · 12 comments
Closed
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@fling-
Copy link
Contributor

fling- commented Oct 6, 2019

System information

Type Version/Name
Distribution Name gentoo
Distribution Version 17.1
Linux Kernel 5.2.18
Architecture amd64
ZFS Version 0.8.2-1
SPL Version -

Describe the problem you're observing

After updating to 0.8.2 I onlined a drive and got the following kernel BUG.
Resilver completed and everything works as expected, system looks stable.

Describe how to reproduce the problem

Unable to reproduce.

Include any warning/errors/backtraces from the system logs

[41579.556291]  zd48: p1 p2 < p5 >
[41586.406824] ------------[ cut here ]------------
[41586.406827] General protection fault in user access. Non-canonical address?
[41586.406840] WARNING: CPU: 27 PID: 18195 at arch/x86/mm/extable.c:126 ex_handler_uaccess+0x4d/0x58
[41586.406842] Modules linked in: ip6table_filter ip6_tables tun veth fuse bridge stp llc ipt_REJECT nf_reject_ipv4 iptable_filter xt_MASQUERADE xt_nat xt_comment xt_tcpudp iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c crc32c_generic iptable_mangle iptable_security ip_tables x_tables amd64_edac_mod edac_mce_amd kvm_amd ccp rng_core kvm nouveau hid_a4tech hid_generic usbhid hid snd_hda_codec_hdmi raid1 binfmt_misc video ast mxm_wmi uvcvideo wmi irqbypass md_mod ttm videobuf2_vmalloc snd_hda_intel snd_usb_audio videobuf2_memops videobuf2_v4l2 drm_kms_helper videobuf2_common snd_hda_codec snd_usbmidi_lib crct10dif_pclmul videodev snd_rawmidi drm snd_hda_core crc32_pclmul snd_seq_device media snd_hwdep w83795 crc32c_intel snd_pcm w83627ehf firewire_ohci e1000e hwmon_vid snd_timer firewire_core ghash_clmulni_intel evdev sp5100_tco snd serio_raw pcspkr k10temp crc_itu_t fam15h_power i2c_piix4 ohci_pci soundcore i2c_algo_bit pcc_cpufreq acpi_cpufreq button
[41586.406911] CPU: 27 PID: 18195 Comm: kworker/u68:0 Not tainted 5.2.18-gentoo-gnu-zfs-0.8.2 #2
[41586.406913] Hardware name: ASUS KGPE-D16/KGPE-D16, BIOS 4.9-499-ge583dd3d51-dirty 01/24/2019
[41586.406917] RIP: 0010:ex_handler_uaccess+0x4d/0x58
[41586.406921] Code: 83 c4 08 b8 01 00 00 00 5b c3 80 3d 42 a9 6b 01 00 75 dc 48 c7 c7 88 f8 33 8c 48 89 34 24 c6 05 2e a9 6b 01 01 e8 5d 8a 06 00 <0f> 0b 48 8b 34 24 eb bd 0f 1f 00 0f 1f 44 00 00 80 3d 11 a9 6b 01
[41586.406922] RSP: 0018:ffffb7121ecb7cf0 EFLAGS: 00010286
[41586.406925] RAX: 0000000000000000 RBX: ffffffff8be02404 RCX: 0000000000000000
[41586.406926] RDX: 0000000000000007 RSI: ffffffff8cd38c1f RDI: 0000000000000246
[41586.406927] RBP: 000000000000000d R08: 0000000000000002 R09: 000000000002aa00
[41586.406928] R10: 0000629556ad5d6b R11: 0000000000004713 R12: 0000000000000000
[41586.406929] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[41586.406931] FS:  0000000000000000(0000) GS:ffff9a28f7cc0000(0000) knlGS:0000000000000000
[41586.406932] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[41586.406934] CR2: 000001ac68200000 CR3: 00000003d4604000 CR4: 00000000000406e0
[41586.406935] Call Trace:
[41586.406942]  fixup_exception+0x43/0x56
[41586.406946]  do_general_protection+0x4b/0x158
[41586.406951]  general_protection+0x1e/0x30
[41586.406955] RIP: 0010:strnlen_user+0x48/0x110
[41586.406957] Code: 0f 86 de 00 00 00 48 29 f8 45 31 c9 0f 1f 00 0f ae e8 48 39 c6 49 89 fa 48 0f 46 c6 41 83 e2 07 48 83 e7 f8 4e 8d 04 10 31 d2 <4c> 8b 1f 85 d2 0f 85 94 00 00 00 42 8d 0c d5 00 00 00 00 ba 01 00
[41586.406958] RSP: 0018:ffffb7121ecb7e00 EFLAGS: 00010246
[41586.406960] RAX: 0000000000020000 RBX: 197a1e3a12cc6400 RCX: 0000000000000000
[41586.406961] RDX: 0000000000000000 RSI: 0000000000020000 RDI: 197a1e3a12cc6400
[41586.406962] RBP: 00007fffffffefef R08: 0000000000020000 R09: 0000000000000000
[41586.406963] R10: 0000000000000000 R11: 0000000000000067 R12: ffff9a289c532fef
[41586.406964] R13: ffff9a28dc441e00 R14: 0000000000000000 R15: fffff1e1bf714c80
[41586.406969]  copy_strings.isra.31+0x98/0x3a0
[41586.406973]  __do_execve_file.isra.38+0x5ac/0x9c8
[41586.406977]  ? yield_to+0x188/0x1a8
[41586.406979]  do_execve+0x21/0x28
[41586.406983]  call_usermodehelper_exec_async+0x189/0x1a8
[41586.406987]  ? recalc_sigpending+0x17/0x50
[41586.406990]  ? call_usermodehelper+0xa0/0xa0
[41586.406991]  ret_from_fork+0x22/0x40
[41586.406994] ---[ end trace 8bf348913b10f9a4 ]---
@Toolybird
Copy link

Getting something similar here on Arch Linux, kernel-5.3.4, zfs-0.8.2. Happens on every boot:

Starting Import ZFS pools by cache file...
------------[ cut here ]------------
General protection fault in user access. Non-canonical address?
WARNING: CPU: 0 PID: 424 at arch/x86/mm/extable.c:126 ex_handler_uaccess+0x4d/0x60
Modules linked in: bridge stp llc radeon amd64_edac_mod edac_mce_amd i2c_algo_bit ttm kvm_amd ccp drm_kms_helper rng_core kvm drm tg3 irqbypass agpgart pcspkr syscopyarea sysfillrect libphy sysimgblt fb_sys_fops evdev mac_hid sp5100_tco i2c_piix4 k10temp acpi_cpufreq zfs(POE) zunicode(POE) zavl(POE) icp(POE) zlua(POE) nfsd auth_rpcgss nfs_acl lockd grace zcommon(POE) sunrpc znvpair(POE) spl(OE) ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 sd_mod ohci_pci ahci libahci libata scsi_mod ehci_pci ohci_hcd ehci_hcd
CPU: 0 PID: 424 Comm: kworker/u8:6 Tainted: P           OE     5.3.4-arch1-1-ARCH #1
Hardware name: HP ProLiant MicroServer, BIOS O41     10/01/2013
RIP: 0010:ex_handler_uaccess+0x4d/0x60
Code: 83 c4 08 b8 01 00 00 00 5b c3 80 3d c5 f8 29 01 00 75 dc 48 c7 c7 a8 17 2c 95 48 89 34 24 c6 05 b1 f8 29 01 01 e8 52 70 01 00 <0f> 0b 48 8b 34 24 eb bd 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44
RSP: 0018:ffffa5e580b3bce8 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffffffff94e02294 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffffff95ab2a7f RDI: 0000000000000246
RBP: ffffa5e580b3bd48 R08: 00000001a59d2549 R09: 000000000000003f
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
R13: 000000000000000d R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff919cd5c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000018ca8f20 CR3: 00000002889d4000 CR4: 00000000000006f0
Call Trace:
 fixup_exception+0x45/0x58
 do_general_protection+0x48/0x170
 general_protection+0x32/0x40
RIP: 0010:strnlen_user+0x47/0x100
Code: 86 db 00 00 00 55 49 29 f9 45 31 c0 53 0f 1f 00 0f ae e8 4c 39 ce 49 89 fa 4c 0f 46 ce 41 83 e2 07 48 83 e7 f8 31 c0 4d 01 d1 <4c> 8b 1f 85 c0 0f 85 93 00 00 00 42 8d 0c d5 00 00 00 00 b8 01 00
RSP: 0018:ffffa5e580b3bdf8 EFLAGS: 00010206
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000044
RDX: 54a8781ff36bf100 RSI: 0000000000020000 RDI: 54a8781ff36bf100
RBP: ffffa5e580b3bf08 R08: 0000000000000000 R09: 0000000000020000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 54a8781ff36bf100 R14: 0000000000000000 R15: ffffa5e580b23cb0
 ? copy_strings.isra.0+0x2b1/0x320
 copy_strings.isra.0+0xc3/0x320
 __do_execve_file.isra.0+0x4ce/0x8a0
 ? umh_complete+0x40/0x40
 do_execve+0x21/0x30
 call_usermodehelper_exec_async+0x17b/0x1a0
 ? recalc_sigpending+0x17/0x50
 ret_from_fork+0x22/0x40
---[ end trace 918b5e00d7fe891f ]---

@Lalufu
Copy link
Contributor

Lalufu commented Oct 7, 2019

This blows up trying to dereference RDI, which indeed points somewhere into invalid address space on a x64 arch.

@behlendorf behlendorf added the Type: Defect Incorrect behavior (e.g. crash, hang) label Oct 7, 2019
@aerusso
Copy link
Contributor

aerusso commented Oct 8, 2019

I'm also reproducing this every boot. This started only (and immediately) after I upgraded firmware on my X470 Taichi Ultimate (with Ryzen 7 3700X) from P3.2 to P3.4 (addressing an issue unrelated to ZFS).

Oct 07 19:58:41 REDACTED systemd[1]: Starting Import ZFS frostssdpool...
Oct 07 19:58:41 REDACTED systemd[1]: Starting Import ZFS frosthugepool...
Oct 07 19:58:42 REDACTED kernel: ------------[ cut here ]------------
Oct 07 19:58:42 REDACTED kernel: General protection fault in user access. Non-canonical address?
Oct 07 19:58:42 REDACTED kernel: WARNING: CPU: 14 PID: 1949 at arch/x86/mm/extable.c:126 ex_handler_uaccess+0x4d/0x60
Oct 07 19:58:42 REDACTED kernel: Modules linked in: dm_crypt dm_mod edac_mce_amd kvm_amd nls_ascii nls_cp437 vfat fat joydev kvm irqbypass zfs(POE) zunicode(POE) zlua(POE) arc4 crct10dif_pclmul crc32_pclmul ghash>
Oct 07 19:58:42 REDACTED kernel:  gpu_sched i2c_algo_bit ttm drm_kms_helper ahci xhci_pci libahci drm xhci_hcd libata crc32c_intel nvme usbcore scsi_mod nvme_core usb_common wmi i2c_dev parport_pc ppdev lp parpor>
Oct 07 19:58:42 REDACTED kernel: CPU: 14 PID: 1949 Comm: kworker/u64:13 Tainted: P           OE     5.2.0-3-amd64 #1 Debian 5.2.17-1
Oct 07 19:58:42 REDACTED kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X470 Taichi Ultimate, BIOS P3.40 08/15/2019
Oct 07 19:58:42 REDACTED kernel: RIP: 0010:ex_handler_uaccess+0x4d/0x60
Oct 07 19:58:42 REDACTED kernel: Code: 83 c4 08 b8 01 00 00 00 5b c3 80 3d 0e 3f 09 01 00 75 dc 48 c7 c7 28 17 47 93 48 89 34 24 c6 05 fa 3e 09 01 01 e8 7d 74 01 00 <0f> 0b 48 8b 34 24 eb bd 66 66 2e 0f 1f 84 00 >
Oct 07 19:58:42 REDACTED kernel: RSP: 0018:ffffb84d49a57cf0 EFLAGS: 00010286
Oct 07 19:58:42 REDACTED kernel: RAX: 0000000000000000 RBX: ffffffff930023f0 RCX: 0000000000000000
Oct 07 19:58:42 REDACTED kernel: RDX: 0000000000000007 RSI: ffffffff93c0fb9f RDI: 0000000000000246
Oct 07 19:58:42 REDACTED kernel: RBP: 000000000000000d R08: ffffffff93c0fb60 R09: 0000000000029940
Oct 07 19:58:42 REDACTED kernel: R10: 0000003df67db3e8 R11: 000000000000079d R12: 0000000000000000
Oct 07 19:58:42 REDACTED kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Oct 07 19:58:42 REDACTED kernel: FS:  0000000000000000(0000) GS:ffff9672fe980000(0000) knlGS:0000000000000000
Oct 07 19:58:42 REDACTED kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 07 19:58:42 REDACTED kernel: CR2: 00007f706f9860a4 CR3: 00000007e053a000 CR4: 0000000000340ee0
Oct 07 19:58:42 REDACTED kernel: Call Trace:
Oct 07 19:58:42 REDACTED kernel:  fixup_exception+0x43/0x56
Oct 07 19:58:42 REDACTED kernel:  do_general_protection+0x4b/0x160
Oct 07 19:58:42 REDACTED kernel:  general_protection+0x1e/0x30
Oct 07 19:58:42 REDACTED kernel: RIP: 0010:strnlen_user+0x47/0x110
Oct 07 19:58:42 REDACTED kernel: Code: f8 0f 86 df 00 00 00 48 29 f8 45 31 c9 0f 01 cb 0f ae e8 48 39 c6 49 89 fa 48 0f 46 c6 41 83 e2 07 48 83 e7 f8 31 c9 4c 01 d0 <4c> 8b 1f 85 c9 0f 85 96 00 00 00 42 8d 0c d5 >
Oct 07 19:58:42 REDACTED kernel: RSP: 0018:ffffb84d49a57e00 EFLAGS: 00050206
Oct 07 19:58:42 REDACTED kernel: RAX: 0000000000020000 RBX: 0633d5510e4ffe00 RCX: 0000000000000000
Oct 07 19:58:42 REDACTED kernel: RDX: 0633d5510e4ffe00 RSI: 0000000000020000 RDI: 0633d5510e4ffe00
Oct 07 19:58:42 REDACTED kernel: RBP: 00007fffffffefef R08: 0000000000000000 R09: 0000000000000000
Oct 07 19:58:42 REDACTED kernel: R10: 0000000000000000 R11: 000ffffffffff000 R12: ffff9672eb735fef
Oct 07 19:58:42 REDACTED kernel: R13: ffff9672df4e0600 R14: 0000000000000000 R15: ffffe97d5fadcd40
Oct 07 19:58:42 REDACTED kernel:  copy_strings.isra.31+0x98/0x390
Oct 07 19:58:42 REDACTED kernel:  __do_execve_file.isra.38+0x5b5/0x9d0
Oct 07 19:58:42 REDACTED kernel:  ? yield_to+0x150/0x1a0
Oct 07 19:58:42 REDACTED kernel:  do_execve+0x21/0x30
Oct 07 19:58:42 REDACTED kernel:  call_usermodehelper_exec_async+0x189/0x1b0
Oct 07 19:58:42 REDACTED kernel:  ? recalc_sigpending+0x17/0x50
Oct 07 19:58:42 REDACTED kernel:  ? call_usermodehelper+0xa0/0xa0
Oct 07 19:58:42 REDACTED kernel:  ret_from_fork+0x22/0x40
Oct 07 19:58:42 REDACTED kernel: ---[ end trace 3d6d32f3ee4dce49 ]---
Oct 07 19:58:47 REDACTED systemd[1]: Started Import ZFS frostssdpool.
Oct 07 19:58:47 REDACTED systemd[1]: Started Import ZFS frosthugepool.

Can anyone comment if this is safe to use?

@matschi-klickme
Copy link

I'm getting a similiar errormessage on an X270

I also have issues with my system rebooting insead of resuming after suspending after an update might this be related?

journalctl -b:

Oct 22 13:39:08 errors kernel: spl: loading out-of-tree module taints kernel.
Oct 22 13:39:08 errors kernel: spl: module verification failed: signature and/or required key missing - tainting kernel
Oct 22 13:39:08 errors kernel: icp: module license 'CDDL' taints kernel.
Oct 22 13:39:08 errors kernel: Disabling lock debugging due to kernel taint
Oct 22 13:39:09 errors kernel: ZFS: Loaded module v0.8.2-2, ZFS pool version 5000, ZFS filesystem version 5
Oct 22 13:39:09 errors systemd[1]: Started Cryptography Setup for datazfs.
Oct 22 13:39:09 errors systemd[1]: Started Cryptography Setup for nvme0n1p5_crypt.
Oct 22 13:39:09 errors systemd[1]: Reached target Local Encrypted Volumes.
Oct 22 13:39:09 errors systemd[1]: Starting Import ZFS pools by cache file...
Oct 22 13:39:09 errors kernel: ------------[ cut here ]------------
Oct 22 13:39:09 errors kernel: General protection fault in user access. Non-canonical address?
Oct 22 13:39:09 errors kernel: WARNING: CPU: 2 PID: 1003 at arch/x86/mm/extable.c:126 ex_handler_uaccess+0x4d/0x60
Oct 22 13:39:09 errors kernel: Modules linked in: zfs(POE) zunicode(POE) zlua(POE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) nls_ascii nls_cp437 vfat fat arc4 msr snd_hda_>
Oct 22 13:39:09 errors kernel:  rfkill tpm_crb battery ac pcc_cpufreq tpm_tis tpm_tis_core tpm acpi_pad evdev rng_core coretemp parport_pc ppdev lp parport efivarfs ip_tables x_table>
Oct 22 13:39:09 errors kernel: CPU: 2 PID: 1003 Comm: kworker/u8:5 Tainted: P           OE     5.2.0-3-amd64 #1 Debian 5.2.17-1
Oct 22 13:39:09 errors kernel: Hardware name: LENOVO 20HN002UGE/20HN002UGE, BIOS R0IET55W (1.33 ) 09/14/2018
Oct 22 13:39:09 errors kernel: RIP: 0010:ex_handler_uaccess+0x4d/0x60
Oct 22 13:39:09 errors kernel: Code: 83 c4 08 b8 01 00 00 00 5b c3 80 3d 0e 3f 09 01 00 75 dc 48 c7 c7 28 17 c7 b3 48 89 34 24 c6 05 fa 3e 09 01 01 e8 7d 74 01 00 <0f> 0b 48 8b 34 24>
Oct 22 13:39:09 errors kernel: RSP: 0018:ffffab90824f7cf0 EFLAGS: 00010286
Oct 22 13:39:09 errors kernel: RAX: 0000000000000000 RBX: ffffffffb38023f0 RCX: 0000000000000000
Oct 22 13:39:09 errors kernel: RDX: 000000000000003f RSI: ffffffffb440fb9f RDI: 0000000000000246
Oct 22 13:39:09 errors kernel: RBP: 000000000000000d R08: ffffffffb440fb60 R09: 0000000000029940
Oct 22 13:39:09 errors kernel: R10: 0000001d5dac5785 R11: 00000000000003eb R12: 0000000000000000
Oct 22 13:39:09 errors kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Oct 22 13:39:09 errors kernel: FS:  0000000000000000(0000) GS:ffff9e8f32500000(0000) knlGS:0000000000000000
Oct 22 13:39:09 errors kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 22 13:39:09 errors kernel: CR2: 00007faf98e034a0 CR3: 000000041f590003 CR4: 00000000003606e0
Oct 22 13:39:09 errors kernel: Call Trace:
Oct 22 13:39:09 errors kernel:  fixup_exception+0x43/0x56
Oct 22 13:39:09 errors kernel:  do_general_protection+0x4b/0x160
Oct 22 13:39:09 errors kernel:  general_protection+0x1e/0x30
Oct 22 13:39:09 errors kernel: RIP: 0010:strnlen_user+0x47/0x110
Oct 22 13:39:09 errors kernel: Code: f8 0f 86 df 00 00 00 48 29 f8 45 31 c9 0f 01 cb 0f ae e8 48 39 c6 49 89 fa 48 0f 46 c6 41 83 e2 07 48 83 e7 f8 31 c9 4c 01 d0 <4c> 8b 1f 85 c9 0f>
Oct 22 13:39:09 errors kernel: RSP: 0018:ffffab90824f7e00 EFLAGS: 00050206
Oct 22 13:39:09 errors kernel: RAX: 0000000000020000 RBX: 36cac46053b3b800 RCX: 0000000000000000
Oct 22 13:39:09 errors kernel: RDX: 36cac46053b3b800 RSI: 0000000000020000 RDI: 36cac46053b3b800
Oct 22 13:39:09 errors kernel: RBP: 00007fffffffefef R08: 0000000000000000 R09: 0000000000000000
Oct 22 13:39:09 errors kernel: R10: 0000000000000000 R11: 000ffffffffff000 R12: ffff9e8f18737fef
Oct 22 13:39:09 errors kernel: R13: ffff9e8f1edb3e00 R14: 0000000000000000 R15: ffffe8eed061cdc0
Oct 22 13:39:09 errors kernel:  copy_strings.isra.31+0x98/0x390
Oct 22 13:39:09 errors kernel:  __do_execve_file.isra.38+0x5b5/0x9d0
Oct 22 13:39:09 errors kernel:  ? yield_to+0x150/0x1a0
Oct 22 13:39:09 errors kernel:  do_execve+0x21/0x30
Oct 22 13:39:09 errors kernel:  call_usermodehelper_exec_async+0x189/0x1b0
Oct 22 13:39:09 errors kernel:  ? recalc_sigpending+0x17/0x50
Oct 22 13:39:09 errors kernel:  ? call_usermodehelper+0xa0/0xa0
Oct 22 13:39:09 errors kernel:  ret_from_fork+0x35/0x40
Oct 22 13:39:09 errors kernel: ---[ end trace 831fb37686de9ac2 ]---
Oct 22 13:39:09 errors systemd[1]: Started Import ZFS pools by cache file.
Oct 22 13:39:09 errors systemd[1]: Reached target ZFS pool import target.
Oct 22 13:39:09 errors systemd[1]: Starting Wait for ZFS Volume (zvol) links in /dev...
Oct 22 13:39:09 errors systemd[1]: Starting Mount ZFS filesystems...
Oct 22 13:39:09 errors zvol_wait[1253]: Testing 1 zvol links
Oct 22 13:39:09 errors zvol_wait[1253]: All zvol links are now present.
Oct 22 13:39:09 errors systemd[1]: Started Wait for ZFS Volume (zvol) links in /dev.
Oct 22 13:39:09 errors systemd[1]: Reached target ZFS volumes are ready.
Oct 22 13:39:09 errors systemd[1]: Started Mount ZFS filesystems.
Oct 22 13:39:09 errors systemd[1]: Reached target Local File Systems.

@ReimuNotMoe
Copy link

I have the same issue.

Type Version/Name
Distribution Name Ubuntu
Distribution Version 19.04
Linux Kernel 5.3.7
Architecture amd64
ZFS Version 0.8.2-1
SPL Version -

Although my data looks fine.

Oct 27 07:09:07 laptop systemd-modules-load[617]: Inserted module 'zfs'
Oct 27 07:09:07 laptop kernel: ZFS: Loaded module v0.8.2-1, ZFS pool version 5000, ZFS filesystem version 5
Oct 27 07:09:08 laptop systemd[1]: Starting Import ZFS pools by cache file...
Oct 27 07:09:08 laptop systemd[1]: Condition check resulted in Import ZFS pools by device scanning being skipped.
Oct 27 07:09:08 laptop kernel: ------------[ cut here ]------------
Oct 27 07:09:08 laptop kernel: General protection fault in user access. Non-canonical address?
Oct 27 07:09:08 laptop kernel: WARNING: CPU: 9 PID: 1372 at arch/x86/mm/extable.c:126 ex_handler_uaccess+0x52/0x60
Oct 27 07:09:08 laptop kernel: Modules linked in: uvcvideo videobuf2_vmalloc videobuf2_memops btusb videobuf2_v4l2 btrtl btbcm videobuf2_common btintel bluetooth videodev mc ecdh_generic ecc sch_fq_codel nls_iso8859_1 zfs(POE) snd_soc_skl snd_soc_hdac_hda snd_hda_ext_core snd_soc_skl_ipc intel_rapl_msr snd_soc_sst_ipc intel_rapl_common snd_soc_sst_dsp x86_pkg_temp_thermal snd_soc_acpi_intel_match intel_powerclamp snd_soc_acpi coretemp snd_hda_codec_hdmi snd_soc_core zunicode(POE) kvm_intel iwlmvm joydev kvm snd_hda_codec_realtek snd_compress zavl(POE) snd_seq_midi ac97_bus irqbypass icp(POE) mac80211 snd_seq_midi_event snd_pcm_dmaengine snd_hda_codec_generic libarc4 ledtrig_audio intel_cstate zlua(POE) snd_rawmidi snd_hda_intel input_leds snd_hda_codec intel_rapl_perf asus_wmi snd_hda_core sparse_keymap input_polldev snd_hwdep iwlwifi serio_raw snd_seq snd_pcm wmi_bmof snd_seq_device hid_multitouch mxm_wmi 8250_dw snd_timer cfg80211 idma64 mei_me snd virt_dma mei soundcore intel_pch_thermal acpi_tad acpi_pad
Oct 27 07:09:08 laptop kernel:  mac_hid nvidia_uvm(OE) nfsd auth_rpcgss nfs_acl lockd grace zcommon(POE) znvpair(POE) spl(OE) sunrpc parport_pc ppdev lp parport ip_tables x_tables autofs4 xfs libcrc32c algif_skcipher af_alg dm_crypt hid_logitech_hidpp hid_logitech_dj usbhid hid_generic nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) crct10dif_pclmul crc32_pclmul ghash_clmulni_intel i915 i2c_algo_bit aesni_intel drm_kms_helper ahci r8169 syscopyarea aes_x86_64 sysfillrect crypto_simd sysimgblt cryptd fb_sys_fops glue_helper nvme drm realtek libahci intel_lpss_pci ipmi_devintf i2c_i801 nvme_core i2c_hid intel_lpss i2c_nvidia_gpu ipmi_msghandler hid pinctrl_cannonlake wmi video pinctrl_intel
Oct 27 07:09:08 laptop kernel: CPU: 9 PID: 1372 Comm: kworker/u24:3 Tainted: P           OE     5.3.7-reimu #1
Oct 27 07:09:08 laptop kernel: Hardware name: Shinelon Computer T3 Ti/GK5CP6V-S, BIOS N.1.00 07/03/2019
Oct 27 07:09:08 laptop kernel: RIP: 0010:ex_handler_uaccess+0x52/0x60
Oct 27 07:09:08 laptop kernel: Code: c4 08 b8 01 00 00 00 5b 5d c3 80 3d 12 59 b8 01 00 75 db 48 c7 c7 d8 91 ca 94 48 89 75 f0 c6 05 fe 58 b8 01 01 e8 0f 99 01 00 <0f> 0b 48 8b 75 f0 eb bc 66 0f 1f 44 00 00 0f 1f 44 00 00 80 3d de
Oct 27 07:09:08 laptop kernel: RSP: 0000:ffffac9081493cc0 EFLAGS: 00010282
Oct 27 07:09:08 laptop kernel: RAX: 0000000000000000 RBX: ffffffff94402264 RCX: 0000000000000001
Oct 27 07:09:08 laptop kernel: RDX: 0000000080000001 RSI: ffffffff9555cf5f RDI: 0000000000000246
Oct 27 07:09:08 laptop kernel: RBP: ffffac9081493cd0 R08: ffffffff9555cf20 R09: 0000000000029f00
Oct 27 07:09:08 laptop kernel: R10: 00000015cf056f9c R11: 000000000000055c R12: 000000000000000d
Oct 27 07:09:08 laptop kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Oct 27 07:09:08 laptop kernel: FS:  0000000000000000(0000) GS:ffff9e6acda40000(0000) knlGS:0000000000000000
Oct 27 07:09:08 laptop kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 27 07:09:08 laptop kernel: CR2: 00007fc44a44a4b4 CR3: 0000000c36c9b002 CR4: 00000000003606e0
Oct 27 07:09:08 laptop kernel: Call Trace:
Oct 27 07:09:08 laptop kernel:  fixup_exception+0x48/0x5f
Oct 27 07:09:08 laptop kernel:  do_general_protection+0x4e/0x180
Oct 27 07:09:08 laptop kernel:  general_protection+0x28/0x30
Oct 27 07:09:08 laptop kernel: RIP: 0010:strnlen_user+0x4c/0x110
Oct 27 07:09:08 laptop kernel: Code: f8 0f 86 e1 00 00 00 48 29 f8 45 31 c9 0f 01 cb 0f ae e8 48 39 c6 49 89 fa 48 0f 46 c6 41 83 e2 07 48 83 e7 f8 31 c9 4c 01 d0 <4c> 8b 1f 85 c9 0f 85 96 00 00 00 42 8d 0c d5 00 00 00 00 41 b8 01
Oct 27 07:09:08 laptop kernel: RSP: 0000:ffffac9081493de8 EFLAGS: 00050206
Oct 27 07:09:08 laptop kernel: RAX: 0000000000020000 RBX: 04c6d0769d251800 RCX: 0000000000000000
Oct 27 07:09:08 laptop kernel: RDX: 04c6d0769d251800 RSI: 0000000000020000 RDI: 04c6d0769d251800
Oct 27 07:09:08 laptop kernel: RBP: ffffac9081493df8 R08: 8080808080808080 R09: 0000000000000000
Oct 27 07:09:08 laptop kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 00007fffffffefe7
Oct 27 07:09:08 laptop kernel: R13: ffff9e6aa1715fe7 R14: ffffebf5f085c540 R15: ffffebf5f085c540
Oct 27 07:09:08 laptop kernel:  ? _copy_from_user+0x3e/0x60
Oct 27 07:09:08 laptop kernel:  copy_strings.isra.32+0x91/0x380
Oct 27 07:09:08 laptop kernel:  __do_execve_file.isra.41+0x597/0x970
Oct 27 07:09:08 laptop kernel:  do_execve+0x25/0x30
Oct 27 07:09:08 laptop kernel:  call_usermodehelper_exec_async+0x17a/0x1a0
Oct 27 07:09:08 laptop kernel:  ? umh_complete+0x40/0x40
Oct 27 07:09:08 laptop kernel:  ret_from_fork+0x1f/0x30
Oct 27 07:09:08 laptop kernel: ---[ end trace bf6b6ef3c898a9fb ]---
Oct 27 07:09:08 laptop systemd[1]: Started Import ZFS pools by cache file.
Oct 27 07:09:08 laptop systemd[1]: Reached target ZFS pool import target.
Oct 27 07:09:08 laptop systemd[1]: Reached target ZFS startup target.
Oct 27 07:09:08 laptop systemd[1]: Starting Mount ZFS filesystems...
Oct 27 07:09:08 laptop systemd[1]: Started Mount ZFS filesystems.
Oct 27 07:09:08 laptop systemd[1]: Reached target Local File Systems.

@ReimuNotMoe
Copy link

ReimuNotMoe commented Oct 27, 2019

More info: The GPF problem from 0.8.2-1 never occurred on my home server (at least for last 5 reboots), which is a dual E5-2648L v2 machine with kernel 5.2.11.

Edit: The mobo is Supermicro X9DRI-F with latest BIOS.

@aerusso
Copy link
Contributor

aerusso commented Oct 27, 2019

@ReimuNotMoe What CPU/motherboard/firmware revision do you have? I didn't start experiencing this until I did a firmware upgrade.

@ReimuNotMoe
Copy link

ReimuNotMoe commented Oct 27, 2019

@ReimuNotMoe What CPU/motherboard/firmware revision do you have? I didn't start experiencing this until I did a firmware upgrade.

It's a less-known Chinese brand laptop with AMI BIOS & Intel i7 9750H CPU / Cannon Lake PCH.

The BIOS says version N.1.00.

Edit:

  1. I have package intel-microcode installed on both machines (my laptop & home server), version 3.20190618.0ubuntu0.19.04.1.
  2. I've disabled mitigations for Meltdown and Spectre V2 on both machines for performance considerations. But this shouldn't be the problem.

@behlendorf
Copy link
Contributor

You should be able to avoid this issue by setting the zfs_vdev_scheduler="none". Due to changes made to the Linux kernel this module option will be removed in a feature release.

If you need to set a custom scheduler you'll want to do so via the standard /sys/block/<block>/queue/scheduler interface.

@ReimuNotMoe
Copy link

So here are the configurations I've tested so far:

CPU Mobo Distro Kernel Pools Has this issue
Intel E5-2648L v2 Supermicro X9DRI-F Ubuntu 18.04 5.2.11 raidz2 x 3 No
Intel E5-2648L v2 Supermicro X9DRI-F Ubuntu 18.04 5.3.8 raidz2 x 3 No
AMD Opteron 6366 HE Supermicro H8DGI-F Ubuntu 18.04 5.3.8 raidz2 x 2 Yes
Intel i7-9750H Tongfang GK5CP6V-S Ubuntu 19.04 5.3.7 single x 2 Yes

All machines have latest BIOS & microcode, and have all CPU bug (meltdown, etc) mitigations disabled.

@behlendorf
Copy link
Contributor

Thanks for the additional information. I should have included a little more information in my previous comment.

This issue was introduced by PR #9321 which resolved a deadlock which could occur when scrubbing root pools. The original fix for this was to remove the zfs_vdev_scheduler module option entirely, but at the time we wanted to avoid removing it mid-0.8 release if possible.

PR #9422 contains a fix for this as long as you're not using a root pool. Setting the module option zfs_vdev_scheduler to "none" should prevent it as well.

@smopucilowski
Copy link

Also hit this on fresh boot:

Linux 5.3.11-gentoo #2 SMP Wed Nov 13 23:11:33 AEDT 2019 x86_64 Intel(R) Core(TM) i5-4460 CPU @ 3.20GHz GenuineIntel GNU/Linux
[   26.068915] ZFS: Loaded module v0.8.2-r0-gentoo, ZFS pool version 5000, ZFS filesystem version 5
[   31.770001] ------------[ cut here ]------------
[   31.770004] General protection fault in user access. Non-canonical address?
[   31.770015] WARNING: CPU: 3 PID: 3993 at ex_handler_uaccess+0x48/0x50
[   31.770016] Modules linked in: zfs(PO) zunicode(PO) zavl(PO) icp(PO) zlua(PO) zcommon(PO) znvpair(PO) spl(O) mpt3sas
[   31.770027] CPU: 3 PID: 3993 Comm: kworker/u8:6 Tainted: P           O      5.3.11-gentoo #2
[   31.770029] Hardware name: Intel Corporation S1200RP/S1200RP, BIOS S1200RP.86B.03.04.0002.110820161604 11/08/2016
[   31.770032] RIP: 0010:ex_handler_uaccess+0x48/0x50
[   31.770035] Code: 83 c4 08 b8 01 00 00 00 5b c3 80 3d be 9c e4 00 00 75 dc 48 c7 c7 08 57 d4 9f 48 89 34 24 c6 05 aa 9c e4 00 01 e8 5f f3 01 00 <0f> 0b 48 8b 34 24 eb bd 80 3d 95 9c e4 00 00 55 48 89 fd 53 48 89
[   31.770037] RSP: 0018:ffffa9d802d03ce8 EFLAGS: 00010282
[   31.770039] RAX: 0000000000000000 RBX: ffffffff9fa0311c RCX: 0000000000000000
[   31.770041] RDX: 000000000000003f RSI: ffffffffa025303f RDI: ffffffffa025343f
[   31.770042] RBP: ffffa9d802d03d48 R08: 0000000765a3ca2a R09: 000000000000003f
[   31.770044] R10: 0000000000013f7c R11: 0000000000000004 R12: 000000000000000d
[   31.770045] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   31.770047] FS:  0000000000000000(0000) GS:ffff9c982fd80000(0000) knlGS:0000000000000000
[   31.770049] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   31.770050] CR2: 000000000001f2c0 CR3: 0000000819886006 CR4: 00000000001606e0
[   31.770051] Call Trace:
[   31.770058]  fixup_exception+0x40/0x53
[   31.770064]  do_general_protection+0x3d/0x110
[   31.770068]  general_protection+0x28/0x30
[   31.770075] RIP: 0010:strnlen_user+0x47/0x100
[   31.770078] Code: 86 db 00 00 00 55 49 29 f9 45 31 c0 53 0f 1f 00 0f ae e8 4c 39 ce 49 89 fa 4c 0f 46 ce 41 83 e2 07 48 83 e7 f8 31 c0 4d 01 d1 <4c> 8b 1f 85 c0 0f 85 93 00 00 00 42 8d 0c d5 00 00 00 00 b8 01 00
[   31.770079] RSP: 0018:ffffa9d802d03df8 EFLAGS: 00010206
[   31.770081] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000054
[   31.770083] RDX: 15803b9a0d399500 RSI: 0000000000020000 RDI: 15803b9a0d399500
[   31.770084] RBP: ffffa9d802d03f08 R08: 0000000000000000 R09: 0000000000020000
[   31.770085] R10: 0000000000000000 R11: ffff9c982a0b5000 R12: ffff9c981b970000
[   31.770087] R13: 15803b9a0d399500 R14: 0000000000000000 R15: ffff9c9819185000
[   31.770095]  ? copy_strings.isra.0+0x280/0x2e0
[   31.770099]  copy_strings.isra.0+0xb7/0x2e0
[   31.770104]  __do_execve_file.isra.0+0x4b4/0x7d0
[   31.770110]  ? syscall_return_via_sysret+0x1f/0x7f
[   31.770116]  ? umh_complete+0x30/0x30
[   31.770119]  do_execve+0x1c/0x20
[   31.770124]  call_usermodehelper_exec_async+0x170/0x190
[   31.770128]  ret_from_fork+0x35/0x40
[   31.770132] ---[ end trace 9e2e2e0635d70166 ]---

@behlendorf behlendorf added this to To do in 0.8-release Nov 21, 2019
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Dec 26, 2019
As described in commit f81d5ef the zfs_vdev_elevator module
option is being removed.  Users who require this functionality
should update their systems to set the disk scheduler using a
udev rule.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#8664
Closes openzfs#9417
Closes openzfs#9609
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Dec 27, 2019
As described in commit f81d5ef the zfs_vdev_elevator module
option is being removed.  Users who require this functionality
should update their systems to set the disk scheduler using a
udev rule.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#8664
Closes openzfs#9417
Closes openzfs#9609
tonyhutter pushed a commit that referenced this issue Jan 23, 2020
As described in commit f81d5ef the zfs_vdev_elevator module
option is being removed.  Users who require this functionality
should update their systems to set the disk scheduler using a
udev rule.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #8664
Closes #9417
Closes #9609
necrose99 pushed a commit to necrose99/redcore-desktop that referenced this issue Feb 22, 2020
necrose99 pushed a commit to necrose99/redcore-desktop that referenced this issue Feb 22, 2020
@behlendorf behlendorf removed this from To do in 0.8-release Dec 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

8 participants