-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fedora 31 - Kernel 5.5.5 ZFS crash on startup #10043
Comments
@greg-hydrogen kernels 5.5 are not supported yet, I think: |
We'll need to determine if this is related to the updated kernel. |
FYI: I'm running zfs 0.8.3 on kernel 5.5.6 (vanilla) on top of a linux mint 19.3 without any crashes. |
running 5.5.5-200.fc31.x86_64, works for me |
interesting... I get the panic every time I boot so is there any additional information I can provide? |
kernel-5.5.5-200.fc31.x86_64 with zfs-0.8.3 works without problems here.
|
zfs-0.8.3 - I just updated to kernel-5.5.7, EFI, boot+root on ZFS. It all seems to be working. |
Just FYI, zfs 0.8.3 compiled well with kernel 5.6-rc5, arch amd64. Did not test it more, because the nvidia drivers did not compile. |
Hello Everyone, I decided to upgrade to Fedora 32 and test a new kernel (5.6) I am getting the same crash as before but this time I narrowed it down to auditd. If auditd doesn't start then I am able to boot and there is no crash, the moment I start auditd I get the following Apr 11 20:17:54 workstn kernel: kernel BUG at fs/inode.c:1588! this issue if very reproducible so if there is anything I can provide to help please let me know Thank you! |
In our environment this only happens with |
Fixes openzfs#10043 The value of zp is used without having been initialized under some conditions. Initialize the pointer to NULL. Add a regression test case using chown in acl/posix. However, this is not enough because the setup sets xattr=sa, which means zfs_setattr_dir will not be called. Create a second group of acl tests in acl/posix-sa duplicating the acl/posix tests with symlinks, and remove xattr=sa from the original acl/posix tests. This provides more coverage for the default xattr=on code. Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
This is trivially reproducible in openZFS 2.0.0-RC3 and latest master. This pr fixes it for me: #11025 Reproduction is easy:
|
Fixes openzfs#10043 The value of zp is used without having been initialized under some conditions. Initialize the pointer to NULL. Add a regression test case using chown in acl/posix. However, this is not enough because the setup sets xattr=sa, which means zfs_setattr_dir will not be called. Create a second group of acl tests in acl/posix-sa duplicating the acl/posix tests with symlinks, and remove xattr=sa from the original acl/posix tests. This provides more coverage for the default xattr=on code. Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
The value of zp is used without having been initialized under some conditions. Initialize the pointer to NULL. Add a regression test case using chown in acl/posix. However, this is not enough because the setup sets xattr=sa, which means zfs_setattr_dir will not be called. Create a second group of acl tests in acl/posix-sa duplicating the acl/posix tests with symlinks, and remove xattr=sa from the original acl/posix tests. This provides more coverage for the default xattr=on code. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10043 Closes #11025
The value of zp is used without having been initialized under some conditions. Initialize the pointer to NULL. Add a regression test case using chown in acl/posix. However, this is not enough because the setup sets xattr=sa, which means zfs_setattr_dir will not be called. Create a second group of acl tests in acl/posix-sa duplicating the acl/posix tests with symlinks, and remove xattr=sa from the original acl/posix tests. This provides more coverage for the default xattr=on code. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes openzfs#10043 Closes openzfs#11025
The value of zp is used without having been initialized under some conditions. Initialize the pointer to NULL. Add a regression test case using chown in acl/posix. However, this is not enough because the setup sets xattr=sa, which means zfs_setattr_dir will not be called. Create a second group of acl tests in acl/posix-sa duplicating the acl/posix tests with symlinks, and remove xattr=sa from the original acl/posix tests. This provides more coverage for the default xattr=on code. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes openzfs#10043 Closes openzfs#11025
The value of zp is used without having been initialized under some conditions. Initialize the pointer to NULL. Add a regression test case using chown in acl/posix. However, this is not enough because the setup sets xattr=sa, which means zfs_setattr_dir will not be called. Create a second group of acl tests in acl/posix-sa duplicating the acl/posix tests with symlinks, and remove xattr=sa from the original acl/posix tests. This provides more coverage for the default xattr=on code. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10043 Closes #11025
The value of zp is used without having been initialized under some conditions. Initialize the pointer to NULL. Add a regression test case using chown in acl/posix. However, this is not enough because the setup sets xattr=sa, which means zfs_setattr_dir will not be called. Create a second group of acl tests in acl/posix-sa duplicating the acl/posix tests with symlinks, and remove xattr=sa from the original acl/posix tests. This provides more coverage for the default xattr=on code. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes openzfs#10043 Closes openzfs#11025
The value of zp is used without having been initialized under some conditions. Initialize the pointer to NULL. Add a regression test case using chown in acl/posix. However, this is not enough because the setup sets xattr=sa, which means zfs_setattr_dir will not be called. Create a second group of acl tests in acl/posix-sa duplicating the acl/posix tests with symlinks, and remove xattr=sa from the original acl/posix tests. This provides more coverage for the default xattr=on code. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes openzfs#10043 Closes openzfs#11025
Distribution Name | Fedora
Distribution Version | 31
Linux Kernel | 5.5.5-200.fc31.x86_64
Architecture | x86_64
ZFS Version | 0.8.0-629_g24fcd9fc5 (recent master 21-feb-2020)
SPL Version | 0.8.0-629_g24fcd9fc5 (recent master 21-feb-2020)
Describe the problem you're observing
ZFS RIP on startup. When attempting to boot into the lastest Kernel I get the following stacktrace
Feb 23 10:15:11 lnxworkstn kernel: BUG: unable to handle page fault for address: 0000000000881bb0
Feb 23 10:15:11 lnxworkstn kernel: #PF: supervisor write access in kernel mode
Feb 23 10:15:11 lnxworkstn kernel: #PF: error_code(0x0002) - not-present page
Feb 23 10:15:11 lnxworkstn kernel: PGD 0 P4D 0
Feb 23 10:15:11 lnxworkstn kernel: Oops: 0002 [#1] SMP NOPTI
Feb 23 10:15:11 lnxworkstn kernel: CPU: 4 PID: 3267 Comm: auditd Tainted: P OE 5.5.5-200.fc31.x86_64 #1
Feb 23 10:15:11 lnxworkstn kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X470 Taichi Ultimate, BIOS P2.00 12/17/2018
Feb 23 10:15:11 lnxworkstn kernel: RIP: 0010:mutex_lock+0x19/0x30
Feb 23 10:15:11 lnxworkstn kernel: Code: 00 0f 1f 44 00 00 be 02 00 00 00 e9 21 fb ff ff 90 0f 1f 44 00 00 55 48 89 fd e8 c2 dc ff ff 31 c0 65 48 8b 14 25 c0 8b 01 00 48 0f b1 55 00 74 06 48 89 ef 5d e>
Feb 23 10:15:11 lnxworkstn kernel: RSP: 0018:ffffa89c79183910 EFLAGS: 00010246
Feb 23 10:15:11 lnxworkstn kernel: RAX: 0000000000000000 RBX: 0000000000881ba8 RCX: 0000000000000002
Feb 23 10:15:11 lnxworkstn kernel: RDX: ffff9c4ed22e4d00 RSI: ffff9c4f0ad25280 RDI: 0000000000881bb0
Feb 23 10:15:11 lnxworkstn kernel: RBP: 0000000000881bb0 R08: 0000000000000000 R09: 0000000000000000
Feb 23 10:15:11 lnxworkstn kernel: R10: ffffa89c79183988 R11: ffff9c4eedba1a00 R12: ffff9c4ed268f780
Feb 23 10:15:11 lnxworkstn kernel: R13: 0000000000881bb0 R14: 0000000000000002 R15: ffff9c4f0ad25000
Feb 23 10:15:11 lnxworkstn kernel: FS: 00007f23ea736880(0000) GS:ffff9c4f1ed00000(0000) knlGS:0000000000000000
Feb 23 10:15:11 lnxworkstn kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 23 10:15:11 lnxworkstn kernel: CR2: 0000000000881bb0 CR3: 0000000f930ec000 CR4: 00000000003406e0
Feb 23 10:15:11 lnxworkstn kernel: Call Trace:
Feb 23 10:15:11 lnxworkstn kernel: zfs_dirent_unlock+0x20/0x140 [zfs]
Feb 23 10:15:11 lnxworkstn kernel: zfs_setattr_dir+0x405/0x470 [zfs]
Feb 23 10:15:11 lnxworkstn kernel: ? _cond_resched+0x15/0x30
Feb 23 10:15:11 lnxworkstn kernel: ? mutex_lock+0xe/0x30
Feb 23 10:15:11 lnxworkstn kernel: ? txg_list_add+0x73/0x90 [zfs]
Feb 23 10:15:11 lnxworkstn kernel: ? dsl_dataset_dirty+0x4f/0xc0 [zfs]
Feb 23 10:15:11 lnxworkstn kernel: ? dbuf_dirty+0x525/0xa00 [zfs]
Feb 23 10:15:11 lnxworkstn kernel: ? _cond_resched+0x15/0x30
Feb 23 10:15:11 lnxworkstn kernel: ? mutex_lock+0xe/0x30
Feb 23 10:15:11 lnxworkstn kernel: ? sa_attr_op+0x29c/0x3c0 [zfs]
Feb 23 10:15:11 lnxworkstn kernel: ? _cond_resched+0x15/0x30
Feb 23 10:15:11 lnxworkstn kernel: ? dnode_rele_and_unlock+0x53/0xb0 [zfs]
Feb 23 10:15:11 lnxworkstn kernel: zfs_setattr+0x1cc8/0x2240 [zfs]
Feb 23 10:15:11 lnxworkstn kernel: ? lookup_fast+0x107/0x280
Feb 23 10:15:11 lnxworkstn kernel: ? _cond_resched+0x15/0x30
Feb 23 10:15:11 lnxworkstn kernel: ? __kmalloc_node+0x1ff/0x310
Feb 23 10:15:11 lnxworkstn kernel: zpl_setattr+0xff/0x170 [zfs]
Feb 23 10:15:11 lnxworkstn kernel: notify_change+0x333/0x4b0
Feb 23 10:15:11 lnxworkstn kernel: chown_common.isra.0+0xec/0x1a0
Feb 23 10:15:11 lnxworkstn kernel: do_fchownat+0x8f/0xf0
Feb 23 10:15:11 lnxworkstn kernel: __x64_sys_chown+0x1e/0x30
Feb 23 10:15:11 lnxworkstn kernel: do_syscall_64+0x5b/0x1c0
Feb 23 10:15:11 lnxworkstn kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Feb 23 10:15:11 lnxworkstn kernel: RIP: 0033:0x7f23eac4c79b
Feb 23 10:15:11 lnxworkstn kernel: Code: 00 00 00 75 a5 48 89 ef e8 e2 c7 f9 ff eb a4 e8 eb dd 01 00 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 5c 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b>
Feb 23 10:15:11 lnxworkstn kernel: RSP: 002b:00007ffef926ff38 EFLAGS: 00000246 ORIG_RAX: 000000000000005c
Feb 23 10:15:11 lnxworkstn kernel: RAX: ffffffffffffffda RBX: 0000564a6490e100 RCX: 00007f23eac4c79b
Feb 23 10:15:11 lnxworkstn kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000564a656c4320
Feb 23 10:15:11 lnxworkstn kernel: RBP: 0000564a656c4320 R08: 0000564a656c4320 R09: 00007ffef926fe10
Feb 23 10:15:11 lnxworkstn kernel: R10: 0000000000000030 R11: 0000000000000246 R12: 0000564a656c4320
Feb 23 10:15:11 lnxworkstn kernel: R13: 0000000000000028 R14: 0000564a656c2900 R15: 0000000000000000
b 23 10:15:11 lnxworkstn kernel: Modules linked in: vfat fat edac_mce_amd snd_hda_codec_realtek btusb kvm_amd btrtl snd_hda_codec_generic btbcm iwlmvm ledtrig_audio snd_hda_codec_hdmi btintel snd_hda_inte>
Feb 23 10:15:11 lnxworkstn kernel: CR2: 0000000000881bb0
Feb 23 10:15:11 lnxworkstn kernel: ---[ end trace d9880de7770176e7 ]---
Feb 23 10:15:11 lnxworkstn kernel: RIP: 0010:mutex_lock+0x19/0x30
Feb 23 10:15:11 lnxworkstn kernel: Code: 00 0f 1f 44 00 00 be 02 00 00 00 e9 21 fb ff ff 90 0f 1f 44 00 00 55 48 89 fd e8 c2 dc ff ff 31 c0 65 48 8b 14 25 c0 8b 01 00 48 0f b1 55 00 74 06 48 89 ef 5d e>
Feb 23 10:15:11 lnxworkstn kernel: RSP: 0018:ffffa89c79183910 EFLAGS: 00010246
Feb 23 10:15:11 lnxworkstn kernel: RAX: 0000000000000000 RBX: 0000000000881ba8 RCX: 0000000000000002
Feb 23 10:15:11 lnxworkstn kernel: RDX: ffff9c4ed22e4d00 RSI: ffff9c4f0ad25280 RDI: 0000000000881bb0
Feb 23 10:15:11 lnxworkstn kernel: RBP: 0000000000881bb0 R08: 0000000000000000 R09: 0000000000000000
Feb 23 10:15:11 lnxworkstn kernel: R10: ffffa89c79183988 R11: ffff9c4eedba1a00 R12: ffff9c4ed268f780
Feb 23 10:15:11 lnxworkstn kernel: R13: 0000000000881bb0 R14: 0000000000000002 R15: ffff9c4f0ad25000
Feb 23 10:15:11 lnxworkstn kernel: FS: 00007f23ea736880(0000) GS:ffff9c4f1ed00000(0000) knlGS:0000000000000000
Feb 23 10:15:11 lnxworkstn kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 23 10:15:11 lnxworkstn kernel: CR2: 0000000000881bb0 CR3: 0000000f930ec000 CR4: 00000000003406e0
Describe how to reproduce the problem
This occurs on every reboot with kernel 5.5.5 (fedora 31)
Include any warning/errors/backtraces from the system logs
The text was updated successfully, but these errors were encountered: