assertion failure: bad sa_magic in zpl_get_file_info() #11433

justinpryzby · 2021-01-05T17:05:32Z

I've twice today seen an issue where zfs freezes (process cannot be killed and VM had to be rebooted). This happened while moving files from one postgres tablespace on zfs FS to another FS on the same zpool (ALTER TABLE .. SET TABLESPACE ..). This would involve first copying the file and then unlinking it. I think we would've been doing about 1 file/second. The original file would've been written with compress=gzip-1 , and the new file is being written with compress=zstd.

This is a centos7 Qemu/KVM VM with ZFS. The VM is running on an ubuntu hypervisor. The ubuntu machine uses LVM with LVs exposed as block devices to QEMU with cache=none. These are the zvols. I realize that storage configuration is discouraged.

System information

[ 1221.186501] VERIFY3(sa.sa_magic == SA_MAGIC) failed (1609863961 == 3100762)
[ 1221.193388] PANIC at zfs_quota.c:89:zpl_get_file_info()
[ 1221.194791] Showing stack for process 671
[ 1221.195920] CPU: 1 PID: 671 Comm: dp_sync_taskq Kdump: loaded Tainted: P           OE  ------------   3.10.0-1127.10.1.el7.x86_64 #1
[ 1221.199922] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[ 1221.202416] Call Trace:
[ 1221.203350]  [<ffffffffb857ffa5>] dump_stack+0x19/0x1b
[ 1221.205386]  [<ffffffffc05afc5b>] spl_dumpstack+0x2b/0x30 [spl]
[ 1221.207056]  [<ffffffffc05afd29>] spl_panic+0xc9/0x110 [spl]
[ 1221.208486]  [<ffffffffc0c4671a>] ? dsl_dir_diduse_space+0x17a/0x190 [zfs]
[ 1221.210140]  [<ffffffffb8025e9d>] ? __slab_free+0x9d/0x290
[ 1221.211352]  [<ffffffffb7ee4eac>] ? update_curr+0x14c/0x1e0
[ 1221.212625]  [<ffffffffb7ee0628>] ? __enqueue_entity+0x78/0x80
[ 1221.213926]  [<ffffffffb7ee708f>] ? enqueue_entity+0x2ef/0xbe0
[ 1221.215541]  [<ffffffffc0cdf8ef>] zpl_get_file_info+0x9f/0x250 [zfs]
[ 1221.217650]  [<ffffffffc0c0de4d>] dmu_objset_userquota_get_ids+0x11d/0x4e0 [zfs]
[ 1221.219664]  [<ffffffffc0c2644b>] dnode_sync+0x11b/0xac0 [zfs]
[ 1221.221554]  [<ffffffffb7ed3c52>] ? __wake_up_common+0x82/0x120
[ 1221.223355]  [<ffffffffb7ed7f56>] ? __cond_resched+0x26/0x30
[ 1221.225377]  [<ffffffffc0c0b6e1>] sync_dnodes_task+0x61/0x150 [zfs]
[ 1221.226901]  [<ffffffffc05b5516>] taskq_thread+0x2c6/0x520 [spl]
[ 1221.228233]  [<ffffffffb7edb990>] ? wake_up_state+0x20/0x20
[ 1221.229466]  [<ffffffffc05b5250>] ? taskq_thread_spawn+0x60/0x60 [spl]
[ 1221.230891]  [<ffffffffb7ec6691>] kthread+0xd1/0xe0
[ 1221.232381]  [<ffffffffb7ec65c0>] ? insert_kthread_work+0x40/0x40
[ 1221.234584]  [<ffffffffb8592d37>] ret_from_fork_nospec_begin+0x21/0x21
[ 1221.236925]  [<ffffffffb7ec65c0>] ? insert_kthread_work+0x40/0x40

[ 1441.082500] INFO: task dp_sync_taskq:671 blocked for more than 120 seconds.
[ 1441.086404] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1441.088233] dp_sync_taskq   D ffff986c186d5230     0   671      2 0x00000000
[ 1441.089961] Call Trace:
[ 1441.090598]  [<ffffffffb8585da9>] schedule+0x29/0x70
[ 1441.091819]  [<ffffffffc05afd55>] spl_panic+0xf5/0x110 [spl]
[ 1441.093249]  [<ffffffffb8025e9d>] ? __slab_free+0x9d/0x290
[ 1441.094523]  [<ffffffffb7ee4eac>] ? update_curr+0x14c/0x1e0
[ 1441.095814]  [<ffffffffb7ee0628>] ? __enqueue_entity+0x78/0x80
[ 1441.097137]  [<ffffffffb7ee708f>] ? enqueue_entity+0x2ef/0xbe0
[ 1441.098642]  [<ffffffffc0cdf8ef>] zpl_get_file_info+0x9f/0x250 [zfs]
[ 1441.100096]  [<ffffffffc0c0de4d>] dmu_objset_userquota_get_ids+0x11d/0x4e0 [zfs]
[ 1441.101764]  [<ffffffffc0c2644b>] dnode_sync+0x11b/0xac0 [zfs]
[ 1441.103201]  [<ffffffffb7ed3c52>] ? __wake_up_common+0x82/0x120
[ 1441.104540]  [<ffffffffb7ed7f56>] ? __cond_resched+0x26/0x30
[ 1441.105828]  [<ffffffffc0c0b6e1>] sync_dnodes_task+0x61/0x150 [zfs]
[ 1441.107243]  [<ffffffffc05b5516>] taskq_thread+0x2c6/0x520 [spl]
[ 1441.108592]  [<ffffffffb7edb990>] ? wake_up_state+0x20/0x20
[ 1441.109838]  [<ffffffffc05b5250>] ? taskq_thread_spawn+0x60/0x60 [spl]
[ 1441.111286]  [<ffffffffb7ec6691>] kthread+0xd1/0xe0
[ 1441.112398]  [<ffffffffb7ec65c0>] ? insert_kthread_work+0x40/0x40
[ 1441.113757]  [<ffffffffb8592d37>] ret_from_fork_nospec_begin+0x21/0x21
[ 1441.115213]  [<ffffffffb7ec65c0>] ? insert_kthread_work+0x40/0x40

[ 1441.116593] INFO: task txg_sync:1137 blocked for more than 120 seconds.
[ 1441.118058] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1441.119788] txg_sync        D ffff986c1e3f9070     0  1137      2 0x00000000
[ 1441.121435] Call Trace:
[ 1441.121997]  [<ffffffffb8585da9>] schedule+0x29/0x70
[ 1441.123122]  [<ffffffffc05b5ba5>] taskq_wait+0x85/0xf0 [spl]
[ 1441.124405]  [<ffffffffb7ec7780>] ? wake_up_atomic_t+0x30/0x30
[ 1441.125736]  [<ffffffffc0c0ca23>] dmu_objset_sync+0x3a3/0x520 [zfs]
[ 1441.127175]  [<ffffffffc0c0a290>] ? dmu_objset_open_impl+0x8f0/0x8f0 [zfs]
[ 1441.128753]  [<ffffffffc0c0b7d0>] ? sync_dnodes_task+0x150/0x150 [zfs]
[ 1441.130251]  [<ffffffffc0c341c5>] dsl_dataset_sync+0x75/0x270 [zfs]
[ 1441.131700]  [<ffffffffc0c4a0c2>] dsl_pool_sync+0xc2/0x520 [zfs]
[ 1441.133276]  [<ffffffffc0c8cda0>] ? spa_suspend_async_destroy+0x60/0x60 [zfs]
[ 1441.135122]  [<ffffffffc0c78a1b>] spa_sync+0x5eb/0x10c0 [zfs]
[ 1441.136497]  [<ffffffffc0c9086b>] ? spa_txg_history_init_io+0x10b/0x120 [zfs]
[ 1441.138124]  [<ffffffffc0c94f7f>] txg_sync_thread+0x2cf/0x4d0 [zfs]
[ 1441.139577]  [<ffffffffc0c94cb0>] ? txg_fini+0x2a0/0x2a0 [zfs]
[ 1441.140918]  [<ffffffffc05b6893>] thread_generic_wrapper+0x73/0x80 [spl]
[ 1441.142444]  [<ffffffffc05b6820>] ? __thread_exit+0x20/0x20 [spl]
[ 1441.143815]  [<ffffffffb7ec6691>] kthread+0xd1/0xe0
[ 1441.144939]  [<ffffffffb7ec65c0>] ? insert_kthread_work+0x40/0x40
[ 1441.146819]  [<ffffffffb8592d37>] ret_from_fork_nospec_begin+0x21/0x21
[ 1441.148288]  [<ffffffffb7ec65c0>] ? insert_kthread_work+0x40/0x40
[ 1441.149682] INFO: task postmaster:6767 blocked for more than 120 seconds.
[ 1441.151195] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1441.152954] postmaster      D ffff986c1f31acc0     0  6767   2453 0x00000084

[ 1441.154618] Call Trace:
[ 1441.155217]  [<ffffffffc0bebd5d>] ? arc_buf_access+0x16d/0x240 [zfs]
[ 1441.156672]  [<ffffffffb8586cc9>] schedule_preempt_disabled+0x29/0x70
[ 1441.158114]  [<ffffffffb8584c57>] __mutex_lock_slowpath+0xc7/0x1d0
[ 1441.159510]  [<ffffffffb858402f>] mutex_lock+0x1f/0x2f
[ 1441.160719]  [<ffffffffc0bfcffb>] dbuf_read+0x7b/0x570 [zfs]
[ 1441.162023]  [<ffffffffc035077a>] ? avl_add+0x4a/0xa0 [zavl]
[ 1441.163335]  [<ffffffffc0c0411a>] dmu_bonus_hold_by_dnode+0x8a/0x1b0 [zfs]
[ 1441.164920]  [<ffffffffc0c0428d>] dmu_bonus_hold+0x4d/0x80 [zfs]
[ 1441.166321]  [<ffffffffc0c6eaae>] sa_buf_hold+0xe/0x10 [zfs]
[ 1441.167658]  [<ffffffffc0d1aa49>] zfs_zget+0xc9/0x270 [zfs]
[ 1441.168983]  [<ffffffffc0d0496a>] zfs_dirent_lock+0x51a/0x660 [zfs]
[ 1441.170408]  [<ffffffffb8107992>] ? security_inode_permission+0x22/0x30
[ 1441.171936]  [<ffffffffc0d14e50>] zfs_remove+0x320/0x940 [zfs]
[ 1441.173276]  [<ffffffffb8112663>] ? selinux_inode_permission+0xe3/0x1b0
[ 1441.174764]  [<ffffffffb8111e77>] ? may_link+0xc7/0x150
[ 1441.175988]  [<ffffffffc0d21eba>] zpl_unlink+0x6a/0xc0 [zfs]
[ 1441.177278]  [<ffffffffb805cd8c>] vfs_unlink+0x10c/0x190
[ 1441.178498]  [<ffffffffb805f9f5>] do_unlinkat+0x285/0x2d0
[ 1441.179713]  [<ffffffffb8592e15>] ? system_call_after_swapgs+0xa2/0x13a
[ 1441.183737]  [<ffffffffb8592e09>] ? system_call_after_swapgs+0x96/0x13a
[ 1441.185249]  [<ffffffffb8592e15>] ? system_call_after_swapgs+0xa2/0x13a
[ 1441.186735]  [<ffffffffb8592e09>] ? system_call_after_swapgs+0x96/0x13a
[ 1441.188222]  [<ffffffffb8592e15>] ? system_call_after_swapgs+0xa2/0x13a
[ 1441.189698]  [<ffffffffb8592e09>] ? system_call_after_swapgs+0x96/0x13a
[ 1441.191195]  [<ffffffffb80609d6>] SyS_unlink+0x16/0x20
[ 1441.192366]  [<ffffffffb8592ed2>] system_call_fastpath+0x25/0x2a
[ 1441.193718]  [<ffffffffb8592e15>] ? system_call_after_swapgs+0xa2/0x13a
[ 1561.196618] INFO: task dp_sync_taskq:671 blocked for more than 120 seconds.

...more of these "stuck" messages...

The text was updated successfully, but these errors were encountered:

ahrens · 2021-01-05T18:39:46Z

I would assume that the problem is the assertion failure, and the "blocked" threads are a result of trying to continue on after the assertion failure.

[ 1221.186501] VERIFY3(sa.sa_magic == SA_MAGIC) failed (1609863961 == 3100762)
[ 1221.193388] PANIC at zfs_quota.c:89:zpl_get_file_info()

justinpryzby · 2021-01-05T18:42:49Z

On Tue, Jan 05, 2021 at 10:40:02AM -0800, Matthew Ahrens wrote: I would assume that the problem is the assertion failure, and the "blocked" threads are a result of trying to continue on after the assertion failure.

Maybe. I didn't look much through old bug reports (there are many). Should I try to scrub ? Anything else I should look for ? I could maybe try to reproduce the problem, but please send a list of what to collect. I can get a stacktrace for the postgres backend, and the kernel's wchan, and a list of /proc/pid/FDs and inodes. Anything else ? Since this happened twice within ~2hours (while migrating data to zstd after upgrading to v2.0), I think that rules out a RAM issue. However I looked and found that it seems to be using two RAM models, both of which are ECC. EBJ81RF4ECFA-DJ-F 36JSF1G72PZ-1G6M1

…

-- Justin

justinpryzby · 2021-01-05T19:56:35Z

Is this likely to be relevant ?
[ 308.307668] SELinux: inode=5430 on dev=zfs was found to have an invalid context=. This indicates you may need to relabel the inode or the filesystem in question.

justinpryzby · 2021-01-06T20:29:14Z

Last night I ran: scan: scrub repaired 0B in 00:58:02 with 0 errors on Wed Jan 6 03:45:22 2021 Today this happened: [94445.128283] VERIFY3(sa.sa_magic == SA_MAGIC) failed (1609959521 == 3100762) [94445.133896] PANIC at zfs_quota.c:89:zpl_get_file_info() [94445.135180] Showing stack for process 672 [94445.136079] CPU: 3 PID: 672 Comm: dp_sync_taskq Kdump: loaded Tainted: P OE ------------ 3.10.0-1127.10.1.el7.x86_64 #1 [94445.138971] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [94445.140733] Call Trace: [94445.141328] [<ffffffff8377ffa5>] dump_stack+0x19/0x1b [94445.142542] [<ffffffffc04e6c5b>] spl_dumpstack+0x2b/0x30 [spl] [94445.144693] [<ffffffffc04e6d29>] spl_panic+0xc9/0x110 [spl] [94445.146537] [<ffffffffc09e571a>] ? dsl_dir_diduse_space+0x17a/0x190 [zfs] [94445.148055] [<ffffffff83225e9d>] ? __slab_free+0x9d/0x290 [94445.149296] [<ffffffffc09995b6>] ? dbuf_rele_and_unlock+0x286/0x600 [zfs] [94445.150819] [<ffffffff83784022>] ? mutex_lock+0x12/0x2f [94445.152073] [<ffffffffc0a7e8ef>] zpl_get_file_info+0x9f/0x250 [zfs] [94445.153510] [<ffffffffc09ace4d>] dmu_objset_userquota_get_ids+0x11d/0x4e0 [zfs] [94445.155249] [<ffffffffc09c544b>] dnode_sync+0x11b/0xac0 [zfs] [94445.157228] [<ffffffff830c77ab>] ? autoremove_wake_function+0x2b/0x40 [94445.158699] [<ffffffffc09aa6e1>] sync_dnodes_task+0x61/0x150 [zfs] [94445.162223] [<ffffffffc04ec516>] taskq_thread+0x2c6/0x520 [spl] [94445.163907] [<ffffffff830db990>] ? wake_up_state+0x20/0x20 [94445.165701] [<ffffffffc04ec250>] ? taskq_thread_spawn+0x60/0x60 [spl] [94445.167272] [<ffffffff830c6691>] kthread+0xd1/0xe0 [94445.168351] [<ffffffff830c65c0>] ? insert_kthread_work+0x40/0x40 [94445.170341] [<ffffffff83792d37>] ret_from_fork_nospec_begin+0x21/0x21 [94445.171876] [<ffffffff830c65c0>] ? insert_kthread_work+0x40/0x40 [94682.228593] INFO: task dp_sync_taskq:672 blocked for more than 120 seconds. This process would've been rewriting tables (about 1/sec, same as during previous two freezes), but now stuck for 10min postgres 22365 3825 27 02:25 ? Ds 213:10 postgres: pryzbyj ts [local] ALTER TABLE 22365 do_last D ? 03:33:10 postgres: pryzbyj ts [local] ALTER TABLE It's opened files: [pryzbyj@database7 ~]$ sudo ls -l /proc/22365/fd total 0 lr-x------. 1 postgres postgres 64 Jan 6 02:26 0 -> /dev/null l-wx------. 1 postgres postgres 64 Jan 6 02:26 1 -> pipe:[18047] lrwx------. 1 postgres postgres 64 Jan 6 02:26 10 -> socket:[21852] lrwx------. 1 postgres postgres 64 Jan 6 02:26 11 -> socket:[1604876] lr-x------. 1 postgres postgres 64 Jan 6 02:26 12 -> pipe:[1603123] l-wx------. 1 postgres postgres 64 Jan 6 02:26 13 -> pipe:[1603123] lrwx------. 1 postgres postgres 64 Jan 6 10:10 14 -> /var/lib/pgsql/13/data/base/16630/946923787 lrwx------. 1 postgres postgres 64 Jan 6 10:10 15 -> /var/lib/pgsql/13/data/base/16630/946923745 lrwx------. 1 postgres postgres 64 Jan 6 10:10 16 -> /var/lib/pgsql/13/data/base/16630/421266385 lrwx------. 1 postgres postgres 64 Jan 6 10:10 17 -> /var/lib/pgsql/13/data/base/16630/946925850 lrwx------. 1 postgres postgres 64 Jan 6 10:10 18 -> /var/lib/pgsql/13/data/base/16630/946925842 lrwx------. 1 postgres postgres 64 Jan 6 10:10 19 -> /var/lib/pgsql/13/data/global/946927403 l-wx------. 1 postgres postgres 64 Jan 6 02:26 2 -> pipe:[18047] lrwx------. 1 postgres postgres 64 Jan 6 10:10 20 -> /var/lib/pgsql/13/data/base/16630/946925890 lrwx------. 1 postgres postgres 64 Jan 6 10:10 21 -> /var/lib/pgsql/13/data/base/16630/946925894 lrwx------. 1 postgres postgres 64 Jan 6 10:10 22 -> /var/lib/pgsql/13/data/base/16630/946925893 lrwx------. 1 postgres postgres 64 Jan 6 10:10 23 -> /var/lib/pgsql/13/data/base/16630/946925831 lrwx------. 1 postgres postgres 64 Jan 6 10:10 24 -> /var/lib/pgsql/13/data/base/16630/946925831_fsm lrwx------. 1 postgres postgres 64 Jan 6 10:10 25 -> /zfs2/metric_table/PG_13_202007201/16630/981098328 lrwx------. 1 postgres postgres 64 Jan 6 10:10 26 -> /zfs2/oldtables/PG_13_202007201/16630/443019002 lr-x------. 1 postgres postgres 64 Jan 6 02:26 3 -> pipe:[18046] lrwx------. 1 postgres postgres 64 Jan 6 02:26 4 -> anon_inode:[eventpoll] lrwx------. 1 postgres postgres 64 Jan 6 10:10 5 -> /var/lib/pgsql/13/data/base/16630/946925827 lrwx------. 1 postgres postgres 64 Jan 6 10:10 6 -> /var/lib/pgsql/13/data/base/16630/946925777 lrwx------. 1 postgres postgres 64 Jan 6 10:10 7 -> /var/lib/pgsql/13/data/pg_wal/000000010000510A00000081 lrwx------. 1 postgres postgres 64 Jan 6 10:10 8 -> /var/lib/pgsql/13/data/base/16630/946925827_vm lrwx------. 1 postgres postgres 64 Jan 6 10:10 9 -> /var/lib/pgsql/13/data/base/16630/946925827_fsm Unfortunately, I cannot get its stacktrace (gdb freezes). [pryzbyj@database7 ~]$ sudo ls -ldi /zfs2/metric_table/PG_13_202007201/16630/981098328 /zfs2/oldtables/PG_13_202007201/16630/443019002 65904 -rw-------. 1 postgres postgres 34111488 Jan 6 14:00 /zfs2/metric_table/PG_13_202007201/16630/981098328 209018 -rw-r-----. 1 postgres postgres 34111488 Dec 2 00:38 /zfs2/oldtables/PG_13_202007201/16630/443019002 zdb errors like this: [pryzbyj@database7 ~]$ sudo zdb zfs2 65904 >zdb1.out zdb: dmu_object_info() failed, errno 2 [pryzbyj@database7 ~]$ sudo zdb zfs2 209018 >zdb2.out zdb: dmu_object_info() failed, errno 2 I've saved the full output but now have to reboot.

…

-- Justin

lundman · 2021-04-26T11:21:54Z

Something very similar to this has been triggered by zfs_upgrade_003_pos

panic(cpu 0 caller 0xffffff7fa0b2e1a3): VERIFY3(sa.sa_magic == SA_MAGIC) failed (1619434751 == 3100762)
Backtrace (CPU 0), Frame : Return Address
0xffffff888d8e3b90 : 0xffffff801e46a0cc mach_kernel : _panic + 0x54
0xffffff888d8e3c00 : 0xffffff7fa0b2e1a3 net.lundman.zfs : _spl_panic + 0x73
0xffffff888d8e3d70 : 0xffffff7fa0999005 net.lundman.zfs : _zpl_get_file_info + 0xe5
0xffffff888d8e3db0 : 0xffffff7fa08ba4f2 net.lundman.zfs : _dmu_objset_userquota_get_ids + 0x2b2
0xffffff888d8e3e20 : 0xffffff7fa08d0499 net.lundman.zfs : _dnode_sync + 0xd9
0xffffff888d8e3ec0 : 0xffffff7fa08b98bd net.lundman.zfs : _sync_dnodes_task + 0x8d
0xffffff888d8e3f00 : 0xffffff7fa0b3e884 net.lundman.zfs : _taskq_thread + 0x224
0xffffff888d8e3fa0 : 0xffffff801db9513e mach_kernel : _call_continuation + 0x2e

frame #8: 0xffffff7fa08d0499 zfs`dnode_sync(dn=0xffffff888d1f04a0, tx=0xffffff8893658e00) at dnode_sync.c:656:3 [opt]
   653 				dn->dn_phys->dn_flags |=
   654 				    DNODE_FLAG_USEROBJUSED_ACCOUNTED;
   655 			mutex_exit(&dn->dn_mtx);
-> 656 			dmu_objset_userquota_get_ids(dn, B_FALSE, tx);
   657 		} else {
   658 			/* Once we account for it, we should always account for it */
   659 			ASSERT(!(dn->dn_phys->dn_flags &
(lldb) down
frame #7: 0xffffff7fa08ba4f2 zfs`dmu_objset_userquota_get_ids(dn=0xffffff888d1f04a0, before=0, tx=<unavailable>) at dmu_objset.c:2211:10 [opt]
   2208		 * type has changed and that type isn't an object type to track
   2209		 */
   2210		zfs_file_info_t zfi;
-> 2211		error = file_cbs[os->os_phys->os_type](dn->dn_bonustype, data, &zfi);
   2212
   2213		if (before) {
   2214			ASSERT(data);
(lldb) down
frame #6: 0xffffff7fa0999005 zfs`zpl_get_file_info(bonustype=<unavailable>, data=0xffffff88943a9000, zoi=0xffffff888d8e3dc8) at zfs_quota.c:89:2 [opt]
   86  			sa.sa_layout_info = BSWAP_16(sa.sa_layout_info);
   87  			swap = B_TRUE;
   88  		}
-> 89  		VERIFY3U(sa.sa_magic, ==, SA_MAGIC);
   90
   91  		int hdrsize = sa_hdrsize(&sa);
   92  		VERIFY3U(hdrsize, >=, sizeof (sa_hdr_phys_t));


(lldb) p/x *dn
(dnode_t) $6 = {
  dn_struct_rwlock = {
    rw_lock = ([0] = 0xa1000000, [1] = 0x00000000, [2] = 0xbaddcafe, [3] = 0xbaddcafe)
    rw_owner = 0x0000000000000000
    rw_readers = 0x00000000
    rw_pad = 0x12345678
  }
  dn_link = {
    list_next = 0xffffff889595eb88
    list_prev = 0xffffff88a0672e70
  }
  dn_objset = 0xffffff8893eca000
  dn_object = 0x0000000000000009
  dn_dbuf = 0xffffff88a1baf018
  dn_handle = 0xffffff88932253a8
  dn_phys = 0xffffff88913b2200
  dn_type = 0x00000013
  dn_bonuslen = 0x00b0
  dn_bonustype = 0x2c
  dn_nblkptr = 0x01
  dn_checksum = 0x00
  dn_compress = 0x00
  dn_nlevels = 0x01
  dn_indblkshift = 0x11
  dn_datablkshift = 0x0c
  dn_moved = 0x00
  dn_datablkszsec = 0x0008
  dn_datablksz = 0x00001000
  dn_maxblkid = 0x0000000000000000
  dn_next_type = ([0] = 0x00, [1] = 0x00, [2] = 0x00, [3] = 0x00)
  dn_num_slots = 0x01
  dn_next_nblkptr = ([0] = 0x00, [1] = 0x00, [2] = 0x00, [3] = 0x00)
  dn_next_nlevels = ([0] = 0x00, [1] = 0x00, [2] = 0x00, [3] = 0x00)
  dn_next_indblkshift = ([0] = 0x00, [1] = 0x00, [2] = 0x00, [3] = 0x00)
  dn_next_bonustype = ([0] = 0x00, [1] = 0x2c, [2] = 0x00, [3] = 0x00)
  dn_rm_spillblk = ([0] = 0x00, [1] = 0x00, [2] = 0x00, [3] = 0x00)
  dn_next_bonuslen = ([0] = 0x0000, [1] = 0x00b0, [2] = 0x0000, [3] = 0x0000)
  dn_next_blksz = ([0] = 0x00000000, [1] = 0x00000000, [2] = 0x00000000, [3] = 0x00000000)
  dn_next_maxblkid = ([0] = 0x0000000000000000, [1] = 0x0000000000000000, [2] = 0x0000000000000000, [3] = 0x0000000000000000)
  dn_dbufs_count = 0x00000001
  dn_dirty_link = {
    [0] = {
      list_next = 0xffffff8897fc9fc8
      list_prev = 0xffffff8897fc9fc8
    }
    [1] = {
      list_next = 0xffffff888cfcd1c8
      list_prev = 0xffffff888cfcd1c8
    }
    [2] = {
      list_next = 0x0000000000000000
      list_prev = 0x0000000000000000
    }
    [3] = {
      list_next = 0x0000000000000000
      list_prev = 0x0000000000000000
    }
  }
  dn_mtx = {
    m_owner = 0x0000000000000000
    m_lock = {
      opaque = ([0] = 0x00000000, [1] = 0x00000000, [2] = 0x00000000, [3] = 0xffffffff)
    }
    m_waiters = 0x0000000000000000
    m_sleepers = 0x0000000000000000
    leak = 0xffffff80264146e0
    m_initialised = 0x123456789abcdef0
  }
  dn_dirty_records = {
    [0] = {
      list_size = 0x0000000000000190
      list_offset = 0x0000000000000000
      list_head = {
        list_next = 0xffffff889146c580
        list_prev = 0xffffff8892161000
      }
    }
    [1] = {
      list_size = 0x0000000000000190
      list_offset = 0x0000000000000000
      list_head = {
        list_next = 0xffffff8893dc5740
        list_prev = 0xffffff8893dc5740
      }
    }
    [2] = {
      list_size = 0x0000000000000190
      list_offset = 0x0000000000000000
      list_head = {
        list_next = 0xffffff888d1f0640
        list_prev = 0xffffff888d1f0640
      }
    }
    [3] = {
      list_size = 0x0000000000000190
      list_offset = 0x0000000000000000
      list_head = {
        list_next = 0xffffff888d1f0660
        list_prev = 0xffffff888d1f0660
      }
    }
  }
  dn_free_ranges = {
    [0] = 0x0000000000000000
    [1] = 0x0000000000000000
    [2] = 0x0000000000000000
    [3] = 0x0000000000000000
  }
  dn_allocated_txg = 0x00000000000015aa
  dn_free_txg = 0x0000000000000000
  dn_assigned_txg = 0x0000000000000000
  dn_dirty_txg = 0x0000000000001651
  dn_notxholds = (pad = 0xbaddcafebaddcafe)
  dn_nodnholds = (pad = 0xbaddcafebaddcafe)
  dn_dirtyctx = 0x00000001
  dn_dirtyctx_firstset = 0xffffff8894ff4b20
  dn_tx_holds = (rc_count = 0x0000000000000000)
  dn_holds = (rc_count = 0x0000000000000005)
  dn_dbufs_mtx = {
    m_owner = 0x0000000000000000
    m_lock = {
      opaque = ([0] = 0x00000000, [1] = 0x00000000, [2] = 0x00000000, [3] = 0xffffffff)
    }
    m_waiters = 0x0000000000000000
    m_sleepers = 0x0000000000000000
    leak = 0xffffff8029292ea0
    m_initialised = 0x123456789abcdef0
  }
  dn_dbufs = {
    avl_root = 0xffffff8893f05eb0
    avl_compar = 0xffffff7fa08d0140 (zfs`dbuf_compare at dnode.c:88)
    avl_offset = 0x0000000000000040
    avl_numnodes = 0x0000000000000001
    avl_size = 0x0000000000000138
  }
  dn_bonus = 0xffffff8894ff4b20
  dn_have_spill = 0x00000000
  dn_zio = 0xffffff889426ba50
  dn_oldused = 0x0000000000000000
  dn_oldflags = 0x000000000000000a
  dn_olduid = 0x0000000000000000
  dn_oldgid = 0x0000000000000000
  dn_oldprojid = 0x0000000000000000
  dn_newuid = 0x0000000000000000
  dn_newgid = 0x0000000000000000
  dn_newprojid = 0x0000000000000000
  dn_id_flags = 0x00000005
  dn_zfetch = {
    zf_lock = {
      m_owner = 0x0000000000000000
      m_lock = {
        opaque = ([0] = 0x00000000, [1] = 0x00000000, [2] = 0x00000000, [3] = 0xffffffff)
      }
      m_waiters = 0x0000000000000000
      m_sleepers = 0x0000000000000000
      leak = 0xffffff802791a360
      m_initialised = 0x123456789abcdef0
    }
    zf_stream = {
      list_size = 0x0000000000000080
      list_offset = 0x0000000000000060
      list_head = {
        list_next = 0xffffff888d1f07e8
        list_prev = 0xffffff888d1f07e8
      }
    }
    zf_dnode = 0xffffff888d1f04a0
    zf_numstreams = 0x00000000
  }
}

(lldb) p/x *dn->dn_objset
(objset) $8 = {
  os_dsl_dataset = 0xffffff8897edb580
  os_spa = 0xffffff88a112a000
  os_phys_buf = 0xffffff8894e41400
  os_phys = 0xffffff889037b000
  os_encrypted = 0x00000000
  os_meta_dnode = {
    dnh_zrlock = {
      zr_mtx = {
        m_owner = 0x0000000000000000
        m_lock = {
          opaque = ([0] = 0x00000000, [1] = 0x00000000, [2] = 0x00000000, [3] = 0xffffffff)
        }
        m_waiters = 0x0000000000000000
        m_sleepers = 0x0000000000000000
        leak = 0xffffff80263e9a40
        m_initialised = 0x123456789abcdef0
      }
      zr_refcount = 0x00000000
      zr_cv = (pad = 0x0000000000000000)
      zr_pad = 0x0000
    }
    dnh_dnode = 0xffffff88a104c370
  }
  os_userused_dnode = {
    dnh_zrlock = {
      zr_mtx = {
        m_owner = 0x0000000000000000
        m_lock = {
          opaque = ([0] = 0x00000000, [1] = 0x00000000, [2] = 0x00000000, [3] = 0xffffffff)
        }
        m_waiters = 0x0000000000000000
        m_sleepers = 0x0000000000000000
        leak = 0xffffff8027839000
        m_initialised = 0x123456789abcdef0
      }
      zr_refcount = 0x00000000
      zr_cv = (pad = 0x0000000000000000)
      zr_pad = 0x0000
    }
    dnh_dnode = 0xffffff8892c055c0
  }
  os_groupused_dnode = {
    dnh_zrlock = {
      zr_mtx = {
        m_owner = 0x0000000000000000
        m_lock = {
          opaque = ([0] = 0x00000000, [1] = 0x00000000, [2] = 0x00000000, [3] = 0xffffffff)
        }
        m_waiters = 0x0000000000000000
        m_sleepers = 0x0000000000000000
        leak = 0xffffff802a92b2c0
        m_initialised = 0x123456789abcdef0
      }
      zr_refcount = 0x00000000
      zr_cv = (pad = 0x0000000000000000)
      zr_pad = 0x0000
    }
    dnh_dnode = 0xffffff88904e46e0
  }
  os_projectused_dnode = {
    dnh_zrlock = {
      zr_mtx = {
        m_owner = 0x0000000000000000
        m_lock = {
          opaque = ([0] = 0x00000000, [1] = 0x00000000, [2] = 0x00000000, [3] = 0xffffffff)
        }
        m_waiters = 0x0000000000000000
        m_sleepers = 0x0000000000000000
        leak = 0xffffff802c39b860
        m_initialised = 0x123456789abcdef0
      }
      zr_refcount = 0x00000000
      zr_cv = (pad = 0x0000000000000000)
      zr_pad = 0x0000
    }
    dnh_dnode = 0xffffff888cf097e0
  }
  os_zil = 0xffffff889627d700
  os_evicting_node = {
    list_next = 0x0000000000000000
    list_prev = 0x0000000000000000
  }
  os_dnodesize = 0x0000000000000200
  os_checksum = 0x00000007
  os_compress = 0x00000002
  os_complevel = 0x00
  os_copies = 0x01
  os_dedup_checksum = 0x00000002
  os_dedup_verify = 0x00000000
  os_logbias = 0x00000000
  os_primary_cache = 0x00000002
  os_secondary_cache = 0x00000002
  os_sync = 0x00000000
  os_redundant_metadata = 0x00000000
  os_recordsize = 0x0000000000020000
  os_version = 0x0000000000000005
  os_normalization = 0x0000000000000000
  os_utf8only = 0x0000000000000000
  os_casesensitivity = 0x0000000000000000
  os_zpl_special_smallblock = 0x00000000
  os_rootbp = 0xffffff88a1d4d480
  os_synctx = 0xffffff8893658e00
  os_zil_header = {
    zh_claim_txg = 0x0000000000000000
    zh_replay_seq = 0x0000000000000000
    zh_log = {
      blk_dva = {
        [0] = {
          dva_word = ([0] = 0x0000000100000018, [1] = 0x00000000000822b6)
        }
        [1] = {
          dva_word = ([0] = 0x0000000000000000, [1] = 0x0000000000000000)
        }
        [2] = {
          dva_word = ([0] = 0x0000000000000000, [1] = 0x0000000000000000)
        }
      }
      blk_prop = 0x8009090200170017
      blk_pad = ([0] = 0x0000000000000000, [1] = 0x0000000000000000)
      blk_phys_birth = 0x0000000000000000
      blk_birth = 0x000000000000164f
      blk_fill = 0x0000000000000000
      blk_cksum = {
        zc_word = ([0] = 0xb66258fea1229a28, [1] = 0xf54677f79536adbc, [2] = 0x0000000000001864, [3] = 0x0000000000000040)
      }
    }
    zh_claim_blk_seq = 0x0000000000000000
    zh_flags = 0x0000000000000000
    zh_claim_lr_seq = 0x0000000000000000
    zh_pad = ([0] = 0x0000000000000000, [1] = 0x0000000000000000, [2] = 0x0000000000000000)
  }
  os_synced_dnodes = 0xffffff889003d6a8
  os_flags = 0x0000000000000007
  os_freed_dnodes = 0x0000000000000005
  os_rescan_dnodes = 0x00000000
  os_raw_receive = 0x00000000
  os_next_write_raw = ([0] = 0x00000000, [1] = 0x00000000, [2] = 0x00000000, [3] = 0x00000000)
  os_obj_lock = {
    m_owner = 0x0000000000000000
    m_lock = {
      opaque = ([0] = 0x00000000, [1] = 0x00000000, [2] = 0x00000000, [3] = 0xffffffff)
    }
    m_waiters = 0x0000000000000000
    m_sleepers = 0x0000000000000000
    leak = 0xffffff802783ae00
    m_initialised = 0x123456789abcdef0
  }
  os_obj_next_chunk = 0x0000000000000100
  os_obj_next_percpu = 0xffffff888d2006a0
  os_obj_next_percpu_len = 0x00000002
  os_lock = {
    m_owner = 0x0000000000000000
    m_lock = {
      opaque = ([0] = 0x00000000, [1] = 0x00000000, [2] = 0x00000000, [3] = 0xffffffff)
    }
    m_waiters = 0x0000000000000000
    m_sleepers = 0x0000000000000000
    leak = 0xffffff8028ae90e0
    m_initialised = 0x123456789abcdef0
  }
  os_dirty_dnodes = {
    [0] = 0xffffff889003d9e8
    [1] = 0xffffff888d2076c0
    [2] = 0xffffff889003dc68
    [3] = 0xffffff889003d5c8
  }
  os_dnodes = {
    list_size = 0x0000000000000368
    list_offset = 0x0000000000000020
    list_head = {
      list_next = 0xffffff888d11fc18
      list_prev = 0xffffff88a05f55a8
    }
  }
  os_downgraded_dbufs = {
    list_size = 0x0000000000000138
    list_offset = 0x0000000000000040
    list_head = {
      list_next = 0xffffff8893eca3d8
      list_prev = 0xffffff8893eca3d8
    }
  }
  os_userused_lock = {
    m_owner = 0x0000000000000000
    m_lock = {
      opaque = ([0] = 0x00000000, [1] = 0x00000000, [2] = 0x00000000, [3] = 0xffffffff)
    }
    m_waiters = 0x0000000000000000
    m_sleepers = 0x0000000000000000
    leak = 0xffffff802c39bcc0
    m_initialised = 0x123456789abcdef0
  }
  os_user_ptr_lock = {
    m_owner = 0x0000000000000000
    m_lock = {
      opaque = ([0] = 0x00000000, [1] = 0x00000000, [2] = 0x00000000, [3] = 0xffffffff)
    }
    m_waiters = 0x0000000000000000
    m_sleepers = 0x0000000000000000
    leak = 0xffffff8026288640
    m_initialised = 0x123456789abcdef0
  }
  os_user_ptr = 0xffffff88940ab000
  os_sa = 0xffffff8890738328
  os_upgrade_lock = {
    m_owner = 0x0000000000000000
    m_lock = {
      opaque = ([0] = 0x00000000, [1] = 0x00000000, [2] = 0x00000000, [3] = 0xffffffff)
    }
    m_waiters = 0x0000000000000000
    m_sleepers = 0x0000000000000000
    leak = 0xffffff802c39a000
    m_initialised = 0x123456789abcdef0
  }
  os_upgrade_id = 0x0000000000000000
  os_upgrade_cb = 0xffffff7fa08ba790 (zfs`dmu_objset_userspace_upgrade_cb at dmu_objset.c:2333)
  os_upgrade_exit = 0x00000001
  os_upgrade_status = 0x00000000
}

stale · 2022-04-27T12:35:36Z

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

justinpryzby added Status: Triage Needed New issue which needs to be triaged Type: Defect Incorrect behavior (e.g. crash, hang) labels Jan 5, 2021

ahrens changed the title ~~zfs freeze (unlinking files?)~~ assertion failure: bad sa_magic in zpl_get_file_info() Jan 5, 2021

rincebrain mentioned this issue Oct 18, 2021

Kernel panic VERIFY3(sa.sa_magic == SA_MAGIC) failed #12659

Closed

rincebrain mentioned this issue Feb 23, 2022

VERIFY3(sa.sa_magic == SA_MAGIC) failed - ZFS 2.1.0/2.1.3, CentOS 8.5 #13144

Closed

stale bot added the Status: Stale No recent activity for issue label Apr 27, 2022

rincebrain mentioned this issue Jul 7, 2022

Cannot read, write or delete corrupted directory #13634

Open

stale bot closed this as completed Aug 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assertion failure: bad sa_magic in zpl_get_file_info() #11433

assertion failure: bad sa_magic in zpl_get_file_info() #11433

justinpryzby commented Jan 5, 2021 •

edited

ahrens commented Jan 5, 2021

justinpryzby commented Jan 5, 2021 via email •

edited

justinpryzby commented Jan 5, 2021

justinpryzby commented Jan 6, 2021 via email

lundman commented Apr 26, 2021

stale bot commented Apr 27, 2022

assertion failure: bad sa_magic in zpl_get_file_info() #11433

assertion failure: bad sa_magic in zpl_get_file_info() #11433

Comments

justinpryzby commented Jan 5, 2021 • edited

System information

ahrens commented Jan 5, 2021

justinpryzby commented Jan 5, 2021 via email • edited

justinpryzby commented Jan 5, 2021

justinpryzby commented Jan 6, 2021 via email

lundman commented Apr 26, 2021

stale bot commented Apr 27, 2022

justinpryzby commented Jan 5, 2021 •

edited

justinpryzby commented Jan 5, 2021 via email •

edited