-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
The call trace is here:
[24313.528339] PANIC: blkptr at ffffffdd81f6c240 has invalid COMPRESS 30
[24313.529681] Showing stack for process 1431
[24313.530980] CPU: 6 PID: 1431 Comm: txg_sync Tainted: P OE 4.4.6-2.el7.aarch64 #2
[24313.532315] Hardware name: AppliedMicro Mustang/Mustang, BIOS 1.1.0 Feb 22 2016
[24313.533646] Call trace:
[24313.534948] [] dump_backtrace+0x0/0x17c
[24313.536258] [] show_stack+0x24/0x2c
[24313.537564] [] dump_stack+0x90/0xb4
[24313.538886] [] spl_dumpstack+0x44/0x60 [spl]
[24313.540199] [] vcmn_err+0xb8/0x108 [spl]
[24313.541650] [] zfs_panic_recover+0x88/0x9c [zfs]
[24313.543094] [] zfs_blkptr_verify+0x2b4/0x348 [zfs]
[24313.544535] [] zio_read+0x54/0xec [zfs]
[24313.545977] [] dsl_scan_scrub_cb+0x40c/0x4bc [zfs]
[24313.547439] [] dsl_scan_visitbp+0x318/0x878 [zfs]
[24313.548908] [] dsl_scan_visitbp+0x4a8/0x878 [zfs]
[24313.550367] [] dsl_scan_visitbp+0x284/0x878 [zfs]
[24313.551819] [] dsl_scan_visitbp+0x284/0x878 [zfs]
[24313.553257] [] dsl_scan_visitbp+0x284/0x878 [zfs]
[24313.554694] [] dsl_scan_visitbp+0x284/0x878 [zfs]
[24313.556131] [] dsl_scan_visitbp+0x284/0x878 [zfs]
[24313.557566] [] dsl_scan_visitbp+0x284/0x878 [zfs]
[24313.558999] [] dsl_scan_visitbp+0x618/0x878 [zfs]
[24313.560437] [] dsl_scan_visitds+0xd4/0x458 [zfs]
[24313.561855] [] dsl_scan_sync+0x338/0xb2c [zfs]
[24313.563259] [] spa_sync+0x324/0x9e0 [zfs]
[24313.564653] [] txg_sync_thread+0x32c/0x5b8 [zfs]
[24313.565887] [] thread_generic_wrapper+0x74/0x88 [spl]
[24313.567101] [] kthread+0xe8/0xfc
[24313.568302] [] ret_from_fork+0x10/0x40
Configuring the zfs module with zfs_recover=1 turns downgrades the panic to a warning and everything seems to happily continue. With the pool in question, this happens both on:
aarch64, Kernel 4.4.6, ZoL 0.6.5.5
x86-64, Kernel 3.18.29, ZoL 0.6.5.6
In both cases with zfs_recover=1, scrub completes successfully without any errors detected on any of the disks.
There seem to be a total of 130 block pointers that have the invalid COMPRESS=30. I have not been able to detect any actual data or metadata damage with zfs_recover=1, but I would prefer to stop having to run this pool with zfs_recover=1 permanently enabled.
- Is there a way to identify which files / file systems are affected by the block pointers listed in the logs? I am hoping I could simply remove them and restore those files from a backup to cure the issue.
- If this is a sanity check failure, should the scrub not fix it by resetting the COMPRESS value to a sane value? At the moment scrub happily completes without detecting any errors.