New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PANIC: rpool: blkptr at ... DVA 0 has invalid OFFSET 18388167655883276288 #12019
Comments
Dmesg when scrub hits this block:
|
I see the similar message on FreeBSD 13 scrub results in some errors, but reports the pool is ONLINE
/var/log/messages:
|
@deem0n You have clear situation: errors on underlaying devices. Mines are fine. My errors are spurious. I think my HW is not stable. |
Sometimes, during
Also found that zfs driver code doesn't handle PANIC as it looks like. It contain leftover devel code which halts thread causes I/O lock in future: printk(KERN_EMERG "PANIC: %s\n", msg);
spl_dumpstack();
/* Halt the thread to facilitate further debugging */
set_current_state(TASK_UNINTERRUPTIBLE);
while (1)
schedule(); |
I believe zdb does not use ARC or L2ARC, no. And what do you mean, "as it looks like"? Are you expecting it to actually kill the system when triggering? Because if that's what you want, I believe the tunable |
Thanks for pointing me to if (spl_panic_halt)
panic("%s", msg); Is this intentional or perhaps should be corrected in SPL ? |
Seems like a not unreasonable patch to me, but that might be deliberate for some reason; feel free to try it and see if the world burns down. |
MR created: #12120 I know that root cause is somewhere else (I have running stable ZFS on other systems), but at least this change helps me to maintenance system remotely. |
Hi, interesting enough I see no
I assume that my zfs has some internal inconsistency which cannot be fixed by scrub. Should I rebuild zfs pool to fix it? |
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions. |
System information
Describe the problem you're observing
System partially stop responding during normal workload (periodic snapshots, send/receive).
Kernel reports PANIC but system all local FS (on ZFS) operation hangs. So all services already in RAM are working, but I cannot remotely login using SSH (because it wants to access FS).
Now node is running (all services migrated from this node). So I can do any tests / experiments. I'm ready for your suggestion how to clean up this error.
Questions:
panic=30
is set?Describe how to reproduce the problem
Include any warning/errors/backtraces from the system logs
Similar error over one month ago (probably with OpenZFS v0.8.4):
The text was updated successfully, but these errors were encountered: