New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PANIC at zfeature.c:294:feature_get_enabled_txg() during import #6543
Comments
|
Command that PANICed was: Attached is output of: |
|
This panic can be "reproduced" in a couple of minutes; output from one of my debian test boxes: The real question here is: "how did the MOS get corrupted"? EDIT: spelling |
|
@loli10K how were you able to reproduce this? |
|
@behlendorf i just wrote garbage on all the 3 |
|
Is it possible to at least handle this kind of situation in some non-blocking fashion with an error from userspace rather than having the userspace command hang indefinitely? The latter is much more difficult to detect/debug, particularly when the commands are being driven by a non-human. |
|
@loli10K I see. @brianjmurrell you can set the module option |
|
@behlendorf as per the comments in whamcloud/integrated-manager-for-lustre#86 this is probably caused by dd'ing the underlying disk after failing to properly remove the zpool. if trying to recreate zpools between automated tests and 'zpool destroy -f ...' fails to remove the pool (pool still reported https://github.com/intel-hpdd/intel-manager-for-lustre/pull/282 ) what would be the recommended approach to clearing state? |
|
@tanabarr You can use wipefs on the underlying disks to remove any trace of zpool and get a clean slate from which to work. That's probably the easiest way to clear any old data. |
|
You could also use |
|
thanks, wipefs seems to be working nicely. also it seems to refuse to remove signatures from imported pools |
|
@utopiabound @tanabarr then as I understand it, the issue here was that the remnants of a previous pool were being used during the import which resulted in the panic. We can work on adding additional sanity checking for block pointers and known object types, but I'd like to tackle that in a different issue. Given that, if there's nothing else to do in this issue can you please close it out. |
|
happy to close, @behlendorf labelclear worked nicely to resolve my issue and we have root caused why we needed it in the first place. much appreciated |
System information
Describe the problem you're observing
Kernel panic on
zfs importof pool created during automated testing.Bug is also being tracked:
https://jira.hpdd.intel.com/browse/LU-9901
Describe how to reproduce the problem
IML tests: https://github.com/intel-hpdd/intel-manager-for-lustre
Will try to narrow down exact reproduction case.
Include any warning/errors/backtraces from the system logs
The text was updated successfully, but these errors were encountered: