-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZFS invalid mode: 0x38 #2316
Comments
happened after I set xattr=sa on my pool and all datasets. |
i was able to import the pool from another system without a problem upgrading to latest head now |
i was able to reproduce it every time. it happens when the FS has selinux labels, even when i boot with selinux=0 |
I'm not really sure. |
yes current HEAD |
was able to boot with init=/bin/bash and zfs set xattr=off on my datasets. |
This is very likely another instance of #2228 as @tuxoko pointed out or possibly #2214. @maci0 Have you got a reproducing scenario starting with a fresh pool creation? It might be useful to follow the debugging steps of #2228 (apply the small patch to zfs_znode.c) and find the inode of a corrupted file/directory and then try dumping it with the hacked zdb of dweeezil/zfs@9888f3c. |
I can always reproduce this on a fresh pool. the steps are rahter lengthy tho: |
Just to make sure I'm clear of these steps, you're doing a fresh centos6 install for the purpose of creating the pool and then copying a complete, existing rhel7 system on to that pool which you then set up for direct ZFS boot? Then the corruption of the pool happens during the selinux labeling by the rhel7 system? If this is the case, I wonder if I can duplicate it simply by running |
its a server in a managed datacenter, so you have several installation options. centos6 is one of them, so i install centos6 without raid ( the server has 2 hdds ) here are the steps as from my notes:
|
I may have something I can work with. I'm able to reliably trigger an assertion when copying a bunch of files with typical |
@maci0 Could you please try the patch in #2321. My test system now survives a complete xattr-copying rsync of my root filesystem onto ZFS with all the selinux xattrs. The typical trouble spot turned out to be the /etc/ssl/certs (or equivalent depending on your distro; likely /etc/pki/ under redhat) due to all the big symlinks. |
I was able to reproduce the issue in a VM. You are also right about /etc/pki. It doesn't properly relabel those files. After applying the patch on a faulting system + updating the initrd to include the latest module versions I still get a crash though. Attached are screenshots with and without selinux enabled respectively. |
@maci0 By "After applying the patch on a faulting system", do you mean you applied the patch once the filesystem had already been corrupted? If so, it won't fix anything. Once the filesystem is corrupted your only recourse would be to destroy the filesystem and re-create it. The patch must already be applied when the files are being written to the filesystem. If you're running stock 0.6.2 when bootstrapping the system under centos6, it likely has a different but related bug which affects symlinks and was fixed by 472e7c6. You should be running current master code with the patch during both phases of your test: while you're using your initial centos6 for bootstrapping and also while you're running rhel7. |
I see. Sorry, I was not aware the bug causes actual corruption. |
It's 200 commits ahead of HIS master branch, which looks like it's currently 33 commits ahead of the 0.6.2 release. Don't worry about that. You're not using it. His issue-2316 branch is fine to use straight up. |
after applying your patches it breaks as well, but without a stacktrace |
@maci0 Unfortunately, I don't have a problem I can replicate any more. The next step would be to get a stack trace from your system. Have you tried using sysrq to get stack traces? |
In the case where a variable-sized SA overlaps the spill block pointer and a new variable-sized SA is being added, the header size was improperly calculated to include the to-be-moved SA. This problem could be reproduced when xattr=sa enabled as follows: ln -s $(perl -e 'print "x" x 120') blah setfattr -n security.selinux -v blahblah -h blah The symlink is large enough to interfere with the spill block pointer and has a typical SA registration as follows (shown in modified "zdb -dddd" <SA attr layout obj> format): [ ... ZPL_DACL_COUNT ZPL_DACL_ACES ZPL_SYMLINK ] Adding the SA xattr will attempt to extend the registration to: [ ... ZPL_DACL_COUNT ZPL_DACL_ACES ZPL_SYMLINK ZPL_DXATTR ] but since the ZPL_SYMLINK SA interferes with the spill block pointer, it must also be moved to the spill block which will have a registration of: [ ZPL_SYMLINK ZPL_DXATTR ] This commit updates extra_hdrsize when this condition occurs, allowing hdrsize to be subsequently decreased appropriately. Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ned Bass <bass6@llnl.gov> Issue #2214 Issue #2228 Issue #2316 Issue #2343
A potential fix for this has been merged. Please let us know if you're able to recreate this issue using a pool created from the latest master source which includes commit 83021b4. |
In the case where a variable-sized SA overlaps the spill block pointer and a new variable-sized SA is being added, the header size was improperly calculated to include the to-be-moved SA. This problem could be reproduced when xattr=sa enabled as follows: ln -s $(perl -e 'print "x" x 120') blah setfattr -n security.selinux -v blahblah -h blah The symlink is large enough to interfere with the spill block pointer and has a typical SA registration as follows (shown in modified "zdb -dddd" <SA attr layout obj> format): [ ... ZPL_DACL_COUNT ZPL_DACL_ACES ZPL_SYMLINK ] Adding the SA xattr will attempt to extend the registration to: [ ... ZPL_DACL_COUNT ZPL_DACL_ACES ZPL_SYMLINK ZPL_DXATTR ] but since the ZPL_SYMLINK SA interferes with the spill block pointer, it must also be moved to the spill block which will have a registration of: [ ZPL_SYMLINK ZPL_DXATTR ] This commit updates extra_hdrsize when this condition occurs, allowing hdrsize to be subsequently decreased appropriately. Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ned Bass <bass6@llnl.gov> Issue openzfs#2214 Issue openzfs#2228 Issue openzfs#2316 Issue openzfs#2343
i was not able to reproduce it anymore with 0.6.3 seems fixed |
@maci0 Thanks for the update. Then let's close this out as fixed for now. If it ever resurfaces we can (and should) open a new issue. |
woops
The text was updated successfully, but these errors were encountered: