Join GitHub today
listxattr EFAULT #503
I was getting an EFAULT when trying to read the xattrs on a specific file, which first showed up as an error when using "rsync -X" to copy into the ZFS fs. I went to have a closer look at the file:
...and tried setting an xattr, which it happily did without any apparent error, but reading it back again failed:
Changing the fs to "set xattr=dir" didn't help (it was then changed it back to 'sa'). I was able to successfully set xattrs on other files in the fs. E.g.:
Augh. So I've now lost the ability to reproduce the problem on that specific file. When it happens again, is there anything in particular I should be doing help track down the problem?
It's not immediately clear how this can happen, if you can come up with a consistent reproducer that would help a lot. Alternately, if you install systemtap on your system you can get a detailed kernel stack the next time it occurs by running the following. That will show exactly how the EFAULT is occurring.
# stap /usr/share/doc/systemtap-1.4/examples/general/para-callgraph.stp \ 'module("zfs").function("*@module/zfs/*.c")' \ 'module("zfs").function("zpl_xattr_list")' -c "getfattr -d aaaa"
Of course, in the grand tradition of heisenbugs, once I booted the box into a systemtap-enabled kernel, I'm no longer seeing the problem. I imagine that's due to the reboot rather than deliberate mischievousness from the FSM. I'll let you know if it pops up again.
(I hadn't looked at systemtap previously - WOW!)
OK, I have another xattr "Bad address" case. Interestingly, this one survived a reboot. Stap output:
In contrast with one that works:
Hmmm, am I reading that right? The one that works is a directory-base xattr, whereas the one that doesn't work is an sa-based xatter?
Interestingly, 'getfattr -Rd .' tells me that of the 1107 files in that directory, which would all have been created in the same rsync run, 4 come back with the 'Bad address' problem.
If I create a new file and set the xattr:
Stap on that file shows a similar output to the older file that worked, i.e. starting with zpl_xattr_filldir rather than zfs_sa_get_xattr.
The file system was created with, and is now definately set for xattr=sa:
...so, aside from the "bad address" issue with the sa xattrs, why is it using directory xattrs (if that is indeed what it's doing)?
Looking at one of the files with the xattr problem, and with "getfattr -d" vs "getfattr -n":
I'm confident the 'user.rsync.%stat' xattr would have been set on the file as it's a part of the rsync transfer. Then, explicitly setting that xattr:
The stap output for the '-d' case:
The stap output for the '-v' case:
Huh? No zfs usage at all? ...or, more likely, stap simply isn't being told to trace the relevant path. At this point my knowledge of what the relevant path might be, how to point stap towards it, and whether it's even relevant, all fail me!
The stap output is very helpful. The problem appears to be that nvlist_unpack() in zfs_sa_get_xattr() is returning EFAULT. Now my understanding is that indicates an encode/decoding error although why that's the case isn't clear. If you add the
As for whether sa or dir based xattrs are used depends exactly on the situation. Even when you have xattr=sa set individual xattrs which exceed 32k will be created as dir xattrs instead. Additionally, if your xattrs on a single file total more than 64k new xattrs with be created at dir xattrs. Both cases should be really unlikely on a Linux box since for example ext4 has a hard xattr limit of about 4k. This spill functionality is just provided for additional compatibility.
Also because directory xattrs once created never get automatically migrated to sa xattrs we always check the dir and sa xattr namespaces when listing them.
Below is the stap incorporating module("znvpair").function("@module/nvpair/.c") for a failing listxattr.
I'm confident the filesystem in question was originally created with xattr=sa, and the only xattr set on a file is like user.rsync.%stat="100700 0,0 123:123", i.e. significantly smaller than 32k. So I'm surprised that where I've looked at a few files where the listxattr works, they all appear to have dir xattrs. Is there any way of telling, other than the stap fingerprint, whether a file has sa or dir xattrs?
As an aside, are there any suggestions as to what to do to enable the stap module to unload? I'm using systemtap 1.6 and per previous dumps, every stap run ends like:
...leaving the module installed. Doing an "rmmod -f" on that module causes a kernel NULL pointer dereference. Google is suprisingly unforthcoming (or perhaps it's my google-fu). As I'm unsure of the consequences of having multiple stap modules loaded, I'm forced to reboot between every stap run - a tad annoying!
The stap output:
Make sure your stap kernel has CONFIG_MODULE_UNLOAD=y. This script might help identify the peculiarity of your setup: http://sourceware.org/git/?p=systemtap.git;a=blob_plain;f=stap-report
Yes, I definitely have CONFIG_MODULE_UNLOAD=y. Thanks for the info on removing the @filename part. I haven't tried that yet as the problem seems to be fundamental in system tap as even a simple script like:
...fails the same way (can't unload the module) and the people over at the systemtap mailing list are helping me with this issue: http://thread.gmane.org/gmane.linux.systemtap/18785
referenced this issue
Jan 13, 2012
From the stap output above, the first sign of failure comes from nvs_xdr_nvpair:
...so I put some printk statements in there to see which part was returning EFAULT:
Then I used stap to trace all spl/zfs calls (script below) whilst running the failing getfattr. The kernel spat out:
The stap output was:
@chrisrd Thanks for doing the leg work on this. Your debugging suggests there may be some flaw in the xdr decoding implementation. This error case in particular indicates that we ran out of bytes in the stream before decoding the excepted length. Could you try commenting out this specific error case and running to test again to see if your able to access the full xattr.
On Wed, Jan 25, 2012 at 07:11:21AM -0800, Frank Ch. Eigler wrote:
...now to start reading the systemtap manual to see how to dig into the XDR structure.
No, with the "*size > NVS_XDR_MAX_LEN(bytesrec.xc_num_avail)" error return patched out it's getting a little further but still failing:
Note 1: added to previous stap script:
Note 2: the above stap output was for the second run. The first run produced a 53k line output (with all spl/zfs functions monitored) - I imagine the difference between accessing uncached vs cached. If interested the output for the uncached case is available here: http://www.onthe.net.au/private/stap.out.bz2
referenced this issue
Feb 11, 2012
added a commit
Mar 2, 2012
Apologies for the delay in responding, I've just had a chance to look at this again...
Is it expected that the xattrs added to files whilst running with
I tried to read the xattrs on a "known problematic" file from previous issues, using
Doing this a few times several minutes apart:
...shows a modest amount of activity within zfs:
Actually, these are the symptoms of #539, which I thought might have been due to #513. However this is on a freshly booted system without any NULL pointer dereferences so it looks like #513 wasn't down to #539 after all. More's the pity!
Oh, I've just seen that my
Chris I believe it would have been possible (but rare) that some of the xattrs would have been only half written. That could make the subsequent getattr fail as you described and it would be persistent since it's now saved on disk. With the patch that will no longer be possible since the handle destroy before the tx is committed.