-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPLError: 324:0:(zfs_vfsops.c:390:zfs_space_delta_cb()) SPL PANIC #2025
Comments
Sorry, stupid me hitting the wrong buttons. Reopening. I just wanted to add my kernel config for reference https://projects.archlinux.org/svntogit/packages.git/tree/trunk/config.x86_64?h=packages/linux-lts |
The user/group used facility does manage the bonus area on its own so this could certainly be a newly-discovered case of SA corruption. Unfortunately, it's not currently possible to dump those dnodes directly with zdb. They are objects -1 and -2. Could you please run
Once you get that output, please post it to this issue (or to a gist or paste somewhere). We want to see the entire output for Object -1 and -2. It wouldn't be surprising if zdb trips an assertion while doing this. My initial suspicion is that there's a problem extending into a spill block when a file is created with a brand new user/group ID. For my part, I'll try a bit of testing on my own to see whether I can reproduce it. |
Do I understand you correctly that there are only two such objects per filesystem and I can stop the zdb grepping once these objects are found? I have a couple of file systems, here are three that are mostly likely involved: https://gist.github.com/clefru/8256771 I can't really interpret them. Varying sizes (1k, 1.5k) while all show 100% full don't make much sense to me. (Thanks for your speedy reply) |
Yes, you're correct. Actually, it turns out I'm wrong, you can dump those objects by specifying them as large unsigned 64-bit integers:
For some reason, it didn't dawn on me at the time that you could do that. I just took a peek at your group/user-used output and there doesn't seem to be anything too crazy going on. You mentioned you were using NFS so I'm wondering if some of the "special" NFS "nobody/nogroup" uid/gids might be involved with this. I don't suppose you've got any idea what type of file operation caused this? I've got a few ideas and will see if I can reproduce this on a test system. As to interpreting the output, I was interested as to how many different uid/gids were being kept track of as well as their values. I was also interested as to whether a spill block was involved (there is none; it would show up after the "dnode flags:"). |
@clefru It looks like I was on the wrong track. Can you reproduce this problem? |
Sorry, no idea on the specific NFS operations. Also the NFS nobody/nogroup UID doesn't seem to have an issue. I can happily create nobody/nogroup files. I will just wait a couple of days to see if I can reproduce it with more information. I will also zpool scrub over night. I found the following other panic reports: http://lists.freebsd.org/pipermail/freebsd-fs/2012-December/thread.html#15945 Interestingly, the last relates the problem to NFS sharing. |
I would like to point out that the value found in place of the SA_MAGIC is a plausible timestamp of 4 Jan 2014. |
I'd be nice if @clefru could run with debugging enabled (configure with --enable-debug) for awhile. There's a bunch of ASSERTS related to SAs which might get hit a bit earlier. |
@maxximino I think you're on the right track with the timestamp observation. I did examine the bogus value but didn't consider it might be a timestamp. |
@clefru Does the system on which your zfs/spl was compiled have gcc 4.8 or newer? If so, I think I've just discovered a whole raft of other problems that its aggressive loop optimization can cause. See #2010 for details. In particular, it looks like the sa_lengths[] array in sa_hdr_phys is also going to be a problem. |
@clefru Never mind the question regarding gcc 4.8. I had forgotten that the problem doesn't occur when the [1] array is the last member of the structure (which it is in this case). |
Thinking more about a timestamp winding up in where the SA's magic should be make me think this may, in fact, be another potential manifestation of #2010. I've not yet thought about the ramifications about every loop involved but anything that causes dn_nblkptr to be mis-managed could easily cause SAs to be clobbered. I will be looking over each of the likely over-optimized loops to try to determine the impact. With that in mind, I guess I would like to know whether your system, @clefru is using gcc 4.8 or newer. I'm also curious as to the history of various of gcc 4.8 adoption in various distros. |
@clefru what is the version of the zfs filesystem? v5 or older? was this upgraded or this fs was "born" as v5? (if this was received, consider also his life on the other pool) |
@maxximino Great catch on the timestamp! One potentially involved file system was v3->v5 upgraded before this happened without a reboot in between. @dweeezil positive, I am on gcc (GCC) 4.8.2 20131219 (prerelease). My CFLAGS -- if config.status is authoritative -- are just "-g -O2", but I assume the kernel build process adds a ton. I recompiled with --enable-debug on spl/zfs. Is there any spl_debug modparam I should pass-in? |
@clefru Thanks for checking the GCC; that's likely to be causing some problems in its own right (unless your distro's kernel build process adds -fno-aggressive-loop-optimizations; you can do a "make clean; make V=1 in the modules directory to check)... but... yes! the v3->v5 upgrade may explain everything. That upgrade enables a pair of features that are at the core of this particular problem and your original stack trace. I never considered the possibility that the filesystem in question may have been created with a ZPL version < 5. Under ZPL version 3, rather than an SA bundle, a znode_phys_t exists in the bonus buffer and is of type DMU_OT_DNODE. The very first members of that structure are the timestamps. In ZPL 5, however, where the bonus would be a DMU_OT_SA, the beginning of the bonus buffer is the magic number (0x2F505A or ASCII "ZP/\0") which is the 3100762 in your assert. As to running with --enable-debug, you don't need to do anything else. I had mainly suggested it to get the ASSERTs activated. I'm going to try some experiments with ZPL 3 pools that have been upgraded to version 5. I have a feeling that may explain everything. EDIT: before I dive into my 3->5 upgrade testing, I must mention that I've not done a pool upgrade in a long time and that during my initial tests while writing this original message, a 3->5 upgrade actually hit an ASSERT and failed. |
Thanks Tim, Thanks Massimo for hunting that down so quickly! Can I somehow dump the bonus area of the file system in question, so that I can figure out whether the bonus area was properly upgraded to the new layout? |
@clefru The bonus buffer is per-file (in the tail of the dnode) and it is not changed by performing the 3->5 upgrade. Only certain filesystem operations will cause a file's bonus buffer to be converted to a DMU_OT_SA. You can check a particular file with
The "bonus ZFS znode" indicates version < 5. Files created on version 5 will have a bonus type listed as "bonus System attributes". I have already been able to reproduce some corruption on 3->5 upgraded filesystems when running stock GCC 3.8-compiled code (without the #2010 hack). I'm working on tracking down the nature of the corruption and its cause. |
Just FYI I am running a debug build for a week now and I haven't seen this issue again. |
Below is my guess about what could be wrong. A thread is changing file attributes and it could end up calling zfs_sa_upgrade() to convert file's bonus from DMU_OT_ZNODE to DMU_OT_SA. The conversion is achieved in two steps:
dmu_set_bonustype() calls dnode_setbonus_type() which does the following: Concurrently, the sync thread can run into the dnode if it was dirtied in an earlier txg. The sync thread calls dmu_objset_userquota_get_ids() via dnode_sync(). dmu_objset_userquota_get_ids() uses dn_bonustype that has the new value, but the data corresponding to the txg being sync-ed is still in the old format. As I understand, dmu_objset_userquota_get_ids() already uses dmu_objset_userquota_find_data() when before == B_FALSE to find a proper copy of the data corresponding to the txg being sync-ed. |
As requested by @dweeezil, closing openzfs/spl#352, and copy/pasting infos here: SPLError: 2657:0:(zfs_vfsops.c:351:zfs_space_delta_cb()) SPL PANIC From production servers (HPC center, /home nfs servers), running for about 1y1/2 without problems, recently got theses messages (one I have been able to save) : Mar 31 19:30:59 r720data3 kernel: [ 7563.266511] VERIFY3(sa.sa_magic == 0x2F505A) failed (1383495966 == 3100762) then all txg_sync are hung, all knfsd are hung and uninterruptible, load goes to stars => hard reboot. I can't force to reproduce that bug, but it appears randomly (on 3 differents servers, all with same config hard+soft) as nfs usage goes (once a week for one server, twice a day for another). I scrubbed all pools after hangs/reboots => "No known data errors". Data were imported from older pools (v3, v4 from solaris x86 to Debian x86_64/zol) via zfs end/recv, then upgraded (via zpool/zfs upgrade) to v5. Maybe related with #1303 and #2025 ?
gcc -v
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT zpool status baie1
errors: No known data errors zpool get all baie1 zfs list zfs get all baie1/users/phys These are production servers, I can not play with debug, but if you need any additionnal data, please ask, I'll do what I can. |
This is believe to be resolved. If anyone observed an instance of a 'zfs_space_delta_cb()) SPL PANIC' in 0.6.3 or newer please open a new issue. |
Unfortunately, I saw this again today on heavy IO work over NFS:
|
OK, I'm reopening this issue. |
I saw this for the first time today with 0.6.3-2. This is a production machine and I can't really do a huge amount of debugging. Is there anything I could do that would allow me to modify the affected files without it panicing?
|
same here:
upgraded from zfs v4 to v5 some days ago. How to get rid of this |
Closing as stale. If anyone still hitting this let us know and we'll reopen it. |
After the following SPL panic
[402052.331476] VERIFY3(sa.sa_magic == 0x2F505A) failed (1388795800 == 3100762)
[402052.331514] SPLError: 324:0:(zfs_vfsops.c:390:zfs_space_delta_cb()) SPL PANIC
[402052.331545] SPL: Showing stack for process 324
[402052.331548] CPU: 2 PID: 324 Comm: txg_sync Tainted: P O 3.10.25-1-lts #1
[402052.331550] Hardware name: /DQ45CB, BIOS CBQ4510H.86A.0079.2009.0414.1340 04/14/2009
[402052.331552] ffff880069446fa8 ffff880221d8b9d8 ffffffff814b1e2d ffff880221d8b9e8
[402052.331555] ffffffffa0187897 ffff880221d8ba10 ffffffffa018890f ffffffffa019c531
[402052.331558] ffff8801e6697600 ffff8801e6697600 ffff880221d8ba50 ffffffffa02ce816
[402052.331561] Call Trace:
[402052.331568] [] dump_stack+0x19/0x1b
[402052.331589] [] spl_debug_dumpstack+0x27/0x40 [spl]
[402052.331595] [] spl_debug_bug+0x7f/0xe0 [spl]
[402052.331613] [] zfs_space_delta_cb+0x186/0x190 [zfs]
[402052.331628] [] dmu_objset_userquota_get_ids+0xd4/0x470 [zfs]
[402052.331643] [] dnode_sync+0xde/0xb80 [zfs]
[402052.331647] [] ? __mutex_lock_slowpath+0x24c/0x320
[402052.331661] [] dmu_objset_sync_dnodes+0xd2/0xf0 [zfs]
[402052.331676] [] dmu_objset_sync+0x1c7/0x2f0 [zfs]
[402052.331689] [] ? secondary_cache_changed_cb+0x20/0x20 [zfs]
[402052.331703] [] ? dmu_objset_sync+0x2f0/0x2f0 [zfs]
[402052.331720] [] dsl_dataset_sync+0x41/0x50 [zfs]
[402052.331737] [] dsl_pool_sync+0x98/0x470 [zfs]
[402052.331754] [] spa_sync+0x427/0xb30 [zfs]
[402052.331772] [] txg_sync_thread+0x374/0x5e0 [zfs]
[402052.331790] [] ? txg_delay+0x120/0x120 [zfs]
[402052.331796] [] thread_generic_wrapper+0x7a/0x90 [spl]
[402052.331802] [] ? __thread_exit+0xa0/0xa0 [spl]
[402052.331806] [] kthread+0xc0/0xd0
[402052.331809] [] ? kthread_create_on_node+0x120/0x120
[402052.331812] [] ret_from_fork+0x7c/0xb0
[402052.331815] [] ? kthread_create_on_node+0x120/0x120
my NFS server stayed partially unresponsive.
I am running 5d862cb zfs and 921a35a spl on linux-lts 3.10.25
I never enabled xattr=sa on this pool. I have xattr=on.
Another SA corruption issue?
The text was updated successfully, but these errors were encountered: