New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thread overran stack, or stack corrupted #1354
Comments
@behlendorf out of paranoia i'm still using 16KiB stacks The deepest stacks I see:
(likely ZPL not a linux fs over a zvol) That leaves about 2K free, I can imagine that's not hard to exhaust. For stack use where zvols are involved:
Which is well under 3K when you consider not all 8K would be available. |
FWIW, I experienced "Thread overran stack, or stack corrupted" as well during testing a "zfs send -D -Rv $SNAP | zfs receive -uF $DEST/$SOURCE-test". This occurred with the source pool containing $SNAP snapshot being backed by LVM on top of mdadm and the destination pool $DEST being backed by a single, full disk attached to the same system. The system was running an up-to-date Debian Squeeze with a Proxmox Kernel 2.6.32-17-pve and with spl 0.6.0.98-0ubuntu1 So far, the above send | receive never failed to crash the system eventually, but the trace varied. Please find below the messages from a hard lockup:
Curiously enough, the very same system sometimes crashes differently during the same test. The occurrence listed below even still allowed the kernel to issue a reset so no hardware reset was required:
|
@florianernst To determine the root cause of your stack overflow you're going to need to rebuild you kernel with the following patch to enable 16k stacks. This will prevent the overflow from occurring and allow you to grab a detailed stack like the one below.
This detailed stack call graph is for the case I'm observing. Somewhat frustratingly zfs is responsible for less than 2k of the total stack usage. However, since we probably only need to save a hundred bytes or so we still may be able to do something about this.
|
After giving this some thought, I think the right thing to do here isn't to shave a few bytes off the stack. Instead we should rework the
Two ways occur to me to go about the above changes and should be investigated. We could either a) use Linux's aio interfaces for this if they support a callback on the kernel side. The downside here is that we'll be dependent on the underlying file system to correctly support aio and not all filesystem do. Or b) we could use a taskq in the ZFS code to make the I/O asynchronous with each individual work item being synchronous. This has the advantage of buying us the most possible stack and being insensitive to the underlying aio implementation. Which ever we end up picking I'm deferring this work until 0.6.2 since this isn't a new issue in the code. |
@behlendorf Thanks for the hint. I recompiled my kernel (for anyone who might want to try this as well: for the proxmox rhel6-based kernel I had to adjust THREAD_ORDER instead of THREAD_SIZE_ORDER), and ever since booting the recompiled kernel I failed to reproduce the previously described crashes. Please advise whether there is anything I should try / provide while the system is running. I had some hung tasks, though, when trying to "zfs destroy -r" a dataset:
But this might be entirely unrelated as the test system was under heavy IO load at that time ... |
@florianernst Now that your not overrunning the stack we can get the debugging which is needed. Here's what you'll need to do.
If the value in |
@behlendorf Thanks a lot for the info, much appreciated. There you go:
As a side note, on my test system BTW, colleagues mentioned they had another crash when cd'ing into .zfs/snapshot dirs. I'll see whether I can find a reproducible testcase for this and will open another issue if I can. |
@florianernst Thanks, that's what we needed. We'll give a little though how to get the stack usage under control, in the meanwhile there's really no downside to running with the 16k kernel until we sort this out. |
Hi, I ran into this issue too (CentOS 6.4 kernel-2.6.32-358.2.1.el6.x86_64 spl/zfs 0.6.1) just zfs send pool@snapshot > /nfs/file.zfs over a 10GBE network (mtu 9k). I have rebuild a 16k stack kernel and here is a stack strace: Thu Apr 18 11:10:13 CEST 2013 ==> /sys/kernel/debug/tracing/stack_trace <==
|
3306 zdb should be able to issue reads in parallel 3321 'zpool reopen' command should be documented in the man page and help Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Matt Ahrens <matthew.ahrens@delphix.com> Reviewed by: Christopher Siden <chris.siden@delphix.com> Approved by: Garrett D'Amore <garrett@damore.org> References: illumos/illumos-gate@31d7e8f https://www.illumos.org/issues/3306 https://www.illumos.org/issues/3321 The vdev_file.c implementation in this patch diverges significantly from the upstream version. For consistenty with the vdev_disk.c code the upstream version leverages the Illumos bio interfaces. This makes sense for Illumos but not for ZoL for two reasons. 1) The vdev_disk.c code in ZoL has been rewritten to use the Linux block device interfaces which differ significantly from those in Illumos. Therefore, updating the vdev_file.c to use the Illumos interfaces doesn't get you consistency with vdev_disk.c. 2) Using the upstream patch as is would requiring implementing compatibility code for those Solaris block device interfaces in user and kernel space. That additional complexity could lead to confusion and doesn't buy us anything. For these reasons I've opted to simply move the existing vn_rdwr() as is in to the taskq function. This has the advantage of being low risk and easy to understand. Moving the vn_rdwr() function in to its own taskq thread also neatly avoids the possibility of a stack overflow. Finally, because of the additional work which is being handled by the free taskq the number of threads has been increased. The thread count under Illumos defaults to 100 but was decreased to 2 in commit 08d08e due to contention. We increase it to 8 until the contention can be address by porting Illumos openzfs#3581. Ported-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#1354
Observed when running filebench on Fedora 18 using file vdevs. Based on the stack it looks like ext4 was responsible for a good chunk of the stack. I suspect this wouldn't have been an issue on block vdevs.
The text was updated successfully, but these errors were encountered: