Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Premature free() in sftp_readdir_async() #7
Comments
ShapeShifter499
referenced this issue
in libfuse/libfuse
Feb 16, 2016
Closed
Segmentation fault with fuse/sshfs in multi-threaded mode #14
|
Sorry, that backtrace doesn't contain debugging symbols and is thus pretty much useless. Could you please install non-stripped versions of libfuse and sshfs (the procedure depends on your distribution, but compiling from source should always work)? Ideally, please try to reproduce this when running under valgrind. The problem is that the glibc error only appears long after the damage has been done, valgrind should tell us exactly when things go wrong. |
ShapeShifter499
commented
Feb 16, 2016
|
Non stripped? Sorry I'm more of a 'end user' of Linux, would compiling straight from git be enough? EDIT: Um duh you already said it in the comment above and I missed it while reading oops |
ShapeShifter499
commented
Feb 16, 2016
|
It seems like my Linux distro Arch Linux already pulls from git for their packages. I'm not sure what they did different that I wouldn't do. SSHFS: https://projects.archlinux.org/svntogit/community.git/tree/trunk/PKGBUILD?h=packages/sshfs EDIT: Suppose I'll try running what I have through Valgrind |
|
Try to run
not this:
|
ShapeShifter499
commented
Feb 16, 2016
|
I have stripped, now what? SSHFS
LibFuse
|
|
Now install unstripped packages using the appropriate means for your distribution (on Debian, that would be |
ShapeShifter499
commented
Feb 17, 2016
|
I'll compile later and get things going. Seems my distro provides some dev and dbg packages but not for fuse and sshfs. Expect an update on that sometime tomorrow. P.S. I'm currently running a load over sshfs through Valgrind with the versions I got from the distro. Exactly same command as the one I used with the gdb log but just valgrind in place of gdb. However this time around I have yet to hit a crash. I don't know enough to understand why, any ideas? |
ShapeShifter499
commented
Feb 17, 2016
|
Just to add, before running sshfs through valgrind. I would experience the crash hours after I told Digikam to go and organize over 22k in photos and videos by the folder structure year/month/date of when it was taken or created using files in SSHFS mounted file system from the remote machine. Sometime into this SSHFS would core dump and shortly after Digikam would core dump. Only I saw just a empty desktop when I came back, no Digikam running. Below are one of the core dumps I saw in "journalctl -b"
|
ShapeShifter499
commented
Feb 17, 2016
|
Tried to trigger the bug again without Valgrind and got a different gdb output but same issue, SSHFS is dying without a known reason.
|
|
On Feb 16 2016, Lance notifications@github.com wrote:
Sorry, without debugging symbols all the information you're providing Best, GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
|
ShapeShifter499
commented
Feb 17, 2016
|
Ok so couldn't sleep, compiled the correct packages. Hopefully what you need is below. SSHFS
LibFUSE
Now for the crash log with the updated non stripped binaries
|
|
Yep, this backtrace is useful. But unfortunately it only tells us that you're most likely seeing the same issue that you referenced above. Something corrupts libc's malloc pool, and this corruption later causes the segfault for which you got the stacktrace. Valgrind should be able to tell us when the actual corruption occurs (unless it's a multithreading issue after all, but I don't think so). |
Nikratio
added
the
bug
label
Jun 5, 2016
|
I'm closing this issue for now, since I can't locally reproduce it and without a valgrind log there is little I can do to fix it. Please feel free to re-open with additional information. |
Nikratio
closed this
Jun 3, 2017
Nikratio
added
the
needs-info
label
Jun 3, 2017
mikemol
commented
Jul 7, 2017
|
I started encountering this earlier this week. Running with sshfs now. If anyone has a good way to do a systemd+fstab-driven sshfs mount under valgrind automatically, I'm all ears. For now, here's my command line, for anyone else who stumbles across the same problem:
If there's any additional flags I ought to be using, let me know. |
mikemol
commented
Jul 10, 2017
|
|
@mikemol Thanks! What sshfs version is this? |
Nikratio
reopened this
Jul 12, 2017
mikemol
commented
Jul 12, 2017
|
fuse-sshfs-2.5-1.el7.x86_64 Latest version from EPEL on CentOS 7. |
Nikratio
self-assigned this
Jul 12, 2017
Nikratio
removed
the
needs-info
label
Jul 12, 2017
Nikratio
changed the title from
sshfs' fuse_fs_readdir bug
to
Premature free() in sftp_readdir_async()
Jul 12, 2017
|
Thanks! Looks like sftp_readdir_async() tries to access req->want_reply. However, sftp_process_one_request() frees req if req->want_reply under the same conditions. I'll have to think about how to best fix this. |
mikemol
commented
Jul 12, 2017
|
Added to EPEL's bugzilla so they're aware when a fix is available, and can incorporate it: https://bugzilla.redhat.com/show_bug.cgi?id=1470193 |
|
Could you test if the following patch solves the problem?
|
mikemol
commented
Jul 12, 2017
|
That's the solution I was going to suggest, and would fix the race condition. I'm not in a position to test it right now; my build host is unavailable. However, given that it's clearly a race condition, you should be able to exacerbate it by dropping a |
ShapeShifter499 commentedFeb 16, 2016
Hi,
With some help from IRC channel ##networking on Freenode it seems there maybe a bug with how sshfs uses fuse. Maybe with 6a2d06e Someone suggested running sshfs with the flag "-o sync_readdir" and it seems to help with the crash I was experiencing. Log of the original crash below.