Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extremely poor performance with latest stable, I/O error, random crashes on folder creation #54

Closed
xnoreq opened this issue Aug 14, 2018 · 36 comments

Comments

@xnoreq
Copy link

xnoreq commented Aug 14, 2018

Using:

  • sshfs-win-2.7.17334-x64.msi
  • winfsp-1.3.18160.msi

Connecting to an Android phone (running primitive ftpd) over wifi results in extremely poor performance.
Copying a folder with 10 files with explorer takes like a minute just to calculate the files to copy.

Using WinSCP the files start copying instantly and with over 5 MB/s.

Using:

  • winfsp-1.4.18211.msi
  • sshfs-win-3.2.18213-x64.msi

first resulted in an I/O error, then the drive got disconnected and stayed that way (there was no possibility to remove the faulty drive).
After mapping another drive, I again got I/O errors even just trying to create a new folder.

@billziss-gh
Copy link
Collaborator

@xnoreq how are you launching the file system?

@xnoreq
Copy link
Author

xnoreq commented Aug 15, 2018

Explorer - Connect network drive: \\sshfs\user@ipv4!port

@xnoreq
Copy link
Author

xnoreq commented Aug 15, 2018

With the stable version it actually doesn't even copy the files. It's stuck in calculating... for a long time (the Explorer file copy dialog) and then when it finally starts copying the first file it fails the error element not found.

@xnoreq
Copy link
Author

xnoreq commented Aug 16, 2018

Is there any way to debug/log what is going on?

@billziss-gh
Copy link
Collaborator

@xnoreq you can start sshfs from the command line and give it the option -d.

@xnoreq
Copy link
Author

xnoreq commented Aug 17, 2018

>"C:\Program Files\SSHFS-Win\bin\ssh.exe" user@192.168.1.56 -p 1234 -x -a -s -vvv sftp

debug1: Next authentication method: keyboard-interactive
debug2: userauth_kbdint
debug3: send packet: type 50
debug2: we sent a keyboard-interactive packet, wait for reply
debug3: receive packet: type 60
debug2: input_userauth_info_req
Password authentication
debug2: input_userauth_info_req: num_prompts 1
Password:
debug3: send packet: type 61
debug3: receive packet: type 52
debug1: Authentication succeeded (keyboard-interactive).
Authenticated to 192.168.1.56 ([192.168.1.56]:1234).
debug1: channel 0: new [client-session]
debug3: ssh_session2_open: channel_new: 0
debug2: channel 0: send open
debug3: send packet: type 90
debug1: Entering interactive session.
debug1: pledge: network
debug3: receive packet: type 91
debug2: callback start
debug2: fd 3 setting TCP_NODELAY
debug3: ssh_packet_set_tos: set IP_TOS 0x08
debug2: client_session2_setup: id 0
debug1: Sending subsystem: sftp
debug2: channel 0: request subsystem confirm 1
debug3: send packet: type 98
debug2: callback done
debug2: channel 0: open confirm rwindow 2097152 rmax 32768
debug3: receive packet: type 99
debug2: channel_input_status_confirm: type 99 id 0
debug2: subsystem request accepted on channel 0
>"C:\Program Files\SSHFS-Win\bin\sshfs.exe" user@192.168.1.56: z: -p 1234 -d -o LogLevel=DEBUG3

debug1: Next authentication method: keyboard-interactive
debug2: userauth_kbdint
debug3: send packet: type 50
debug2: we sent a keyboard-interactive packet, wait for reply
debug3: receive packet: type 60
debug2: input_userauth_info_req
Password authentication
debug2: input_userauth_info_req: num_prompts 1
debug3: failed to open file:/dev/tty error:3
debug1: read_passphrase: can't open /dev/tty: No such file or directory
Password:
debug3: send packet: type 61
debug3: receive packet: type 52
debug1: Authentication succeeded (keyboard-interactive).
Authenticated to 192.168.1.56 ([192.168.1.56]:1234).
debug2: fd 4 setting O_NONBLOCK
debug2: fd 5 setting O_NONBLOCK
debug1: channel 0: new [client-session]
debug3: ssh_session2_open: channel_new: 0
debug2: channel 0: send open
debug3: send packet: type 90
debug1: Entering interactive session.
debug1: pledge: network
debug3: receive packet: type 91
debug2: channel_input_open_confirmation: channel 0: callback start
debug2: fd 3 setting TCP_NODELAY
debug2: client_session2_setup: id 0
debug1: Sending subsystem: sftp
debug2: channel 0: request subsystem confirm 1
debug3: send packet: type 98
debug2: channel_input_open_confirmation: channel 0: callback done
debug2: channel 0: open confirm rwindow 2097152 rmax 32768
debug3: read - ERROR from cb :87, io:0000022944AD0C60
debug2: channel 0: read<=0 rfd 4 len 4294967295
debug2: channel 0: read failed
debug2: channel 0: close_read
debug2: channel 0: input open -> drain
debug2: channel 0: ibuf empty
debug2: channel 0: send eof
debug3: send packet: type 96
debug2: channel 0: input drain -> closed
debug3: receive packet: type 99
debug2: channel_input_status_confirm: type 99 id 0
debug2: subsystem request accepted on channel 0
debug3: receive packet: type 96
debug2: channel 0: rcvd eof
debug2: channel 0: output open -> drain
debug2: channel 0: obuf empty
debug2: channel 0: close_write
debug2: channel 0: output drain -> closed
debug3: receive packet: type 98
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
debug3: receive packet: type 97
debug2: channel 0: rcvd close
debug3: channel 0: will not send data after close
debug2: channel 0: almost dead
debug2: channel 0: gc: notify user
debug2: channel 0: gc: user detached
debug2: channel 0: send close
debug3: send packet: type 97
debug2: channel 0: is dead
debug2: channel 0: garbage collecting
debug1: channel 0: free: client-session, nchannels 1
debug3: channel 0: status: The following connections are open:
  #0 client-session (t4 r0 i3/0 o3/0 fd -1/-1 cc -1)

debug3: send packet: type 1
debug3: fd 0 is not O_NONBLOCK
debug3: fd 1 is not O_NONBLOCK
Transferred: sent 1720, received 1736 bytes, in 0.7 seconds
Bytes per second: sent 2316.9, received 2338.4
debug1: Exit status 0
read: Connection reset by peer

@billziss-gh
Copy link
Collaborator

@xnoreq try to set the path before executing SSHFS:

set PATH=C:\Program Files\SSHFS-Win\bin

Also when you start SSHFS from the command-line use the options -o uid=-1,gid=-1 to make the file system accessible to the user that starts the file system.

@xnoreq
Copy link
Author

xnoreq commented Aug 17, 2018

>sshfs.exe user@192.168.1.56:/ z: -p 1234 -d -o uid=-1,gid=-1
Password authentication
Password:
read: Connection reset by peer

Still the same error.
It looks like it is doing some operation that is not supported by the server.

What am I doing wrong? Why does mounting work through Explorer?

@xnoreq
Copy link
Author

xnoreq commented Aug 17, 2018

I switched the way primftpd accesses the filesystem on the android phone (the option is called "plain old filesystem"), and now it connects also in using sshfs.exe, but the resulting drive cannot be accessed.

The security tab in explorer shows my user, nobody and everyone, but none of them have any permissions.

Has anyone ever tried this with an Android phone?

@billziss-gh
Copy link
Collaborator

@xnoreq did you try setting the PATH environment variable before startings SSHFS? Do you have another ssh.exe in the PATH somewhere?

@xnoreq
Copy link
Author

xnoreq commented Aug 18, 2018

@billziss-gh Of course I have adjusted PATH, and no, I don't have another ssh.exe in the PATH.

@4O4
Copy link
Contributor

4O4 commented Aug 18, 2018

@xnoreq What about native sshfs under linux, have you maybe tested it before? If not, can you setup some minimal linux virtual machine and check if it works better with your android server?

@xnoreq
Copy link
Author

xnoreq commented Nov 5, 2018

After some changes (using termux now on my phone) and testing using sshfs on a linux box (works fine) I get the following problems:

  1. Mapping a network drive doesn't seem to like pubkey authentication, so I have to start sshfs manually
  2. sshfs connects but when I create a folder in explorer I get a nasty error popup with error code: 0x800703E3

The folder is created however, because after closing the error and refreshing it appears in Explorer.
Additionally, when pressing retry in the error popup sshfs crashes.

Here is the log from creating a folder:

[00063] LSTAT
  [00063]         STATUS       33bytes (212ms)
[00064] LSTAT
  [00064]         STATUS       33bytes (3ms)
[00065] MKDIR
  [00065]         STATUS       28bytes (822ms)
[00066] LSTAT
[00067] LSTAT
  [00066]          ATTRS       41bytes (38ms)
  [00067]          ATTRS       41bytes (39ms)
[00068] EXTENDED
  [00068] EXTENDED_REPLY       97bytes (5ms)

Retry:

[00069] EXTENDED
  [00069] EXTENDED_REPLY       97bytes (88ms)
[00070] LSTAT
  [00070]          ATTRS       41bytes (30ms)
[00071] LSTAT
  [00071]          ATTRS       41bytes (5ms)
[00072] LSTAT
  [00072]          ATTRS       41bytes (7ms)
[00073] LSTAT
  [00073]          ATTRS       41bytes (5ms)
[00074] LSTAT
  [00074]          ATTRS       41bytes (6ms)
[00075] LSTAT
  [00075]          ATTRS       41bytes (4ms)
[00076] LSTAT
  [00076]         STATUS       33bytes (7ms)
[00077] LSTAT
  [00077]         STATUS       33bytes (5ms)
[00078] MKDIR
  [00078]         STATUS       28bytes (828ms)
Exception: STATUS_ACCESS_VIOLATION at rip=001800BE3A7
rax=000006FE7FD20010 rbx=0000000600081000 rcx=000000003003F7F5
rdx=00000001802DBFA8 rsi=00000006000E0000 rdi=000006FE7FE00010
r8 =000006F87FD20010 r9 =00000001802DFFE0 r10=0000000002F80000
r11=00000001004014C1 r12=00000000000000A0 r13=0000000000122B60
r14=0000000002F7CA80 r15=0000000000000000
rbp=0000000000000000 rsp=0000000002F7C7C8
program=C:\Program Files\SSHFS-Win\bin\sshfs.exe, pid 5596, thread
cs=0033 ds=002B es=002B fs=0053 gs=002B ss=002B
Stack trace:
Frame        Function    Args
00000000000  001800BE3A7 (00002F7C820, 00180140131, 00180141CFF, 006000C9518)
00000000000  0010040600B (0018011973B, 500000000000401, 00600081000, 006000C9390)
00002F7C8F0  0010040A04E (00600078860, 00100410177, 001800BA6D3, 00000000A02)
00002F7C9E1  7FF9EC9C2E2C (00002F7CA80, 00000000000, 00000000000, 00000000000)
00002F7C9E1  7FF9EC9C5E3F (00600080FD0, 00000000000, 00002F7CA74, 00002F7CA7C)
00002F7CAE1  7FF9EC9C699C (0000012EF90, 00000000080, 00000000000, 00000000010)
00000122B60  7FF9EC9D2C4E (00000132F60, 00000000370, 00000100001, 0000012EF90)
00000122B60  7FF9EC9D3B35 (00000000000, 00000132F60, 000000303E9, 0000012EF90)
00000122B60  7FF9EC9D6141 (00000000000, 00000000080, 7FF9EC9D5F40, 00000000000)
00000122B60  00180045653 (00000000000, 00000000000, 00000000000, 00000000000)
00000000000  00180045704 (00000000000, 00000000000, 00000000000, 00000000000)
00000000000  7FFA2C843034 (00000000000, 00000000000, 00000000000, 00000000000)
00000000000  7FFA2F361471 (00000000000, 00000000000, 00000000000, 00000000000)
End of stack trace
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
debug1: channel 0: free: client-session, nchannels 1
Transferred: sent 9116, received 8020 bytes, in 27.4 seconds
Bytes per second: sent 332.1, received 292.2
debug1: Exit status 0

@billziss-gh
Copy link
Collaborator

@xnoreq sshfs-win does not currently support pubkey authentication without having to resort to the command line and/or batch files or special launchers (e.g. sirikali).

Error 0x800703E3 is the "wrapped" Win32 ERROR_OPERATION_ABORTED. This happens in rare occasions, but having the file system fail with an access violation (as per your second log) is one of them.

I am assuming this happens with the latest SSHFS beta and not the stable version?

@xnoreq
Copy link
Author

xnoreq commented Nov 6, 2018

This happens in rare occasions

What if it happens every time?

I am assuming this happens with the latest SSHFS beta and not the stable version?

Correct.

@billziss-gh
Copy link
Collaborator

This happens in rare occasions

What if it happens every time?

This points to a bug (access violation) in the latest beta that is easily reproducible in your environment.

I am currently swamped with other work, but my intent is to sync with upstream SSHFS v3.5 before year's end; this might fix your problem. The work for this should be straightforward as all Cygwin related changes have now been incorporated upstream. So if anyone wants to take this on as a PR I would really appreciate it.

@xnoreq
Copy link
Author

xnoreq commented May 17, 2019

Updated to latest versions today and tried again.
Creating a folder still crashes:

sshfs.exe options after mount point: -d -o LogLevel=DEBUG3 -o uid=-1 -o gid=-1

[00034] MKDIR
  [00034]         STATUS       28bytes (2950ms)
Exception: STATUS_ACCESS_VIOLATION at rip=001800CC607
rax=000006FE7FCF0010 rbx=00000006000C57C0 rcx=0000000030045415
rdx=000000018030A0A8 rsi=00000006000E0000 rdi=000006FE7FDD0010
r8 =000006F87FCF0010 r9 =0000000180310000 r10=00000000029B0000
r11=00000001004014C1 r12=00000000000000A0 r13=00000000000F43A0
r14=00000000029ACA30 r15=0000000000000000
rbp=0000000000000000 rsp=00000000029AC778
program=c:\Program Files\SSHFS-Win\bin\sshfs.exe, pid 874, thread
cs=0033 ds=002B es=002B fs=0053 gs=002B ss=002B
Stack trace:
Frame        Function    Args
00000000000  001800CC607 (000029AC7D0, 00180151341, 003FF32F746, 006000912F8)
00000000000  0010040610B (001800C88DE, 500000000000401, 006000810D0, 006000810F0)
000029AC8A0  0010040A15E (00600078860, 00100410197, 001800C88DE, 00000000A02)
000029AC991  7FF916C92E2C (000029ACA30, 00000000000, 00180152ECE, 00000000000)
000029AC991  7FF916C95E4F (006000914B0, 00000000000, 000029ACA24, 000029ACA2C)
000029ACAA1  7FF916C96A1A (00000109AD0, 00000000001, 00000000000, 00000000010)
000029ACB79  7FF916CA3778 (00000000078, 00000109AD0, 00000100001, 00000109AD0)
000000F43A0  7FF916CA4899 (00000000000, 0000010DAA0, 000000303E9, 00000109AD0)
000000F43A0  7FF916CA7201 (00000000000, 00000000078, 7FF916CA7000, 00000000000)
000000F43A0  00180048353 (00000000000, 00000000000, 00000000000, 00000000000)
00000000000  00180048404 (00000000000, 00000000000, 00000000000, 00000000000)
00000000000  7FF944317974 (00000000000, 00000000000, 00000000000, 00000000000)
00000000000  7FF946BCA271 (00000000000, 00000000000, 00000000000, 00000000000)
End of stack trace
debug2: channel 0: read<=0 rfd 4 len 0
debug2: channel 0: read failed
debug2: channel 0: chan_shutdown_read (i0 o0 sock -1 wfd 4 efd 6 [write])
debug2: channel 0: input open -> drain
debug2: channel 0: ibuf empty
debug2: channel 0: send eof
debug3: send packet: type 96
debug2: channel 0: input drain -> closed
debug3: receive packet: type 98
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
debug3: receive packet: type 96
debug2: channel 0: rcvd eof
debug2: channel 0: output open -> drain
debug2: channel 0: obuf empty
debug2: channel 0: chan_shutdown_write (i3 o1 sock -1 wfd 5 efd 6 [write])
debug2: channel 0: output drain -> closed
debug3: receive packet: type 97
debug2: channel 0: rcvd close
debug3: channel 0: will not send data after close
debug2: channel 0: almost dead
debug2: channel 0: gc: notify user
debug2: channel 0: gc: user detached
debug2: channel 0: send close
debug3: send packet: type 97
debug2: channel 0: is dead
debug2: channel 0: garbage collecting
debug1: channel 0: free: client-session, nchannels 1
debug3: channel 0: status: The following connections are open:
  #0 client-session (t4 r0 i3/0 o3/0 e[write]/0 fd -1/-1/6 sock -1 cc -1)

debug3: send packet: type 1
debug3: fd 0 is not O_NONBLOCK
debug3: fd 1 is not O_NONBLOCK
Transferred: sent 4860, received 5060 bytes, in 20.6 seconds
Bytes per second: sent 235.7, received 245.4
debug1: Exit status 0

@xnoreq
Copy link
Author

xnoreq commented May 17, 2019

Btw, it works fine with sshfs 3.5.2 on a linux machine.
Here's the same logs of creating a folder:

[00192] LSTAT
  [00192]         STATUS       33bytes (15ms)
[00193] MKDIR
  [00193]         STATUS       28bytes (88ms)
[00194] LSTAT
  [00194]          ATTRS       41bytes (16ms)

Options are the same as above but without uid, gid.

@xnoreq
Copy link
Author

xnoreq commented May 18, 2019

Here's the same stack trace with the x86 binary:

[00085] MKDIR
  [00085]         STATUS       28bytes (43ms)
Exception: STATUS_ACCESS_VIOLATION at eip=61161878
eax=9E7A0008 ebx=61297A38 ecx=184A5E8E edx=00000000 esi=00000000 edi=9E7A0008
ebp=02D3CA14 esp=02D3CA08 program=C:\Program Files (x86)\SSHFS-Win\bin\sshfs.exe, pid 503, thread
cs=0023 ds=002B es=002B fs=0053 gs=002B ss=002B
Stack trace:
Frame     Function  Args
02D3CA14  61161878 (9E7A0008, 00000000, 61297A38, 00000000)
02D3CACC  00405F17 (00738000, 02D3CB00, 02D3CBD0, 00000000)
02D3CB84  70EC48D0 (02D3CBD0, 00000000, 02D3CBC8, 02D3CBC4)
02D3CC38  70EC531E (006DF068, 006F2058, 00000001, 00100001)
02D3CCF4  70ECF4FE (00712028, 006F2000, 00712028, 006DF068)
02D3CD14  70ED015D (006DF068, 006F2000, 00712028, 02D3CDA4)
02D3CD48  70ED23B2 (006DF068, 02D3CDA4, 70ED2200, 00000000)
006DF068  610085D2 (0044005C, 00760065, 00630069, 005C0065)
End of stack trace

@xnoreq
Copy link
Author

xnoreq commented May 18, 2019

Compiled sshfs master on cygwin with winfsp ... same crash.

@xnoreq xnoreq changed the title Extremely poor performance with latest stable, I/O error with latest beta Extremely poor performance with latest stable, I/O error, random crashes on folder creation May 19, 2019
@xnoreq
Copy link
Author

xnoreq commented May 20, 2019

@billziss-gh If there's any way to debug this I'd love to know how to figure out where it crashes and what exactly causes the crash.
As you can see from other issues this is not an isolate case but happens to a lot of other users as well.

@billziss-gh
Copy link
Collaborator

@xnoreq yes, I am aware of this problem and have seen it myself with latest SSHFS-Win beta. Unfortunately I am currently very busy with implementing low-level FUSE in WinFsp and possibly in WSLinux and have not had the chance to look at it myself.

This appears to be a problem with the SSHFS code. The way I would approach it is to compile upstream SSHFS on Cygwin (this is now possible as all necessary patches have been included in the upstream project) with debug information enabled and run it under GDB until it crashes. GDB should be able to show you a complete stack trace with symbols and local variables, which would be tremendously helpful to pinpoint the problem.

In the event that the problem is actually with the WinFsp FUSE3 layer rather than SSHFS, we may have to switch to WinDbg and/or Visual Studio. But I think the SSHFS/GDB investigation would give us a good idea where the problem actually lies.

@xnoreq
Copy link
Author

xnoreq commented May 24, 2019

As I said before I've already tried upstream sshfs on cygwin. It crashes. It says it wrote a stackdump but it doesn't.
It crashes the same with gdb. No backtrace.

I tried again with meson --buildtype debug. Same result.

Either it didn't properly compile in debug mode or the crash is in winfsp.
Can you provide builds with debug symbols?

@billziss-gh
Copy link
Collaborator

billziss-gh commented May 24, 2019

Using addr2line -f -C -e ./sshfs.exe ADDRESS I was able to extract a useful stack frame with sshfs compiled with debug symbols.

buf_add_buf     sshfs.c:615
sshfs_getattr   sshfs.c:3164
cache_getattr   cache.c:272

Setting a breakpoint at sshfs.c:3163 breaks before the failure, which happens inside buf_add_buf. I do not have the time investigate this further right now, but there may be an incompatibility in how the WinFsp-FUSE3 layer passes information to sshfs.

EDIT: line numbers are from the sshfs commit included as a git submodule to this project.

@xnoreq
Copy link
Author

xnoreq commented May 25, 2019

@billziss-gh Thanks.
I've debugged a bit.

After creating the folder, the first call to sshfs_getattr returns -EIO sshfs.c#L3152 because the fuse_file_info->fh (which is a struct sshfs_file *) seems to have been corrupted.

The second time (e.g. when clicking retry after the I/O error dialog pops up in Explorer) sshfs_file_is_conn will succeed but sf is still corrupt.
So buf_add_buf(&buf, &sf->handle) crashes because sf->handle is corrupted:
{p = 0x100000000 <error: Cannot access memory at address 0x100000000>, len = 6445641896, size = 32}

The whole backtrace:

#0  sshfs_getattr (path=0x725ad0 "/New Folder", stbuf=0x318c900, fi=0x318c8a0) at ../sshfs.c:3163
#1  0x0000000100409b9b in cache_getattr (path=0x725ad0 "/New Folder", stbuf=0x318c900, fi=0x318c8a0) at ../cache.c:272
#2  0x00007ffca2e12e2c in winfsp-x64!fuse_set_signal_handlers () from /cygdrive/c/Program Files (x86)/WinFsp/bin/winfsp-x64.dll
#3  0x00007ffca2e15e4f in winfsp-x64!fuse_invalidate () from /cygdrive/c/Program Files (x86)/WinFsp/bin/winfsp-x64.dll
#4  0x00007ffca2e16a1a in winfsp-x64!fuse_invalidate () from /cygdrive/c/Program Files (x86)/WinFsp/bin/winfsp-x64.dll
#5  0x00007ffca2e23778 in winfsp-x64!FspFileSystemOpLeave () from /cygdrive/c/Program Files (x86)/WinFsp/bin/winfsp-x64.dll
#6  0x00007ffca2e24899 in winfsp-x64!FspFileSystemOpCreate () from /cygdrive/c/Program Files (x86)/WinFsp/bin/winfsp-x64.dll
#7  0x00007ffca2e27201 in winfsp-x64!FspFileSystemRemoveMountPoint () from /cygdrive/c/Program Files (x86)/WinFsp/bin/winfsp-x64.dll
#8  0x0000000180048353 in _cygtls::call2(unsigned int (*)(void*, void*), void*, void*) () from /usr/bin/cygwin1.dll
#9  0x0000000180048404 in _cygtls::call(unsigned int (*)(void*, void*), void*) () from /usr/bin/cygwin1.dll
#10 0x00007ffcc1117974 in KERNEL32!BaseThreadInitThunk () from /cygdrive/c/WINDOWS/System32/KERNEL32.DLL
#11 0x00007ffcc3a8a271 in ntdll!RtlUserThreadStart () from /cygdrive/c/WINDOWS/SYSTEM32/ntdll.dll
#12 0x0000000000000000 in ?? ()

So unless sshfs randomly mangles its sshfs_file structs it looks like winfsp mangles struct fuse_file_info (at least the fh field) which it passes to cache_getattr.

@xnoreq
Copy link
Author

xnoreq commented May 25, 2019

@billziss-gh
Copy link
Collaborator

The whole backtrace:

Unfortunately the backtraces that GDB prints for WinFsp appear bogus.

This line seems fishy

Thanks for tracking this. It might be that there is a subtle difference between the rules for FUSE2 vs FUSE3 in this case. FUSE2 had both getattr and fgetattr, but in FUSE3 they have been merged into a single getattr call.

@bes-internal
Copy link

I got STATUS_ACCESS_VIOLATION on create dir at windows 10 x86_64, WinFsp-2019.2 both with new sshfs-win-3.2.18213-x64 and sshfs-win-3.5.19106-x64. But not with sshfs-win-2.7.17334-x64

@billziss-gh
Copy link
Collaborator

Yes, this looks like this is a problem with the FUSE3 layer that newer SSHFS uses, but not with the FUSE2 layer that older SSHFS uses.

@temintyd
Copy link

temintyd commented Jun 29, 2019

Passing "-o workaround=fstat" fixes error 0x800703E3 #54 for me (WinFsp 2019.3 B1 + SSHFS-Win 3.5 BETA)

@billziss-gh
Copy link
Collaborator

billziss-gh commented Jul 1, 2019

@xnoreq and @temintyd thanks for the investigation and workaround respectively.

I finally did some more digging into this using the Visual Studio debugger attached to sshfs.exe together with a debug version of the WinFsp DLL. I include full steps to recreate my debugging session at the end of this post.

During directory creation a "CREATE DIRECTORY" call is being made (fsp_fuse_intf_Create with FILE_DIRECTORY_FILE). When creating a directory in Windows the semantics are to create the directory and open it. Therefore this code calls mkdir and then opendir. The fh value returned by opendir is recorded in fi.fh. Later the code calls fsp_fuse_intf_GetFileInfoFunnel, which sees a valid value for fi->fh and calls fgetattr on it. This results in an Access Violation.

There is a clear incompatibility here. WinFsp-FUSE treats values returned in fh either by open or opendir (or create) as valid values that can be passed to fgetattr. However SSHFS seems to not like that.

Digging into the SSHFS code we find sshfs_open and sshfs_opendir. Notice that fi->fh is assigned different kinds of objects in the 2 functions:

I believe the correct action here is to take this to the libfuse and sshfs maintainer @ nikratio and have a discussion with him. While I would like to argue that open and opendir should be returning fh values from the same "object space" and therefore fgetattr should work on both kinds of fh values (and I certainly design my file systems this way), the inconvenient truth is that what libfuse and sshfs does is correct by default and I will likely have to change WinFsp-FUSE to avoid calling fgetattr on an fh returned by opendir.


  • Clone the SSHFS-Win repository.
  • Build sshfs:
    cd sshfs-win/sshfs
    mkdir build
    meson --buildtype debug ..
    ninja
    
  • Clone the WinFsp repository.
  • Build WinFsp from source:
    • Use solution build\VStudio\winfsp.sln and configuration Debug|x64.
  • Copy sshfs.exe into the WinFsp build directory.
    cp sshfs-win/sshfs/build/sshfs.exe winfsp/build/VStudio/build/Debug/
    
  • Select winfsp.dll as startup project (the default) and open Project Properties > Debugging
    • Command: $(OutDir)sshfs.exe
    • Command Arguments: -f -ouid=-1,gid=-1 USER@SERVER: Y:
    • Working Directory: $(OutDir)
    • Environment: PATH=C:\Cygwin64\bin (or wherever your Cygwin lives)
  • Place a breakpoint at fsp_fuse_intf_Create.
  • Start debugging winfsp.dll.
  • Enter the USER@SERVER password.
    • You should see the message: The service sshfs has been started.
  • Open a Command Prompt window and enter:
    Y:
    mkdir foobar
    
  • The breakpoint on fsp_fuse_intf_Create should trigger.
  • Step through the function through the mkdir call and into the opendir call.
  • Record the fi.fh value returned by opendir.
  • Continue stepping through until the fsp_fuse_intf_GetFileInfoEx call. Then step into the function call.
  • You are now in fsp_fuse_intf_GetFileInfoFunnel.
  • Step through until just before the call to fgetattr.
  • Record the value of fi->fh. Notice that the value is the same as the one returned by opendir.
  • Step into fgetattr.
  • Access Violation.
    • Capture2
    • Capture1

@billziss-gh
Copy link
Collaborator

I have created experimental commit winfsp/winfsp@3a12d92 which seems to fix this problem.

@xnoreq
Copy link
Author

xnoreq commented Jul 10, 2019

Nikratio is right in that the current behavior of SSHFS is correct because FUSE guarantees:

Optionally opendir may also return an arbitrary filehandle in the fuse_file_info structure, which will be passed to readdir, releasedir and fsyncdir.

https://libfuse.github.io/doxygen/structfuse__operations.html#a1813889bc5e6e0087a936b7abe8b923f

So it doesn't need to be an actual handle to a file and it should never be passed to any other functions than readdir, releasedir and fsyncdir.

So here:
readdir:
https://github.com/billziss-gh/winfsp/blob/master/src/dll/fuse/fuse_intf.c#L1577
https://github.com/billziss-gh/winfsp/blob/master/src/dll/fuse/fuse_intf.c#L1899

releasedir:
https://github.com/billziss-gh/winfsp/blob/master/src/dll/fuse/fuse_intf.c#L953
https://github.com/billziss-gh/winfsp/blob/master/src/dll/fuse/fuse_intf.c#L1197

fsyncdir:
https://github.com/billziss-gh/winfsp/blob/master/src/dll/fuse/fuse_intf.c#L1338

are the only places where the fh set by opendir should be passed to.

I've looked at your commit. It looks like you just pass the fuse_file_info* as null whenever it's a directory.
Wouldn't it have been easier to just not set the FileHandle if we open a directory, i.e. just don't do this if its a directory:
https://github.com/billziss-gh/winfsp/blob/c97f2cb66023eccd2f87b21f5d5297ed8f1a9cd6/src/dll/fuse/fuse_intf.c#L1056

And lastly, wouldn't it be more correct to add a "DirHandle" to struct fsp_fuse_file_desc and pass that as fh to readdir, releasedir and fsyncdir if we're dealing with a directory?

@billziss-gh
Copy link
Collaborator

@xnoreq

Nikratio is right in that the current behavior of SSHFS is correct because FUSE guarantees:

Optionally opendir may also return an arbitrary filehandle in the fuse_file_info structure, which will be passed to readdir, releasedir and fsyncdir.

I agree. It looks like I misunderstood how the fh field is supposed to be used for directories.

And lastly, wouldn't it be more correct to add a "DirHandle" to struct fsp_fuse_file_desc and pass that as fh to readdir, releasedir and fsyncdir if we're dealing with a directory?

The existing code already had FileHandle and IsDirectory fields in fsp_fuse_file_desc, so I found it easier to make the change (hack) that I made. I agree that introducing a DirHandle would have worked as well, and might lead to better code. OTOH it might require code changes in more places and I am averse to making large code changes when I am fixing bugs. Still I may consider refactoring this code in the future.

BTW, in the next day or two I will be releasing a new beta of WinFsp with this fix and other improvements, so this issue should be finally resolved. Thank you for your long patience.

@xnoreq
Copy link
Author

xnoreq commented Jul 11, 2019 via email

@billziss-gh
Copy link
Collaborator

The SSHFS-Win Beta was actually crashing for me on mkdir (but not on any other operations). The stable SSHFS-Win releases never crashed on me (or anyone else AFAIK).

Unfortunately this (WinFsp) problem was not caught by the test suites, likely because my FUSE test file systems are always written in such a way that open and opendir return file handles from the same "set / space" and can therefore be used with fgetattr.

In any case I have released the latest WinFsp beta. Give it a try if you want and let me know.

https://github.com/billziss-gh/winfsp/releases

I am closing this, but please reopen if the problem persists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants