Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fstat returns error "Folder not found: c:\xxx\yyy.csv" (SSH_FX_FAILURE) on a file #288

Closed
ncw opened this issue May 7, 2019 · 18 comments · Fixed by #289
Closed

Fstat returns error "Folder not found: c:\xxx\yyy.csv" (SSH_FX_FAILURE) on a file #288

ncw opened this issue May 7, 2019 · 18 comments · Fixed by #289

Comments

@ncw
Copy link
Contributor

ncw commented May 7, 2019

An rclone user has reported that rclone with a certain sftp server reports the error "Folder not found: c:\xxx\yyy.csv" (SSH_FX_FAILURE). As far as I can work out this is being generated by the Read() call on an open file, though looking at the trace below it seems to come from Fstat on the open file - not sure what is calling that.

The user reports that other sftp clients work just fine.

I turned on debugging and sent the user a binary to try.

Here is rclone logs interleaved with sftp debugs

./rclone copyto --low-level-retries 1 --retries 1 -vv aiep:aaaaaaa_valley_hhhhh_ccp/CCP_Valley_hhhhh_190429.csv ~/Downloads/z.csv
2019/05/02 13:00:24 DEBUG : rclone: Version “v1.47.0-032-g5c0574ea-fix-sftp-folder-beta” starting with parameters ["./rclone" “copyto” “–low-level-retries” “1” “–retries” “1” “-vv” “aiep:aaaaaaa_valley_hhhhh_ccp/CCP_Valley_hhhhh_190429.csv” “/Users/auser/Downloads/z.csv”]
2019/05/02 13:00:24 DEBUG : Using config file from “/Users/auser/.config/rclone/rclone.conf”
2019/05/02 13:00:24 send packet: SSH_FXP_INIT 5 bytes 00000003
2019/05/02 13:00:24 recv packet: SSH_FXP_VERSION 5 bytes 00000003
2019/05/02 13:00:24 send packet: SSH_FXP_STAT 63 bytes 0000000100000036616c65646164655f76616c6c65795f6865616c74685f6363702f4343505f56616c6c65795f4865616c74685f3139303432392e637376
2019/05/02 13:00:24 recv packet: SSH_FXP_ATTRS 29 bytes 000000010000003d000000000024230c000081ef5cc6dc835cc6dc83
2019/05/02 13:00:24 send packet: SSH_FXP_STAT 63 bytes 0000000200000036616c65646164655f76616c6c65795f6865616c74685f6363702f4343505f56616c6c65795f4865616c74685f3139303432392e637376
2019/05/02 13:00:24 recv packet: SSH_FXP_ATTRS 29 bytes 000000020000003d000000000024230c000081ef5cc6dc835cc6dc83

Rclone has identified the file as needing to transfer

2019/05/02 13:00:24 DEBUG : CCP_Valley_hhhhh_190429.csv: Couldn’t find file - need to transfer

Now we open it

2019/05/02 13:00:25 send packet: SSH_FXP_OPEN 71 bytes 0000000300000036616c65646164655f76616c6c65795f6865616c74685f6363702f4343505f56616c6c65795f4865616c74685f3139303432392e6373760000000100000000
2019/05/02 13:00:25 recv packet: SSH_FXP_HANDLE 12 bytes 0000000300000003323535
2019/05/02 13:00:25 send packet: SSH_FXP_FSTAT 12 bytes 0000000400000003323535
2019/05/02 13:00:25 recv packet: SSH_FXP_STATUS 94 bytes 00000004000000040000004d466f6c646572206e6f7420666f756e643a20633a5c66747073697465735c706569615c76616c6c65795f6865616c74685c4343505f56616c6c65795f4865616c74685f3139303432392e63737600000000
2019/05/02 13:00:25 send packet: SSH_FXP_CLOSE 12 bytes 0000000500000003323535
2019/05/02 13:00:25 recv packet: SSH_FXP_STATUS 17 bytes 00000005000000000000000000000000

However the SSH_FXP_FSTAT returned an error. I don't know why! Rclone doesn't directly call Stat() on the handle returned - I'm not sure where that gets called.

2019/05/02 13:00:25 DEBUG : CCP_Valley_hhhhh_190429.csv: Reopening on read failure after 0 bytes: retry 1/1: sftp: “Folder not found: c:\ftpsites\aiep\valley_hhhhh\CCP_Valley_hhhhh_190429.csv” (SSH_FX_FAILURE)
2019/05/02 13:00:25 DEBUG : CCP_Valley_hhhhh_190429.csv: Reopen failed after 0 bytes read: failed to reopen: too many retries
2019/05/02 13:00:25 NOTICE: z.csv: Removing partially written file on error: sftp: “Folder not found: c:\ftpsites\aiep\valley_hhhhh\CCP_Valley_hhhhh_190429.csv” (SSH_FX_FAILURE)
2019/05/02 13:00:25 ERROR : CCP_Valley_hhhhh_190429.csv: Failed to copy: sftp: “Folder not found: c:\ftpsites\aiep\valley_hhhhh\CCP_Valley_hhhhh_190429.csv” (SSH_FX_FAILURE)
2019/05/02 13:00:25 ERROR : Attempt 1/1 failed with 2 errors and: sftp: “Folder not found: c:\ftpsites\aiep\valley_hhhhh\CCP_Valley_hhhhh_190429.csv” (SSH_FX_FAILURE)
2019/05/02 13:00:25 Failed to copyto: sftp: “Folder not found: c:\ftpsites\aiep\valley_hhhhh\CCP_Valley_hhhhh_190429.csv” (SSH_FX_FAILURE)

I attempted to decode the sftp debug using the RFC but it wasn't making sense, so I'm asking for help here.

Thanks in advance!

@ncw
Copy link
Contributor Author

ncw commented May 8, 2019

I discovered who calls the Fstat (with a bit of debugging by panic!), it is File.WriteTo in pkg/sftp

sftp/client.go

Lines 886 to 895 in 5bcd86d

func (f *File) WriteTo(w io.Writer) (int64, error) {
fi, err := f.Stat()
if err != nil {
return 0, err
}
inFlight := 0
desiredInFlight := 1
offset := f.offset
writeOffset := offset
fileSize := uint64(fi.Size())

Which is called by rclone to maximise the throughput of the download.

@eikenb
Copy link
Member

eikenb commented May 9, 2019

Hey @ncw, thanks for reporting the issue. I skimmed through the forum discussion you linked and to be sure of the situation, let me re-iterate and clarify a few points.

  1. I think they said the server is WIN_SCP. That seems like WinSCP which is a client. So I'm thinking this isn't correct. Knowing the server would be useful.
  2. This error is happening when the file isn't on the destination yet and is being uploaded.
  3. All the parent directories already exist for the file in question.
  4. In your beta branch you had the user test, was the only change that you removed the f.Stat() call from the sftp library?

@ncw
Copy link
Contributor Author

ncw commented May 12, 2019

Hey @ncw, thanks for reporting the issue. I skimmed through the forum discussion you linked and to be sure of the situation, let me re-iterate and clarify a few points.

Sure

  1. I think they said the server is WIN_SCP. That seems like WinSCP which is a client. So I'm thinking this isn't correct. Knowing the server would be useful.

In a recent beta I put a log in to discover the name of the server - I'll ask the user to run that.

  1. This error is happening when the file isn't on the destination yet and is being uploaded.

The file is being copied from the SFTP server to local disk, so it is being downloaded.

  1. All the parent directories already exist for the file in question.

Yes and the file exists too. The file can be Open-ed just fine, but running Fstat on the open read handle gives the very confusing error in the title!

  1. In your beta branch you had the user test, was the only change that you removed the f.Stat() call from the sftp library?

The workaround I attempted (which didn't work) was to use io.Copy to download the file rather than use use the WriteTo method from the sftp handle (which does call Fstat). The commit is here. However the workaround failed since io.Copy will call WriteTo if it is available which I had forgotten.

@eikenb
Copy link
Member

eikenb commented May 15, 2019

It seems like I misread parts of that thread pretty badly. Thanks @ncw for clarifying things and giving more details. Sorry it's taking me a while to look into this. I just started a new job and have been pretty busy. I'll try to spend more time on it soon.

@eikenb
Copy link
Member

eikenb commented May 16, 2019

@ncw Sorry for being obtuse, but...

All the parent directories already exist for the file in question.

Yes and the file exists too. The file can be Open-ed just fine, but running Fstat on the open read handle gives the very confusing error in the title!

The directories and file already exist on the client where it is being downloaded to, on the server where it is being downloaded from or both?

@ncw
Copy link
Contributor Author

ncw commented May 16, 2019

The directories and file already exist on the client where it is being downloaded to, on the server where it is being downloaded from or both?

The directories and files exist on the server. Rclone will create directories as needed on the client before it downloads the file.

PS While investigating this problem I had a deeper look at the source and discovered it could be an sftp server! That encouraged me to write rclone serve sftp which can server any of rclone's supported cloud providers (including local disk) as sftp. The library was a joy to work with, so thank you very much for that :-)

@caduceus313
Copy link

@eikenb @ncw The server identifies itself as WS FTP:
2019/05/16 09:30:25 DEBUG : sftp://auser@ftp.wvaiep.com:22/: New connection 192.168.1.176:65436->xxx.xx.xxx.xx:22 to "SSH-2.0-WS_FTP-SSH_7.5.1"

@jfautley
Copy link

I am seeing a very similar sounding issue, but on downloading files from an external SFTP server (over which I have no control). This is also using rclone but the issue appears to be with the behavior of the underlying sftp module.

The SFTP server in question ("SSH-2.0-SSHD-CORE-0.13.0", for those playing along at home) returns an SSH_FX_FAILURE error for the stat call. A quick-n-dirty hack was to remove the call to stat and set fileSize to f.c.maxPacket. I would suggest that if the stat call returns an error, just set the fileSize=maxPacket and let the download continue.

If this is an appropriate fix, I'm happy to raise a PR.

@ncw
Copy link
Contributor Author

ncw commented May 20, 2019

A quick-n-dirty hack was to remove the call to stat and set fileSize to f.c.maxPacket. I would suggest that if the stat call returns an error, just set the fileSize=maxPacket and let the download continue.

Interesting... Can you paste the diff?

@jfautley
Copy link

Here you go (I did say it was quick-n-hacky... 😄 )

diff --git a/vendor/github.com/pkg/sftp/client.go b/vendor/github.com/pkg/sftp/client.go
index ef885d801..0045ccd4e 100644
--- a/vendor/github.com/pkg/sftp/client.go
+++ b/vendor/github.com/pkg/sftp/client.go
@@ -884,15 +884,15 @@ func (f *File) Read(b []byte) (int, error) {
 // maximise throughput for transferring the entire file (especially
 // over high latency links).
 func (f *File) WriteTo(w io.Writer) (int64, error) {
-       fi, err := f.Stat()
-       if err != nil {
-               return 0, err
-       }
+       //fi, err := f.Stat()
+       //if err != nil {
+       //      return 0, err
+       //}
        inFlight := 0
        desiredInFlight := 1
        offset := f.offset
        writeOffset := offset
-       fileSize := uint64(fi.Size())
+        fileSize := uint64(f.c.maxPacket)
        // see comment on same line in Read() above
        ch := make(chan result, f.c.maxConcurrentRequests+1)
        type inflightRead struct {

@ncw
Copy link
Contributor Author

ncw commented May 20, 2019

Here you go (I did say it was quick-n-hacky.

Thanks! That is exactly the same bit of code that was causing the original issue.

Does this slightly less hacky patch work for you?

diff --git a/vendor/github.com/pkg/sftp/client.go b/vendor/github.com/pkg/sftp/client.go
index ef885d801..11a81b5e0 100644
--- a/vendor/github.com/pkg/sftp/client.go
+++ b/vendor/github.com/pkg/sftp/client.go
@@ -884,7 +884,7 @@ func (f *File) Read(b []byte) (int, error) {
 // maximise throughput for transferring the entire file (especially
 // over high latency links).
 func (f *File) WriteTo(w io.Writer) (int64, error) {
-	fi, err := f.Stat()
+	fi, err := f.c.Stat(f.path)
 	if err != nil {
 		return 0, err
 	}

@jfautley
Copy link

jfautley commented May 20, 2019

Does this slightly less hacky patch work for you?

Yep it does indeed (at least as well as my hacky patch did).

I'm seeing some other issues relating to copy not working (but sync is fine) but I don't think these are related to this issue specifically.
Ignore that, I was being a tool...

@ncw
Copy link
Contributor Author

ncw commented May 20, 2019

@eikenb Do you think the above patch would be acceptable if I sent a PR with it in?

I think it

  • should have the same result
  • doesn't use any more resources

However it is a work-around for what is probably a broken SFTP server. Stat is a much more used operation than FStat so using Stat may be preferable for compatibility.

@jfautley
Copy link

@ncw FYI I've tested this patch against a known-problematic WS_FTP server (same as the original rclone problem report) and it resolves the issue there, too.

@eikenb
Copy link
Member

eikenb commented May 20, 2019

While I think this should be fine I'm now thinking the fix is to just remove that Stat check entirely. None of the other read/write methods (Read, Write, ReadFrom) have that check, only WriteTo does, and I'm not sure I see any reason it needs it while the other's do not.

ncw added a commit to ncw/sftp that referenced this issue May 21, 2019
Before this change in File.WriteTo() we used Fstat to discover the
length of the file being transferred.  It appears that some SFTP
servers do not implement this properly perhaps because it is a seldom
used call.

After this change we replace the Fstat on the file handle with a Stat
of the path.  Stat is commonly used function and implemented correctly
in both the servers that had the problem with Fstat, thus working
around the problem.

The code before and after uses the same number of SFTP roundtrips so
performance should be identical.

Fixes pkg#288
@ncw
Copy link
Contributor Author

ncw commented May 21, 2019

I created a pull request for the one line fix above #289

@ncw
Copy link
Contributor Author

ncw commented May 21, 2019

@eikenb sorry missed your message!

While I think this should be fine I'm now thinking the fix is to just remove that Stat check entirely. None of the other read/write methods (Read, Write, ReadFrom) have that check, only WriteTo does, and I'm not sure I see any reason it needs it while the other's do not.

The Stat is used read the file size, which in turn in used only to control the desiredInFlight parameter two places like this

				if offset > fileSize {
					desiredInFlight = 1
				}

We'd need to know that the read had finished and turn down the window. I'm not sure how to accomplish that since there doesn't seem to be an EOF message when reading the data from the remote end.

@eikenb
Copy link
Member

eikenb commented May 23, 2019

Of course you are right about the filesize and I think your solution is fine. I'll take it for now to resolve the issue. It still bugs me that it is needed though as it doesn't really seem like it should be, it should receive an EOF and finish up just like ReadFrom. But I don't have the time to play with it right now and your PR fixes it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants