Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File truncated when uploading through proxy #30

Closed
farnsy opened this issue Jan 11, 2017 · 27 comments
Closed

File truncated when uploading through proxy #30

farnsy opened this issue Jan 11, 2017 · 27 comments

Comments

@farnsy
Copy link

farnsy commented Jan 11, 2017

I am attempting to upload a file to an external FTP server (i.e. a third party server that I don't control). When I do this on my dev machine it is not going through a proxy. However from our other environments it goes through a HTTP1.1 proxy.

I have found that when the file is uploaded through the proxy, the file is always truncated slightly. The file being uploaded is a CSV with header, and the header includes a count of total rows of data (not including header), and each row in the file includes an index field - therefore it's v easy to see that the file is truncated. The files are ~15-16MB in size. Some of the files that are being truncated indicate that there are 206,818 records in the file, however the last line of data is record 206,223 - and the line is half truncated.

Are there any known issues with file truncation through a proxy? Is there anything else I should be doing here? I'm transmitting using ASCII format, as that's what the third party specifies. I'll test using Binary, but will have to see if this will cause issues at the other end. any tips appreciated.

@farnsy
Copy link
Author

farnsy commented Jan 11, 2017

The following is from the squid logs, showing connection details:

11/Jan/2017:15:51:27 AUS Eastern Daylight Time.820     47 10.64.208.20 TCP_MISS/200 557 CONNECT x.y.253.145:57383 - DIRECT/x.y.253.145 -
11/Jan/2017:15:51:30 AUS Eastern Daylight Time.367     47 10.64.208.20 TCP_MISS/200 557 CONNECT x.y.253.145:55248 - DIRECT/x.y.253.145 -
11/Jan/2017:15:52:09 AUS Eastern Daylight Time.103 6153172 10.64.208.20 TCP_MISS/200 19926 CONNECT ftp.x.y.z:21 - DIRECT/x.y.253.145 -
11/Jan/2017:15:52:09 AUS Eastern Daylight Time.103 6121608 10.64.208.20 TCP_MISS/200 19902 CONNECT ftp.x.y.z:21 - DIRECT/x.y.253.145 -

I'm not an expert, but it looks a bit unusual that it's starting on a random port and then settling on port 21. I thought 21 would be the connect port and would then hand off to a random port. (again not an expert, clutching at straws...)

I've tried changing the FtpDataType to binary, but it made no difference. I also tried enabling keepalives, but no joy there either.

@robinrodricks
Copy link
Owner

robinrodricks commented Jan 11, 2017

Thanks for the well documented issue. If there is an issue with a proxy that does not occur with direct connection, then the issue is not in FtpClient but somewhere in the proxy set of classes. There are 2 classes in your case:

Since I haven't authored those classes I'm a bit unsure how to fix this. I'll look into it as see if I can spot anything. Mostly I think the issue is that the command being sent is not accounting for the proxy header, and therefore truncates data.

@robinrodricks robinrodricks changed the title Issue uploading through proxy - file truncated File truncated when uploading through proxy Jan 12, 2017
@farnsy
Copy link
Author

farnsy commented Jan 25, 2017

I just wondered if you've had a chance to have a look at this? I had a look but couldn't see anything obvious in the proxy implementation. Otherwise I'll probably have to look at another FTP implementation if possible.

@robinrodricks
Copy link
Owner

I tried to look at it, but as you said, nothing obvious. I was esp. looking for a part of the code involving uploading of files and the length measurement. I could not find that and since I did not implement the proxy I don't know what else to do.

@Zoltan666 @Cocotus @elmar69 @zharris6 - Do you guys know where the problem could be? If you just give me a hint I can take a better look in that region...

@robinrodricks
Copy link
Owner

robinrodricks commented Jan 26, 2017

Firstly, can you try the new UploadFile() API, and secondly if that does not make any difference, can you lower the TransferChunkSize and then using UploadFile() to see if that fixes your problem? Try with chunk sizes of 65 KB, 32 KB, 16 KB and so on to see which works best for you.

I've just published a version to nuget which supports modifying the TransferChunkSize : https://www.nuget.org/packages/FluentFTP/16.0.19

@farnsy
Copy link
Author

farnsy commented Jan 30, 2017

Thanks a lot for your help on this. Turns out I read your previous post a little to literally. However it helped. Initially I tried uploading a ~16MB file using UploadFile() which resulted in an ~14MB file. I then tried using (literally) 32, 16, 8, 4 and what I found was the lower I went I was getting closer to the 16MB file. I then realised I was not using 32K, so I changed the chunksize to (eg) 16384 - this caused the uploaded file to go closer to ~14MB again. So, I finally tried a chunk size of 1 - i.e. 1 byte. It provided the best result, but still doesn't end up with the whole file. A file that starts out as 16,664,816 bytes ends up as 16,663,564 bytes. Or, in relation to the data in the file, a file that should have 212613 records only has 212596 records.

Do you have any other suggestions?

Thanks again for your help here.

@robinrodricks
Copy link
Owner

robinrodricks commented Jan 30, 2017

Sounds good that you were able to get closer to the final file size. This information definitely helps me debug. Can you try the latest version and see if that works any better with the default chunk size? I have totally changed the upload/download code and so that might help you?

https://www.nuget.org/packages/FluentFTP/

@robinrodricks
Copy link
Owner

As of now my working theory is that some chunks are being skipped (lost packets maybe?) and therefore the smaller chunk size you have the more likely you have a full file on the server.

@farnsy
Copy link
Author

farnsy commented Jan 31, 2017

Thanks for your prompt response.

I've updated to 16.2 and attempted with the default chunk size, and also with a transferChunkSize of "1", and the results are similar to yesterday - i.e. default chunk size transferring a file of ~16MB results in a file of ~14MB. transferChunkSize of "1" results in a file of ~ 16MB, but a bit short. Transfer of a 210336 record file results in a 210331 file.

Just to be sure it's definitely proxy related (or most likely) I've tried the transfer from my local machine, using no proxy and default chunk size and the file was transferred correctly.

@farnsy
Copy link
Author

farnsy commented Jan 31, 2017

Given your mention of lost packets, and in case it helps, the missing data is always at the end - i.e. the file has an 8 line header which includes the total number of data records in the file. The header is followed by the records. In the file I'm currently looking at, the header indicates 210336 records. The last 2 lines of the file are as below:

210330,"RN",2,"SSR","579026-3","2017-01-30T17:00:00Z",0.0,"mm","",,1,""
210331,"RN",2,"SSR","57

i.e. the last line in the file has been truncated halfway through, and the file is only 5 (or 4.5) lines short. So if "lost packets" they are always the last few packets.

@robinrodricks
Copy link
Owner

Wow, great info for debugging. Thanks for the extensive research! I'll see what I can do.

@BobEntwhistle
Copy link
Contributor

@hgupta9 ,

This code doesn't look right to me in public virtual Stream OpenAppend(string path, FtpDataType type)

length = client.GetFileSize(path);
stream = client.OpenDataStream(string.Format("APPE {0}", path.GetFtpPath()), 0);

if (length > 0 && stream != null) {
	stream.SetLength(length);
	stream.SetPosition(length);  <======
}

The current position in the unfinished stream is almost certainly not the length of the original file.

This method is called if there's an interruption to the transfer.

@robinrodricks
Copy link
Owner

@Zoltan666 - Thank you, but the author has not mentioned using OpenAppend. Even a simple UploadFile is failing for him. Only the last part is missing. UploadFile uses this code: https://github.com/hgupta9/FluentFTP/blob/master/FluentFTP/FtpClient.cs#L1910 - Does anything in that code look wrong to you?

@BobEntwhistle
Copy link
Contributor

UploadFileInternal calls that method. However, I see that @farnsy wasn't using the UploadFile API when he had the original problem.

@robinrodricks
Copy link
Owner

@Zoltan666 - Great observation! I'll work on a solution and send you to check.

@BobEntwhistle
Copy link
Contributor

I take it back. I don't think there is a problem here.

@farnsy
Copy link
Author

farnsy commented Feb 14, 2017

Hi, I just wondered if you have any other thoughts on this? I'm unable to see anywhere that can be causing this, but am happy to have a closer look at anything if you are able to give any pointers? I'll also look into whether there's any way we can use a box that doesn't need to be behind the proxy to do the public upload.

@robinrodricks
Copy link
Owner

robinrodricks commented Feb 14, 2017

@farnsy - I was working on a fix but I could not find reference code and was unable to complete it. Essentially as Zoltan has pointed out there is a bug in OpenAppend, as this #46 issue also possibly indicates. Maybe fixing this bug will solve both issues at once? What makes it complicated is this:

  1. At the start of the upload we need to read the length of the file on the server, and then seek the local file and the remote file stream to that exact position
  2. During upload, I'm not sure if we need to use the length returned by the server, or blindly increment a counter and upload data from the local to the remote stream
  3. After upload, should a check be performed? I think in your case it may solve your issue of having missing bytes at the end. If yes, then we will check the server file length vs the local file length and then upload those missing bytes

Not sure how all this will fit in, ie. what the code will look like or how many loops are required, but you are free to fork FluentFTP and experiment with the OpenAppend method. I'm a bit busy this week so although I can help you do it I don't think I'll find time to do it myself.

@farnsy
Copy link
Author

farnsy commented Feb 14, 2017 via email

@robinrodricks
Copy link
Owner

robinrodricks commented Feb 22, 2017

@farnsy I've attempted to fix this in 16.2.3. Can you check and see if that works? Essentially I'm checking the file length after upload using the upStream.Position property. If that does not work I'll try using another FTP command to re-get the file length and then OpenAppend and append the missing data.

https://www.nuget.org/packages/FluentFTP/16.2.3

robinrodricks pushed a commit that referenced this issue Feb 22, 2017
@farnsy
Copy link
Author

farnsy commented Feb 23, 2017

I've just attempted with the latest version, and the symptoms I am experiencing remain. with a chunksize of 1 byte, the file is still a few bytes short. With the default chunk size it is substantially short (more like 2MB). Thanks for the update.

@robinrodricks
Copy link
Owner

robinrodricks commented Feb 23, 2017

Can you try this build to see if it works? https://we.tl/8yn1JA5nt9

And are you capable of copying the FluentFTP files into your project so you can debug this more carefully OR are you able to build a debug version of FluentFTP and include that DLL in your project? Because its very hard for me to fix something I can't even step through, although you can.

@farnsy
Copy link
Author

farnsy commented Feb 24, 2017

Thanks for the updated release, however still no joy.

I've attempted to use the linked build, and the symptoms I'm seeing are the same. With a chunksize of 1 I transferred a 210401 record file, and 210385 lines were transferred to the remote server.

The issue I have with your other points is that the application is only exhibiting the issue when used on one of our servers in a deployed environment - i.e. not in a development environment, and therefore not where development tools can be installed. I'm unable to set up my development machine to work in the same way - i.e. to go through the same proxy. I'll have a think about how I may be able to debug this further.

@dlazzarino
Copy link

+1 Same issue for me

@robinrodricks
Copy link
Owner

I've tried everything I can think of and it still fails. If anyone has any ideas on how to get the file length such that the missing bytes can be calculated correctly, tell me.

@robinrodricks
Copy link
Owner

@farnsy There are fixes in the HTTP 1.1 proxy. Can you try the latest nuget and see if it works for you?

https://www.nuget.org/packages/FluentFTP/16.5.0

@robinrodricks
Copy link
Owner

Closing this since no response received from OP, and it seems to be solved in the latest release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants