-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EOFError when getting a big file (900MB) via SFTP. #151
Comments
I think this might be related to issue #124 in which I'm also accessing a Windows hosted SFTP server (GlobalSCAPE) and downloading pretty large files (hundreds of MBs to tens of GBs). Do you have access to the SFTP server itself as well as its logs? That's the one thing I don't have access to with our current vendor. Otherwise, all of the symptoms you mention above match what we're seeing. |
Yes, it's possibly related. There's also, at least, two posts on StackOverflow that are probably related.
About the SFTP server logs, it could be possible for me to get access to it, but not easily. But the admin running the server told me that there's nothing abnormal in the logs? |
I'm in a similar situation. We can get access to the logs, but the admin in charge is rather difficult to deal with and will most likely intentionally ignore our tickets. To add insult to injury, while we have managed to get logs from them before, GlobalSCAPE is a multi-protocol solution, so it mangles the logs into an standard "FTP-ish" format before saving them. As a result, any information specific to SFTP is lost. I don't think they'd give me the time of day if I asked for debug logs (if the product even supports them). We were also told that there's "nothing abnormal" about our logs, but when they actually gave them to us, the logs were indicating that multiple downloaded attempts finished successfully, each after transferring differing numbers of bytes (far smaller than the actual size of the file) for the same file. Additionally, they never recorded the disconnection. If you can get a few chunks of log to scrutinize yourself, it might be more helpful. If you can get protocol specific logs, even better. I'm not expecting a lot from these Windows SFTP server solutions though. |
Was any traction made with this? I am hitting the same roadblock with a file over 1GB in size. |
Unfortunately not. I've been using Perl + Net::SFTP::Foreign as a replacement for the time being. Mainly because it piggybacks off the openssh binary which has proven to be far more reliable. |
Any updates on this? Running into the same error |
Not that I'm aware of. I assume the current maintainer mostly uses it for his deployment solution (fabric), so issues with large files on platforms he doesn't (officially?) support anyway don't seem to attract much interest. It's also a pain to reproduce this bug without an appropriate file and server combination, and nobody has come forward to try to troubleshoot it, so I've just avoided using paramiko since it can't be relied on for my purposes. |
Any suggestions on other modules I can use instead of paramiko? |
for large files ie |
@kaorihinata is mostly right, though I do my best to look at things from a "pure" paramiko standpoint (i.e. I won't ignore an issue just because it doesn't impact Fabric, even if Fabric-related issues do get more love). The problem here is much more the difficulty in reproducing & the nonstandard platform :( Always open to merging patches that users say "this fixes my problem X" and which can be proven to not break eg POSIX platforms, but this ticket's not at that stage yet unless I'm missing something. |
To be honest, I'd probably blame the vendor for their loose interpretation of the standard and dubious definition of "production ready" when it comes to code. It may be the case that OpenSSH works with these servers due to workarounds for broken servers. If I come across the issue again, I will try to determine the cause. |
@kaorihinata so what do you use currently to transfer files > 1GB? Fabric if so can you point me to a link where I can find sample code to implement a similar get operation using it |
@rsheshadri I wasn't required to use Python as long as I had a working solution and since the script was pretty simple, I switched to using Net::SFTP::Foreign with Perl. It uses the OpenSSH client on the backend so compatibility is the best you're going to get. |
Running into this issue with a significantly smaller file too. I've tracked it down to the fact that the max. packet_size/window_size are very small after connecting (4096 and 32759).. If I manually override these values to I can get to 2.1 MB before the upload stalls.. This only occurs on a handful of remotes. The only way of overriding that worked so far was:
@bitprophet would love to work with you on debugging this. |
I recently ran in a similar problem when downloading files larger than 100MB from a CrushSFTP server. What happened was that the CrushSFTP server closed the socket as soon as paramiko requested a package beyond the file size. This is actually what happens in The following changes in the code solved the problem for me:
Furthmore, in SFTPClient.getfo an extra break condition to not call again read:
|
@horida can you send them as pull requests and i can look into and see if i can figure out why it's misbehaving? (that pull request probably won't be merge, but i know where in the code the issues are) |
It would be great to see a review of pull request #564 and to create a full fix for this issue. |
I am getting this issue communicating with OpenSSH-6.2 server btw, so the "Nonstandard platforms" tag isn't relevant to me.
|
Has this been fixed? I'm getting @rsheshadri did you find a good workaround (using python)? |
This is becoming more and more a problem with duplicity and bigger backups. Luckily duplicity has a "legacy" ssh backend, that can be used successfully instead of the paramiko one:
The proposed patch in #564 did not work for me. Talking to a wheezy openssh server. |
FYI: I actually ended up implementing a combo of the normal |
FYI, I abandoned Paramiko and am sending things successfully using the system scp executable. |
Any solution to this problem yet? I am still facing this error when downloading files over 500MB. |
I'm unaware of any solution to this. It's also a bit difficult to reproduce with any consistency. It would make it a lot easier to solve if someone was able to reproduce it consistently, and someone else attached to the ticket was able to confirm the method. In my case it was random, so it was difficult to pin down what was happening. |
Run into the same problem with 600MB files. Is there any proper fix for this? |
My workarounds: https://stackoverflow.com/a/48170689/501765 |
with paramiko.Transport((server, 22)) as transport:
# SFTP FIXES
transport.default_window_size = paramiko.common.MAX_WINDOW_SIZE // 2
#transport.default_max_packet_size = paramiko.common.MAX_WINDOW_SIZE
transport.packetizer.REKEY_BYTES = pow(2,
40) # 1TB max, this is a security degradation!
transport.packetizer.REKEY_PACKETS = pow(2,
40) # 1TB max, this is a security degradation!
# / SFTP FIXES
transport.connect(username=username, password=pw)
with paramiko.SFTPClient.from_transport(transport) as sftp:
sftp.get_channel().in_window_size = 2097152
sftp.get_channel().out_window_size = 2097152
sftp.get_channel().in_max_packet_size = 2097152
sftp.get_channel().out_max_packet_size = 2097152
files = sftp.listdir()
files = list(filter(lambda x: x.endswith(".zip"), files))
print(files)
if len(files) > 2:
for f in files:
target = str(dst / f)
print(f"Downloading {f} to {target}")
sftp.get(f, target)
for f in files:
sftp.remove(f) This fixes it for me for files > 600MB (not sure what exactly I did there but it works ¯_(ツ)_/¯) |
This fix works for us. Now we're able to download that stupid file of 3.2Gb. |
Where does one add this piece? when using pysftp for example which uses paramiko.. |
From RFC4253:
It seems that the right fix here is perhaps to adjust (increase) the data and/or time limits for rekeying, but also fix whatever bug is causing the connection to drop when it happens? |
Has anyone been able to track down the issue with this? I have been able to successfully recreate the error repeatedly during a school project attempting to write a python script to brute force SSH. Looking at the packetizer docs for Paramiko Packet handling, I notice that write_all is not included. but there is an EOF error for read_all All the above stack overflow links seem to discuss issues with file size. Traceback
This errors every time on that exact spot in the file (line 7) no matter what is written there. Code below:
|
It's not documented AFAIK, but this will work for pysftp in a pinch:
|
We have been using the following code for quite a long and it worked for the majority of the large files up to 10GB. Today, we start seeing errors when downloading 21GB files. is there any way to fix it? What's the biggest file u r able to download w/ paramiko? Env: Error:
Code: ` tr = client.get_transport() ` |
any update on this? If this can't be resolved w/ paramiko, we have to look for a different solution. Has anyone been able to download files large than 10GB w/paramiko? What's the largest file that's downloaded or tested w/ paramiko. If there is an alternative solution, please let me know. |
I noticed that using With For those in need of a short term solution: def paramiko_sftp_get(
sftp_client: SFTPClient,
sftp_file: str,
local_file: str,
callback: Callable[[int, int], None],
max_request_size: int = 2 ** 20) -> int:
"""
A copy of paramiko's sftp.get() function that allows for sequential download
of large chunks.
This is a work around for https://github.com/paramiko/paramiko/issues/151.
The issue does not occur when prefetch=False (i.e. sequential download) indicating
that there seems to be an error with the parallel approach. However, the sequential
version in paramiko does not allow customizable request size, and instead hardcodes a
small value that is known to work with many SFTP implementations.
With the possibility of large chunks, the sequential download's RTT overhead becomes
less of a pain and a viable alternative.
:param sftp_client: Paramiko's SFTPClient.
:param sftp_file: The remote file in sftp.
:param local_file: The local file.
:param callback: A function that is invoked on every chunk.
:param max_request_size: The max request size, defaults to 2**20.
:return: The size of the file in bytes.
"""
with open(local_file, "wb") as local_handle:
file_size = sftp_client.stat(sftp_file).st_size
assert file_size is not None
with sftp_client.open(sftp_file, "rb") as remove_handle:
paramiko_transfer_with_callback(
remove_handle,
local_handle,
file_size,
callback,
max_request_size
)
return file_size
def paramiko_transfer_with_callback(
reader: SFTPFile,
writer: BinaryIO,
file_size: int,
callback: Callable[[int, int], None],
max_request_size: int):
"""
A copy of paramiko's sftp_client._transfer_with_callback with max_request_size support.
:param reader: The reader file handle.
:param writer: The writer file handle.
:param file_size: The size of the file to be downloaded.
:param callback: A function that is invoked on every chunk.
:param max_request_size: The max request size, defaults to 2**20.
"""
size = 0
while True:
remaining = file_size - size
chunk = min(max_request_size, remaining)
data = reader.read(chunk)
writer.write(data)
size += len(data)
if len(data) == 0:
break
if callback is not None:
callback(size, file_size)
assert size == file_size I believe this value. along with SFTPFile.MAX_REQUEST_SIZE, could be made configurable in |
We're going to merge #2058 which will likely solve this; please open new issues after the next feature release (should be Paramiko 3.3) if you continue to reproduce the issue. Thanks! |
Thanks, @bitprophet ! When should we expect to see Paramiko 3.3 on PyPi? |
My guess is it will be sometime in the next couple of months, but it could be longer. I believe bitprophet wants to give time for the experimental key/auth features in v3.2 to work their way out into the wild & have some feedback come in before taking the next step on those in v3.3. Plus, it's all dependent on the pace bitprophet can reach with his open source allocation, given that his time is split among multiple projects. |
I'm trying to use duplicity to backup my server to a CrushFTP server (windows) in SFTP, but it always drop connection when getting some big files.
So I tried to get a problematic file with paramiko directly and got the same error
Server connection dropped
meaning anEOFError
was raise (bytes received == 0). I put a print statement in_read_all
method to show how many bytes were received. On one thread it's getting rapidly to 32777 then stay there. Another thread goes up to 7049 then back to 0 and then never more then 200. After a long long time (many minutes) I receive theEOFError
.If I try to get the same file with
lftp
it works without any problems.Using paramiko 1.10.0, python 2.6.5 on Ubuntu Server LTS 10.0.4 amd64.
The text was updated successfully, but these errors were encountered: