Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SFTP: Downloading Large Files Hangs / Stalls #926

Open
joshuamcginnis opened this issue Mar 23, 2017 · 23 comments
Open

SFTP: Downloading Large Files Hangs / Stalls #926

joshuamcginnis opened this issue Mar 23, 2017 · 23 comments

Comments

@joshuamcginnis
Copy link

@joshuamcginnis joshuamcginnis commented Mar 23, 2017

I'm using paramiko-2.1.2 and I'm running into an issue where Paramiko appears to hang or stall when downloading a large file (in my case 4gb zip file) after downloading only 3MB of the file.

I've seen related issues in Stackoverflow and here, but I've yet to see a clear resolution.

self.client = paramiko.SFTPClient.from_transport(self.__transport)
self.__transport = paramiko.Transport((hostname, self.PORT))
self.__transport.connect(username = username, password = password)

self.client.get(remote_file_path, local_file_path, callback=progress)
DEB [20170323-12:28:40.307] thr=1   paramiko.transport: starting thread (client mode): 0x76d2910L
DEB [20170323-12:28:40.308] thr=1   paramiko.transport: Local version/idstring: SSH-2.0-paramiko_2.1.2
DEB [20170323-12:28:40.468] thr=1   paramiko.transport: Remote version/idstring: SSH-2.0-1.82_sshlib REMOTE SFTP
INF [20170323-12:28:40.469] thr=1   paramiko.transport: Connected (version 2.0, client 1.82_sshlib)
DEB [20170323-12:28:40.625] thr=1   paramiko.transport: kex algos:[u'diffie-hellman-group14-sha1', u'diffie-hellman-group-exchange-sha1', u'diffie-hellman-group1-sha1'] server key:[u'ssh-rsa'] client encrypt:[u'twofish256-cbc', u'twofish-cbc', u'twofish128-cbc', u'blowfish-cbc', u'3des-cbc', u'arcfour', u'cast128-cbc', u'aes256-cbc', u'aes128-cbc', u'aes256-ctr', u'aes128-ctr'] server encrypt:[u'twofish256-cbc', u'twofish-cbc', u'twofish128-cbc', u'blowfish-cbc', u'3des-cbc', u'arcfour', u'cast128-cbc', u'aes256-cbc', u'aes128-cbc', u'aes256-ctr', u'aes128-ctr'] client mac:[u'hmac-sha1', u'hmac-md5', u'hmac-sha1-96', u'hmac-md5-96'] server mac:[u'hmac-sha1', u'hmac-md5', u'hmac-sha1-96', u'hmac-md5-96'] client compress:[u'zlib', u'none'] server compress:[u'zlib', u'none'] client lang:[u''] server lang:[u''] kex follows?False
DEB [20170323-12:28:40.625] thr=1   paramiko.transport: Kex agreed: diffie-hellman-group1-sha1
DEB [20170323-12:28:40.625] thr=1   paramiko.transport: Cipher agreed: aes128-ctr
DEB [20170323-12:28:40.625] thr=1   paramiko.transport: MAC agreed: hmac-md5
DEB [20170323-12:28:40.626] thr=1   paramiko.transport: Compression agreed: none
DEB [20170323-12:28:41.035] thr=1   paramiko.transport: kex engine KexGroup1 specified hash_algo <built-in function openssl_sha1>
DEB [20170323-12:28:41.036] thr=1   paramiko.transport: Switch to new keys ...
DEB [20170323-12:28:41.038] thr=2   paramiko.transport: Attempting password auth...
DEB [20170323-12:28:41.416] thr=1   paramiko.transport: userauth is OK
INF [20170323-12:28:41.417] thr=1   paramiko.transport: Auth banner: EFT Server Enterprise 7.1.3.5
INF [20170323-12:28:41.698] thr=1   paramiko.transport: Authentication (password) successful!
DEB [20170323-12:28:41.722] thr=2   paramiko.transport: [chan 0] Max packet in: 32768 bytes
DEB [20170323-12:28:41.882] thr=1   paramiko.transport: [chan 0] Max packet out: 35840 bytes
DEB [20170323-12:28:41.882] thr=1   paramiko.transport: Secsh channel 0 opened.
DEB [20170323-12:28:42.058] thr=1   paramiko.transport: [chan 0] Sesch channel 0 request ok
INF [20170323-12:28:42.218] thr=2   paramiko.transport.sftp: [chan 0] Opened sftp connection (server version 3)
DEB [20170323-12:28:42.219] thr=2   paramiko.transport.sftp: [chan 0] stat('FOLDER/large_file.zip')
DEB [20170323-12:28:42.394] thr=2   paramiko.transport.sftp: [chan 0] open('FOLDER/large_file.zip', 'rb')
DEB [20170323-12:28:42.570] thr=2   paramiko.transport.sftp: [chan 0] open('FOLDER/large_file.zip', 'rb') -> 31

The only way I've been able to resolve this is by forgoing the use of .get and use open with shutils to copy the remote_file to local disk. I find this to be really slow.

with self.client.open(remote_file_path, 'r') as remote_file:
    shutil.copyfileobj(remote_file, open(local_file_path, 'w'))

Any ideas here?

@bitprophet

This comment has been minimized.

Copy link
Member

@bitprophet bitprophet commented Mar 24, 2017

I feel like this has come up a lot before, if you search for SFTP you'll likely find a few, e.g. #462. Going to mark this a duplicate of those, but thanks for the report & details! If you can't find another ticket that exhibits the same behavior (even the "stalls around a given amount of bytes transferred" angle sounds familiar, tbh) I might reopen this one.

@bitprophet bitprophet closed this Mar 24, 2017
@bitprophet bitprophet added the SFTP label Mar 24, 2017
@joshuamcginnis

This comment has been minimized.

Copy link
Author

@joshuamcginnis joshuamcginnis commented Mar 24, 2017

@bitprophet

I mentioned:

I've seen related issues in Stackoverflow and here, but I've yet to see a clear resolution.

#462 is a different issue that mind:

  • My issue is not sporadic whereas OPS is
  • My issue is not in authentication, but in file transfer

If there is an existing issue open that is the same as this, then I'm happy to close it and my details. But every issue I've seen regarding this issue has ended with no resolution.

@joshuamcginnis

This comment has been minimized.

Copy link
Author

@joshuamcginnis joshuamcginnis commented Mar 24, 2017

@dordadush

This comment has been minimized.

Copy link

@dordadush dordadush commented Dec 17, 2017

I experience the exact same issue... using paramiko-2.4.0

DEB [20171217-14:31:59.473] thr=1   paramiko.transport: Switch to new keys ...
DEB [20171217-14:31:59.474] thr=2   paramiko.transport: Attempting password auth...
DEB [20171217-14:31:59.677] thr=1   paramiko.transport: userauth is OK
INF [20171217-14:31:59.684] thr=1   paramiko.transport: Authentication (password) successful!
DEB [20171217-14:31:59.690] thr=2   paramiko.transport: [chan 0] Max packet in: 32768 bytes
DEB [20171217-14:31:59.693] thr=1   paramiko.transport: [chan 0] Max packet out: 35840 bytes
DEB [20171217-14:31:59.693] thr=1   paramiko.transport: Secsh channel 0 opened.
DEB [20171217-14:31:59.697] thr=1   paramiko.transport: [chan 0] Sesch channel 0 request ok
INF [20171217-14:31:59.700] thr=2   paramiko.transport.sftp: [chan 0] Opened sftp connection (server version 3)
DEB [20171217-14:31:59.700] thr=2   paramiko.transport.sftp: [chan 0] stat('path_to/big_file.zip')
DEB [20171217-14:31:59.703] thr=2   paramiko.transport.sftp: [chan 0] open('path_to/big_file.zip', 'rb')
DEB [20171217-14:31:59.711] thr=2   paramiko.transport.sftp: [chan 0] open('path_to/big_file.zip', 'rb') -> 31

Is there any fix for this problem?

Really appreciate any workaround or fix.

Thanks

@billcrook

This comment has been minimized.

Copy link

@billcrook billcrook commented Dec 19, 2017

ditto, having same issue.

@SamuelRamond

This comment has been minimized.

Copy link

@SamuelRamond SamuelRamond commented Dec 20, 2017

We have the same issue over it, we had to use the combo open + shutil.copyfileobj

@billcrook

This comment has been minimized.

Copy link

@billcrook billcrook commented Dec 20, 2017

That was way too slow. Since I can't wait for this issue, I abandoned paramiko and call sftp directly via pexpect:

child = pexpect.spawn(
   f'/usr/bin/sftp -r -P {port} {username}@{host}:{remote_path} {local_path}',
   timeout=14400)

Followed by a bunch of expects. If anyone is interested in the full working code DM me.

@mcmanustfj

This comment has been minimized.

Copy link

@mcmanustfj mcmanustfj commented Jan 25, 2018

@billcrook I'm interested, but github hasn't had DMs for a few years.

@zachliu

This comment has been minimized.

Copy link

@zachliu zachliu commented Apr 4, 2018

@billcrook i'm interested too

@billcrook

This comment has been minimized.

@zachliu

This comment has been minimized.

Copy link

@zachliu zachliu commented Apr 12, 2018

https://stackoverflow.com/a/18969354/8529250
Why don't we just make this constant MAX_REQUEST_SIZE smaller?
Or we could make it dynamic based on the file size

@lapygithub

This comment has been minimized.

Copy link

@lapygithub lapygithub commented Dec 17, 2018

.put() (After authentication) seems to have a very similar hang with pysftp 0.2.9 and paramiko 2.4.2.
Trying to write to a brand new AWS Transfer SFTP server.
1MB file works great. Anything larger than 3.2MB hangs with log and code below.
I'm interested in a fix...

DEB [20181217-11:41:29.095] thr=2   paramiko.transport: [chan 0] Max packet in: 32768 bytes
DEB [20181217-11:41:29.380] thr=1   paramiko.transport: Received global request "hostkeys-00@openssh.com"
DEB [20181217-11:41:29.380] thr=1   paramiko.transport: Rejecting "hostkeys-00@openssh.com" global request from server.
DEB [20181217-11:41:29.381] thr=1   paramiko.transport: Debug msg: /apollo/env/NeccoSftpServer/bin/get-user-config:8: key options: agent-forwarding port-forwarding pty user-rc x11-forwarding
DEB [20181217-11:41:29.625] thr=1   paramiko.transport: [chan 0] Max packet out: 32768 bytes
DEB [20181217-11:41:29.625] thr=1   paramiko.transport: Secsh channel 0 opened.
DEB [20181217-11:41:29.930] thr=1   paramiko.transport: [chan 0] Sesch channel 0 request ok
INF [20181217-11:41:31.495] thr=2   paramiko.transport.sftp: [chan 0] Opened sftp connection (server version 3)
DEB [20181217-11:41:31.503] thr=2   paramiko.transport.sftp: [chan 0] open('test_file_1gb', 'wb')
DEB [20181217-11:41:33.034] thr=2   paramiko.transport.sftp: [chan 0] open('test_file_1gb', 'wb') -> 30
cnopts = pysftp.CnOpts()
cnopts.hostkeys = None
cnopts.log = True
srv = pysftp.Connection(	host=sftp_host,
						username=sftp_username,
						port=int(sftp_port),
						private_key=private_key,
						cnopts=cnopts )

print 'SFTP cwd=', srv.getcwd()
print 'SFTP logfile=', srv.logfile

srv.put(file_name, callback=lambda x,y: printProgressDecimal(x,y))

# Closes the connection
srv.close()
@GiacomoP

This comment has been minimized.

Copy link

@GiacomoP GiacomoP commented Dec 20, 2018

I've exactly the same issue here... Google everywhere and tried every single thing I found but nothing.

@cudmore

This comment has been minimized.

Copy link

@cudmore cudmore commented Dec 25, 2018

I have same issue. When using paramiko ssh/ftp to transfer files on the order of 1-2 GB it is far too slow. Can someone working on paramiko just tell its users 'yes, it is slow and no, it will not be fixed'.

I am abandoning paramiko and resorting to calling ssh/sftp/scp from command line. This is unfortunate after all the hard work that went into paramiko. In the end, paramiko is not useful for many real-world uses.

@armona

This comment has been minimized.

Copy link

@armona armona commented Dec 25, 2018

I also encountered the same issue and had to resort to calling sftp via shell as well

@svanscho

This comment has been minimized.

Copy link

@svanscho svanscho commented Jun 11, 2019

Why is it slow? Would any C-bindings be of any use? I like to use smart_open as I can use stream interfaces for accessing the data, which is using paramiko under the hood. Its performance however makes it too slow for our usage (copying 30GB+ files).

@macrovve

This comment has been minimized.

Copy link

@macrovve macrovve commented Aug 15, 2019

Increase MAX_PACKET_SIZE and WINDOW_SIZE would help

MAX_TRANSFER_SIZE = 2 ** 30
with paramiko.Transport((host, port)) as transport:
    transport.connect(username=uesr, password=password)
    with paramiko.SFTPClient.from_transport(transport, window_size=MAX_TRANSFER_SIZE, max_packet_size=MAX_TRANSFER_SIZE) as sftp:
         sftp.put(local_path, remote_path)
@BedivereZero

This comment has been minimized.

Copy link

@BedivereZero BedivereZero commented Oct 15, 2019

Increase MAX_PACKET_SIZE and WINDOW_SIZE would help

MAX_TRANSFER_SIZE = 2 ** 30
with paramiko.Transport((host, port)) as transport:
    transport.connect(username=uesr, password=password)
    with paramiko.SFTPClient.from_transport(transport, window_size=MAX_TRANSFER_SIZE, max_packet_size=MAX_TRANSFER_SIZE) as sftp:
         sftp.put(local_path, remote_path)

still only 4MB/s

@macrovve

This comment has been minimized.

Copy link

@macrovve macrovve commented Oct 15, 2019

Increase MAX_PACKET_SIZE and WINDOW_SIZE would help

MAX_TRANSFER_SIZE = 2 ** 30
with paramiko.Transport((host, port)) as transport:
    transport.connect(username=uesr, password=password)
    with paramiko.SFTPClient.from_transport(transport, window_size=MAX_TRANSFER_SIZE, max_packet_size=MAX_TRANSFER_SIZE) as sftp:
         sftp.put(local_path, remote_path)

still only 4MB/s

It will allow you to download a large file, not increase the download speed.

@vznncv

This comment has been minimized.

Copy link

@vznncv vznncv commented Jan 17, 2020

I got the same error with a paramiko-2.4.2 (downloading 7GB+ files). MAX_PACKET_SIZE and WINDOW_SIZE changes didn't help me. But some monkey patching, that limits number of concurrent requests, (default paramiko implementation sends read command requests for a whole file without any limitation) fixed my problems.

@lapygithub

This comment has been minimized.

Copy link

@lapygithub lapygithub commented Jan 17, 2020

@vznncv

This comment has been minimized.

Copy link

@vznncv vznncv commented Jan 18, 2020

Hi @lapygithub

Yes, I have published this workaround as a github gist:
https://gist.github.com/vznncv/cb454c21d901438cc228916fbe6f070f

Notes: I have rewritten it as separate function and tested it with latest paramiko version (2.7.1). But the implementation relies on some private paramiko API, so it may be broken if the sftp paramiko implementation is changed.

@jr14marquez

This comment has been minimized.

Copy link

@jr14marquez jr14marquez commented Feb 7, 2020

@vznncv Stumbled upon the gist you published for the monkey patch and its great! How would I go about implementing a retry to your script? I have a 600GB file that I need to download and it got to about 550GB when I received a Socket Exception Connection Timed Out along with the EOFERROR and Server Connection Dropped. It looks like there were about 15 minutes when the connection was down. Is there a way to add a retry without having to redownload the full file?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.