Transfer fails at 1GB: rekey window too small, hard-coded #49

chrisegner · 2011-10-17T23:55:25Z

First off, thanks for the great work with paramiko.

At 1GB of data transferred over sftp, paramiko initiates a ssh rekey request and then sets a limit on the number of packets exchanged before the remote side answers the rekey request. This limit is hard-coded to 20. If you have fat, long pipes, it's pretty easy to consistently exceed this threshold. As a consequence in this context, paramiko is not able to transfer files larger than 1GB in size. OpenSSH does not exhibit this behavior. I manually patched the source to change the 20 to something that was more reasonable based on our RTT and bandwidth. The exceptions immediately went away.

What does openssh do in this situation?
If doing what openssh does in not suitable or convenient, can this be made a configurable parameter, accessible from the higher-level API (ie SSHClient)?

chrisegner · 2011-10-18T00:02:08Z

Exception info, if helpful:

Backtrace on logger:
2011-08-26 08:52:37,406 30818 paramiko.transport ERROR Exception: Remote transport is ignoring rekey requests
2011-08-26 08:52:37,407 30818 paramiko.transport ERROR Traceback (most recent call last):
2011-08-26 08:52:37,407 30818 paramiko.transport ERROR File "build/bdist.linux-x86_64/egg/paramiko/transport.py", line 1524, in run
2011-08-26 08:52:37,407 30818 paramiko.transport ERROR ptype, m = self.packetizer.read_message()
2011-08-26 08:52:37,407 30818 paramiko.transport ERROR File "build/bdist.linux-x86_64/egg/paramiko/packet.py", line 378, in read_message
2011-08-26 08:52:37,408 30818 paramiko.transport ERROR raise SSHException('Remote transport is ignoring rekey requests')
2011-08-26 08:52:37,408 30818 paramiko.transport ERROR SSHException: Remote transport is ignoring rekey requests
2011-08-26 08:52:37,408 30818 paramiko.transport ERROR

Backtrace on exception:
File "build/bdist.linux-x86_64/egg/paramiko/sftp_client.py", line 573, in put
fr.write(data)
File "build/bdist.linux-x86_64/egg/paramiko/file.py", line 314, in write
self._write_all(data)
File "build/bdist.linux-x86_64/egg/paramiko/file.py", line 435, in _write_all
count = self._write(data)
File "build/bdist.linux-x86_64/egg/paramiko/sftp_file.py", line 165, in _write
t, msg = self.sftp._read_response(req)
File "build/bdist.linux-x86_64/egg/paramiko/sftp_client.py", line 667, in _read_response
raise SSHException('Server connection dropped: %s' % (str(e),))
SSHException: Server connection dropped:

btimby · 2011-10-18T02:40:36Z

Chris, ironically, I ran into this issue today as well.

What value did you end up picking instead of 20 packets?

chrisegner · 2011-10-18T04:58:32Z

The value is entirely dependent on your situation, but basically, it goes like this:

Determine your bandwidth between end points. Run scp with a large files (>1G) to get a good average.
Determine your "round trip time" (rtt). Run ping from one end point to the other. It'll provide info on this.
Note that most of the "packets" paramiko is using are 24k.
Estimate the number of packets that go out from this side in the time it takes one rekey packet to come back from the other side. Note this is a lower bound since it assumes that the remote side instantly puts a rekey packet out on the wire the moment it receives one.
num_packets = (bandwidth_MB * 1024) / 24 * rtt_msec * 1000 / 2
The factor of two gets it down from a round trip to a one-way trip time. The 1024 converts from MB (which bandwidth is in) to KB, which the packet size (24) is in. The 1000 converts from milliseconds (which ping reports for rtt) into second, which everything else is in.
Take num_packets and multiply it by some safety margin, at least 2 or 3x. I believe the downside here is that paramiko will buffer more packets the larger the number is, but it's been a while since I read the code. Unless you're heavily memory constrained, you can get away with a large number here.

btimby · 2011-10-18T14:22:41Z

I was looking for a ball park. I don't have the luxury of doing this calculation. I am using paramiko on the server-side, so my clients will all have different optimal values. I am using paramiko as an SSH server to expose access to rsync. The client I am using is OpenSSH.

First I tried 120, I got to 3GB transferred before dying.

Now I am trying 1024. We will see how that works.

I reviewed the code of dropbear SSH server. It disallows sending of packets during kex rekeying. During rekeying, it enqueues packets to a linked list. No limit is enforced (except by available memory). The relevant code is in packet.c see:

enqueue_reply_packet()
maybe_flush_reply_queue()
encrypt_packet()

The queue is activated when ses.dataallowed == 0. kex.c handles rekeying and sets the dataallowed member to 0 after sending/receiving a rekey request.

The OpenSSH implementation is similar. packet.c enqueues outbound packets in packet_send2() when active_state->rekeying == 1.

My thoughts are that the 20 packet receiving limit should simply disappear. I don't see any similar limits on receiving packets in either implementation. I also don't see any queuing of outbound packets in paramiko, just a difference, not sure if that is a problem or not.

thinred · 2012-02-29T16:18:08Z

New version of duplicity (http://duplicity.nongnu.org/) uses paramiko as a transport backend and seem to suffer from this bug also. It will normally crash at some moment during the full backup with a following stacktrace:

ssh: Exception: Remote transport is ignoring rekey requests
ssh: Traceback (most recent call last):
ssh:   File "/usr/lib/python2.7/dist-packages/paramiko/transport.py", line 1524, in run
ssh:     ptype, m = self.packetizer.read_message()
ssh:   File "/usr/lib/python2.7/dist-packages/paramiko/packet.py", line 378, in read_message
ssh:     raise SSHException('Remote transport is ignoring rekey requests')
ssh: SSHException: Remote transport is ignoring rekey requests
ssh: 
BackendException: sftp put of /tmp/duplicity-3KymQx-tempdir/mktemp-3Ehncv-42 (as duplicity-full.20120229T160640Z.vol41.difftar.gpg) failed: Server connection dropped:

Any ideas for a real fix to this (instead of enlarging the number of packets allowed to receive)?

chrisegner · 2012-02-29T17:12:22Z

The problem boils down to a sort of timeout (measured in packets received, not seconds) that is too short. The options are then to remove the timeout or make it longer, perhaps by some fancy algorithm. Rekeying is part of the ssh protocol and is there for security reasons. So any "real fix" necessarily involves enlarging the number of packets allowed.

That's not to say we can't do better than a hard coded limit. Something that auto-tunes based on the formula above is probably a decent idea. I'd also be curious what the openssl codebase does since it does not suffer from this issue as far as I can tell.

dlitz · 2012-03-23T17:15:46Z

One way to fix this, without disabling transmission during re-keying and without the need for a complex (and possibly error-prone) time-based algorithm, would be to initiate the re-keying really early (at 500 MB, for example) and then raise the exception after 1 GB has been received like we do now. This weekend, I'll see if I can put together a patch that does this.

dlitz · 2012-03-25T19:13:12Z

This patch seems to work: #63

Anyone want to try it out?

joshtriplett · 2012-04-18T02:12:52Z

This patch seems to work: #63

Anyone want to try it out?

Works for me; with it, my duplicity backups manage to get past 1GB without throwing an exception.

adiroiban · 2012-04-18T19:46:43Z

I think that this is related to this Twisted Conch bug http://twistedmatrix.com/trac/ticket/4395

chrisegner · 2012-04-18T20:12:45Z

So what's the process for getting this merged to master?

When Paramiko initiates a re-key request over a high-bandwidth, medium-latency connection, it erroneously terminates the connection with the error, "SSHException: Remote transport is ignoring rekey requests". This is due to the hard-coded limit of 20 packets that may be received after a re-key request has been sent. See, for example, this bug report: "Transfer fails at 1GB: rekey window too small, hard-coded" paramiko#49 This patch changes paramiko's behaviour as follows: - Decrease the threshold for starting re-keying from 2**30 to 2**29 bytes. - Decrease the threshold for starting re-keying from 2**30 to 2**29 packets. - Increase the limit of received packets between re-key request & completion from 20 packets to 2**29 packets. - Add a limit of 2**29 received bytes between re-key request & completion. In other words, we re-key more often in order to allow more data to be in-transit during re-keying. NOTE: It looks like Paramiko disables the keep-alive mechanism during re-keying. This patch does not change that behaviour.

When backing up assets and Whitehall using Duplicity, I originally used SCP which makes use of the Paramiko back-end (as it's all Python) for SSH access. Due to paramiko/paramiko#49 and paramiko/paramiko#63, we can't use Paramiko. Changing the SSH backend in the version of Duplicity we run (0.6.18) isn't supported. It is in later versions, but they aren't available from Ubuntu for Precise machines. As a result, we need to ensure that rSSH allows access for Rsync. This does that.

simon-online · 2015-09-28T03:30:42Z

Is this still an issue with the latest release v1.15.2?
I think this can be marked as closed now.

bitprophet · 2015-09-30T04:41:55Z

Given the age, it's at least a little likely, @simon-online - though I know there are still outstanding issues re: "large" file transfers (also with their own PRs, though...). Not sure if same or different.

Overall, we really need a thorough cleaning of the ticket queue (even if it is simply "close anything w/o activity in the last year" on the assumption that critical issues will get re-upped). Something I hope to find time for in the mid term.

simon-online · 2015-11-09T02:50:00Z

For what it's worth I can confirm that I was regularly having this issue with v1.7.7.1 but since I updated to v1.15.2 over a month ago this issue has stopped.

bitprophet · 2015-11-19T22:49:51Z

Thanks a lot for double checking, @simon-online!

kex: fix server-side of kex-group-14-sha256 and kex-group16-sha512

fthiery · 2023-10-06T12:41:12Z

For what it's worth, i'm on 2.12.0 and the problem is (still?) present. The workaround provided here #151 (comment) worked wonders for me.

bskinn · 2023-10-08T14:50:00Z

If you're still observing this problem, @fthiery, then likely your root cause is different from that discussed in this ticket, at least to some degree.

Please test also on the most recent 3.x release, and if you're seeing the problem there open a new ticket to report as such. Thanks!

ajlanghorn mentioned this issue Sep 26, 2014

Allow Rsync access through rSSH alphagov/govuk_offsitebackups-puppet#68

Merged

bitprophet closed this as completed Nov 19, 2015

intgr pushed a commit to intgr/paramiko that referenced this issue Nov 27, 2019

Merge pull request paramiko#49 from ploxiln/kex_g16_srv_fix

e3e10cb

kex: fix server-side of kex-group-14-sha256 and kex-group16-sha512

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transfer fails at 1GB: rekey window too small, hard-coded #49

Transfer fails at 1GB: rekey window too small, hard-coded #49

chrisegner commented Oct 17, 2011

chrisegner commented Oct 18, 2011

btimby commented Oct 18, 2011

chrisegner commented Oct 18, 2011

btimby commented Oct 18, 2011

thinred commented Feb 29, 2012

chrisegner commented Feb 29, 2012

dlitz commented Mar 23, 2012

dlitz commented Mar 25, 2012

joshtriplett commented Apr 18, 2012

adiroiban commented Apr 18, 2012

chrisegner commented Apr 18, 2012

simon-online commented Sep 28, 2015

bitprophet commented Sep 30, 2015

simon-online commented Nov 9, 2015

bitprophet commented Nov 19, 2015

fthiery commented Oct 6, 2023

bskinn commented Oct 8, 2023

Transfer fails at 1GB: rekey window too small, hard-coded #49

Transfer fails at 1GB: rekey window too small, hard-coded #49

Comments

chrisegner commented Oct 17, 2011

chrisegner commented Oct 18, 2011

btimby commented Oct 18, 2011

chrisegner commented Oct 18, 2011

btimby commented Oct 18, 2011

thinred commented Feb 29, 2012

chrisegner commented Feb 29, 2012

dlitz commented Mar 23, 2012

dlitz commented Mar 25, 2012

joshtriplett commented Apr 18, 2012

adiroiban commented Apr 18, 2012

chrisegner commented Apr 18, 2012

simon-online commented Sep 28, 2015

bitprophet commented Sep 30, 2015

simon-online commented Nov 9, 2015

bitprophet commented Nov 19, 2015

fthiery commented Oct 6, 2023

bskinn commented Oct 8, 2023