Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transfer fails at 1GB: rekey window too small, hard-coded #49

Closed
chrisegner opened this issue Oct 17, 2011 · 17 comments
Closed

Transfer fails at 1GB: rekey window too small, hard-coded #49

chrisegner opened this issue Oct 17, 2011 · 17 comments

Comments

@chrisegner
Copy link

First off, thanks for the great work with paramiko.

At 1GB of data transferred over sftp, paramiko initiates a ssh rekey request and then sets a limit on the number of packets exchanged before the remote side answers the rekey request. This limit is hard-coded to 20. If you have fat, long pipes, it's pretty easy to consistently exceed this threshold. As a consequence in this context, paramiko is not able to transfer files larger than 1GB in size. OpenSSH does not exhibit this behavior. I manually patched the source to change the 20 to something that was more reasonable based on our RTT and bandwidth. The exceptions immediately went away.

  • What does openssh do in this situation?
  • If doing what openssh does in not suitable or convenient, can this be made a configurable parameter, accessible from the higher-level API (ie SSHClient)?
@chrisegner
Copy link
Author

Exception info, if helpful:

Backtrace on logger:
2011-08-26 08:52:37,406 30818 paramiko.transport ERROR Exception: Remote transport is ignoring rekey requests
2011-08-26 08:52:37,407 30818 paramiko.transport ERROR Traceback (most recent call last):
2011-08-26 08:52:37,407 30818 paramiko.transport ERROR File "build/bdist.linux-x86_64/egg/paramiko/transport.py", line 1524, in run
2011-08-26 08:52:37,407 30818 paramiko.transport ERROR ptype, m = self.packetizer.read_message()
2011-08-26 08:52:37,407 30818 paramiko.transport ERROR File "build/bdist.linux-x86_64/egg/paramiko/packet.py", line 378, in read_message
2011-08-26 08:52:37,408 30818 paramiko.transport ERROR raise SSHException('Remote transport is ignoring rekey requests')
2011-08-26 08:52:37,408 30818 paramiko.transport ERROR SSHException: Remote transport is ignoring rekey requests
2011-08-26 08:52:37,408 30818 paramiko.transport ERROR

Backtrace on exception:
File "build/bdist.linux-x86_64/egg/paramiko/sftp_client.py", line 573, in put
fr.write(data)
File "build/bdist.linux-x86_64/egg/paramiko/file.py", line 314, in write
self._write_all(data)
File "build/bdist.linux-x86_64/egg/paramiko/file.py", line 435, in _write_all
count = self._write(data)
File "build/bdist.linux-x86_64/egg/paramiko/sftp_file.py", line 165, in _write
t, msg = self.sftp._read_response(req)
File "build/bdist.linux-x86_64/egg/paramiko/sftp_client.py", line 667, in _read_response
raise SSHException('Server connection dropped: %s' % (str(e),))
SSHException: Server connection dropped:

@btimby
Copy link

btimby commented Oct 18, 2011

Chris, ironically, I ran into this issue today as well.

What value did you end up picking instead of 20 packets?

@chrisegner
Copy link
Author

The value is entirely dependent on your situation, but basically, it goes like this:

  1. Determine your bandwidth between end points. Run scp with a large files (>1G) to get a good average.
  2. Determine your "round trip time" (rtt). Run ping from one end point to the other. It'll provide info on this.
  3. Note that most of the "packets" paramiko is using are 24k.
  4. Estimate the number of packets that go out from this side in the time it takes one rekey packet to come back from the other side. Note this is a lower bound since it assumes that the remote side instantly puts a rekey packet out on the wire the moment it receives one.
    num_packets = (bandwidth_MB * 1024) / 24 * rtt_msec * 1000 / 2
    The factor of two gets it down from a round trip to a one-way trip time. The 1024 converts from MB (which bandwidth is in) to KB, which the packet size (24) is in. The 1000 converts from milliseconds (which ping reports for rtt) into second, which everything else is in.
  5. Take num_packets and multiply it by some safety margin, at least 2 or 3x. I believe the downside here is that paramiko will buffer more packets the larger the number is, but it's been a while since I read the code. Unless you're heavily memory constrained, you can get away with a large number here.

@btimby
Copy link

btimby commented Oct 18, 2011

I was looking for a ball park. I don't have the luxury of doing this calculation. I am using paramiko on the server-side, so my clients will all have different optimal values. I am using paramiko as an SSH server to expose access to rsync. The client I am using is OpenSSH.

First I tried 120, I got to 3GB transferred before dying.

Now I am trying 1024. We will see how that works.

I reviewed the code of dropbear SSH server. It disallows sending of packets during kex rekeying. During rekeying, it enqueues packets to a linked list. No limit is enforced (except by available memory). The relevant code is in packet.c see:

enqueue_reply_packet()
maybe_flush_reply_queue()
encrypt_packet()

The queue is activated when ses.dataallowed == 0. kex.c handles rekeying and sets the dataallowed member to 0 after sending/receiving a rekey request.

The OpenSSH implementation is similar. packet.c enqueues outbound packets in packet_send2() when active_state->rekeying == 1.

My thoughts are that the 20 packet receiving limit should simply disappear. I don't see any similar limits on receiving packets in either implementation. I also don't see any queuing of outbound packets in paramiko, just a difference, not sure if that is a problem or not.

@thinred
Copy link

thinred commented Feb 29, 2012

New version of duplicity (http://duplicity.nongnu.org/) uses paramiko as a transport backend and seem to suffer from this bug also. It will normally crash at some moment during the full backup with a following stacktrace:

ssh: Exception: Remote transport is ignoring rekey requests
ssh: Traceback (most recent call last):
ssh:   File "/usr/lib/python2.7/dist-packages/paramiko/transport.py", line 1524, in run
ssh:     ptype, m = self.packetizer.read_message()
ssh:   File "/usr/lib/python2.7/dist-packages/paramiko/packet.py", line 378, in read_message
ssh:     raise SSHException('Remote transport is ignoring rekey requests')
ssh: SSHException: Remote transport is ignoring rekey requests
ssh: 
BackendException: sftp put of /tmp/duplicity-3KymQx-tempdir/mktemp-3Ehncv-42 (as duplicity-full.20120229T160640Z.vol41.difftar.gpg) failed: Server connection dropped: 

Any ideas for a real fix to this (instead of enlarging the number of packets allowed to receive)?

@chrisegner
Copy link
Author

The problem boils down to a sort of timeout (measured in packets received, not seconds) that is too short. The options are then to remove the timeout or make it longer, perhaps by some fancy algorithm. Rekeying is part of the ssh protocol and is there for security reasons. So any "real fix" necessarily involves enlarging the number of packets allowed.

That's not to say we can't do better than a hard coded limit. Something that auto-tunes based on the formula above is probably a decent idea. I'd also be curious what the openssl codebase does since it does not suffer from this issue as far as I can tell.

@dlitz
Copy link
Contributor

dlitz commented Mar 23, 2012

One way to fix this, without disabling transmission during re-keying and without the need for a complex (and possibly error-prone) time-based algorithm, would be to initiate the re-keying really early (at 500 MB, for example) and then raise the exception after 1 GB has been received like we do now. This weekend, I'll see if I can put together a patch that does this.

@dlitz
Copy link
Contributor

dlitz commented Mar 25, 2012

This patch seems to work: #63

Anyone want to try it out?

@joshtriplett
Copy link

This patch seems to work: #63

Anyone want to try it out?

Works for me; with it, my duplicity backups manage to get past 1GB without throwing an exception.

@adiroiban
Copy link

I think that this is related to this Twisted Conch bug http://twistedmatrix.com/trac/ticket/4395

@chrisegner
Copy link
Author

So what's the process for getting this merged to master?

bendavis78 pushed a commit to bendavis78/paramiko that referenced this issue Jan 30, 2014
When Paramiko initiates a re-key request over a high-bandwidth, medium-latency
connection, it erroneously terminates the connection with the error,
"SSHException: Remote transport is ignoring rekey requests".  This is due to
the hard-coded limit of 20 packets that may be received after a re-key request
has been sent.

See, for example, this bug report:

    "Transfer fails at 1GB: rekey window too small, hard-coded"
        paramiko#49

This patch changes paramiko's behaviour as follows:

- Decrease the threshold for starting re-keying from 2**30 to 2**29 bytes.
- Decrease the threshold for starting re-keying from 2**30 to 2**29 packets.
- Increase the limit of received packets between re-key request & completion
  from 20 packets to 2**29 packets.
- Add a limit of 2**29 received bytes between re-key request & completion.

In other words, we re-key more often in order to allow more data to be
in-transit during re-keying.

NOTE: It looks like Paramiko disables the keep-alive mechanism during
re-keying.  This patch does not change that behaviour.
ajlanghorn added a commit to alphagov/govuk_offsitebackups-puppet that referenced this issue Sep 26, 2014
When backing up assets and Whitehall using Duplicity, I originally used SCP
which makes use of the Paramiko back-end (as it's all Python) for SSH
access. Due to paramiko/paramiko#49 and
paramiko/paramiko#63, we can't use Paramiko.

Changing the SSH backend in the version of Duplicity we run (0.6.18) isn't
supported. It is in later versions, but they aren't available from Ubuntu
for Precise machines. As a result, we need to ensure that rSSH allows access
for Rsync.

This does that.
@simon-online
Copy link

Is this still an issue with the latest release v1.15.2?
I think this can be marked as closed now.

@bitprophet
Copy link
Member

Given the age, it's at least a little likely, @simon-online - though I know there are still outstanding issues re: "large" file transfers (also with their own PRs, though...). Not sure if same or different.

Overall, we really need a thorough cleaning of the ticket queue (even if it is simply "close anything w/o activity in the last year" on the assumption that critical issues will get re-upped). Something I hope to find time for in the mid term.

@simon-online
Copy link

For what it's worth I can confirm that I was regularly having this issue with v1.7.7.1 but since I updated to v1.15.2 over a month ago this issue has stopped.

@bitprophet
Copy link
Member

Thanks a lot for double checking, @simon-online!

intgr pushed a commit to intgr/paramiko that referenced this issue Nov 27, 2019
kex: fix server-side of kex-group-14-sha256 and kex-group16-sha512
@fthiery
Copy link

fthiery commented Oct 6, 2023

For what it's worth, i'm on 2.12.0 and the problem is (still?) present. The workaround provided here #151 (comment) worked wonders for me.

@bskinn
Copy link
Contributor

bskinn commented Oct 8, 2023

If you're still observing this problem, @fthiery, then likely your root cause is different from that discussed in this ticket, at least to some degree.

Please test also on the most recent 3.x release, and if you're seeing the problem there open a new ticket to report as such. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants