Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hang on large initial window size #524

Conversation

oleksiy-markovets
Copy link

Hi,

I'm using paramiko for file upload to SFTP server, but I faced problem that with some SFTP
servers it just hangs in putfo. Nevertheless openssh's sftp uploads file without any probles.
After hours of debuging of paramiko and openssh I found out that it happens when servers
returns MAX_UINT in intial window size (which complies SSHv2 standard). Because of strage
packing/unpacking logic paramiko gets wrong window size and hangs in a while.

I studied specifications (https://wiki.filezilla-project.org/SFTP_specifications) to find explanation of this
logic, but didn't find any.

Anyway my change fixes my problem, but I have strong feeling that this int packing/unpacking logic was implemented on purpose. Could you advice what the purpose of it is?

best regards
Oleksiy

@Russell-Jones
Copy link

It was added in 0e4ce37 @scottkmaxwell do you remember why you added "int handling to use mpint with a flag whenever the int is greater than 32 bits" ? Also, unless I'm misunderstanding it, the modified get_int() is looking for 0xff as a lead byte whilst in RFC 4251 an mpint is zero (00 00 00 00) or a 32-bit length followed by that number of bytes to be taken as a twos compliment integer unless the most significant bit of the first byte is 1, in which case a leading zero byte should be added, which appears quite different. Is this a workaround for something, perhaps? A non-standard extension used by some servers?

It appears to relate to the code in the same commit where add_int() does the reverse, adding 0xff when an int is > Message.big_int ( long(0xff000000) ) Both appear to be related to Python 3 compatibility fixes.

@oleksiy-markovets
Copy link
Author

Yes, I saw that commit too. But it doesn't make sense to me, I didn't find any mention of such data format in SSHv2 specification. Also, I undestand that it was implemented like this intentionaly, but I don't understand why.

@Russell-Jones
Copy link

My suspicion is that it's used internally within paramiko and the message gets rewritten (the rewind(), get_so_far() and get_remainder() methods makes me suspect this). I should probably just wait for Scott, but I find this intriguing and want to follow through the code of message.py and the calls to it now :)

I notice that get_size() is defined twice, too, once in the way you did it (the second time it's defined), and once in the altered way.

@scottkmaxwell
Copy link
Contributor

I believe this is what I fixed in #482

@oleksiy-markovets
Copy link
Author

@Russell-Jones I also thought that it may be some kind of paramiko's extension of SSH protocol to address some internal issues, but it brakes compatibility with others SSH implementations.
@scottkmaxwell Yes, #482 should fix my problem, but I'm just wondering what is the purpose of get_adaptive_int/add_adaptive_int? I mean this datatype is not defined in SSH specification so potentially dangerous

@Russell-Jones
Copy link

There's no int type in rfc4251. So add_int(), get_int() and _add() (via the previous two functions) are meant to inspect the input type to determine what to do. One problem with this is that there're ambiguities between the ranges

The ssh spec cases are (edited, I had this wrong)

x < 0 mpint
0 <= x < 2**32 uint32, uint64 or mpint
2**32 <= x < 2**64 mpint or uint64
x >= 2**64 mpint

The python 2 cases are

-sys.maxint - 1 < x <= sys.maxint int
x < -sys.maxint - 1 long
x > sys.maxint long

64-bit
sys.maxint = 0x7fffffffffffffff
32-bit
sys.maxint = 0x7fffffff

The python 3 cases are
int for all x

So either Scott's explanation on the 27th of Jan was wrong or the code was, perhaps because 32-bit ints were assumed. The ranges for long and int in python 2 are not necessarily the same as for the ssh spec, and relying on that in any way seems fragile. I guess there are (at least) three ways forward:

1a) Try to decide what to do with if statements and inequalities in Message.{get,add}_int()
This would need to deal with the ambiguities above, which would require careful consideration of the set of possible input values and things like the consequences of a malicious server sending the wrong one.

1b) Do that, but have some way to disambiguate the intended ssh type. Using Python integer handling would not be sufficient that I can see. I suggest using simple classes that wrap a Python 3 int or Python 2 long, along with isinstance(). This would require some reworking, but I suspect it could work OK, maybe needing something like add_int() in the context of those callers where it's not clear what the ssh type should be. If these end up (or predictably would be) the same, I guess that's when Scott's approach would be best.

  1. Start using add_int32(), add_int64() and add_mpint() explicitly, and never add() for integers
    This would require changes in the rest of the code, and could be quite complex, requiring significant refactoring to make everything explicit. It might be worth it, if it's possible. Does openssh, for example, use explicit typing everywhere? I'd expect so, but I don't know.

@oleksiy-markovets
Copy link
Author

@Russell-Jones thank you for comprehensive explanation.
IMHO, for sure, the best approach would be 2). Since SSHv2 specs operates only with uint32, uint64 and mpint which just don't map to python's native int and long. So having generic add() method with isinstance magic inside is a very bad idea.

After days of debuging of openssh code I became more or less familiar with its code. And yes, they use explicit functions for uint32, uint64 and mpint.

ps. As I can see 3 different pull requests which fix this problem, so I hope it will be fixed in master shortly....

@bitprophet
Copy link
Member

Thanks for the detailed explanation, @Russell-Jones.

@oleksiy-markovets I haven't unpacked this problem in my head in a while but I think merging #482 is the best initial step and I hope to get to that bugfix milestone relatively soon.

Given a lot of these problems are due to an earlier "quite complex, requiring significant refactoring" change - the Python 3 support merge - I'm a bit scared to jump into another. Though I recognize it may be the best way forward depending on how #482 is received post-release.

@scottkmaxwell
Copy link
Contributor

Yes, I'd say get #482 merged, then look at removing the magic int code. Since that code only appears to be used in tests and would be totally invalid in an actual SSH context, best to change or remove those tests and drop the magic code. I am guessing that code was a nice shortcut back when it was originally written but (fortunately) it was either never used in any actual code or those usages were later made explicit.

@oleksiy-markovets
Copy link
Author

agree

@bitprophet
Copy link
Member

Good point, @scottkmaxwell - thanks.

@corbinbs
Copy link

Hi - I encountered this issue earlier today with sftp uploads using paramiko to ProFTPD running on ubuntu. This patch with the paramiko 1.15.2 release fixed it up for me. In my case, I wasn't having any problems until the file size became larger than 32KB or so. When transmitting really small files, I didn't encounter it. Just wanted to pass that along in case that helps develop more test scenarios for it.
Thanks for doing the research on this and offering a solution until it can be worked into an official release

@Tatsh
Copy link

Tatsh commented Jan 25, 2016

This really works. All of a sudden downloading large files works correctly with SFTPFile.get(), downloading at full speed.

Server I'm downloading from uses ProFTPd with mod_sftp 0.9.9.

I don't know what to do until this gets merged or the fix in general gets into Paramiko, except to vendorise Paramiko 😢 (via Git submodule or some other decent way).

@oleksiy-markovets
Copy link
Author

I would suggest you to do what I did for our project, just download python-paramiko debian source package (assuming that you are using debian/ubuntu), patch it, rebuild and install on all machines that you need.

@Tatsh
Copy link

Tatsh commented Jan 25, 2016

At first I manually patched the file, but then...

It seems that Paramiko 1.16 already has the duplicate methods removed, and the get_int() patch. I removed paramiko completely from my virtualenv even by deleting it outright, and then ran pip install --upgrade paramiko and everything still worked.

The result is here: https://github.com/Tatsh/xirvik-tools/blob/master/xirvik/sftp.py#L102 <- the except socket.timeout is almost unnecessary as I have not encountered it since I upgraded Paramiko. I believe I was on 1.15.1 and it was not working at all. Got the hanging issue as described in this bug.

@bskinn
Copy link
Contributor

bskinn commented Jul 25, 2023

Hi! I'm helping bitprophet by going through the issues and PRs backlog, and for anything more than a few years old, we're just closing them out for now.

If this is still a live bug, please re-test and reproduce using a current version of Python (3.9+) and paramiko (3.0+), check to be sure the change still solves the problem, and create a new PR for it.

Thanks, and apologies for the long wait!

@bskinn bskinn closed this Jul 25, 2023
@scottkmaxwell
Copy link
Contributor

Pretty sure I fixed that in #482. Glad to see you going and closing out these old issues Brian.

@bskinn
Copy link
Contributor

bskinn commented Jul 26, 2023

Pretty sure I fixed that in #482. Glad to see you going and closing out these old issues Brian.

👍, thanks for following up, @scottkmaxwell.

No problem - going to be a gradual process, but I think we'll get there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants