Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix sftp stalls #578

Closed
wants to merge 2 commits into from

Conversation

Projects
None yet
4 participants
@stephencpope
Copy link
Contributor

stephencpope commented Aug 24, 2015

Proposed fix for #577. Implemented as a branch off 1.15 HEAD as I was not interested in the extra credit of determining exactly when the issue arose earlier in the history. However, it would apply in principal beginning at the point where sftp prefetch was implemented (sometime after 1.9.0).

Stephen C. Pope
@stephencpope

This comment has been minimized.

Copy link
Contributor Author

stephencpope commented Aug 24, 2015

Bah! I don't do no stinkin' Python 3. My submission is failing the various Python 3 variations (no dict.has_key() method). I am sure some simple syntactic change can fix that.

Stephen C. Pope
@stephencpope

This comment has been minimized.

Copy link
Contributor Author

stephencpope commented Aug 24, 2015

So I fixed the has_key problem. Python 3.2 passes. Python 3.3 and Python 3.4 both timeout, apparently hanging up in the SSHClient tests, which do not involve SFTPClient in any way; I suspect the problem is not caused by me. I got and built Python 3.4.3, ran the tests manually myself, and they worked fine. Don't know what to say ...

@bitprophet

This comment has been minimized.

Copy link
Member

bitprophet commented Sep 2, 2015

This is most likely covered by some older tickets as well, need to dig those up sometime to make sure use case coverage is better understood. (I'm paranoid about fixes breaking other areas of code...:D)

@stephencpope

This comment has been minimized.

Copy link
Contributor Author

stephencpope commented Sep 2, 2015

we've been using the fix here for lots of very large downloads (files as big as 35G thanks to recent market turmoil) as well as plenty of small ones with no adverse effects. Not that that will give you warm fuzzies!

@bitprophet

This comment has been minimized.

Copy link
Member

bitprophet commented Sep 2, 2015

It's still helpful though! So thanks :) and mostly I just want to make sure we link related tickets for closure or noting that "hey we're merging this and so um watch out for side effects".

@stephencpope

This comment has been minimized.

Copy link
Contributor Author

stephencpope commented Sep 2, 2015

So one thing to look out for: it seemed to me that the SFTP code wasn't safe to use for both getting and putting simultaneously. While my fixes address some locking issues, I didn't even attempt to address this since it is not a use case of mine.

@Tatsh

This comment has been minimized.

Copy link

Tatsh commented Feb 6, 2016

Status on this? I'm getting 'Garbage packet' exceptions when retrieving some large files (like anything above 50 MiB it seems, but sometimes things randomly work).

Other times the connection is extremely slow. It might be the server (as always can be the case), but lftp is able to retrieve such files very easily at around 2 MiB/s.

Is this a useful fix for that? I am using Python 3 and it appears this needs patching for 3.4?

@shuhaowu

This comment has been minimized.

Copy link

shuhaowu commented Mar 10, 2016

Any status on this? I think I just ran into this problem, although I am not 100% positive. I used pdb to trace it to hang on: https://github.com/paramiko/paramiko/blob/1.16/paramiko/sftp_file.py#L477.

PDB shows:

(Pdb) p self._prefetch_lock.locked()
True

When I step over that line it hangs.

When I straced one of the processes, I get

poll([{fd=3, events=POLLIN}], 1, 100)   = 0 (Timeout)
poll([{fd=3, events=POLLIN}], 1, 100)   = 0 (Timeout)
poll([{fd=3, events=POLLIN}], 1, 100)   = 0 (Timeout)
poll([{fd=3, events=POLLIN}], 1, 100)   = 0 (Timeout)
...

fd=3 is the socket to the SFTP server.

@Tatsh

This comment has been minimized.

Copy link

Tatsh commented Mar 10, 2016

I apply this patch on the version of Paramiko I'm using now (in my virtualenv). It works really well. Using Python 3.4. My project really depends on it. So I hope it does get merged at some point.

@shuhaowu

This comment has been minimized.

Copy link

shuhaowu commented Mar 10, 2016

Applied this patch on top of 1.16 today. Seems to have fixed this issue.

Is it possible we get some project member to look into merging this? I can help solve the merge conflict and re PR if OP is not going to do that.

@bitprophet bitprophet modified the milestones: 1.16.2, 1.16.1 Apr 25, 2016

@bitprophet

This comment has been minimized.

Copy link
Member

bitprophet commented Apr 25, 2016

Noting for the record that I wasn't able to recreate this (using the description in #577) even across a (not slow...) WAN connection, but thanks to @stephencpope's excellent analysis (thank you!) and reading the code, I'm reasonably sure I grok what it's doing.

More importantly, I'm not able to detect any serious regression in my integration branch :)

@stephencpope

This comment has been minimized.

Copy link
Contributor Author

stephencpope commented Apr 25, 2016

Thank you!

@bitprophet

This comment has been minimized.

Copy link
Member

bitprophet commented Apr 25, 2016

@stephencpope it's remotely possible this patchset is causing #730. I'll be trying to confirm/deny that tomorrow; if you happen to have time before then & want to try reproducing it (it's probably intermittent and mostly affecting Python 3.5, hooray) please comment on #730 with any findings.

@bitprophet

This comment has been minimized.

Copy link
Member

bitprophet commented Apr 26, 2016

As seen on #730 I was wholly unable to recreate the occurrence, so, hopefully a false alarm. glares suspiciously at all computers everywhere

@Tatsh

This comment has been minimized.

Copy link

Tatsh commented Apr 26, 2016

I am glad because I have wanted this merged for a long time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.