New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fabric is slow because stdout/stderr read byte per byte #715
Comments
Thanks a lot for all the details :) First off, the bytewise behavior needs to remain the default for now, as it's what enables the interactivity feature (docs) and changing it would break backwards compatibility. (We might change this in 2.x, hard to say right now. It may well remain the default. However I agree that allowing control over this would be useful for situations where users A) have lots of output and B) explicitly acknowledge that they don't need the ability to respond to prompts, etc inline (usually fine for non-newbie users who've already worked out the interactivity kinks in their recipes.) If we can find a low-impact way to change this I'll merge it sometime soon; if the changes feel too drastic I will probably punt to 2.x. FYI. What probably needs to happen:
|
If I'm not mistaken, the interactivity feature is only active (edit: seems like I'm mistaken as capture is always used even if there is no interactive behaviour expected. How sad :/) |
Created a pull request that differ slightly from what we discussed. The diff however is quite small. |
Hrm. I'm -0 on that approach. Capturing shouldn't be considered orthogonal to interactivity, even if it makes the code changes easier to handle (though your PR mentions that the line prompt remains to be fixed -- that's important too =/). That + your earlier comment make it sound like you might not realize how important capturing is for most users -- it's a definitely a core use case for Put another way, it's never OK to turn off capturing -- IMO this feature/change should be limited to a tradeoff between interactivity (small byte-read size) and speed (larger/configurable byte-read size). That probably will make it tougher to implement :( Final, mostly unrelated note -- why the mutation into a class? I'm not a fan of doing that to existing code unless one strongly needs "classy" behavior, e.g. subclass+override or keeping a handle on frequently referenced state. Sorry for being a negative nancy :) I appreciate you taking the time to try and hash this out! |
Hello Jeff, Thanks for the feedback. As I'm about to go in holidays, this will have to wait until I'm back. I understand your concern and will try for propose a changeset that preserves the interactivity and capturability of the stream while reading it by chunk. I'm almost certain this can be achieved without too much headaches. I've mutated the loop into a class for two reasons. First, I found it was difficult to understand the different states of the small state machine that's embedded inside without actually creating additional functions (or methods) encapsulating the prompt, try again, etc behaviour. Secondly, I actually captured some heavily used internal state (such as the Another important reason : This is definitely _the_ part of the code where lies the solution of this issue so I really need to be able to isolate all its moving parts so I can be sure it will still work after I've rewritten it... (my co-worker told me that I'm actually the kind of guy who mutate script in class sometimes a bit too early, I might be the problem in this case :)) |
That all sounds fair. Thanks again -- will wait to hear back. (I'm going on vaca for a few days as well, actually. Enjoy your own :)) |
I'd love to know the results from this discussion! I also noticed that fabric can be slow, when a command generates a lot of output. I'd like to fetch a
Interestingly enough, the following also worked really well, within 1 second:
|
@vdboor Yea, it's not really a global "run within Fabric == SLOWMG", but rather that using the Python SSH library and/or the way we're buffering network traffic with it, is what's slow. Using @octplane Eagerly awaiting your iteration on the pull request whenever you're back, & thanks again. |
Ohai, Pull request is there: #733 |
Merging into #733 |
Hi,
(Useless) introduction and other might-be bugs
Origin of the issue
I use fabric to build a maven artefact. When run in console, this build process takes around 30s and spits around 15000 lines of log. However, when using
fabric
it takes around _1m30s_ to complete the same process. Painfully slow.I use it for all my deployment and control (in particular, I run chef-client inside fab) and I started wondering if using fabric was not making all the stuff slower than it was already... 😢
First missed attempt at fixing the issue
Being not very good at optimizing, I started suspecting the pure python ssh library to be a bit slow and wanted to thrown in a pure C implementation of ssh to see how it performed much better. I ended up trying
libssh2
which has an incomplete binding for Python. However, it's complete enough to replacessh
for in the fabric code. Replacingssh
withpylibssh2
is almost straightforward although I feel there miss a "ssh" interface between fabric and its ssh library. I'm not sure having the opportunity to be able to replace the ssh backend simply would be good feature for fabric but It would have helped me testing faster.My first tests indicate that fabric+libssh2 is as slow as fabric+ssh. This is when I suspected I was hunting for the wrong duck.
Actual Issue
The actual issue lies in the fact that fabric read its channel's stdout/stderr byte by byte. Changing line 60 of
io.py
from func(1) to func(1024) and ignoring all side effect issues boots the overall performance in my case.I guess this should be rewritten so that:
ssh
and in particular, handle EOF correctly_prefix
Also:
prompt
andtry_again
features are active on both stdout and stderr and I'm not sure this is actually needed in these two cases,Because this is something very annoying for me right now, I'm ready to spend some time to try and fix that (after all, I've spent 2 days a digging into pylibssh2 and libssh2 to understand how to use their public key feature properly). Note that I'll be doing that at work as my company (fotopedia) uses fabric in production.
I'm not a python expert, but provided you give me a heads up about the pitfalls I should be aware of, I'd love to help.
A so finishes this bug report. My thanks for fabric !
The text was updated successfully, but these errors were encountered: