Stream control in run() #235

Closed
bitprophet opened this Issue Apr 20, 2015 · 9 comments

Projects

None yet

2 participants

@bitprophet
Member

This is re: replicating an existing Fabric feature allowing users to insert their own stream objects to be used in place of the default writing to sys.std(out|err).

It's tougher here because while the "direct", non-pty-using use case isn't hard, pty use is currently difficult as we use pexpect right now and it's a reasonably gross, not written to be extended, etc.

There's a couple clear options:

  • Try extending pexpect anyway - the core spot that MUST be overridden isn't actually that long of a function and while it is __sekrit that still should be overrideable within a subclass.
    • Testing will still be gross but I've done most of the work for that already, so eh.
  • Attempt to identify only the bits of pexpect we truly need, and replicate those in our own module (which can then be written to be more testable and of course, to have explicit stream control).
    • Going by the investigation I did for test mocking, it boils down to use of the stdlib pty module, forking, running os.execve, and then select.select on the FDs involved.
    • Maybe not easy, and we'd lose (or eventually reimplement) a lot of the corner/edge cases that pexpect handles (such as lots of nasty Solaris stuff) - but maybe worthwhile?
@sophacles

I seriously recommend looking at the builtin subprocess module, if getting rid of pexpect is being considered. It really nicely handles a lot of the very gross things under the hood, and using select and pty with it doesn't appear very hard.

pexpect is very old - it was written against python 2.1, which was many releases before subprocess was introduced (2.4). A lot of the behavior it includes/encapsulates is now just handled by subprocess nicely.

edit because I was curious, here's some stuff on pty with subprocess:

@bitprophet
Member

stalker

You haven't actually looked at how that stuff is implemented, huh? ;) We are using subprocess for the non PTY case. pexpect is only for the PTY use case.

Also, in looking around I just found out somebody picked up pexpect sometime in the last couple years (since we vendored it) and have done at least some work on it - including making it py3 compat (something I had to do myself for our vendored copy a while ago). So now I get to see whether they made it any more extensible.

But if I have to put that work in, yea, I may as well just see how little we can condense the useful bits via dropping pexpect and just using pty ourselves. If I can get away with shoving even more of the logic from the run_direct method into something shared between it and run_pty, that would be rad.

@bitprophet
Member

Of generic note, in trying to mock out the pexpect junk (partly with an eye towards making the tests not slow - 3-5s runtimes for <500 tests is kind of shitty) I found that much of the 'slowness' is due to pexpect sleeping for 0.1s at the end of every command run.

I'd been wondering why the pty tests were SO much slower than the non-pty ones, since I'd been assuming most of the slow was the spawning of real subprocesses/shells. God bless cProfile.

This is what led to even more thoughts of extensibility, since that sleep value is defined via literal at instantiation time, and cannot be overridden at the module or class level - makes test-only overriding a PITA.

@sophacles

Yeah, haven't looked at all the run implementation details - a huge chunk of picking inv for my cases is that I don't have to. I just figured if pexpect was being used, subprocess wasn't.

stalker

Totally, invoke is now in the critical path of my work, so I have filters in place to follow it very closely.

@bitprophet
Member

Honestly that's probably a good thing, since that is the point of such a tool!

Anyway, I am poking at pty and friends and will see where it goes. It would be very nice to remove this other half of the gross underpinnings of run (we recently got rid of a bunch of Popen monkeypatching dating from the same era.)

@bitprophet
Member

Neat, the pexpect work resulted in splitting out the PTY related stuff into https://github.com/pexpect/ptyprocess - may still be less extensible than we want, but good to see either way. Poking briefly.

EDIT: It's a bit cleaner and has Unicode and Python 3 support, and explicitly drops a bunch of the legacy Solaris/etc junk. Unfortunately, the specific bits I would've had to override (some with copypasta), like the read loop & the init (re: sleep times) are still there. In fact, the read "loop" is actually gone and is just a straight up read method, the loop itself (including aliveness handling) is still in the main pexpect package.

Think building our own module is sadly the "right" way to go here for testability/etc, I do not have the energy to try working with upstream to make things work right for my use case. At least we can still use pexpect/ptyprocess as a reference for edge/corner cases, and the "new" codebase is easier to read (it got a reformatting pass apparently) and Python 3 / Unicode compliant.


Re: that stuff, the pty module offers a spawn method which is extremely similar to what both the old and new pexpect code does (forks a child process, does a lot of low level magic to create a pty and rearrange all the pipes between parent/pty/child, then the child does some variant of os.exec with the command being run).

Will break it down and try to build it back up in a useful-to-us manner.

@bitprophet
Member

On the way home I got a basic draft working based on run_direct w/ just the specific bits swapped out, and it actually seems to...just work! All tests pass & it works well enough to run spec/nose/etc like the pexpect implementation does.

This is without any of the apparently random assortment of tweaks pexpect.spawn and/or pty.spawn do to the environment the child runs in - no tty module function calls, no setwinsize, etc. Right now I'm hoping to get away with not adding those back until we have proven, specific tests for them (many of which will probably need to go in the integration suite, or be run on a per-platform basis).

Currently playing with refactoring, which shouldn't take too crazy long given how I wrote this (tho trying to keep fabric.runners.Remote in mind).

@bitprophet
Member

Have been buried in a big arse refactor of Runner and related things now that the barrier between run_direct and run_local has been destroyed. Makes execution a lot more overridable/customizable and should still work for Remote, hopefully - will find out relatively soon.

As a side benefit it has also helped me continue moving older, crappy, full-integration-style tests from one test module, into another one using mocking or the newly chopped-up class structure (testable code ❤️).

@bitprophet
Member

pexpect is dead, long live pexpect! this is done now, including the enormous thorny rewrite of Runner's innards. Finally time to figure out what exactly I was going to do with this new feature over in Fabric land.

@bitprophet bitprophet added a commit that closed this issue May 6, 2015
@bitprophet bitprophet Changelog, closes #235 6f017fe
@bitprophet bitprophet closed this in 6f017fe May 6, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment