Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows support #17

Closed
takluyver opened this issue Oct 18, 2013 · 42 comments
Closed

Windows support #17

takluyver opened this issue Oct 18, 2013 · 42 comments

Comments

@takluyver
Copy link
Member

There are at least two ports of pexpect to Windows APIs:

We could look at integrating this support into pexpect. Both Chris and Geert have said they'd be willing to help out with some initial work, but I think this would need someone committed to maintaining it and fixing the inevitable problems. I don't really use Windows myself, so I can't give it much support.

@fhoech
Copy link

fhoech commented Apr 6, 2014

I may be interested in taking this up. I maintain a modified version of Chris Gorecki's wexpect which I've added to over the years (mostly bugfixes), which can be currently found here:
https://sourceforge.net/p/dispcalgui/code/HEAD/tree/trunk/dispcalGUI/wexpect.py
The code could probably use some clean-up, but is quite well tested and in active use (atleast by me ;)).

@takluyver
Copy link
Member Author

Awesome, it would be great to end the balkanisation of pexpect derivatives. Let's both start to think about how the Windows support could be integrated with minimal disruption.

@fhoech
Copy link

fhoech commented Apr 8, 2014

The code I have currently works with Python 2.5 (maybe even earlier, but I have not tested it and ultimately we don't need compatibility for anything earlier than 2.6 if I'm correct) up to 2.7, but needs some changes for Python 3. I'm thinking of integrating the current code in a separate branch or fork first (should be fairly straightforward), then doing any necessary Python 3 changes, and when it's all been tested and working we could merge back into master.

@takluyver
Copy link
Member Author

Thanks, that makes sense. I'm thinking, though, about how to structure the resulting code so that it both works as seamlessly as possible for new users writing cross platform code, and breaks as little as possible for existing users on posix systems.

For instance, wexpect makes spawn a function that instantiates either spawn_unix or spawn_windows. But that breaks subclassing spawn. Some other possibilities:

  1. spawn remains the Unix only class, and another class for Windows has to be invoked explicitly.
  2. spawn is a base class, but its __new__ method automatically instantiates an appropriate subclass.
  3. spawn is defined at import time as a reference to either spawn_unix or spawn_windows, depending on platform.
  4. spawn remains a single class and the primary interface, and the platform-specific details are separated out at a lower level.

@fhoech
Copy link

fhoech commented Apr 8, 2014

I'm thinking a variation of 3 and 4 maybe. E.g. we could have a module unix.py (or posix.py?) which would have the contents of the current __init__.py with minimal changes to allow importing on Windows without throwing an ImportError, and a win32.py with all Windows-specific stuff (would import and subclass the spawn class and other parts from unix.py). The __init__.py would then just be a wrapper with something along the lines of the following:

import sys

if sys.platform == 'win32':
    from pexpect.win32 import *
else:
    from pexpect.unix import *

Existing code which uses pexpect should continue to work normally.

@jquast
Copy link
Member

jquast commented Jun 3, 2014

python's selectors.py has a good model.
https://github.com/python/cpython/blob/master/Lib/selectors.py

@takluyver
Copy link
Member Author

Oh nice, I hadn't seen that new module.

That's using approach 3, which is the same thing @fhoech is thinking of. That variant would add a BaseSpawn class, which the unix and win32 implementations would inherit from. Then spawn would just be a reference to spawn_unix or spawn_win32, as appropriate. I quite like that - it preserves using the API, subclassing spawn and doing isinstance checks with pexpect.spawn. The only downside I can think of is duplicating inapplicable initialisation parameters (like ignore_sighup) to ensure that spawn() has a predictable signature. That doesn't seem too problematic.

I guess another question is what to do for methods that don't make sense - don't implement them, implement them but raise an error, or return a default value. It looks like wexpect mostly implements them but raises (e.g. sendcontrol()), but in some cases it returns a default value (e.g. fileno()).

@takluyver
Copy link
Member Author

Also, we should work out how the Windows stuff interacts with Unicode support. My understanding is that Windows has two copies of every function concerning text - the 'A' for ANSI variant which deals with bytes, and the 'W' for wide character variant which deals with (BMP) unicode code points. It's not clear to me, looking at wexpect, which variant it's dealing with. Ideally, we'd want to use the A functions for spawn_win32, and the W functions for spawnu_win32.

@fhoech : For comparison, because Unix deals with bytes and encodings, we currently have a spawn class which handles bytes-level interaction, and a spawnu class which handles encoding and decoding to provide a Unicode API to streams of bytes.

@takluyver
Copy link
Member Author

@jquast I think we should aim for a 3.3 release with the fixes and small additions we've made recently, and then push for a 4.0 release. Like 3.0, this would be mostly backwards compatible, but we could do some or all of:

  • Windows support
  • asyncio support (/other ways to wait for multiple spawned processes at once)
  • Drop Python 3.2 support (and maybe 2.6?)
  • Drop the psh module

@jquast
Copy link
Member

jquast commented Jun 4, 2014

Yes, let us cut pexpect 3.3 within a few weeks.
Agree in regards to 4.0 including windows.

@dperkins3600
Copy link

It would be VERY desirable for me to have support for just fdexpect in Windows. I'm using winpexpect to fork the "command line version of Putty" (which is called plink) to use the Windows COM ports (serial ports), to communicate with a device with a serial interface. The serial interface is running at 115200 Baud, and if Windows and the device could run at "line speed", I should be able to get approximately 11520 bytes per second through it. However, I'm seeing less than 1000 bytes per second (that is, I'm seeing less than 1/10th the expected rate) Reading and writing to a COM port using the serial library gives a speedup by over a factor of 5. So, if using fdexpect can provide the same speedup it would be very useful.

@takluyver
Copy link
Member Author

@dperkins3600 - with a bit of care, it should be possible to pull out fdpexpect so that it will run on Windows. As far as I know, the Unix specific bits are all about creating a new terminal and running a process inside it, which fdpexpect doesn't do. However, fdspawn does subclass from spawn. I would pull out both of those classes together, and then remove from spawn all the methods overridden by fdspawn.

@dperkins3600
Copy link

After the above, I looked at the fdexpect code, and saw that I could just "import and use it". So I did.
Original:
t = winpexpect.winspawn(plinkCmd, timeout=10)
Replacement:
import fdpexpect
import serial
...
s = serial.Serial(port="COM3", baudrate=115200)
fd = s.fileno()
t = fdpexpect.fdspawn(fd, timeout=10)
But, "s.fileno()" fails in Windows, so I couldn't test it out.
Looking in fdpexpect, I see the following lines in init:
if type(fd) != type(0) and hasattr(fd, 'fileno'):
fd = fd.fileno()
if type(fd) != type(0):
raise ExceptionPexpect('The fd argument is not an int. If this is a command string then maybe you want to use pexpect.spawn.')
try: # make sure fd is a valid file descriptor
os.fstat(fd)
except OSError:
raise ExceptionPexpect('The fd argument is not a valid file descriptor.')

So, it does look like fdpexpect would "work", but it is blocked in my usage by serial. I haven't looked to see if there is another way to open a Windows COM port and a file descriptor.

@takluyver
Copy link
Member Author

That's annoying, it doesn't use a file descriptor. Looking at the pyserial source code, I think you would have to customise the implementation to wrap a Win32Serial instance, calling its read and write methods. It also looks like you might need to find some way of checking when there is data to be read, but maybe that's already there somewhere. The relevant code is here:
http://sourceforge.net/p/pyserial/code/HEAD/tree/trunk/pyserial/serial/serialwin32.py

@dperkins3600
Copy link

Doing the wrapping is something I'm not qualified to do (at least now). I did find the following that looks like it would make testing very nice if support is added: http://com0com.sourceforge.net/.
Skipping over the wrapping approach, and instead looking in pexpect, it seems like all that fdpexpect is doing is saving the FD as the value for attribute child_fd. Then this is used in methods read_nonblocking, send, isalive, and __select. These in turn call os.read, os.write, os.close, select.select. These will not work with a serial object. It seems like read_nonblocking and send could be overridden for serial ports and it might work. I haven't done this in Python, but it does seem like a new class, say, called serialpexpect could be written that subclasses pexpect.spawn and includes read_nonblocking, send, isalive, etc. How does this sound?

@takluyver
Copy link
Member Author

Yep, that was what I meant by wrapping - subclass spawn, and override all the methods that use child_fd to call the corresponding Serial methods instead.

@fhoech
Copy link

fhoech commented Jul 25, 2014

@takluyver Unicode support should be easy to add to wexpect, because the way it's handled is that it already uses Unicode internally and converts to byte strings after reading from the child.
@dperkins3600 I'm not sure yet how to add support for fdexpect to wexpect because wexpect currently doesn't support file descriptors. I'm looking into that.

@takluyver
Copy link
Member Author

I've just been looking at what winpexpect and wexpect do to set up processes with a console, and it's pretty hairy. Searching on github and nullege, I suspect that most of the processes people want to communicate with can be dealt with using straightforward stdin/stdout pipes, as created by subprocess; though at least Powershell is an exception to that - starting it using subprocess makes very odd things happen.

I realise that pexpect is really conflating two separate-ish concerns: starting and controlling processes in a pseudoterminal, and waiting for specific patterns to appear in a pipe, which is often used to programmatically interact with some kind of prompt-based interface. Those features are often useful together, but they're technically quite separate, and the pattern-waiting can certainly be useful on Windows without the pty-control mechanisms. There are also cases, like terminado, which I'm working on for IPython, where we want the pty-control without the pattern-waiting.

So, my new plan is:

  • Separate out a ptysubprocess module which can be released on its own. Not being bound by backwards compatibility, this can expose a cleaner API, probably roughly inspired by the standard library's subprocess module. This will be Unix-only.
  • The code in Pexpect will focus on the pattern-waiting; it will depend on ptysubprocess on Unix, for backwards compatibility, exposing as much of its existing API as practical, and, for the time being, it will use subprocess.Popen to control processes on Windows. But the API should be sufficiently general that you could plug in a Windows analogue of ptysubprocess to communicate with a process in a Console.

@fhoech
Copy link

fhoech commented Oct 3, 2014

Separate out a ptysubprocess module which can be released on its own. [...] The code in Pexpect will focus on the pattern-waiting.

Sounds good!

it will use subprocess.Popen to control processes on Windows.

Pattern waiting will not work in conjunction with subprocess/pipes at all though because of the buffered stdio you'll get when a process isn't connected to an actual console.

@takluyver
Copy link
Member Author

Hmm, that's a bummer. How much of the buffering can we disable? At least in my tests, I managed to briefly drive cmd through the subprocess interface, but I only tried one command. I'll play around with this more when I'm back at my Windows VM. I have made a start at separating out ptyprocess, but I'm still working on the details of what belongs where.

I also spoke to someone who wanted to use pexpect to talk to a remote device using the pyserial interface. That should be possible without the unix-specific stuff, it's just unfortunate that the base class of pexpect (spawn) deals with the most complex case, rather than having a simple base class and a specialisation for extra details like pseudo terminals.

Of course, pexpect on Windows could use, or even depend on, an analogue of ptyprocess that handles the details of running processes in an invisible console. But I'd rather not have to maintain that piece, since I don't really use Windows myself.

@fhoech
Copy link

fhoech commented Nov 3, 2014

How much of the buffering can we disable?

I don't think there's a way to control it.

@blink1073
Copy link
Contributor

This gist reflects the hours of head-bashing I spent getting Octave and Python to talk on Windows. What say you, @takluyver?

@blink1073
Copy link
Contributor

This approach can be made to work with bash on Windows by forcing a sentinel echo after each command since bash does not print the prompt over an os.pipe.

@blink1073
Copy link
Contributor

That's what we do for metakernel.

@takluyver
Copy link
Member Author

That gist is more or less what I'm thinking of for initial Windows support, with a couple of tweaks:

  • I'd redirect stderr into stdout, and avoid running separate threads to read from the child.
  • I think passing stdout=subprocess.PIPE to Popen will deal with creating the pipe for you, and then you can use proc.stdout.read() instead of os.read(fd).

PR #123 contains the necessary refactoring that will make it possible to add Windows support, and then I'll look at doing something like that.

@blink1073
Copy link
Contributor

I agree on all points except for the PIPE. By using os.pipe we are able to use non-blocking reads (I tried it with PIPEs to be sure). I've updated the gist, and I'll take a look at #123 tomorrow.

@takluyver
Copy link
Member Author

It looks like the implementation in the gist will block if the pipe is empty, but maybe I'm missing something.

@blink1073
Copy link
Contributor

Actually, we can use os.read with the created PIPE's fileno, updated...

@blink1073
Copy link
Contributor

Yes, you're right. The only way around that is to use a thread.

@blink1073
Copy link
Contributor

So are we back to threads? :)

@takluyver
Copy link
Member Author

Not before I've delved into asyncio to see how it solves this problem ;-)

@blink1073
Copy link
Contributor

Take it away maestro

@takluyver
Copy link
Member Author

Asyncio does some very clever stuff using IOCP that I definitely don't want to try to replicate on Python 2. On the other hand, I am definitely keen to avoid spinning up extra threads if possible. I'm now wondering about doing an asyncio implementation with a fallback to a thread when asyncio is not available.

@blink1073
Copy link
Contributor

Perhaps use trollius so you're not limited to 3.4+.

@arve0
Copy link

arve0 commented Jun 10, 2015

Any news on windows support?

@afterhill
Copy link

Hi, this issue is closed yet. Does that means windows support is dropped?

@takluyver
Copy link
Member Author

This issue is still open. 4.0, when we get it out, will have experimental Windows support. It won't simulate terminal like we do in Posix systems, but the core functionality of waiting for a particular pattern to be read should work.

@takluyver
Copy link
Member Author

With a recently merged PR, it should now be possible to import Pexpect on Windows to use pexpect.fdpexpect.fdspawn and pexpect.popen_spawn.PopenSpawn. If you want to integrate it with other things, such as pyserial, it should be possible to subclass pexpect.spawnbase.SpawnBase (though that API is undocumented, and may still need to be changed).

We have not integrated any machinery to communicate with a subprocess in a hidden Windows console, which is what wexpect/winpexpect do. This would be more similar to the main function of Pexpect on Unix (where it uses a pty), but I don't think we could maintain the Windows code properly. However, if someone else wants to implement the communication machinery, they can reuse Pexpect's pattern-waiting code by inheriting from SpawnBase.

@tsuhaoc
Copy link

tsuhaoc commented Sep 28, 2016

Does Pexpect 4.2.1 Windows support depends on Cygwin?

@takluyver
Copy link
Member Author

No, it works on Windows natively.

@niyatia
Copy link

niyatia commented Nov 18, 2016

Hi, I am trying to use corenlp-python on my Windows Machine. The corenlp library internally uses pexpect.spawn(..). However, it gives me AttributeError saying 'module' object has no attribute spawn. Is there any workaround for this?

@takluyver
Copy link
Member Author

spawn uses a pty, which is a Unix only feature. You can try:

  • Using PopenSpawn to run the subprocess with piped output rather than a terminal. This is technically reasonably robust, but the program you're running may not do what you want without a terminal.
  • Using a tool like wexpect/winpexpect, which creates a hidden Windows console and tries to exchange data with it. This is conceptually closer to what spawn does on Unix, but it's an awkward hack, and I'm not interested in supporting it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants