Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Add support for talking to our stdin/stdout/stderr as streams #174
There should be a convenient and standard way to read and write from the trio process's stdin/stdout/stderr streams. (Note that this is different from talking to the stdin/stdout/stderr of child processes, which is part of #4.) Probably this should use our standard stream abstraction.
Complications to consider:
Normally Python's I/O stack does a bunch of work here: text/binary conversion, newline conversion, buffering, convenience parsing things like
It might even make sense to do both; #20 might mean that we have a 3 line solution for the "wrap an
On Windows, you often have to do these separate console control calls for things like cursor movement and coloring text, which need to be synchronized with the output stream. (In the very very latest Win 10 update they finally added VT100 support to the console, but it will be a while before anyone can count on that.) I believe that the output is still binary (UTF-16) rather than using some kind of first-class text read/write API.
I know prompt_toolkit has an async API and they support a lot of fancy terminal stuff in pure Python - we should check what they need to make sure whatever we come up with matches.
Related to #4, in previous projects I've played with feeding ptys to subprocesses instead of pipes (not sure about the correctness of the below):
import asyncio from asyncio.base_subprocess import ReadSubprocessPipeProto import os import pty async def subprocess_exec_pty(protocol_factory, *args, **kwargs): loop = asyncio.get_event_loop() stdout_master, stdout_slave = pty.openpty() stderr_master, stderr_slave = pty.openpty() transport, protocol = await loop.subprocess_exec( protocol_factory, *args, stdout=stdout_slave, stderr=stderr_slave, **kwargs) _, pipe = await loop.connect_read_pipe( lambda: ReadSubprocessPipeProto(transport, 1), os.fdopen(stdout_master, 'rb', 0)) transport._pipes = pipe _, pipe = await loop.connect_read_pipe( lambda: ReadSubprocessPipeProto(transport, 2), os.fdopen(stderr_master, 'rb', 0)) transport._pipes = pipe return transport, protocol
Unless we're reimplementing prompt_toolkit, is this required to provide a valid
I think it would be convenient if the API mirrored the stdlib a little:
Is the implementation here specifically that we set
asyncio doesn't support this directly yet (nice):
Oh wow yeah this is way nastier than I had realized.
So the absolute simplest solution would be to suggest people use
It has the downside that it's probably pretty slow compared to doing real non-blocking io in the cases where that's possible. So there's a specific use case we're talking about where this might be inadequate, the one where you're specifically trying to push bulk data through the standard descriptors, probably talking between two programs. So one question is whether and how we can do better for this case. Can we detect when the fd supports non-blocking operation? (Apparently from the twisted discussion it sounds like epoll will refuse to work, so that's one indication if nothing else. Not sure if kqueue works the same way. I guess just setting and then checking the nonblocking flag might work.) If we can detect that, then we can potentially offer two modes: the "always works" mode, and the "always works as long as no one else minds us setting things to non-blocking", and people who need speed and don't mind taking a risk can use the latter.
I don't know how important this feature is in practice. It might not be worth the complexity.
Oh, here's another fun issue to keep in mind:
The globalness of the standard descriptors causes several problems, actually. If we set them non-blocking, then it's not just other processes that get messed up, it's also naive calls to
... And actually this is also trickier than it might seem, because the thread safety issue also applies between the main thread and worker threads, i.e. even if
Anyway, one thing this makes clear is that the decision to use the standard fds for programmatic purposes is really not something to take lightly – if you're going to do it then the whole program needs to agree on how.
Oh, I just remembered another fun thing about stdin: trying to read from it can cause your whole program to get suspended (SIGTSTP).
Hmm, here's another trick, but it might not be widely applicable enough to be worthwhile: the
But... AFAICT this is supported only on Linux, not Windows or MacOS. On MacOS, the
# MacOS In : import socket In : socket.MSG_DONTWAIT Out: 128 In : a, b = socket.socketpair() In : while True: ...: print("sending") ...: res = a.send(b"x" * 2 ** 16, socket.MSG_DONTWAIT) ...: print("sent", res) ...: sending [...freezes...]
And on Windows it doesn't appear to be either documented or defined.
And, even on Linux, it only works on sockets. If I try using nasty tricks to call
# Linux In : s = socket.fromfd(1, socket.AF_INET, socket.SOCK_STREAM) In : s.send(b"x") OSError: [Errno 88] Socket operation on non-socket
and similarly on a pipe:
# Linux In : p1, p2 = os.pipe() In : s = socket.fromfd(p2, socket.AF_INET, socket.SOCK_STREAM) In : s.send(b"x") OSError: [Errno 88] Socket operation on non-socket
This has me wondering though if there's any other way to get a similar effect. There was a Linux patch submitted in 2007 to make Linux native AIO work on pipes and sockets; I don't know if it was merged, but in principle it might be usable to accomplish a similar effect.
On pipes, if no-one else is reading from the pipe, then the
Maybe we should focus on making threaded I/O as fast as possible :-)
Unrelated issue: there's also some question about how a hypothetical
Task-local storage would be useful if there were some way to give each task its own private stdin, stdout, etc., but.... I'm not sure what that would mean? :-) Those are kind of inherently process-global resources.
referenced this issue
Jun 22, 2017
Update: Apparently I was wrong! On Windows, It is possible to read/write to the console without doing blocking
This stackoverflow question seems to have reasonable info (once you filter through all the partial answers). AFAICT, the basic idea is that you call
Now, all the APIs mentioned in the previous paragraph assume that your program is attached to a regular console (like a TTY on unix). And you can always get access to whatever console you're running under (if any) by opening
The first case (magic console objects) is described above.
Socket without OVERLAPPED support: well, we can use
Named pipe: can't assume OVERLAPPED is available; maybe
On-disk files: well, here just plain old threads are OK, because reading/writing to a file might be slow but it shouldn't block indefinitely.
So tentatively I'm thinking:
Also, note for reference: looking at the python-prompt-toolkit code, it appears that the way they do async interactive applications on Unix is to
Further Windows update: while I still can't find any references to
Unfortunately canceling a console read via
libuv has a clever trick! If you want to set stdin/stdout/stderr non-blocking, and it's a tty, then you can use
That blog post also mentions that kqueue on MacOS doesn't work on ttys, which would be super annoying, but apparently this got fixed in 10.7 (Lion). I don't think we need to care about supporting anything older than 10.7. Apparently even 10.9 is already out of security-bugfix-land. (ref)
@remleduff has made a remarkable discovery: on Linux, libuv's clever trick of re-opening the file can actually be done on anonymous pipes too, by opening
So this means that technically on Linux I think we actually can handle every common case:
The first three cases cover the vast vast vast majority of stdin/stdout/stderr configurations that actually occur in practice. I'm not sure sockets are common enough to justify a whole extra set of code paths, but maybe.
I also spent some time trying to figure out if there was a way to making blocking I/O cancellable.
The first idea I considered is: start a thread that will sit blocked in
The second idea I considered is:
OH WAIT THOUGH. What if we combine these. Option 3:
Otherwise, it means the
This still has the problems that we have to claim a signal, and if we're running outside the main thread then Python doesn't provide an API for registering a signal handler (and I'm pretty sure that to get EINTR we need to have a C-level signal handler registered, even though we want it to just be a no-op). But we could potentially grab, like, SIGURG which hopefully no-one actually uses and is ignored by default, and use ctypes to call
This is kind of a terrible idea, but I do think it would work reliably and portably on all Unixes for all fd types.
I guess this is some kind of argument for... something: https://gist.github.com/njsmith/235d0355f0e3d647beb858765c5b63b3
(It exploits the fact that
added a commit
May 21, 2018
Here's the discussion about this in mio: carllerche/mio#321
It looks like libuv has an amazing thing where their tty layer on windows actually implements a vt100 emulator in-process on top of the windows console APIs: https://github.com/libuv/libuv/blob/master/src/win/tty.c
I looked at SIGTTIN/SIGTTOU again. This is a useful article. It sounds like for SIGTTIN, you can detect when you've been blocked from reading (ignore SIGTTIN, and then
You've probably seen this already, but Windows 10 has been making large changes (improvements) to console handling. Is it better to have a wait-and-see attitude on this one, and just try to make it work really well starting with Windows 10?
The console changes are great, but unfortunately, as far as I know none of them change the basic api that apps use to talk to their stdin/stdout when it's a console.
That API did get some work in win 8 – in particular some possibly useful cancellation support – but there's still no real async API afaik.