Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow userland streams to be passed to stdio option #81

Closed
jamestalmage opened this issue Mar 23, 2017 · 17 comments
Closed

Allow userland streams to be passed to stdio option #81

jamestalmage opened this issue Mar 23, 2017 · 17 comments

Comments

@jamestalmage
Copy link
Contributor

jamestalmage commented Mar 23, 2017

UPDATE: I think this is the right way to do it. Everything in between here and there is interesting background though.


I think you should be able to do this:

const PassThrough = require('stream').PassThrough;
const execa = require('execa');

const stdin = new PassThrough();
const stdout = new PassThrough();
const stderr = new PassThrough();

execa('foo', [], {stdio: [stdin, stdout, stderr]});

But you aren't actually allowed to pass any old stream to child_process. It has to be a very specific type of native stream (here is the internal Node code that checks that).

Instead, if you've got some userland stream you want to use, you've got to go about it a bit more convoluted way:

const PassThrough = require('stream').PassThrough;
const execa = require('execa');

const stdin = new PassThrough();
const stdout = new PassThrough();
const stderr = new PassThrough();

const cp = execa('foo', [], ['pipe', 'pipe', 'pipe']);

stdin.pipe(cp.stdin);
cp.stdout.pipe(stdout);
cp.stderr.pipe(stderr);

I think it's a weird choice to allow streams, but only certain types of streams. I've written native-stream-type which would help us figure out if we can pass the stream directly to child_process.spawn or not. For userland streams, we just convert to using pipe.

@sindresorhus
Copy link
Owner

👍

@jamestalmage
Copy link
Contributor Author

jamestalmage commented Mar 24, 2017

Looking into this a bit more, it's actually kind of complicated to tell exactly which direction a given pipe should go. For the first three streams, you can probably infer stdin to be a write stream and the other two to be read streams in the parent (and reversed in the child process).

Any streams past the first three, it's harder to guess:

// parent.js
const child = fork(process.execPath, 'child.js', {
  stdio: [0, 1, 2, 'pipe', 'pipe']
});

child.stdio[3].write('some data');
child.stdio[4].on('data', data => console.log('Got data from child: ' + data));

// child.js
const readStream = fs.createReadStream(null, {fd: 3});
const writeStream = fs.createWriteStream(null, {fd: 4});

If they hand us a stream that is readable only, or writable only, it's easy to know what to do, but duplex streams create a problem:

const readable = isStream.readable(stream);
const writable = isStream.writable(stream);

if (readable && !writable) {
  stream.pipe(child.stdio[index]);
} else if (writable && !readable) {
  child.stdio[index].pipe(stream);
} else if (readable && writable) {
  // Uh Oh! Duplex Stream. What do we do?
}

@jamestalmage
Copy link
Contributor Author

So, the most common type of Duplex stream we will get is probably a Transform stream. With a Transform stream, it's impossible to know the intended direction (do they intend to transform data coming out of the child process, or before it goes in?).

I think it makes sense to assume any Duplex stream they give us should be used as a Readable stream, and piped to the process.

I think this is the more readable choice:

var input = someReadable()
  .pipe(someTransform()); // pipe returns the transform, which is duplex

var child = fork(path, args, {stdio: [0, 1, 2, input, 'pipe']});

var output = child.stdio[4].pipe(someTransform());

output.on('data', data => {
  // ...
});

If we were to make the opposite assumption, it feels backwards:

var outputTransform = someTransform();
var input = someReadable()
  .pipe(someTransform());

var child = fork(path, args, {stdio: [0, 1, 2, 'pipe', outputTransform]});

input.pipe(child.stdio[3]);

outputTransform.on('data', data => {
  // ...
});

I like the way the first one reads better. You are creating input streams before passing them to the child process constructor. After the child process is created, you pipe the streams that are created as a result somewhere.

@jamestalmage
Copy link
Contributor Author

jamestalmage commented Mar 24, 2017

Or perhaps, a better choice is to be more restrictive.

If you pass a duplex stream, we make assumptions for the first three fds (read, write, write). Otherwise it is an error.

We could additionally allow you to do something like this:

{
  stdio: [
    duplexStream(), // 0: stdin - we assume it should be used as a readable stream

    duplexStream(), // 1: stdout - we assume it should be used as a writable stream

    duplexStream(), // 2: stderr - we assume it should be used as a writable stream

  
    duplexStream(), // 3+: ERROR - We need more hints to figure out how to handle it

    readableStream(), // 3+: if it's not duplex, we can still handle automatically

    writableStream(),  // 3+: if it's not duplex, we can still handle automatically


    // Provide object literals to hint how to handle duplexes we can't automatically figure out.
    {
      read: duplexStream() // will be used as a readable
    },

    {
      write: duplexStream() // will be used as a writable stream
    },

    {
      read: readableOrDuplex(), // communication is duplex on this channel, but you want the input and output piped different locations
      write: writableOrDuplex()
    },

    {
      duplex: someDuplexStream() // will be used as duplex (piped both to and from) - useful for something like a TCP socket
    }
  ]
}

@jamestalmage
Copy link
Contributor Author

jamestalmage commented Mar 24, 2017

It really seams like pipe should not return the transform stream, but a facade that is read-only.

@sindresorhus
Copy link
Owner

sindresorhus commented Mar 25, 2017

This is becoming more complicated than just using .pipe() directly...

@jamestalmage
Copy link
Contributor Author

jamestalmage commented Mar 28, 2017

Agreed. So what do you think? Just forget this idea?

@sindresorhus
Copy link
Owner

Hmm, we support a Stream in the stdin, stdout, stderr options now. Doesn't that solve the main use-case?

@SamVerschueren
Copy link
Contributor

Don't think we really support a Stream. Because {stdout: fs.createWriteStream('file.txt')} doesn't work as that Stream is not a valid one. See my tweet about this.

@jamestalmage
Copy link
Contributor Author

To answer the question in your tweet, the docs do mention some restrictions on the type of stream that can be used:

The stream's underlying file descriptor is duplicated in the child process to the fd that corresponds to the index in the stdio array. Note that the stream must have an underlying descriptor (file streams do not until the 'open' event has occurred).

So, to use a file stream, you must wait for the open event. The following works just fine:

// parent.js
var spawn = require('child_process').spawn;
var path = require('path');
var fs = require('fs');

var barPath = path.join(__dirname, 'child.js');
var outputPath = path.join(__dirname, 'output.txt');

var s = fs.createWriteStream(outputPath);

s.on('open', () => {
	spawn(process.execPath, [barPath], {
		stdio: [null, s, null]
	});
});
// child.js
console.log('hello world');

@SamVerschueren
Copy link
Contributor

Damn, good catch @jamestalmage! Thanks for elaborating! It's odd behaviour though, you just want to pass in a stream without having to handle the open yourself.

@jamestalmage
Copy link
Contributor Author

Yes. Odd behavior indeed

@sberney
Copy link

sberney commented Oct 12, 2018

This has me banging my head against the wall, trying to support something similar.

I have run into an external program that doesn't print out any output until execution is finished if it detects that it's not in a tty. FYI I've been playing around with node-pty to fool it, which works; but there are currently no signals that the command has finished. At least as far as I can tell.

Anyway, good luck; also awesome stuff, thanks!

@aguegu
Copy link

aguegu commented Jan 19, 2023

Thanks, @jamestalmage , waiting for this open event is really a good catch.

But how to use PassThrough stream in this case?

Ideally PassThrough is writable and it can avoid all the filesystem IO. But it got no open or ready event. And setting it as stdout option would result in The argument 'stdio' is invalid. error.

@ehmicky
Copy link
Collaborator

ehmicky commented Mar 14, 2023

We just added 3 methods related to piping:

await execa('echo', ['unicorns']).pipeStdout(stream);
await execa('echo', ['unicorns']).pipeStderr(stream);
await execa('echo', ['unicorns'], {all: true}).pipeAll(stream);

For stdin, the input option can be used with a stream.

Although that's a different solution than the one discussed here (allowing stream for the stdio option), it seems like it solves the same problem, so this issue might be closed?

@ehmicky
Copy link
Collaborator

ehmicky commented Aug 17, 2023

@sindresorhus Should we close this issue?

@ehmicky
Copy link
Collaborator

ehmicky commented Dec 17, 2023

See also #617

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants