Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

child_process: give forks stdout and stderr #2454

Closed
wants to merge 1 commit into from

Conversation

mmalecki
Copy link

@mmalecki mmalecki commented Jan 3, 2012

Previously using these streams was not possible because of making them
use given file descriptors. This solution is more generic - it uses
pipe function to pipe fork's output into parent's stdout and
stderr unless silent: true is set as an option.

This doesn't break any existing API and it makes fork more compatible
with spawn.

Please note that fork's stdout and stderr aren't meant to be used as
communication channels - use fork(...).send() instead.

Fixes #2442.

It'd be awesome to backport it to 0.6, it can be merge easily (I can provide another pull request for that). I find current behavior inconsistent with other APIs (also, documentation doesn't mention it).

/cc @bmeck

Previously using these streams was not possible because of making them
use given file descriptors. This solution is more generic - it uses
`pipe` function to pipe fork's output into parent's `stdout` and
`stderr` unless `silent: true` is set as an option.

This doesn't break any existing API and it makes `fork` more compatible
with `spawn`.

Please note that fork's `stdout` and `stderr` aren't meant to be used as
communication channels - use `fork(...).send()` instead.

Fixes nodejs#2442.
@mmalecki mmalecki mentioned this pull request Jan 3, 2012
@indexzero
Copy link

+1 to exposing child.stdout and child.stderr, but piping it to process.stdout and process.stderr isn't consistent with child_process.spawn.

@ry Can't we just set child.stdout and child.stderr as we would with .spawn if options.silent is not set.

@mmalecki
Copy link
Author

mmalecki commented Jan 3, 2012

We do expose them. We just pipe things to process.stdout and process.stderr by default. With this patch you can:

var fork = require('child_process').fork,
    child = fork('my-cool-script');

child.stdout.on('data', function (chunk) {
  console.log('got a data from my cool script', chunk);
});

indexzero added a commit to foreversd/forever that referenced this pull request Jan 3, 2012
@indexzero
Copy link

@mmalecki Oh. My mistake. Lets make this happen then. Consistency in node child_process APIs is paramount imho.

indexzero added a commit to foreversd/forever that referenced this pull request Jan 3, 2012
@stolsma
Copy link

stolsma commented Jan 3, 2012

+1 Yes, this change would be very useful! For some projects I need the message channel and don't want the forked child to output to stdout / stderr but to my own logging code in the parent.

@AndreasMadsen
Copy link
Member

@mmalecki this is exactly what I had in mind.

But we need to update the silent testcase after this has landed, since it will now be true in any case. I would like to do it but I just don't have the time.

@@ -181,6 +177,11 @@ exports.fork = function(modulePath, args, options) {

var child = spawn(process.execPath, args, options);

if (!options.silent) {
child.stdout.pipe(process.stdout, { end: false });
child.stderr.pipe(process.stderr, { end: false });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you measured the performance impact?

A direct write to stdout / stderr is one syscall + one (user to kernel space) mempcy. Piping the output means three syscalls + three memcpys and a task switch if the child and parent run on the same CPU.

It could make a considerable difference for applications that print out megabytes of data.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't aware of the performance implications, maybe two options is prudent on which pipes stdout and stderr and one which shares them. Sharing them logging a real pain because there is no way to mulitplex the merged log into several files. If you run a lot of worker processes this is important imho

@AndreasMadsen
Copy link
Member

I have made a quick bechmark: https://gist.github.com/1558988

@mmalecki
Copy link
Author

mmalecki commented Jan 4, 2012

Thanks Andreas, I was about to write something like that. Seems like it doesn't hurt performance that much (it's approximately 40 ms slower with pipe).
But you know guys, I've heard that this node thing is quite fast.

@AndreasMadsen
Copy link
Member

@mmalecki My opinion is to make pipe default but also allowing use of customFds (except stdin) in .fork it will give userland the final chose. In most cases I don't think people will write 10 MB data out using stdout they will properly create a fileStream then.

customFds will be: [-1, customFds[1] || -1, customFds[2] || -1]
if silent is false then data won't be piped.

If userland want this very fast they can do:
customFds = [null, 1, 2]

indexzero added a commit to foreversd/forever that referenced this pull request Jan 5, 2012
@ry
Copy link

ry commented Jan 6, 2012

we want child_process.fork to act as threads would - they should share stdout and stderr - so this PR is not acceptable.
Discussed with @mmalecki on IRC - he is going to make it an option to have stdio pipes for fork.

@ry ry closed this Jan 6, 2012
@indexzero
Copy link

Fundamentally I think this is the wrong decision. They are not threads, they are child processes and many people will want to multiplex their log files accordingly.

@ry
Copy link

ry commented Jan 9, 2012

@indexzero except when they aren't processes. There will be options for having pipes for stdio - but it's not going to be default - by default they share stdio.

@indexzero
Copy link

@ry @isaacs spent some time explaining this to me in some depth and how isolates is going to change things.

There are some real-world cases where you want to spawn a child process with IPC always, as opposed to a new isolate. A good example of this is haibu and haibu-carapace, also some forever-based process monitoring.

This is why I'm going to stand-by my complaint of the inconsistency of fork with spawn. It seems more logical to have three methods:

  • spawn: Spawns a child process
  • fork: Eventually handled by isolates
  • spawn-with-ipc: (terrible name, I know). Something that always spawns a child process with IPC. This could probably be better handled through an option to the original spawn method:
  var childWithIpc = require('child_process').spawn('some/node-like/thing/like/bin/coffee', 'some/file.js', {
    ipc: true
  });

Just my $0.02

@mmalecki
Copy link
Author

+1 to @indexzero's idea. IMO it'd save people a lot of confusion.

@isaacs
Copy link

isaacs commented Jan 10, 2012

Starting as a separate process will probably always be the default, unless there is a significant performance win on some platforms. child_process.spawn will always spawn a child process, not an isolate.

child_process.fork is for starting another javascript script file with node. That's the semantics, and by limiting it to that, we open the door to having a nice way to use isolates.

Coffee-script should port it, and provide a coffeeScript.fork method or something, or you can just use child_process.spawn, and manually set up the json channel yourself.

This is highly solveable in userland.

@indexzero
Copy link

And already solved: https://github.com/stolsma/node-fork

That's not really the point though. Shouldn't we be striving for the best core APIs possible which limit the necessity for such user-land solutions? This API is very fundamental to me, passing the buck (to coffeescript for example) doesn't really do it for me. You know me, I don't use coffeescript; but people use it with forever. This is a problem I've seen in real-world usage.

At the very least, all of the internals of setting up the JSON channel should exposed through the child_process module to allow user-land to make these decisions instead of having to copy/paste core code as we had to do in node-fork.

@mmalecki
Copy link
Author

So, we should all just agree with core API being half-baked and move on to the userland? I don't find this solution optimal. If something is in the core, it should be the best one.

I like what @indexzero proposed - we could provide a spawn option called ipc which would set up an IPC channel and fork method which would use spawn(..., { ipc: true }) or isolates. This approach would allow us to abstract things a bit (and keep things backwards-compatible).

I'll be very happy to pull request that. So can we please stop nerdfighting and actually move on to code?

@bnoordhuis bnoordhuis reopened this Feb 4, 2012
@isaacs
Copy link

isaacs commented Feb 6, 2012

Here's the plan:

The option will be stdio. It can take one of 4 values:

  1. 'null' black hole (Q: should we allow an actual null value here, rather than the string 'null'? Not sure.)
  2. 'stream' expose a stream object
  3. 'inherit' inherit from the parent, don't expose a JS stream object
  4. An object with keys in, out, and err, which can each be one of 'null', 'stream' or 'inherit'.

The default for fork will be inherit. The default for spawn will be stream.

Closing this pull request. @mmalecki has offered to send a new one.

@isaacs isaacs closed this Feb 6, 2012
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants