Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster Awareness #3

Closed
dmuth opened this issue Apr 9, 2014 · 2 comments
Closed

Cluster Awareness #3

dmuth opened this issue Apr 9, 2014 · 2 comments

Comments

@dmuth
Copy link

dmuth commented Apr 9, 2014

Let's say you have multiple processes logging to the same file:

var cluster = require("cluster");
var log = require("bristol");

var num_cpus = 2;

if (cluster.isMaster) {
    for (var i = 0; i < num_cpus; i++) {
        cluster.fork();
    }

} else {
    log.addTarget("file", { file: "log.txt" });
    log.addTarget("console")
        .withFormatter("human")
        ;
    log.info(process.pid, "Lorem Ipsum");

}

Then run the script, and while the script is running, fire up lsof:

$ lsof ./log.txt 
COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
node    49511 doug   13w   REG    1,2      788 66444792 ./log.txt
node    49512 doug   13w   REG    1,2      788 66444792 ./log.txt

Two processes, two different file descriptors, same file. Under UNIX, there is no guarantee that writes to a file are atomic, thus running the risk of partial lines being written to one file descriptor before the next line is written, thus creating a "staircase" effect.

Opening files in the master process then spawning the child process won't help. (I tried)

The best solution I found so far is to have a single process write files. This can be done with process.send() as follows:

//
// In the master process
//
process.on('message', function(m) {
   // Write to file
});

//
// In the child process, JSON data structure is sent to the master process
//
process.send({ foo: 'bar' });

There are probably other ways to accomplish the same. This particular technique worked for me in my projects.

@TomFrost
Copy link
Owner

Good workaround -- unfortunately, this is the case with pretty much all logging libs that keep an open WriteStream (and frankly, using one that utilizes one-off writes would have so much of a performance impact and draught in the thread pool that it wouldn't be worth using one that doesn't).

However, I'm keeping this incident open because I have a decent idea to get child processes' Bristol instances to autoconfigure according to the parent process, and automate the passing of messages. I'll comment again when that's in.

@TomFrost
Copy link
Owner

I'm going to close this for now. While there's a good solution to this, it would appear to be not worth the effort at this point. There's a strong consensus that for most use cases, if you're running Node on a machine with multiple CPUs, you should either:

  • (if you're on a cloud provider) spend the same amount of money to turn that machine into multiple machines with single CPUs and run multiple instances of your app, or:
  • (if you're on discrete hardware) use a hypervisor or container scheduler to run multiple instances of your app on that machine as though they are individualized machines.

With the above, you multiply your durability for literally free, and gain the advantage of a less complex codebase as a result.

I'll reopen if this becomes a popular request!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants