Why does node/io.js hang when printing to the console inside a loop? #1741

Open
pgkos opened this Issue May 20, 2015 · 18 comments

Projects

None yet

6 participants

@pgkos
pgkos commented May 20, 2015

Sorry if it is a dumb question/issue, but why does this code:

for (i = 0; i < 1000000; i++) {
  console.log('abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc' + i);
}

hang node for a few seconds (at around i = 1500) before resuming again? During the hang the amount of memory used by the node process increases very fast and the CPU usage is very high during that time.

On the other hand, if I write:

for (i = 0; i < 1000000; i++) {
  console.log(i);
}

then the node process does not hang at all.

The reason I am asking is that I have a piece of code which involves printing inside a loop, and it also hangs, but for much longer (like 30-60 seconds, before resuming again). The code prints the paths of all files in a directory (walking recursively): https://gist.github.com/pgkos/f0a650daf56aa49899e9.

The issue occurs on both io.js v2.0.2 and node.js v0.12.2, on Ubuntu 64-bit.

@pgkos pgkos changed the title from Why does node/io.js hang when printing to a console inside a loop? to Why does node/io.js hang when printing to the console inside a loop? May 20, 2015
@bnoordhuis
Member

I can think of a couple of reasons. The first snippet creates a million strings (i.e. does a million heap allocations) whereas the second one does not; you can verify that with the --trace_gc flag.

What's more, if the first argument to console.log() is a string, it's considered a format string that needs to be parsed. In the first snippet, that works out to about 60 MB of memory that is scanned. You can avoid that by using process.stdout.write() instead.

Apropos memory usage, console.log() and process.stdout.write() are mostly asynchronous under the hood. When you call it, a 'write to the console' request is put together that is then handed off to libuv.

It's not until the next tick of the event loop that the memory of that request is freed; that includes the actual string that is written. In both snippets, one million requests are in flight but in the first one, they're much bigger.

@mscdex mscdex added the console label May 20, 2015
@ChALkeR
Member
ChALkeR commented May 20, 2015

Actually, + i is not needed.there.

var x = process.hrtime();
for (var i = 0; i < 1000000; i++) {
    console.log('abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc1234567890');
}
console.log(process.hrtime(x));

Is almost equally slow.

@ChALkeR
Member
ChALkeR commented May 20, 2015

@bnoordhuis

What's more, if the first argument to console.log() is a string, it's considered a format string that needs to be parsed. In the first snippet, that works out to about 60 MB of memory that is scanned. You can avoid that by using process.stdout.write() instead.

See #1749. It does not fix this issue, but makes things slightly better.

@Fishrock123
Member

Simple answer: the code in question blocks the process for a bit while the loop is running~, and libuv doesn't print it until the next tick~.

@ChALkeR
Member
ChALkeR commented May 25, 2015

@Fishrock123 I would say that (almost) everything is printed in the same tick.

The cause of the lag is gc going crazy at some point and hanging everything for a second or so.

Part of this single gc loop:

[16372]      251 ms: Scavenge 9.3 (45.0) -> 6.8 (46.0) MB, 2.3 ms [allocation failure].
[16372]      263 ms: Scavenge 9.8 (46.0) -> 7.7 (47.0) MB, 3.2 ms [allocation failure].
[16372]      403 ms: Scavenge 10.8 (47.0) -> 8.6 (47.0) MB, 132.1 ms [allocation failure].
[16372]      428 ms: Scavenge 15.6 (47.0) -> 10.6 (48.0) MB, 5.6 ms [allocation failure].
[16372]      452 ms: Scavenge 16.5 (48.0) -> 12.3 (50.0) MB, 7.6 ms [allocation failure].
[16372]      738 ms: Scavenge 18.5 (50.0) -> 14.1 (52.0) MB, 254.8 ms (+ 13.7 ms in 94 steps since last GC) [allocation failure].
[16372]      795 ms: Mark-sweep 22.0 (52.0) -> 15.7 (54.0) MB, 15.8 ms (+ 32.0 ms in 214 steps since start of marking, biggest step 0.4 ms) [GC interrupt] [GC in old space requested].
[16372]      846 ms: Scavenge 29.2 (54.0) -> 19.7 (56.0) MB, 11.0 ms [allocation failure].
[16372]      892 ms: Scavenge 31.5 (56.0) -> 23.1 (60.0) MB, 12.9 ms [allocation failure].
[16372]      934 ms: Scavenge 35.4 (60.0) -> 26.7 (63.0) MB, 12.4 ms [allocation failure].
[16372]      976 ms: Scavenge 38.8 (63.0) -> 30.2 (67.0) MB, 13.1 ms [allocation failure].
[16372]     1019 ms: Scavenge 42.4 (67.0) -> 33.7 (71.0) MB, 12.9 ms [allocation failure].
[16372]     1060 ms: Scavenge 45.9 (71.0) -> 37.3 (74.0) MB, 12.3 ms [allocation failure].
[16372]     1103 ms: Scavenge 49.5 (74.0) -> 40.8 (78.0) MB, 12.5 ms [allocation failure].
[16372]     1145 ms: Scavenge 53.0 (78.0) -> 44.3 (81.0) MB, 13.1 ms [allocation failure].
[16372]     1187 ms: Scavenge 56.5 (81.0) -> 47.9 (85.0) MB, 12.5 ms [allocation failure].
[16372]     1229 ms: Scavenge 60.1 (85.0) -> 51.4 (89.0) MB, 12.7 ms [allocation failure].
[16372]     1272 ms: Scavenge 63.6 (89.0) -> 54.9 (92.0) MB, 12.7 ms [allocation failure].
[16372]     1315 ms: Scavenge 67.1 (92.0) -> 58.5 (96.0) MB, 12.7 ms [allocation failure].
[16372]     1388 ms: Scavenge 70.7 (96.0) -> 62.0 (99.0) MB, 15.9 ms (+ 24.6 ms in 186 steps since last GC) [allocation failure].
[16372]     1533 ms: Scavenge 74.2 (99.0) -> 65.6 (103.0) MB, 19.9 ms (+ 91.8 ms in 186 steps since last GC) [allocation failure].
[16372]     1611 ms: Scavenge 77.8 (103.0) -> 69.1 (107.0) MB, 16.2 ms (+ 28.4 ms in 186 steps since last GC) [allocation failure].
[16372]     1683 ms: Scavenge 81.3 (107.0) -> 72.6 (110.0) MB, 13.5 ms (+ 26.9 ms in 186 steps since last GC) [allocation failure].
[16372]     1755 ms: Scavenge 84.8 (110.0) -> 76.2 (114.0) MB, 13.7 ms (+ 25.6 ms in 186 steps since last GC) [allocation failure].
[16372]     1828 ms: Mark-sweep 80.9 (114.0) -> 77.5 (117.0) MB, 49.2 ms (+ 208.7 ms in 1003 steps since start of marking, biggest step 68.1 ms) [GC interrupt] [GC in old space requested].
[16372]     1874 ms: Scavenge 91.8 (117.0) -> 81.6 (119.0) MB, 10.7 ms [allocation failure].
[16372]     1917 ms: Scavenge 93.2 (119.0) -> 85.0 (123.0) MB, 15.0 ms [allocation failure].
@ChALkeR
Member
ChALkeR commented May 25, 2015

@Fishrock123 FYI, I can reproduce this in async:

function f(max) {
    for (var i = 0; i < max; i++) {
        console.log('abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc1234567890');
    }
}

function g() {
    f(100);
    process.nextTick(g);
}
g();

Run with --trace-gc.

@ChALkeR
Member
ChALkeR commented May 25, 2015

Btw, I saw misconceptions about sync code.
Please note:

  1. It does not block any type of gc runs from executing. Both scavenge and mark-sweep runs are automatically executed in the middle of a tick. No need to manually trigger gc().
  2. It does not block output by itself.
@ChALkeR
Member
ChALkeR commented May 25, 2015

Another example:

function g() {
    process.stdout.write('abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc1234567890\n');
    process.nextTick(g);
}
g();

It's even far worse in this case. When gc starts, it never finishes running (runs over and over again with no actual stdout output), memory usage quickly goes to 1.4 GiB and the process quits with «Allocation failed - process out of memory» error.

It looks as if garbage collector blocks output from happening, but the js code still runs, process.stdout.write gets called and overflows the output buffer.

@Fishrock123 Fishrock123 reopened this May 25, 2015
@ChALkeR
Member
ChALkeR commented May 25, 2015

If you don't like recursive process.nextTick, let's setTimeout for 1ms every 1e6 iterations.

var i = 0;
function g() {
    console.log('abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc1234567890');
    i++;
    if (i % 1000000 === 0) {
        setTimeout(g, 1);
    } else {
        process.nextTick(g);
    }
}
g();

Takes more time, but still crashes.

Another example (setTimeout for 0 ms every 1e5 iterations):

var i = 0;
function g() {
    console.log('abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc1234567890');
    i++;
    if (i % 100000 === 0) {
        setTimeout(g, 0);
    } else {
        process.nextTick(g);
    }
}
g();

Another one:

var s = 'abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc1234567890';
function g() {
    for (var i = 0; i < 1e5; i++) {
        console.log(s);
    }
    setTimeout(g, 0);
}
g();

The numbers might require some tuning if you have a system that is faster than mine.

@ChALkeR
Member
ChALkeR commented May 25, 2015

Asked on IRC by @Fishrock123 :

can you try allocating the sting outside the loop and using a reference to it?

Doesn't change anything

@ChALkeR
Member
ChALkeR commented May 25, 2015

Btw, there is no issue when stdout is piped somewhere (for example, iojs test.js > /dev/null, iojs test.js > out.txt, iojs test.js | wc). Probably because in this case process.stdout is blocking.

@ChALkeR
Member
ChALkeR commented May 25, 2015

One approach to «fixing» this would be to actually block output if the buffer is too large until it is dumped.

Almost as it is done in the case when output is piped, but only when the buffer gets overfilled and only until it is freed to a reasonable size.

Btw, that could actually make output faster, because with the current implementation it could make garbage collector go crazy (have you noticed gc runs that took 254 ms above?) when the output buffer grows too big.

A side question: why is output blocking when it is piped? Could it be for the same reason (everything going boom when it is processed too slowly)? Or are there other reasons?

@Fishrock123
Member

Probably because in this case process.stdout is blocking.

Not unless you're on win32. See #1771, could you try that patch? (It might fix this?)

@ChALkeR ChALkeR added the memory label Aug 17, 2015
@ChALkeR ChALkeR removed the question label Sep 20, 2015
@GordonDiggs GordonDiggs added a commit to codeclimate/codeclimate-eslint that referenced this issue Jan 12, 2016
@GordonDiggs GordonDiggs Use process.stdout.write instead of console.log and increase batch size
`console.log` does some extra work to construct the output, and because
it's already a string, we can save some memory by using
`process.stdout.write`. See this issue for more information:
nodejs/node#1741 (comment)

This is an attempt to remedy the OOM issues when rendering all the
issues. Increasing the batch size will also help by reducing the number
of GCs.
0ee3117
@Fishrock123 Fishrock123 added process and removed console labels Jan 13, 2016
@pgkos
pgkos commented Mar 1, 2016

Btw, placing this piece of code before the loop with console.log:

process.stdout._handle.setBlocking(true);

completely fixes the problem.

@jasnell
Member
jasnell commented Mar 9, 2016

@ChALkeR @Fishrock123 ... does this one need to stay open?

@ChALkeR
Member
ChALkeR commented Mar 9, 2016

@jasnell To me, this still looks like an issue that has to be solved. Very low priority, though.
I'm keeping this one in mind.

@yairchu yairchu added a commit to lamdu/lamdu that referenced this issue Jun 6, 2016
@yairchu yairchu rts.js: use process.stdout.write rather than console.log
console.log's first argument is a format string according to nodejs/node#1741 (comment)
feed920
@yairchu yairchu added a commit to lamdu/lamdu that referenced this issue Jun 6, 2016
@yairchu yairchu rts.js: enable blocking for stdout
see nodejs/node#1741 (comment) for reasoning
4f8be7f
@Fishrock123
Member

A side question: why is output blocking when it is piped? Could it be for the same reason (everything going boom when it is processed too slowly)? Or are there other reasons?

It shouldn't be?

@ChALkeR Is this a bug in libuv or is it a v8 memory issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment