Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Memory leak in stream maybe? #2175

Closed
devdazed opened this issue Nov 23, 2011 · 16 comments
Closed

Memory leak in stream maybe? #2175

devdazed opened this issue Nov 23, 2011 · 16 comments

Comments

@devdazed
Copy link

if you run this with --trace-gc and --expose-gc you will see the memory just continue to grow:

for (var i = 0; i < 1000000; i++) {
  process.stdout.write('this should not be leaking\n');
  gc();
}
@devdazed
Copy link
Author

Additionally, if you run this, you will get a process out of memory crash...eventually

FATAL ERROR: CALL_AND_RETRY_2 Allocation failed - process out of memory

while(true){
  process.stdout.write('this should not be leaking\n');
}

@tshinnic
Copy link

I know similar reports have been made before, and I think they were all the same pattern. I can only find this one report, perhaps the other mentions were maillist postings: "memory consumption in node 0.5+" #1990

I believe the consensus was that memory would not be released/recovered unless you allowed for some idle time. The gist mentioned in above would throw in a nextTick() call every so often.

Check the other issue, check that gist, and see if your situation isn't similar. This probably ought to be a FAQ - "memory leak" when stdout/console writes without pause or nextTick call

@defunctzombie
Copy link

The problem is that you should not have to allow for idle time to prevent your program from crashing (certainly not in such a trivial case). If I write a program in python which just prints to the screen forever it will not run out of memory. I believe this is indicative of poor underlying behavior.

On a side note, even with idle time, it still grows and grows. And under high load systems, counting on "idle" time is not really an option. If there are no leaks in the code it should not be crashing. The VM has the capability to stop execution and free memory, so this is not the case here.

@bnoordhuis
Copy link
Member

I'm closing this because it's not really a bug.

@devdazed: You are writing to stdout as fast as you can, faster than stdout can handle so you get backlogged and memory usage grows. Example:

// Run with `node --expose-gc script.js`
var s = '0123456789';
for (var i = 0; i < 17; ++i) s = s + s;
while (true) process.stdout.write(s), gc(true); // collect and compact

node --expose-gc tmp/gc.js > /dev/null # direct to /dev/null - fast, constant memory

node --expose-gc tmp/gc.js | cat > /dev/null # to pipe - slow, memory keeps growing

@shtylman: V8 is not interruptible, the best you can do is terminate the script.

@defunctzombie
Copy link

@bnoordhuis you are incorrect to think this is not a bug. The following trivial python program does the same thing and uses constant memory:

while True:
    print "this should not be leaking"

I will also point out that I compiled the sample shell program from the v8 source and did the same infinite loop; it DOES NOT grow in memory size. This is very much a bug in node.js and really makes the wonder what other things are just not being done right. I am willing to help track this down but just need some pointers as to where to be looking.

In addition to the memory growth, if I run the infinite loop inside of node shell, I cannot exit with ctrl+c. I can easy exit out of the sample v8 shell as well as the python shell.

@bnoordhuis
Copy link
Member

Well... maybe. If you want a hint on where to start debugging, src/node.js sets process.stdout to a different type of stream depending on what stdout is (tty, pipe, file).

Re Ctrl+C: Node sets the terminal to raw mode, the v8 shell (and presumably python) does not.

@bnoordhuis bnoordhuis reopened this Nov 24, 2011
bnoordhuis added a commit to bnoordhuis/node that referenced this issue Nov 24, 2011
@bnoordhuis
Copy link
Member

I grudgingly and reluctantly admit there may be a bug in how Node currently handles TTY and pipe I/O.

In a nutshell, a write request is queued for each call to process.stdout.write(). The actual write(2) syscall succeeds immediately but the write request object isn't released until the next tick of the event loop - which never comes in a busy loop.

@defunctzombie
Copy link

We all have to face reality sometime ;) Do you think the patch you provided is the way to go? Or is there a more long term solution?

@devdazed
Copy link
Author

the patch didn't affect the memory consumption of either of the examples above.

@devdazed
Copy link
Author

also, im not sure if nextTick will actually clean up. I tried putting a process.nextTick() right above the gc() and it had no affect on the memory consumption.

@bnoordhuis
Copy link
Member

@devdazed: bnoordhuis/node@27389bb only fixes the case where stdout is a pipe, not a TTY.

For process.nextTick() to work, there needs to be an actual next tick. Try this:

function spam()
{
  process.stdout.write('foobarbaz');
  process.nextTick(spam);
}

spam();

@shtylman: Don't know yet. Maybe I'll bypass the JS layer altogether.

@acoomans
Copy link

acoomans commented Mar 1, 2012

There's definitely a problem here.

When I connect on the twitter stream api, nodejs starts filling up the memory at a speed around 0.2-0.5 MB/s (my server gets its memory filled up in 10 minutes!)

Calling the following hack in helps me stay around the 20MB:

function mem_hack() {
process.nextTick(function () {
gc(true);
});
}

var req = https.request(options, function(res) {
req.on('response', function (response) {
response.on('data', function(data) {
mem_hack();
});
});
});

This is probably definitely dirty but it works until the issue get fixed

@bnoordhuis
Copy link
Member

@acoomans: That's an altogether different issue. Can you trim it down to a standalone test case and open a new bug report?

@PaulMougel
Copy link

Shouldn't this bug be closed? This doesn't look like to be relevant anymore.


OP's first version:

while (true)
    process.stdout.write('foobarbaz');

makes indeed the program leak memory, but that is expected, as we write to the stream without taking into account the internal buffer states.

A naive second version:

function spam() {    
    if (process.stdout.write('foobarbaz')) spam();
    else process.stdout.once('drain', spam);
}
spam();

will result in a stack overflow, which can be expected if the output stream is fast enough not to buffer the data.

Using setImmediate:

function spam() {    
    if (process.stdout.write('foobarbaz')) setImmediate(spam);
    else process.stdout.once('drain', spam);
}
spam();

no memory leak happen, but because the drain event never fires on my machine, we can manually throttle the stream:

var Throttle = require('throttle');
var output = new Throttle(10);
output.pipe(process.stdout);

function spam() {
    if (output.write('foobarbaz')) spam();
    else output.once('drain', function() {process.stdout.write('O\n'); spam()});
}

spam();

no memory leak happen, no stack overflow and the drain event is being fired:

I don't see any bad behavior in any of those examples.

@trevnorris
Copy link

@PaulMougel Thanks for taking the time to create these examples. You're right, this should be closed.

@GHowlett
Copy link

GHowlett commented Dec 3, 2015

@PaulMougel Is there also a way to prevent the memory leak when using stream.pipe()? The pipe() function is supposed to manage the flow automatically, but the following example backlogs stdout.

var stream = require('stream');
var readable = new stream.Readable({
  read: function(){
    if (!this.counter) this.counter = 1;
    this.push(this.counter++ + '\n');
  }
});
readable.pipe(process.stdout);

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants