New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OutOfMemory due to results in chain retained #1529

Closed
peterlundberg opened this Issue May 30, 2018 · 4 comments

Comments

Projects
None yet
3 participants
@peterlundberg

peterlundberg commented May 30, 2018

  1. What version of bluebird is the issue happening on?
    3.5.1

  2. What platform and version? (For example Node.js 0.12 or Google Chrome 32)
    Node v8.11.1 on osx Darwin Kernel Version 17.5.0 x86_64

  3. Did this issue happen with earlier version of bluebird?
    Not sure


Using promise chains can cause unexpected OutOfMemory situations as the partial results in the chain seam to be retained until the entire chain is settled.

For example in this promise-chain-retain.js:

// global.Promise = require('bluebird');
console.log('Using', Promise.version || 'vanilla Promises');
logMemoryUse('before all');

Promise.resolve()
.then(() => {
  logMemoryUse('build result');
  const rows = [];
  for (let i = 0; i < 1000000; i += 1) {
    rows.push([`Example ${i}`, i, i * 2]);
  }
  return Promise.resolve({ rows });
})
.then((result) => {
  logMemoryUse('use result');
  // this then gets the large result
  return { len: result.rows.length };
})
.then((summaryResult) => {
  logMemoryUse('use summary');
  // this then should not need the 'result' and be free to reuse heap
  console.log(summaryResult);
})
.catch((err) => {
  console.log('caught error: ', err);
});

setTimeout(() => logMemoryUse('after all'), 3000);

function logMemoryUse(prefix = '') {
  global.gc();
  global.gc();
  const toMiB = b => (b / 1048576).toFixed(2);
  const mem = process.memoryUsage();
  console.log([
    'memoryUsage:',
    prefix.padEnd(20),
    `rss ${toMiB(mem.rss)} Mb`.padStart(20),
    // `heapTotal ${toMiB(mem.heapTotal)} Mb`.padStart(20),
    `heapUsed ${toMiB(mem.heapUsed)} Mb`.padStart(20),
  ].join('  '));
}

Using node 8 vanilla promises this works as expected in a chain and in "use summary" step the big result is no longer retained and can be GC:ed:

❯ NODE_ENV="production" node --expose-gc promise-chain-retain.js
Using vanilla Promises
memoryUsage:  before all                    rss 21.21 Mb    heapUsed 3.95 Mb
memoryUsage:  build result                  rss 21.28 Mb    heapUsed 3.94 Mb
memoryUsage:  use result                   rss 192.95 Mb    heapUsed 151.17 Mb
memoryUsage:  use summary                  rss 134.85 Mb    heapUsed 4.20 Mb <== GC worked!
{ len: 1000000 }
memoryUsage:  after all                     rss 44.03 Mb    heapUsed 4.22 Mb

When using Bluebird as Promise things do not work as expected. The result of the first then is still retained in "use summary" step.

❯ NODE_ENV="production" node --expose-gc promise-chain-retain.js
Using 3.5.1
memoryUsage:  before all                    rss 25.03 Mb    heapUsed 4.77 Mb
memoryUsage:  build result                  rss 25.12 Mb    heapUsed 4.75 Mb
memoryUsage:  use result                   rss 194.18 Mb    heapUsed 152.00 Mb
memoryUsage:  use summary                  rss 195.88 Mb    heapUsed 152.00 Mb  <== WHY?
{ len: 1000000 }
memoryUsage:  after all                    rss 113.99 Mb    heapUsed 5.05 Mb

Thus long chains in a request do not GC as expected if resolving large values. This example is simplified but imagine building a large object, converting to JSON, compressing and then sending to an external service - if at any step resolving the large object will not allow it to be GC:ed until the entire chain is done. Essentially at least doubling the memory requirement (or OOM given contained resources and large objects).

@bergus

This comment has been minimized.

Contributor

bergus commented Aug 1, 2018

until the entire chain is done

My quick guess would be that it doesn't have to do with the chain ending, but rather with your callbacks not doing anything asynchronous - bluebird might schedule all of them on the same tick, and keep the promises until the current round of synchronous work is over. Can you test by introducing a .delay(…) in the chain?

@benjamingr

This comment has been minimized.

Collaborator

benjamingr commented Aug 1, 2018

Bluebird keeps a circular buffer around as a queue that grows the more callbacks you have - its size is constant and it stays in the background for fast scheduling. It never shrinks but the memory isn't "leaked" - that is - the next time you'll schedule a promise it will be reused.

@benjamingr

This comment has been minimized.

Collaborator

benjamingr commented Aug 1, 2018

@peterlundberg

This comment has been minimized.

peterlundberg commented Aug 1, 2018

Introducing a .delay(1) after "use result" in the script above will allow GC and "use summary" will not have the large heapUsed value. A useful workaround.

The queue probably helps (or helped in earlier runtime versions) performance, but if this is what keeps a reference it has the effect that one must be careful of what is in closure of the callbacks if using large part of the available heap for one chain. This is not uncommon in for instance serverless microservices processing data.

I am a complete noob to bluebird internals, but also wonder why there is a reference kept as shift() appears to dereference the entry in the queue (as used from async). Perhaps there is another problem (related to the scheduling?)

I am not sure what an appropriate course of action is, but this is quite a gotcha so perhaps at least worth adding somewhere in the doxs? (and at least this issue might help others for now)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment