Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vm Script memory leak in Node.js 14 / 16 #40014

Open
XadillaX opened this issue Sep 6, 2021 · 24 comments
Open

vm Script memory leak in Node.js 14 / 16 #40014

XadillaX opened this issue Sep 6, 2021 · 24 comments
Labels
confirmed-bug Issues with confirmed bugs. v8 engine Issues and PRs related to the V8 dependency. vm Issues and PRs related to the vm subsystem.

Comments

@XadillaX
Copy link
Member

XadillaX commented Sep 6, 2021

Version

14 / 16

Platform

x64 macOS / Linux

Subsystem

vm

What steps will reproduce the bug?

'use strict';

let times = 0;

function run() {
  let str = 'var a = "';
  for (let i = 0; i < 100 * 1024 / 18; i++) str += Math.random().toString();
  str += '";';

  const script = new (require('vm').Script)(str);
  times++;
  if (times % 1000 === 0) console.log(times);
}

(async () => {
  while (true) {
    run();

    await new Promise(resolve => {
      setTimeout(() => {
        resolve();
      }, 10);
    });

    gc();
  }
})();

Run with:

$ node --expose-gc --max-heap-size=100 test.js

It will occur heap OOM under Node.js 14 and 16, and Node.js 12 is OK.

<--- Last few GCs --->                                                         
                                                                                                                                                               
[1430950:0x55695b1dc4b0]     6372 ms: Mark-sweep 72.0 (91.6) -> 71.6 (91.8) MB, 4.3 / 0.0 ms  (average mu = 0.728, current mu = 0.778) testing GC in old space requested
[1430950:0x55695b1dc4b0]     6388 ms: Mark-sweep 72.2 (91.8) -> 71.8 (92.1) MB, 3.1 / 0.0 ms  (average mu = 0.760, current mu = 0.804) testing GC in old space requested
[1430950:0x55695b1dc4b0]     6403 ms: Mark-sweep 72.4 (92.1) -> 72.0 (92.3) MB, 2.9 / 0.0 ms  (average mu = 0.781, current mu = 0.806) testing GC in old space requested
                                                                                                                                                               
                                                                               
<--- JS stacktrace --->                                                                                                                                        
                                                                               
FATAL ERROR: MarkCompactCollector: young object promotion failed Allocation failed - JavaScript heap out of memory  

How often does it reproduce? Is there a required condition?

Every time.

What is the expected behavior?

Not OOM.

What do you see instead?

OOM.

Additional information

No response

@XadillaX XadillaX added confirmed-bug Issues with confirmed bugs. vm Issues and PRs related to the vm subsystem. labels Sep 6, 2021
@XadillaX
Copy link
Member Author

XadillaX commented Sep 6, 2021

And another amazing thing:

  1. run node --expose-gc --max-heap-size=100 --inspect=9229 test.js for a while, OOM.
  2. run node --expose-gc --max-heap-size=100 --inspect=9229 test.js and then run node inspect localhost:9229, everything will be OK and there's no OOM any more.

@XadillaX
Copy link
Member Author

XadillaX commented Sep 6, 2021

I added

if (times === 330) {
  require('v8').writeHeapSnapshot();
}

To the code, and found there're full of the str object like:

深度截图_选择区域_20210906190745

@targos targos added the v8 engine Issues and PRs related to the V8 dependency. label Sep 6, 2021
@XadillaX
Copy link
Member Author

XadillaX commented Sep 7, 2021

The leak seems in CompilationCache.

深度截图_选择区域_20210907140356

That means when I run command below in Node.js 14 / 16:

$ node --expose-gc --max-heap-size=100 --no-compilation-cache test2.js

The behavior is right - no longer OOM.

@targos
Copy link
Member

targos commented Sep 7, 2021

/cc @nodejs/v8

@XadillaX
Copy link
Member Author

XadillaX commented Sep 7, 2021

In Node.js 12, the heap size will up to about 100M either. But after it cannot allocate more space, it will trigger kLastResort GC because:

https://github.com/nodejs/node/blob/v12.22.6/deps/v8/src/heap/heap.cc#L4904-L4925

After kLastResort GC, the heap size will down to very small.

And Node.js 14 / 16 will never trigger kLastResort GC:

https://github.com/nodejs/node/blob/v14.16.0/deps/v8/src/heap/heap.cc#L5205-L5226

And heap size still increases until OOM.

@XadillaX
Copy link
Member Author

XadillaX commented Sep 7, 2021

https://github.com/nodejs/node/blob/master/deps/v8/src/heap/mark-compact.cc#L1458-L1469

EvacuateNewSpaceVisitor::Visit() in GC, this new code (After Node.js 14) requires allocating an old space block and returns an AllocateResult.IsRetry() object. heap_->FatalProcessOutOfMemory() here at last.

So we should think what should we do when we're under GC but we have no enough space to allocate old space block.

Old V8 (Node.js 12) just no this if logic, and allocate old space anyway. And it will do kLastResort GC later.

What's more, it seems that I can't disable always promote young mc by --no-always-promote-young-mc because of DEFINE_IMPLICATION(array_buffer_extension, always_promote_young_mc) in v8's src/flags/flag-defination.h.


Refs

@XadillaX
Copy link
Member Author

XadillaX commented Sep 7, 2021

I think we have 2 ways to fix this bug:

  1. Fix _always_promote_young's logic in GC - make allocate old space not throw error when heap size up to exceeded.
  2. Set array_buffer_extension in gyp back to 0, and set --always-promote-young-mc defaults to false.

btw. why UnboundScript's script cache and its code string won't GCed until kLastResort (aka. heap size up to exceeded)? Can this be fixed?

@XadillaX
Copy link
Member Author

XadillaX commented Sep 8, 2021

There're several problems (bugs) occur this one issue:

  1. All Script's CompilationCache in the heap and won't be GCed until Last Resort GC; (It will make heap size to the limit and --no-compilation-cache can resolve it)
  2. --always-promote-young-mc will OOM when old space cannot allocate more memory (e.g. when heap size limit exceeded); (I think it should do a Last Resort GC when we meet this situation like Node.js 12's V8)
  3. We cannot close --always-promote-young-mc because --array-buffer-extension always true at compile time.

/cc @targos
/ping @nodejs/v8

@XadillaX
Copy link
Member Author

XadillaX commented Sep 8, 2021

@sidverma32
Copy link

sidverma32 commented Oct 29, 2021

This answer helped me: StackOverflow

@lukeapage
Copy link

lukeapage commented Dec 15, 2021

@XadillaX given you did such detailed analysis do you have any idea what is the path forward now v8 closed the bug you raised as won’t fix ? With some suggestions for the node team ?

@XadillaX
Copy link
Member Author

XadillaX commented Dec 16, 2021

@XadillaX given you did such detailed analysis do you have any idea what is the path forward now v8 closed the bug you raised as won’t fix ? With some suggestions for the node team ?

Hmmm... I'm not sure since V8 closed that issue. I think we can't do anything without v8's change.

@mmiszy
Copy link

mmiszy commented Feb 21, 2022

@XadillaX Is this still a problem for you in Node 14.19.0?

@phawxby
Copy link

phawxby commented Mar 24, 2022

I decided to do some digging and built on the sample script provided by @XadillaX.

'use strict';
const { getHeapStatistics } = require('v8');

let times = 0;
// The largest heap size observed since last logging output.
let heapMax = 0;
// The absolute largest heap size seen.
let absoluteHeapMax = 0;

function run() {
  let str = 'var a = "';
  for (let i = 0; i < (100 * 1024) / 18; i++) str += Math.random().toString();
  str += '";';

  const script = new (require('vm').Script)(str);
  times++;

  const heap = getHeapStatistics();
  heapMax = Math.max(heapMax, heap.total_heap_size);
  absoluteHeapMax = Math.max(absoluteHeapMax, heap.total_heap_size);

  if (times % 50 === 0) {
    console.log(
      times,
      `Heap - Since last: ${bytesToSize(heapMax)} | Absolute: ${bytesToSize(
        absoluteHeapMax
      )}`
    );

    heapMax = 0;
  }
}

// https://stackoverflow.com/a/18650828/468214
function bytesToSize(bytes) {
  var sizes = ['Bytes', 'KB', 'MB', 'GB', 'TB'];
  if (bytes == 0) return '0 Byte';
  var i = parseInt(Math.floor(Math.log(bytes) / Math.log(1024)));
  return Math.round(bytes / Math.pow(1024, i), 2) + ' ' + sizes[i];
}

(async () => {
  while (true) {
    run();

    await new Promise((resolve) => {
      setTimeout(() => {
        resolve();
      }, 10);
    });

    gc();
  }
})();
node --expose-gc --max-heap-size=100 scripts/heap.js
  • 12.16.1 - 5000+ iterations
  • 14.0.0 - 5000+ iterations
  • 14.3.0 - 5000+ iterations
  • 14.4.0 - 5000+ iterations
  • 14.5.0 - Crash @ 350 iterations - V8 upgrade to 8.3
  • 14.6.0 - Crash @ 350 iterations
  • 14.10.0 - Crash @ 350 iterations
  • 14.12.0 - Crash @ 350 iterations
  • 14.15.0 - Crash @ 350 iterations
  • 14.18.2 - Crash @ 350 iterations
  • 14.19.0 - Crash @ 350 iterations
  • 16.3.0 - Crash @ 350 iterations
  • 16.9.1 - Crash @ 350 iterations
  • 16.10.0 - Crash @ 350 iterations
  • 16.11.0 - Crash @ 350 iterations
  • 16.14.0 - Crash @ 750 iterations
  • 16.14.1 - Crash @ 750 iterations.
  • 16.14.2 - Crash @ 750 iterations
  • 17.0.0 - Crash @ 350 iterations
  • 17.8.0 - Crash @ 350 iterations
  • 18.1.0 - Crash @ 750 iterations

As has been noted, the versions that don't crash appear to correctly GC as they approach the heap limit.

Now using node v12.16.1 (npm v6.13.4)
50 Heap - Since last: 17 MB | Absolute: 17 MB
100 Heap - Since last: 30 MB | Absolute: 30 MB
150 Heap - Since last: 42 MB | Absolute: 42 MB
200 Heap - Since last: 55 MB | Absolute: 55 MB
250 Heap - Since last: 67 MB | Absolute: 67 MB
300 Heap - Since last: 80 MB | Absolute: 80 MB
350 Heap - Since last: 91 MB | Absolute: 91 MB
400 Heap - Since last: 99 MB | Absolute: 99 MB
450 Heap - Since last: 21 MB | Absolute: 99 MB
500 Heap - Since last: 34 MB | Absolute: 99 MB
550 Heap - Since last: 46 MB | Absolute: 99 MB
600 Heap - Since last: 59 MB | Absolute: 99 MB
650 Heap - Since last: 71 MB | Absolute: 99 MB

@mmiszy yes it's still an issue at 14.19.0.

Now using node v14.19.0 (npm v6.14.16)
50 Heap - Since last: 18 MB | Absolute: 18 MB
100 Heap - Since last: 31 MB | Absolute: 31 MB
150 Heap - Since last: 43 MB | Absolute: 43 MB
200 Heap - Since last: 56 MB | Absolute: 56 MB
250 Heap - Since last: 68 MB | Absolute: 68 MB
300 Heap - Since last: 81 MB | Absolute: 81 MB
350 Heap - Since last: 92 MB | Absolute: 92 MB

<--- Last few GCs --->

[91848:0x104a08000]     6074 ms: Mark-sweep 72.2 (91.5) -> 71.7 (91.8) MB, 4.3 / 0.0 ms  (average mu = 0.735, current mu = 0.757) testing GC in old space requested
[91848:0x104a08000]     6092 ms: Mark-sweep 72.4 (91.8) -> 71.9 (92.0) MB, 4.8 / 0.0 ms  (average mu = 0.732, current mu = 0.730) testing GC in old space requested
[91848:0x104a08000]     6111 ms: Mark-sweep 72.6 (92.0) -> 72.1 (92.3) MB, 6.1 / 0.0 ms  (average mu = 0.707, current mu = 0.682) testing GC in old space requested

V16.14.X is a bit weird.

Now using node v16.14.0 (npm v8.3.1)
50 Heap - Since last: 18 MB | Absolute: 18 MB
100 Heap - Since last: 31 MB | Absolute: 31 MB
150 Heap - Since last: 43 MB | Absolute: 43 MB
200 Heap - Since last: 56 MB | Absolute: 56 MB
250 Heap - Since last: 68 MB | Absolute: 68 MB
300 Heap - Since last: 81 MB | Absolute: 81 MB
350 Heap - Since last: 93 MB | Absolute: 93 MB
400 Heap - Since last: 100 MB | Absolute: 100 MB
450 Heap - Since last: 24 MB | Absolute: 100 MB
500 Heap - Since last: 37 MB | Absolute: 100 MB
550 Heap - Since last: 49 MB | Absolute: 100 MB
600 Heap - Since last: 62 MB | Absolute: 100 MB
650 Heap - Since last: 74 MB | Absolute: 100 MB
700 Heap - Since last: 87 MB | Absolute: 100 MB
750 Heap - Since last: 99 MB | Absolute: 100 MB

<--- Last few GCs --->

[6500:0x7fda3c100000]    12269 ms: Mark-sweep 78.5 (99.7) -> 77.9 (99.7) MB, 2.6 / 0.0 ms  (average mu = 0.835, current mu = 0.840) testing GC in old space requested
[6500:0x7fda3c100000]    12286 ms: Mark-sweep 78.7 (99.9) -> 78.1 (99.9) MB, 3.4 / 0.0 ms  (average mu = 0.820, current mu = 0.805) testing GC in old space requested
[6500:0x7fda3c100000]    12292 ms: Mark-sweep 78.6 (99.9) -> 78.2 (100.2) MB, 3.3 / 0.0 ms  (average mu = 0.718, current mu = 0.417) allocation failure GC in old space requested

I admit, I'm not experienced in this kind of debugging so there's a good chance I've screwed something up. But if I haven't it suggests the issue appeared in Node V14.5.0, hasn't gotten any worse/better since then apart from a bit of weirdness in In V16.14.X.

@gabrielcsapo
Copy link

gabrielcsapo commented Jun 13, 2022

18.1.0 is also having this issue, happens at 750

Screen Shot 2022-06-13 at 9 08 14 AM

@kedarv
Copy link

kedarv commented Jun 14, 2022

Seeing the same behavior in Node 14, 16, 18 as well, which is surfaced via runInNewContext (which I believe internally invokes. vm.Script)

@mbyrne00
Copy link

mbyrne00 commented Jun 28, 2022

The v8 issue has a few suggestions for node with some speculation. Even after my last comment about Jest they are still suggesting removing a guard in node and asking if it will work. Could some folk from the node team with stronger knowledge than me provide some clear feedback? The team currently believe this isn't a common issue or something that can easily be worked around. This isn't the case ... for the Jest issue we are stuck on node 16.10 with no idea of longer-term solution.

https://bugs.chromium.org/p/v8/issues/detail?id=12198#c20

@mbyrne00
Copy link

mbyrne00 commented Jun 28, 2022

Oh yes - sorry for the bad link, you are right. I'll also update my comment

@benjamingr
Copy link
Member

benjamingr commented Jul 28, 2022

@XadillaX @nodejs/vm @nodejs/v8 are there any updates on this issue or anything that changed in the last year?

@benjamingr
Copy link
Member

benjamingr commented Jul 28, 2022

Also more context facebook/jest#12205 which implies maybe @nodejs/loaders is also the right ping since adding importModuleDynamically causes the leak

@legendecas
Copy link
Member

legendecas commented Jul 28, 2022

The problem will be gone in the example of the OP if we simply remove the gc() call, as explained in https://bugs.chromium.org/p/v8/issues/detail?id=12198#c4. So I'd believe this issue can be closed.

@benjamingr
Copy link
Member

benjamingr commented Jul 28, 2022

@legendecas my understanding is that the "explicit gc calling" issue is in addition and not instead of Jest's issue which they are experiencing without any gc() calls.

Namely facebook/jest#12205 changes code with 0 gc calls to 0 gc calls and resolves the issue except if they also pass importModuleDynamically which causes the leak again.

@legendecas
Copy link
Member

legendecas commented Jul 28, 2022

@benjamingr I would take it as a different problem so I'd suggest opening a dedicated issue to track it.

@legendecas
Copy link
Member

legendecas commented Aug 15, 2022

@benjamingr I've opened #44211 to track the issues with vm.compileFunction/importModuleDynamically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug Issues with confirmed bugs. v8 engine Issues and PRs related to the V8 dependency. vm Issues and PRs related to the vm subsystem.
Projects
None yet
Development

No branches or pull requests