Skip to content

Bump node-addon-api to fix memory leak #2436#2562

Open
iurisilvio wants to merge 4 commits intoAutomattic:masterfrom
iurisilvio:fix-memory-leak
Open

Bump node-addon-api to fix memory leak #2436#2562
iurisilvio wants to merge 4 commits intoAutomattic:masterfrom
iurisilvio:fix-memory-leak

Conversation

@iurisilvio
Copy link
Copy Markdown

@iurisilvio iurisilvio commented Apr 6, 2026

Fix memory leak from deferred N-API weak reference callbacks

Fixes #2436

Problem

Since the NAN -> N-API migration (v3.0.0), ObjectWrap destructors (which free cairo surfaces) are deferred to the next SetImmediate instead of running during GC. In server environments where the event loop is always busy, these destructors never fire and memory grows indefinitely.

This was reported upstream in nodejs/node-addon-api#1140 and fixed in nodejs/node#42651 by introducing node_api_nogc_finalize, a finalizer that runs directly during GC for native-only cleanup.

Fix

  • Upgrade node-addon-api from 7.x to 8.x (which uses node_api_nogc_finalize for ObjectWrap::FinalizeCallback)
  • Add NAPI_EXPERIMENTAL to binding.gyp defines (required to enable NODE_API_EXPERIMENTAL_HAS_POST_FINALIZER in Node's headers)

No code changes, node-addon-api 8.x automatically routes ObjectWrap destructor callbacks through the nogc finalizer path when the flag is set.

I don't fully understand the impact of node-addon-api upgrade or enabling NAPI_EXPERIMENTAL.

Before / After

200 requests decoding 3000x2000 JPEG images into thumbnails:

// node --expose-gc stress.js
const { createCanvas, loadImage } = require("canvas");

let s = 1;
function rand() {
    s = (s + 0x6d2b79f5) | 0;
    let t = Math.imul(s ^ (s >>> 15), 1 | s);
    t = (t + Math.imul(t ^ (t >>> 7), 61 | t)) ^ t;
    return ((t ^ (t >>> 14)) >>> 0) / 4294967296;
}

function genImage(w, h) {
    const c = createCanvas(w, h);
    const ctx = c.getContext("2d");
    for (let i = 0; i < 10; i++) {
        ctx.fillStyle = `rgb(${(rand() * 255) | 0},${(rand() * 255) | 0},${(rand() * 255) | 0})`;
        ctx.fillRect((rand() * w) | 0, (rand() * h) | 0, (rand() * w) / 2, (rand() * h) / 2);
    }
    return c.toBuffer("image/jpeg");
}

async function run() {
    const imgs = [genImage(2000, 1500), genImage(3000, 2000)];
    let peak = 0;
    for (let i = 0; i < 50; i++) {
        const promises = [];
        for (let j = 0; j < 4; j++) {
            promises.push(
                loadImage(imgs[j % 2]).then((img) => {
                    const c = createCanvas(200, Math.round((img.height / img.width) * 200));
                    c.getContext("2d").drawImage(img, 0, 0, c.width, c.height);
                    return c.toBuffer("image/jpeg");
                })
            );
        }
        await Promise.all(promises);
        const rss = process.memoryUsage().rss;
        if (rss > peak) peak = rss;
        if ((i + 1) % 10 === 0)
            console.log(`  batch ${i + 1}: rss=${Math.round(rss / 1024 / 1024)}MB`);
    }
    console.log(`Peak RSS: ${Math.round(peak / 1024 / 1024)}MB`);
}

run();
// Before (node-addon-api 8, no NAPI_EXPERIMENTAL):
  batch 10: rss=784MB
  batch 20: rss=1477MB
  batch 30: rss=2169MB
  batch 40: rss=2862MB
  batch 50: rss=3555MB
Peak RSS: 3555MB

// After (node-addon-api 8 + NAPI_EXPERIMENTAL):
  batch 10: rss=196MB
  batch 20: rss=195MB
  batch 30: rss=196MB
  batch 40: rss=196MB
  batch 50: rss=196MB
Peak RSS: 196MB

@chearon
Copy link
Copy Markdown
Collaborator

chearon commented Apr 7, 2026

Great find! That tracks perfectly with what we discussed in the issue, and it looks like we are good to upgrade since we don't support node 16.

The tests are failing, though.

@iurisilvio
Copy link
Copy Markdown
Author

Thank you for running the tests, I didn't run them local before! Now I ran it on Mac and Linux, reproduced the failures local, I hope my last commit fix all of them. 🙏🏻

@iurisilvio
Copy link
Copy Markdown
Author

iurisilvio commented Apr 8, 2026

Nice, all green now! 💚 Looking forward to have it reviewed/merged/released!

src/Canvas.cc Outdated
Napi::MemoryManagement::AdjustExternalMemory(env, -(int64_t)approxBytesPerPixel() * width * height);
}
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my reading of node-addon-api, having this member function forces the destructor to get called after gc, though before JS goes back to executing. It makes sense since this couldn't get called if the instance were already destroyed. Here's the relevant bit, cleaned up (source):

// If class overrides the basic finalizer, execute it.
if constexpr (details::HasBasicFinalizer<T>::value) {
  HandleScope scope(env);
  instance->Finalize(Napi::BasicEnv(env));
}

// If class overrides the (extended) finalizer, schedule it.
if constexpr (details::HasExtendedFinalizer<T>::value) {
  // Attach via node_api_post_finalizer. `PostFinalizeCallback` is responsible
  // for deleting the `T* instance`, after calling the user-provided finalizer.
  napi_status status =
      node_api_post_finalizer(env, PostFinalizeCallback, data, nullptr);
  NAPI_FATAL_IF_FAILED(status,
                       "ObjectWrap<T>::FinalizeCallback",
                       "node_api_post_finalizer failed");
}
// If the instance does _not_ override the (extended) finalizer, delete the
// `T* instance` immediately.
else {
  delete instance;
}

And the PostFinalizeCallback looks like this:

T* instance = static_cast<T*>(data);
instance->Finalize(Napi::Env(env));
delete instance;

But it would be better to destroy within gc, which is the whole point of BasicEnv. All day I've been trying to figure out why that causes node to crash (your initial test failures). I know it crashes in AdjustExternalMemory, but it seems to be failing because of something internal to node or node-addon-api.

AdjustExternalMemory takes a BasicEnv, and that env should still be valid, so I'm kind of at a loss for now. I'd like to take some time to understand it, which either leads to a bug fix in node or a corrected understanding on my part. I'm a little more hesitant to use the experimental API now. If we wanted to use it now, just having this method be empty and leaving the AdjustExternalMemory below would be better, since we want that to get called if we destroy from the C++ side.

I also think we should probably refactor the classes to only store BasicEnv to avoid accidentally trying to invoke JS at the wrong times. It's pretty easy to grab the full Napi::Env from info or other context.

@iurisilvio
Copy link
Copy Markdown
Author

I added a test to make sure memory leak doesn't come back later.

I don't know how to proceed from here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Memory leak in 3.0.0-rc2

2 participants