Skip to content

Add visitForGc to CompressionStream to fix zlib slow-path leak#6743

Open
guybedford wants to merge 2 commits intomainfrom
gbedford/zlib-compression-stream-gc-tracing
Open

Add visitForGc to CompressionStream to fix zlib slow-path leak#6743
guybedford wants to merge 2 commits intomainfrom
gbedford/zlib-compression-stream-gc-tracing

Conversation

@guybedford
Copy link
Copy Markdown
Contributor

@guybedford guybedford commented May 8, 2026

Repost of #6741 with the origin branch //cc @danlapid @jasnell for re-approvals.

This fixes a memory leak in the node:zlib slow path of the synchronous convenience methods ({ info: true }).

Each call constructs a JSG-bound CompressionStream wrapper that holds a jsg::Function writeCallback and errorHandler capturing the JS handle (see internal_zlib_base.ts). This forms a JS<->C++ reference cycle. Without visitForGc, V8 cannot trace through the C++->JS edge, so the cycle is uncollectable and every CompressionStream becomes immortal. 20k iterations of inflateSync(input, { info: true }) leaks ~128 MB.

The fast path (zlibUtil.zlibSync) is unaffected — it does the whole compression in C++ without exposing a CompressionStream to JS.

  • Adds visitForGc() to CompressionStream covering writeCallback, writeResult, and errorHandler so V8 can collect the cycle once it's unreachable from JS roots.
  • Eagerly clears these refs in close() so callers that explicitly destroy the stream don't have to wait on the cycle collector.

Before:

JSG_RESOURCE_TYPE(CompressionStream) {
  JSG_METHOD(close);
  JSG_METHOD_NAMED(write, template write<true>);
  // ...
}
// no visitForGc — JS<->C++ cycle is uncollectable

After:

JSG_RESOURCE_TYPE(CompressionStream) { /* ... */ }

void visitForGc(jsg::GcVisitor& visitor) {
  visitor.visit(writeCallback, writeResult, errorHandler);
}

Test coverage in zlib-leak-nodejs-test:

  • inflateSyncInfoCollects, deflateSyncInfoCollects, brotliSyncInfoCollects — exercise the eager-clear path through close().
  • createInflateAbandonedCollects — exercises the visitForGc path specifically by abandoning a createInflate() stream without end()/destroy()/close(). Without the visitor this case leaks 256 of 256 engines; with it, all are collected.

Red-green verified across all three test variants (@, @all-compat-flags, @all-autogates): removing visitForGc causes only createInflateAbandonedCollects to fail, confirming the new test specifically catches the visitor regression while the other three are covered by the eager-clear fallback.

The slow path of the sync zlib convenience methods (`{ info: true }`)
constructs a JSG-bound CompressionStream wrapper per call. The wrapper
holds a jsg::Function writeCallback that captures the JS handle (see
internal_zlib_base.ts), forming a JS<->C++ reference cycle. Without a
visitForGc, V8 cannot trace through the C++->JS edge, so the cycle is
uncollectable and every CompressionStream becomes immortal.

Reproducer: 20k iterations of inflateSync(input, { info: true }) leaks
~128 MB.

Adds visitForGc() to CompressionStream covering writeCallback,
writeResult, and errorHandler. Also clears these refs eagerly in
close() so callers that explicitly destroy don't have to wait on the
cycle collector.

The fast path (zlibUtil.zlibSync) is unaffected: it does the whole
compression in C++ without exposing a CompressionStream wrapper to JS.

Adds zlib-leak-nodejs-test asserting that engines returned via
{ info: true } are reclaimed after GC, using WeakRef and --expose-gc.
@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk Bot commented May 8, 2026

The close() method sets closed = true before clearing the refs, and looking at the writeStream method, writeCallback is only invoked before the close() at the bottom (line 523-528). The eager clear is safe since no code path uses those refs after closed is set.

The implementation is correct, follows established patterns, and the test is well-structured. The visitForGc correctly traces all three JS-heap refs (writeCallback, writeResult, errorHandler). The eager clear in close() is an optimization that's safe given the control flow. The test exercises both the sync info path and the abandoned-stream path.

LGTM

github run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants