Skip to content

Fix heap retention from failed IPC sends and trim oversized AHP logs#320267

Merged
roblourens merged 2 commits into
mainfrom
roblou/ipc-ahp-heap-retention
Jun 7, 2026
Merged

Fix heap retention from failed IPC sends and trim oversized AHP logs#320267
roblourens merged 2 commits into
mainfrom
roblou/ipc-ahp-heap-retention

Conversation

@roblourens
Copy link
Copy Markdown
Member

@roblourens roblourens commented Jun 6, 2026

What & why

Investigating an Agents window heap snapshot (7.7GB heap, ~97% native memory) traced a multi-GB retention back to a failed Export Agent Host Debug Logs operation. Two IPC bugs and one logging issue conspired to pin the memory. Retainer analysis of the snapshot confirmed the root cause precisely: the failed request's rejected promise was retained by the live channel's handlers map, and through the error's captured stack it retained the serialization buffers + log strings.

Changes

Bug A — leaked request handler (the actual leak)

In ChannelClient.requestPromise, the response handler is registered in this.handlers before the request is serialized and sent. If serialization throws synchronously (e.g. an oversized argument makes VSBuffer.concat throw RangeError: Array buffer allocation failed), no request ever goes out and the handler — only removed when a response arrives — is leaked for the lifetime of the channel. That leaked entry transitively retains the rejected promise, the error, and the multi-GB serialization buffers.

Fix: wrap sendRequest so that on a synchronous failure we delete the handler and reject. This also settles the promise on the uninitialized (then) path, which previously hung forever on such a failure.

Why one dangling entry pins so much (the retention chain)

This isn't obvious from reading the code, so concretely, here is the chain the snapshot shows from a GC root down to the payload:

  1. The live ChannelClient holds its this.handlers Map for the channel's whole lifetime.
  2. The leaked entry's value is the response-handler closure. It is only ever deleted when a response with that id arrives — and since the send threw, none ever will.
  3. That closure captures the promise's resolve/reject (c/e) callbacks, which internally reference the rejected Promise itself.
  4. The promise's internal [[PromiseResult]] slot keeps the Error alive. This is the counter-intuitive hop: a rejected promise retains its rejection value for as long as the promise is reachable — so the error is not eligible for GC even though nothing awaits it.
  5. The Error's lazily-captured .stack pins the stack frames that were live at throw time, including the send/serialize frame whose local writer (a BufferWriter) is still in scope.
  6. writer.buffers then holds the multi-GB serialized payload (and, in the AHP case, the ~157MB log string).

In the snapshot the RangeError had exactly one incoming edge (the rejected promise), and the only path from that whole cluster to a GC root was handler closure → Map → ChannelClient.handlers. So the dangling handlers entry is both necessary and sufficient to pin everything, which is why deleting it on a synchronous send failure releases the entire graph.

Bug B — buffer retention via error stacks (defense in depth)

BufferWriter now implements IDisposable, and the three serialize/send sites dispose it in a finally. So even if a thrown serialization error's captured stack pins the send frame (step 5 above), it can't drag the intermediate buffers along. This covers the server-side and other send sites that have no handler-map cleanup of their own.

AHP log line trimming

A single AHP protocol log line could reach ~157MB (e.g. a resourceRead carrying a base64-encoded file). AhpJsonlLogger.log now stringifies once (fast path) and, only when a line exceeds 1MB, re-serializes with oversized string values elided — keeping the output valid JSONL and marking the entry truncated.

Tests

  • BufferWriter releases its buffers on dispose
  • request rejects (and cleans up) when serialization throws on the deferred path
  • elides oversized string payloads while keeping the line valid JSONL

Full unit suite passes (10812 passing).

Notes

  • Thresholds (1MB line / 16KB per-string) are tunable.
  • A milder variant of Bug A exists in requestEvent (handler registered up front, removed on last-listener-removed); left untouched as it's far lower risk (small event args). Happy to harden it too if desired.

(Written by Copilot)

Investigating an Agents window heap snapshot (7.7GB, mostly native memory)
revealed a multi-GB retention rooted at a failed "Export Agent Host Debug
Logs" operation. Two IPC bugs and one logging issue conspired:

Bug A (the leak): in `ChannelClient.requestPromise`, the response handler is
registered in `this.handlers` before the request is serialized/sent. If
serialization throws synchronously (e.g. an oversized argument makes
`VSBuffer.concat` throw `RangeError: Array buffer allocation failed`), the
handler entry is never  it's only deleted on a response that neverremoved
arrives. The leaked handler retains the rejected promise, the error, and (via
the error's captured stack) the serialization buffers, for the lifetime of the
channel. The heap snapshot confirmed this was the sole retainer of the error.
Fix: clean up the handler and reject if `sendRequest` throws. This also settles
the promise on the uninitialized (`then`) path, which previously hung forever.

Bug B (defense in depth): `BufferWriter` now implements `IDisposable` and the
`send`/serialize sites dispose it in a `finally`, so a thrown serialization
error's captured stack can't pin the intermediate buffers. This protects the
server-side and other send sites that have no handler-map cleanup.

AHP log trimming: a single AHP protocol log line could reach ~157MB (e.g. a
`resourceRead` carrying a base64 file). `AhpJsonlLogger.log` now stringifies
once (fast path) and, only when a line exceeds 1MB, re-serializes with
oversized string values elided so the line stays valid JSONL.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 6, 2026 23:17
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a heap-retention scenario in the IPC layer (leaked request handlers when synchronous serialization fails) and reduces memory/disk pressure from Agent Host Protocol (AHP) transport logging by truncating oversized log lines while keeping JSONL valid.

Changes:

  • Ensure ChannelClient.requestPromise cleans up its handler map entry and rejects when sendRequest throws synchronously during serialization.
  • Make BufferWriter disposable and ensure all IPC send sites dispose it in finally blocks to avoid retaining intermediate buffers via error stacks.
  • Add AHP JSONL log-line truncation for oversized payloads, plus unit tests covering truncation and BufferWriter disposal semantics.
Show a summary per file
File Description
src/vs/platform/agentHost/test/common/ahpJsonlLogger.test.ts Adds a unit test ensuring oversized string payloads are elided while maintaining valid JSONL.
src/vs/platform/agentHost/common/ahpJsonlLogger.ts Implements log-line truncation logic and records truncation metadata on oversized entries.
src/vs/base/parts/ipc/test/common/ipc.test.ts Adds tests asserting BufferWriter releases buffers on dispose and that deferred request serialization failures reject properly.
src/vs/base/parts/ipc/common/ipc.ts Fixes handler leak on synchronous send/serialization failure and disposes BufferWriter at send sites.

Copilot's findings

  • Files reviewed: 4/4 changed files
  • Comments generated: 2

Comment thread src/vs/platform/agentHost/common/ahpJsonlLogger.ts
Comment thread src/vs/platform/agentHost/common/ahpJsonlLogger.ts Outdated
It has no cross-module consumer (not even tests), so per the repo guideline
to only export APIs used across components, drop the export.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@roblourens roblourens marked this pull request as ready for review June 7, 2026 16:46
@roblourens roblourens enabled auto-merge (squash) June 7, 2026 16:46
@roblourens roblourens merged commit 6a33563 into main Jun 7, 2026
25 checks passed
@roblourens roblourens deleted the roblou/ipc-ahp-heap-retention branch June 7, 2026 17:14
@vs-code-engineering vs-code-engineering Bot added this to the 1.124.0 milestone Jun 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants