memory-leak test measures vitest harness overhead more than Runtime retention

## Summary

\`tests/integration/memory-leak.test.ts\` (the \`should not leak memory over multiple init/shutdown cycles\` case) reports heap deltas in the low single-digit MB range that look like a Runtime leak but are actually vitest harness overhead. This was diagnosed and worked around in commit 355e342 (raised the threshold to 6000 KB, added \`--expose-gc\` to vitest workers, and added a final pre-measurement GC call so the baseline and final reads are symmetric). Filing this so the underlying limitation is documented and we can revisit the test design.

## Evidence

A standalone Node script running the **exact same** workload as the failing test — 20 init/shutdown cycles, 10 plugins per cycle, action + screen registration, full introspection, with \`global.gc()\` between cycles — measures **~32 KB** delta on Node 24 / V8 13.x.

The same workload inside a vitest test (any pool, with \`--expose-gc\` enabled) measures **~3.9 MB** on Node 24. The 4 MB delta is per-test-file vitest harness state — module graph entries, source maps, snapshot bookkeeping — that doesn't release within the worker's lifetime.

This wasn't visible on Node 22 because V8 12.x had a tighter harness footprint and the existing 3 MB threshold hid it. Node 24 / V8 13.x exposed it.

## Why it matters

The test claims to enforce \"Requirement 12.1: Base runtime memory increase < 100KB\" but it can't actually measure to that resolution — the harness floor is two orders of magnitude above the target. The current 6000 KB threshold catches a leak only if it's already in the 10+ MB range at this scale (200 plugin lifecycles).

## Suggested fix

Rewrite the test to subtract a harness baseline. Roughly:

\`\`\`ts
// Run an empty cycle loop first to measure harness allocation
const harnessBaseline = await measureWorkload(emptyWorkload, cycles);
// Then the real workload
const fullDelta = await measureWorkload(realWorkload, cycles);
const runtimeDelta = fullDelta - harnessBaseline;
expect(runtimeDelta).toBeLessThan(100); // back to the requirement target
\`\`\`

The empty-workload measurement isolates the per-iteration harness cost, leaving the real Runtime contribution. Needs care around vitest's allocation patterns (the second measurement may not pay the same fixed cost as the first), but worth attempting.

## Acceptance

- The test enforces a threshold meaningful at the level of the original requirement (low hundreds of KB, not low MB).
- The test passes on Node 22 and Node 24 deterministically across many seeds.
- A real Runtime leak (e.g. an intentionally-introduced retained reference in shutdown) is caught by the test, demonstrated in a separate fixture or commit.

## Related

- Diagnosis happened during CI bring-up for v0.1.0 → 355e342.
- See the comment block in the test for the in-line context.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory-leak test measures vitest harness overhead more than Runtime retention #2

Summary

Evidence

Why it matters

Suggested fix

Acceptance

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

memory-leak test measures vitest harness overhead more than Runtime retention #2

Description

Summary

Evidence

Why it matters

Suggested fix

Acceptance

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions