gc: block-persistence conservatism inflates peak RSS on tight allocation loops

## Summary

Perry's GC includes a \"block-persistence\" pass at [`crates/perry-runtime/src/gc.rs:1090`](crates/perry-runtime/src/gc.rs#L1090) that marks ALL objects in an 8 MB arena block live whenever any single object in the block is root-reachable. This was added to protect against untracked arena refs sitting in caller-saved registers that the conservative stack scan can't capture (issues #43 / #44 pattern — dangling-pointer crashes when GC freed still-reachable arena objects).

The conservatism is catastrophic for **tight allocation loops that co-locate fresh per-iteration data with pre-existing long-lived state**. The `JSON.parse` case was traced in detail in #149; summary of the mechanism:

1. Block 0 (and sometimes block 1) contains long-lived data: interned string keys, shape-cache `keys_array`s, the blob being parsed, any caller-level root arrays.
2. The same block also contains early-iteration allocations (the first `new Foo()` in the loop always lands adjacent to the setup data).
3. Every subsequent iteration's allocations go into new blocks.
4. GC marks block 0 live (because of the long-lived data). Block-persist marks EVERY object in block 0 live — including the dead early-iteration objects.
5. Those dead objects have field values pointing into *later* blocks (fresh name strings, tag arrays, nested objects).
6. The persist pass iterates to fixed point, following pointers across blocks — at iter 49 of `bench_json_roundtrip`, 2 live blocks / 37 truly-reachable objects cascaded to 30 live blocks / 3 M \"live\" objects over 11 rounds.
7. `gc()` becomes a no-op. RSS grows linearly with iteration count.

## Measured impact (`bench_json_roundtrip` at v0.5.190)

| | Perry | Bun | ratio |
|-|------:|----:|------:|
| Speed | 315 ms | 248 ms | 1.27× |
| Peak RSS | **318 MB** | **83 MB** | **3.83×** |

The speed gap is tolerable (recently closed from 2.4× to 1.27×, see #149 closeout). The RSS gap is the architectural cost of block-persist + bump-arena vs. Bun's per-object GC.

## Not just JSON

Any tight `new ClassName()` / `new Array()` / tight parser loop that runs after meaningful setup hits the same pattern. The setup's long-lived roots anchor block 0 live; the first loop iteration allocates adjacent to them; block-persist pulls in the dead iter-0 objects on every GC and cascades from there.

We've seen the contours of this before:
- `items.push({...})` builds that produce megabytes of structure but only `length` is read later
- Repeated `Buffer.alloc` with `const buf = Buffer.alloc(1024)` in a loop (before the #173 slab fast-path)
- `new Map()` build loops
- (probably) any perry/thread workload that cycles large intermediate state

## Fix directions

Ranked by scope.

### A. Segregated long-lived arena region (medium)

Allocate intrinsically long-lived data — `PARSE_KEY_CACHE` strings, shape-cache `keys_array`s, transition-cache arrays, the string intern table, stringify scratch — into a DEDICATED arena block (or a small fixed-size region) that block-persist *can* conservatively retain. Everything else goes into the general arena where per-iter blocks genuinely go dead together.

Small scope (no correctness tradeoff with #43 / #44), but needs careful routing: every long-lived allocation path has to opt in.

### B. Weaken block-persist with a stricter pre-condition (high)

Block-persist exists because the conservative stack scan can miss handles in caller-saved registers during mid-parse GC triggers. If we can *guarantee* (by convention + scaffolding in `js_json_parse`, `js_buffer_alloc`, etc.) that all intermediate arena refs from a given path are tracked in an explicit root set (like `PARSE_ROOTS`) before any internal allocation, we can skip block-persist for those paths.

Would need a comprehensive audit of allocation call sites. Higher risk — re-opens the failure mode #43 / #44 closed.

### C. Generational GC (largest)

Young generation = throwaway per-GC-cycle region; old generation = current arena model. Most parse output is trivially young-generation (dies before first GC). Matches what Bun/V8 do.

Weeks of work, changes allocator semantics throughout the runtime.

### D. Do nothing (documented trade-off)

Ship Perry's current numbers as-is. Speed is already a win over Node. RSS gap is the cost of the arena model's simplicity. Document it, close the door.

## What I'd do next

(A) looks like the right first step — \"long-lived data gets its own quarantine block\" is a conceptually clean, well-bounded change. Implementing that for just `PARSE_KEY_CACHE` + shape-cache arrays would prove the concept on the JSON workload and inform whether to extend.

## References

- Closed investigation: #149
- Block-persist origins: #43, #44
- Conservative stack scan: `crates/perry-runtime/src/gc.rs` `mark_stack_roots`
- Block-persist implementation: `crates/perry-runtime/src/gc.rs:1090` `mark_block_persisting_arena_objects`
- Block reset: `crates/perry-runtime/src/arena.rs:592` `arena_reset_empty_blocks`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gc: block-persistence conservatism inflates peak RSS on tight allocation loops #179

Summary

Measured impact (`bench_json_roundtrip` at v0.5.190)

Not just JSON

Fix directions

A. Segregated long-lived arena region (medium)

B. Weaken block-persist with a stricter pre-condition (high)

C. Generational GC (largest)

D. Do nothing (documented trade-off)

What I'd do next

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	Perry	Bun	ratio
Speed	315 ms	248 ms	1.27×
Peak RSS	318 MB	83 MB	3.83×

Uh oh!

gc: block-persistence conservatism inflates peak RSS on tight allocation loops #179

Description

Summary

Measured impact (bench_json_roundtrip at v0.5.190)

Not just JSON

Fix directions

A. Segregated long-lived arena region (medium)

B. Weaken block-persist with a stricter pre-condition (high)

C. Generational GC (largest)

D. Do nothing (documented trade-off)

What I'd do next

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Measured impact (`bench_json_roundtrip` at v0.5.190)