Skip to content

Finish interrupted copying collections to keep the GC heap consistent#13528

Merged
fitzgen merged 1 commit into
bytecodealliance:mainfrom
fitzgen:wasmtime-issue-13516
Jun 1, 2026
Merged

Finish interrupted copying collections to keep the GC heap consistent#13528
fitzgen merged 1 commit into
bytecodealliance:mainfrom
fitzgen:wasmtime-issue-13516

Conversation

@fitzgen
Copy link
Copy Markdown
Member

@fitzgen fitzgen commented Jun 1, 2026

Garbage collection is incremental with respect to the Wasmtime embedder and Future::poll, yielding between collection increments to cooperate with the async runtime; it is logically atomic with respect to the Wasm mutator.

When the future driving an incremental collection is dropped before completion (e.g. an async call_async future is cancelled while a GC is in progress), the copying collector's heap was left half-flipped and half-forwarded. Subsequent store reuse observed the half-collected state and corrupted the GC heap.

We fix this situation by forcing any in-progress collection to finish synchronously in CopyingCollection's Drop. Other collectors do not actually do collection across multiple Future::polls in any meaningful way and so cancellation does not lead to GC heap corruption for them.

Fixes #13516

Garbage collection is incremental with respect to the Wasmtime embedder and
`Future::poll`, yielding between collection increments to cooperate with the
async runtime; it is logically atomic with respect to the Wasm mutator.

When the future driving an incremental collection is dropped before
completion (e.g. an async `call_async` future is cancelled while a GC is in
progress), the copying collector's heap was left half-flipped and
half-forwarded. Subsequent store reuse observed the half-collected state and
corrupted the GC heap.

We fix this situation by forcing any in-progress collection to finish
synchronously in `CopyingCollection`'s `Drop`. Other collectors do not actually
do collection across multiple `Future::poll`s in any meaningful way and so
cancellation does not lead to GC heap corruption for them.

Fixes bytecodealliance#13516
@fitzgen fitzgen requested a review from a team as a code owner June 1, 2026 21:38
@fitzgen fitzgen requested review from pchickey and removed request for a team June 1, 2026 21:38
@alexcrichton
Copy link
Copy Markdown
Member

Would it be possible to store the collection's intermediate state in the collector itself and automatically trigger a GC on future allocations? If a host is time-slicing a super long collection and ends up just having to block anyway to finish it that seems like something we'll ideally want to fix

@fitzgen
Copy link
Copy Markdown
Member Author

fitzgen commented Jun 1, 2026

Would it be possible to store the collection's intermediate state in the collector itself and automatically trigger a GC on future allocations?

The problem is that even running Wasm code that doesn't allocate, but accesses GC object fields, will observe the half-collected state and crash (basically equivalent to running with GC heap corruption).

We would need to have every API that could transitively access the GC heap automatically check for this state and trigger collection if needed, which is not just like call_async but also things that aren't async like StructRef::field.

I think the only realistic alternative to this PR would be to not make GC host-incremental at all, and if we ever want to have host-incremental GC then we would implement an actual concurrent-with-the-mutator incremental collector and do its incremental GC slices atomically and without interruption to avoid this same issue but "up a level"

@alexcrichton
Copy link
Copy Markdown
Member

Good points yeah. I've filed #13530 to track this over time, but I agree there's really not a whole lot else we can do in the meantime.

@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 1, 2026
@fitzgen fitzgen added this pull request to the merge queue Jun 1, 2026
Merged via the queue into bytecodealliance:main with commit 32dbb07 Jun 1, 2026
51 checks passed
@fitzgen fitzgen deleted the wasmtime-issue-13516 branch June 1, 2026 23:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Copying collector incremental GC does not handle cancellation of GC operation

3 participants