Skip to content

fix(runtime-transpiler): don't read this after publishing TranspilerJob to main thread#29128

Merged
Jarred-Sumner merged 2 commits intomainfrom
dylan/transpiler-job-dispatch-uaf
Apr 10, 2026
Merged

fix(runtime-transpiler): don't read this after publishing TranspilerJob to main thread#29128
Jarred-Sumner merged 2 commits intomainfrom
dylan/transpiler-job-dispatch-uaf

Conversation

@dylan-conway
Copy link
Copy Markdown
Member

What does this PR do?

dispatchToMainThread() runs on a WorkPool thread and does:

this.vm.transpiler_store.queue.push(this);
this.vm.eventLoop().enqueueTaskConcurrent(...);

After queue.push(this) publishes the job, the main thread can be woken by a ConcurrentTask enqueued by a different worker, call RuntimeTranspilerStore.runFromJSThread()queue.popBatch() (which drains all pushed jobs, including this one) → runFromJSThread()store.put(this). HiveArray.put() writes value.* = undefined across the slot, so by the time this worker executes the second line and reloads this.vm, it reads 0xAA poison and vm.eventLoop() dereferences a non-canonical pointer.

In practice this only crashes on Windows x64 release builds: those are ReleaseSafe, so = undefined actually writes 0xAA (ReleaseFast is a no-op), and the generated code reloads this.vm after the xchg in queue.push rather than CSE'ing it. The window between the two instructions is a handful of cycles, so it only fires when the worker is preempted right after the push on an oversubscribed machine.

Crash reports show up as Segmentation fault at address 0xFFFFFFFFFFFFFFFF with frames VirtualMachine.zig:eventLoop / arena_allocator.zig:deinit / atomic.zig:fetchSub (the innermost inline at each PC; the outermost frames are dispatchToMainThread / TranspilerJob.run / ThreadPool.Thread.run).

Fix: snapshot vm into a local before queue.push(this) and never touch this after publishing.

Fixes #15805
Fixes #14681

@robobun
Copy link
Copy Markdown
Collaborator

robobun commented Apr 10, 2026

Updated 1:39 PM PT - Apr 10th, 2026

@Jarred-Sumner, your commit d86130f has 2 failures in Build #44903 (All Failures):


🧪   To try this PR locally:

bunx bun-pr 29128

That installs a local version of the PR into your bun-29128 executable, so you can run:

bun-29128 --bun

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 10, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 42b1c2dc-2882-469d-8799-bf720f2029f8

📥 Commits

Reviewing files that changed from the base of the PR and between 0055c46 and d86130f.

📒 Files selected for processing (1)
  • src/bun.js/RuntimeTranspilerStore.zig

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.


Walkthrough

Capture this.vm into a local vm variable inside TranspilerJob.dispatchToMainThread, derive a local transpiler_store = &vm.transpiler_store, use that for queue push and for creating the enqueued jsc.ConcurrentTask, and avoid referencing this after publishing the job (added inline comment).

Changes

Cohort / File(s) Summary
RuntimeTranspilerStore change
src/bun.js/RuntimeTranspilerStore.zig
In TranspilerJob.dispatchToMainThread snapshot this.vm to a local vm, use &vm.transpiler_store for queue.push() and for constructing the jsc.ConcurrentTask, and add a comment noting this may be freed after the push (avoid further references).

Possibly related PRs

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the core fix: avoiding a dereference of 'this' after publishing the TranspilerJob to the main thread, which is the primary change in the PR.
Description check ✅ Passed The description fully documents the problem, root cause, and fix with technical detail matching the template's required sections on what the PR does and verification.
Linked Issues check ✅ Passed The code changes directly address the race condition documented in issues #15805 and #14681 by capturing vm locally before queue.push(this), preventing use-after-free of the job pointer.
Out of Scope Changes check ✅ Passed All changes are scoped to fixing the use-after-publish race in TranspilerJob.dispatchToMainThread with no extraneous modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

Found 2 issues this PR may fix:

  1. Issue #28177 - ASan reports use-after-poison in RuntimeTranspilerStore.TranspilerJob.schedule with poisoned HiveArray slot, same race condition this PR fixes
  2. Bun 1.2.15 crashes in GCR job execution #20341 - Crash in RuntimeTranspilerStore.TranspilerJob.runFromWorkerThread during GCR job execution, consistent with use-after-free in transpiler dispatch path

If this is helpful, copy the block below into the PR description to auto-close these issues on merge.

Fixes #28177
Fixes #20341

🤖 Generated with Claude Code

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — the local snapshot of vm before queue.push(this) correctly closes the use-after-free window.

Extended reasoning...

Overview

Single-function change in dispatchToMainThread() (3 lines): introduces a local vm variable snapshotted from this.vm before publishing the job to the queue, then uses that local for both the queue push and the subsequent enqueueTaskConcurrent call.

Security risks

None. This is a pure correctness fix with no auth, crypto, or permission surface.

Level of scrutiny

Low. The race condition is well-described in the PR body, the fix is the canonical solution (snapshot before publish), and no other call sites or data flows are affected. The diff is 3 lines with no ambiguity.

Other factors

No bugs found by the automated system. No prior reviews on this PR. The PR description provides a thorough analysis of the crash (ReleaseSafe 0xAA poison, reloaded this.vm after xchg, preemption window) and links two crash issue reports. The fix exactly addresses the root cause with minimal blast radius.

@Jarred-Sumner Jarred-Sumner merged commit d81b0ca into main Apr 10, 2026
4 of 6 checks passed
@Jarred-Sumner Jarred-Sumner deleted the dylan/transpiler-job-dispatch-uaf branch April 10, 2026 19:51
Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — minimal, correct fix for a well-analyzed use-after-publish race.

Extended reasoning...

Overview

The PR modifies a single 4-line function in . The change snapshots and into locals before calling , and never touches afterwards.

Security risks

None. This is a memory-safety fix with no security surface exposure.

Level of scrutiny

Low. The change is mechanically simple: snapshot pointers before publishing the object, avoid dereferencing the published object. The root cause analysis in the PR description is thorough and accurate — makes the job visible to the main thread, which can recycle the slot (writing 0xAA in ReleaseSafe), and the original code then reloaded from poisoned memory. The fix is minimal and correctly addresses the race.

Other factors

No bugs found by automated analysis. No existing reviewer comments outstanding. The fix is consistent with standard use-after-publish idioms and the added comment clearly documents the invariant.

robobun added a commit that referenced this pull request Apr 11, 2026
Build 45156 failed on ubuntu-25-dot-04-aarch64-test-bun — a shard
that merged PRs #29153, #29128, #29111, #29097 all also failed on
yet still merged. Fresh sha to unblock the farm wake loop.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bun crash Bun crash

3 participants