ensure worker stub subrequest channel is kept alive until internal startRequest call by git-bruh · Pull Request #6547 · cloudflare/workerd

git-bruh · 2026-04-10T05:41:06Z

When using newPromisedWorkerInterface, by the time the .then() callback is executed, it is possible for the subrequest channel itself to have gone out of scope due to GC kicking in, causing a crash when startRequestImpl tries to access the worker service

// This pattern triggers the bug (fetcher can get GC'd)
const result = await dynamicWorker.getEntrypoint().foo(...);

// This doesn't (Fetcher is kept in scope, keeping the subrequest channel alive)
const e = dynamicWorker.getEntrypoint();
const result = await e.foo(...);

Closes #6441

ask-bonk · 2026-04-10T05:42:50Z

The fix is straightforward and correct. Let me summarize my analysis:

The change:

Old code: [this, metadata = kj::mv(metadata)] — raw this capture of SubrequestChannelImpl in a .then() callback
New code: [self = kj::addRef(*this), metadata = kj::mv(metadata)] — ref-counted capture via kj::addRef()

Why it's correct:

SubrequestChannelImpl extends IoChannelFactory::SubrequestChannel which extends kj::Refcounted, so kj::addRef(*this) is valid and returns kj::Own<SubrequestChannelImpl>.
The .then() callback outlives the caller's scope because it's wrapped in newPromisedWorkerInterface() — the caller gets back a WorkerInterface proxy immediately, and if the JS-side Fetcher gets GC'd, the SubrequestChannelImpl (and its raw this pointer) would be destroyed before the callback fires.
The self = kj::addRef(*this) capture keeps the SubrequestChannelImpl alive until the callback completes.

The ActorClassImpl::whenReady() case at line 4311 uses [this] but is safe because the promise is awaited directly by the caller who owns the object — the object cannot be GC'd during the await.

This is a clean, minimal, correct fix. No test is included, but the issue is inherently GC-timing-dependent and difficult to reliably test. The fix addresses the root cause.

LGTM

github run

codspeed-hq · 2026-04-10T05:56:34Z

Merging this PR will not alter performance

✅ 70 untouched benchmarks
⏩ 129 skipped benchmarks¹

_{Comparing pkhanna/fix-worker-loader-crash (81651ae) with main (6fc69a6)}

129 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

…artRequest call

Follow-up to cloudflare#6547, which fixed the deferred startup path but missed two additional crash vectors for the same root cause (cloudflare#6441). cloudflare#6547 fixed the `[this, ...]` capture in `SubrequestChannelImpl:: startRequest()` for the case where `isolate->service == kj::none` (async startup not yet complete). However, the crash reported in cloudflare#6441 also reproduces on the synchronous startup path, and with the same pattern on `ActorClassImpl::whenReady()`. The core problem: when JS code chains temporary objects like loader.get(name, getCode).getEntrypoint().evaluate(args) V8 can GC the Fetcher mid-request. This destroys the SubrequestChannelImpl, which releases its Rc<WorkerStubImpl>, which triggers WorkerStubImpl::unlink() → WorkerService::unlink(), clearing the LinkedIoChannels. The child worker's IoContext still holds raw pointers (via NullDisposer) to the WorkerService as its IoChannelFactory and LimitEnforcer, so the next I/O operation (e.g. an RPC callback to the parent) dereferences freed memory → SIGSEGV or SIGBUS. This remains 100% reproducible on current main using the reproduction from cloudflare#6441 (@cloudflare/codemode DynamicWorkerExecutor). Two additional fixes, both in WorkerLoaderNamespace: - SubrequestChannelImpl::startRequestImpl(): Attach kj::addRef(*this) to the returned WorkerInterface, keeping the SubrequestChannelImpl (and thus WorkerStubImpl and WorkerService) alive for the full request duration. This is the fix for the synchronous startup path that cloudflare#6547 did not address. - ActorClassImpl::whenReady(): Replace raw `[this]` capture with `[self = kj::addRef(*this)]` — same pattern as the SubrequestChannelImpl fix from cloudflare#6547, applied to the actor class deferred startup path. ## Reproduction Requires `@cloudflare/codemode` and `wrangler`: ```json // package.json { "dependencies": { "@cloudflare/codemode": "^0.3.2", "wrangler": "^4.77.0" } } ``` ```jsonc // wrangler.jsonc { "name": "repro", "main": "src/index.ts", "compatibility_date": "2025-06-01", "compatibility_flags": ["nodejs_compat"], "worker_loaders": [{ "binding": "LOADER" }] } ``` ```ts // src/index.ts import { DynamicWorkerExecutor, resolveProvider } from '@cloudflare/codemode'; interface Env { LOADER: ConstructorParameters<typeof DynamicWorkerExecutor>[0]['loader']; } export default { async fetch(request: Request, env: Env) { const executor = new DynamicWorkerExecutor({ loader: env.LOADER, timeout: 30_000 }); const tools = { get_items: async () => Array.from({ length: 112 }, (_, i) => ({ id: `item_${i}`, name: `Item ${i}`, memo: 'x'.repeat(220), })), }; for (let i = 0; i < 6; i++) { const result = await executor.execute( `async () => { return await codemode.get_items(); }`, [resolveProvider({ name: 'codemode', tools })] ); if (result.error) return Response.json({ round: i, error: result.error }, { status: 500 }); } return Response.json({ ok: true }); }, }; ``` Then: `wrangler dev` and `curl http://localhost:8787` → segfault every time. To test a local workerd build against this reproduction: MINIFLARE_WORKERD_PATH=bazel-bin/src/workerd/server/workerd wrangler dev Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Follow-up to cloudflare#6547, which fixed the deferred startup path but missed two additional crash vectors for the same root cause (cloudflare#6441). startRequest()` for the case where `isolate->service == kj::none` (async startup not yet complete). However, the crash reported in cloudflare#6441 also reproduces on the synchronous startup path, and with the same pattern on `ActorClassImpl::whenReady()`. The core problem: when JS code chains temporary objects like loader.get(name, getCode).getEntrypoint().evaluate(args) V8 can GC the Fetcher mid-request. This destroys the SubrequestChannelImpl, which releases its Rc<WorkerStubImpl>, which triggers WorkerStubImpl::unlink() → WorkerService::unlink(), clearing the LinkedIoChannels. The child worker's IoContext still holds raw pointers (via NullDisposer) to the WorkerService as its IoChannelFactory and LimitEnforcer, so the next I/O operation (e.g. an RPC callback to the parent) dereferences freed memory → SIGSEGV or SIGBUS. This remains 100% reproducible on current main using the reproduction from cloudflare#6441 (@cloudflare/codemode DynamicWorkerExecutor). Two additional fixes, both in WorkerLoaderNamespace: - SubrequestChannelImpl::startRequestImpl(): Attach kj::addRef(*this) to the returned WorkerInterface, keeping the SubrequestChannelImpl (and thus WorkerStubImpl and WorkerService) alive for the full request duration. This is the fix for the synchronous startup path that cloudflare#6547 did not address. - ActorClassImpl::whenReady(): Replace raw `[this]` capture with `[self = kj::addRef(*this)]` — same pattern as the SubrequestChannelImpl fix from cloudflare#6547, applied to the actor class deferred startup path. Requires `@cloudflare/codemode` and `wrangler`: ```json // package.json { "dependencies": { "@cloudflare/codemode": "^0.3.2", "wrangler": "^4.77.0" } } ``` ```jsonc // wrangler.jsonc { "name": "repro", "main": "src/index.ts", "compatibility_date": "2025-06-01", "compatibility_flags": ["nodejs_compat"], "worker_loaders": [{ "binding": "LOADER" }] } ``` ```ts // src/index.ts import { DynamicWorkerExecutor, resolveProvider } from '@cloudflare/codemode'; interface Env { LOADER: ConstructorParameters<typeof DynamicWorkerExecutor>[0]['loader']; } export default { async fetch(request: Request, env: Env) { const executor = new DynamicWorkerExecutor({ loader: env.LOADER, timeout: 30_000 }); const tools = { get_items: async () => Array.from({ length: 112 }, (_, i) => ({ id: `item_${i}`, name: `Item ${i}`, memo: 'x'.repeat(220), })), }; for (let i = 0; i < 6; i++) { const result = await executor.execute( `async () => { return await codemode.get_items(); }`, [resolveProvider({ name: 'codemode', tools })] ); if (result.error) return Response.json({ round: i, error: result.error }, { status: 500 }); } return Response.json({ ok: true }); }, }; ``` Then: `wrangler dev` and `curl http://localhost:8787` → segfault every time. To test a local workerd build against this reproduction: MINIFLARE_WORKERD_PATH=bazel-bin/src/workerd/server/workerd wrangler dev Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

git-bruh requested review from a team as code owners April 10, 2026 05:41

ryanking13 approved these changes Apr 10, 2026

View reviewed changes

tewaro approved these changes Apr 10, 2026

View reviewed changes

ensure worker stub subrequest channel is kept alive until internal st…

81651ae

…artRequest call

git-bruh force-pushed the pkhanna/fix-worker-loader-crash branch from 11c9e1e to 81651ae Compare April 10, 2026 07:15

git-bruh enabled auto-merge (squash) April 10, 2026 07:16

git-bruh merged commit 7fa78f7 into main Apr 10, 2026
33 of 34 checks passed

git-bruh deleted the pkhanna/fix-worker-loader-crash branch April 10, 2026 08:09

airhorns mentioned this pull request Apr 10, 2026

Fix remaining use-after-free in dynamic worker loading (worker_loaders) #6553

Merged

kentonv mentioned this pull request Apr 10, 2026

🐛 Bug Report — Worker Loader: "tried to defer destruction during isolate shutdown" crash during ephemeral isolate teardown #6506

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ensure worker stub subrequest channel is kept alive until internal startRequest call#6547

ensure worker stub subrequest channel is kept alive until internal startRequest call#6547
git-bruh merged 1 commit intomainfrom
pkhanna/fix-worker-loader-crash

git-bruh commented Apr 10, 2026

Uh oh!

ask-bonk bot commented Apr 10, 2026

Uh oh!

codspeed-hq bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

git-bruh commented Apr 10, 2026

Uh oh!

ask-bonk bot commented Apr 10, 2026

Uh oh!

codspeed-hq bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Footnotes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codspeed-hq bot commented Apr 10, 2026 •

edited

Loading