V8 procedures: async/await support for blocking host calls

## Summary

V8 procedure syscalls like `fetch()` block the V8 thread for their full duration, forcing the runtime to spawn additional instances for concurrent requests — each with its own OS thread and V8 isolate. This proposal adds async/await support for procedures so they can yield at host calls, multiplexing multiple in-flight procedures on a single V8 worker. Synchronous procedures continue to work unchanged; module authors opt in to async at their own pace.

## Motivation

Blocking procedure syscalls (`fetch()`) in the V8 runtime park the V8 thread for their full duration — up to 30s per request. When another operation arrives while an instance is blocked, `ModuleInstanceManager` creates a new V8 instance — spawning an OS thread, allocating a V8 isolate, and recompiling the module. The pool never shrinks, so instances accumulate at peak load.

For example, an LLM chat app where each message triggers a procedure calling an LLM API (30-60s response time). With 10 concurrent users, that's 10 blocked instances each holding an OS thread and V8 heap, plus additional instances for interleaved work. This grows with every concurrent long-running request and never reclaims.

PR #4663 addresses this for reducers, views, and lifecycle callbacks by moving them to a single-worker FIFO lane (`JsInstanceLane`). Procedures are deliberately left on the old pool because they block on `rt.block_on()`. V8 has native async/await and there's exactly one guest language to support, so async procedures are a natural fit. A file-by-file analysis of what the implementation would involve accompanies this issue (see comments below), along with a verification checklist, to help expedite this work.

## How the WASM runtime already solves this

The WASM runtime doesn't have this problem. Both runtimes call the same `instance_env.http_request()`, but the paths diverge at the syscall layer:

```rust
// WASM path (wasmtime/wasm_instance_env.rs) — yields via async host function
let result = async { env.instance_env.http_request(request, body)?.await }.await;

// V8 path (v8/syscall/common.rs) — blocks the thread
let (response, body) = rt.block_on(env.instance_env.http_request(request, body)?)?;
```

The WASM runtime uses `SingleCoreExecutor` backed by a `tokio::task::LocalSet` — a single-threaded async executor where multiple tasks are multiplexed cooperatively. When a procedure yields at an async host function, the executor polls other tasks. Wasmtime bridges synchronous WASM guest code to async host functions via stack switching.

V8 doesn't have stack switching, but it doesn't need it — native async/await serves the same purpose. #4663 brings the reducer path closer to this model (single worker, FIFO queue) but doesn't add the async yielding that would let procedures share the worker.

## What it looks like to module authors

```typescript
// Synchronous procedure — works as today, blocks V8 thread
(ctx) => {
  const resp = ctx.http.fetch(url);
  return resp.text();
}

// Async procedure — V8 thread is free during await
async (ctx) => {
  const resp = await ctx.http.fetch(url);
  return resp.text();
}
```

Both forms coexist. Synchronous procedures use the existing `rt.block_on()` path unchanged. Async procedures yield at `await` points. Module authors adopt `async` at their own pace — no migration required. Reducers remain synchronous and never yield, same as the WASM runtime.

The runtime detects async functions at registration time via `fn.constructor.name === 'AsyncFunction'` and automatically selects the async execution path, providing `AsyncProcedureCtx` (where `fetch()` returns `Promise<Response>` instead of `SyncResponse`). No explicit flag is needed:

```typescript
export const myProc = spacetimedb.procedure(
  { url: t.string() },
  t.string(),
  async (ctx, { url }) => {
    const resp = await ctx.http.fetch(url);
    return resp.text();
  }
);
```

## `withTx` vulnerability (ships independently)

The current `withTx<T>(body: (ctx) => T): T` signature silently accepts async callbacks — TypeScript infers `T = Promise<X>`, the call type-checks, and the transaction commits before the awaited body runs. This is a latent data corruption path that exists today.

Fix: a conditional type `T extends Promise<any> ? never : T` rejects async callbacks at compile time, plus a runtime thenable check as defense-in-depth. This can ship independently of async procedures.

## Key implementation areas

The changes needed have been traced through the codebase. The full file-by-file analysis and verification checklist are in the comments below. Here's a summary of the key areas:

**Event loop.** The synchronous `for request in request_rx.iter()` loop in `spawn_instance_worker()` becomes an async event loop with `tokio::task::LocalSet`, `FuturesUnordered` for in-flight async futures, and `v8::MicrotasksPolicy::kExplicit` to control when Promise resolutions propagate. The scheduling priority (request channel biased, with non-blocking `try_next` to prevent future starvation) ensures reducers aren't delayed by async procedure work.

**Per-call state.** `InstanceEnv` and `JsInstanceEnv` store per-call mutable state (`start_time`, `func_name`, `tx`, `iters`, `call_times`, `timing_spans`) in single slots. With multiple procedures in-flight, these move into a `HashMap<CallId, CallContext>`. Syscall handlers access the active call's state via `env.current_call()`.

**Promise-returning syscalls.** `procedure_http_request` branches on `env.current_call().is_async`: the sync path uses `rt.block_on()` unchanged, the async path creates a `v8::PromiseResolver`, stashes it with the Rust future, and returns the Promise to JS. When the future completes, the event loop resolves the Promise and runs a microtask checkpoint. This pattern also unlocks future streaming syscalls.

**`__call_procedure__` ABI.** Returns `Promise<Uint8Array>` for async procedures instead of `Uint8Array`. The Rust side branches on the `isAsync` metadata from registration; `.now_or_never()` is removed for async calls.

**Host-side dispatch.** #4663 already has per-request oneshot reply channels and a cloneable `JsInstance`. The remaining change is routing procedures through the lane worker instead of `procedure_instances`, with a semaphore-based max-in-flight limit (e.g. 100).

**Trap handling.** #4663's `replace_active_if_current()` mechanism handles traps. The addition is failing all in-flight async procedures' reply channels before replacing the worker.

**Safety.** A starvation watchdog (`terminate_execution()` after 30s without yielding) prevents runaway JS. A wall-clock timeout (fixed 5-minute default) catches never-resolving Promises. Both use hard cancellation — graceful cancellation (AbortSignal, cleanup deadlines) can follow later.

## Suggested incremental path

Steps 1-2 are independent of #4663 and could be contributed immediately. Steps 3-6 build on #4663's single-worker lane architecture.

1. **Pool shrinking** (independent): idle timeout on `ModuleInstanceManager` so instances are reclaimed after peak load.
2. **`withTx` hardening** (independent): conditional type + runtime thenable check. Fixes a latent bug that exists today.
3. **Event loop + per-call state**: async event loop infrastructure + `CallContext` map. No behavior change yet — all operations still synchronous.
4. **Route procedures through lane worker**: remove `procedure_instances`, add `FuturesUnordered` + max-in-flight semaphore.
5. **Promise-returning syscalls + async procedures**: the actual feature — `AsyncProcedureCtx`, Promise-based `fetch()`, `__call_procedure__` ABI branching.
6. **Starvation watchdog + wall-clock timeout**: safety mechanisms.

## Relationship to open PRs

- **#4663** (`joshua/v8-perf`): architectural dependency. Async procedures extend #4663's single-worker lane to cover procedures.
- **#4684** (`joshua/v8-heap-metrics`): complementary. Isolate rotation is orthogonal but more valuable with increased in-flight state from async procedures.
- **#4636** (`gefjon/http-routes`): future beneficiary. HTTP route procedures go through the same `call_procedure` path.

## Scope considerations

The following are intentionally left out of the core proposal to keep scope manageable. They're natural follow-ups once the foundation is in place:

- **Graceful cancellation (AbortSignal, cleanup deadlines)** — let procedures catch cancellation and run cleanup before being killed.
- **Client disconnect cancellation** — cancel only the disconnecting client's procedures.
- **Module shutdown drain mode** — grace period for in-flight procedures during update/hotswap.
- **Scheduled procedure overlap** — repeat-scheduled async procedures overlapping instead of running serially. Breaking semantic change, needs its own discussion.
- **Overflow workers** — spawn additional workers when in-flight count exceeds threshold.
- **Per-procedure timeout configuration** — custom timeouts instead of the fixed 5-minute default.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

V8 procedures: async/await support for blocking host calls #4697

Summary

Motivation

How the WASM runtime already solves this

What it looks like to module authors

`withTx` vulnerability (ships independently)

Key implementation areas

Suggested incremental path

Relationship to open PRs

Scope considerations

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

V8 procedures: async/await support for blocking host calls #4697

Description

Summary

Motivation

How the WASM runtime already solves this

What it looks like to module authors

withTx vulnerability (ships independently)

Key implementation areas

Suggested incremental path

Relationship to open PRs

Scope considerations

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`withTx` vulnerability (ships independently)