Skip to content

Host function callbacks can deadlock when calling back into the sandbox #192

@simongdavies

Description

@simongdavies

Problem

When a host function callback (registered via registerHostFunction or setHostPrintFn) tries to call back into the same sandbox (e.g. callHandler, snapshot, restore, unload), it deadlocks.

This happens because call_handler holds the LoadedJSSandbox mutex for the entire duration of guest execution. Host functions are dispatched via TSFN to the Node.js main thread while that lock is held. If the callback then calls any method that needs the same lock, it waits forever.

Why this doesn't happen in core hyperlight

In hyperlight-dev/hyperlight, the host function registry (Arc<Mutex<FunctionRegistry>>) uses a separate lock from the sandbox. Host functions are dispatched synchronously while the VM is paused — they don't need the sandbox lock at all. See src/hyperlight_host/src/sandbox/outb.rs.

In hyperlight-js, the QuickJS runtime invokes host function closures inside handle_event, which requires &mut self on the sandbox. The NAPI layer wraps this in a single tokio::sync::Mutex, so host function dispatch and sandbox lifecycle share the same lock.

Current workaround

PR #55 adds an executing_flag (AtomicBool) that detects reentrancy at runtime. If a callback tries to acquire the lock while guest code is executing, it returns ERR_REENTRANT instead of deadlocking. This prevents hangs but doesn't allow the operation to succeed.

Suggested fix

Separate host function dispatch from the sandbox lock, similar to how core hyperlight does it. Options:

  1. Move host function state out of the &mut self borrow so callbacks don't need the sandbox lock
  2. Temporarily release the sandbox lock before dispatching to host functions, reacquire after
  3. Provide a shared FFI/binding helper crate that handles this pattern correctly for any language binding

Reproduction

const loaded = await sandbox.getLoadedSandbox();

proto.registerHostModule('mymod', (mod) => {
  mod.registerHostFunction('callback', async () => {
    // This deadlocks (or returns ERR_REENTRANT with the fix)
    await loaded.callHandler('other_handler', {});
    return 'result';
  });
});

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinglifecycle/needs-reviewThe issue has not yet been reviewed.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions