Skip to content

Promise resolved from net.Socket 'data' listener doesn't wake the await site #75

@proggeramlug

Description

@proggeramlug

Summary

Handshake completes end-to-end — net.Socket 'connect', 'data' events fire, our dispatcher processes every frame — and the Promise resolver (startupGate.resolve(conn)) is invoked on the correct object, but the await connect(...) site in async function main() never resumes. Process hangs indefinitely.

Reproduces on 0.5.87, 0.5.88, 0.5.89, 0.5.91 — predates the asm-sideeffect fix in #74.

Repro — full driver path

Repo: https://github.com/PerryTS/postgresexamples/cold-start-minimal.ts:

import { connect } from '../src';

async function main(): Promise<void> {
    const conn = await connect({
        host: '127.0.0.1', port: 55432,
        user: 'perch_test', database: 'perch_test',
    });
    await conn.query('SELECT 1');
}
main().then(() => process.exit(0)).catch(() => process.exit(1));

Against a real Postgres on 127.0.0.1:55432. Bun and Node complete this in <50ms. Perry hangs forever.

What the traces show

With console.log traces inserted into the driver's event dispatch path, on Perry:

[trace] onSocketConnect id=1
[trace] sendStartup writing
[trace] sendStartup written
[trace] onSocketData id=1 len=432
[trace] onSocketData frames.length=17
[trace] handleFrame type=82 paylen=4       # AuthenticationOk
[trace] handleFrame type=83 paylen=19      # ParameterStatus × 15
[trace] handleFrame type=83 paylen=21
...
[trace] handleFrame type=83 paylen=21
[trace] handleFrame type=75 paylen=8       # BackendKeyData
[trace] handleFrame type=90 paylen=1       # ReadyForQuery — auth+setup done
(hangs here)

Inside handleReadyForQuery, we pull the resolver local and call it:

const rs = st.startupGate.resolve;     // never been observed to be null
st.startupGate = null;
rs(st.connection);

No exception is thrown. The trace after rs(...) in handleReadyForQuery executes. But the await connect(...) site in main never resumes — no subsequent traces, no conn.query(...) call, no progress. Process sits idle indefinitely.

Narrower repro (doesn't need Postgres but incomplete)

async function main(): Promise<void> {
    const result = await new Promise<string>((resolve) => {
        setTimeout(() => resolve('from-timer'), 100);
    });
    console.log(result);
}
main();

This works on 0.5.91 — prints from-timer. So a Promise resolved from setTimeout does wake the await. The breaking case is resolving from inside a net.Socket 'data' event handler dispatched through the async stdlib task.

I suspect the issue is specific to how the async bridge (crates/perry-stdlib/src/common/async_bridge.rs) queues the callback back to the main thread — maybe the JSValue is being created on a tokio worker thread, crossing the thread-local-arena boundary noted in the comment:

// IMPORTANT: perry-runtime uses thread-local arenas for memory allocation.
// This means JSValue objects created on tokio worker threads will be allocated
// from a different arena than the main thread, causing memory corruption.

The resolve value (st.connection) is a JSValue object that was created on the main thread, but maybe the resolve() call itself is happening on the tokio thread via the socket event dispatch path.

Repro environment

  • Perry 0.5.91 (also 0.5.87–0.5.89)
  • macOS 26.4
  • Postgres 16 on localhost:55432, perch_test user, trust auth

Impact

Blocks meaningful per-query benchmarking of the @perry/postgres driver on Perry. Bun+Node bench numbers are current; Perry numbers can't be collected until this resolves.

Happy to add more trace output or test candidate fixes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions