Hi, I ran into a worker crash loop using pg through pg-boss and pg.Pool on Node 20. It looks like there may be a connect-time raw socket error window before pg normal error handling is attached.
What I observed
- periodic worker crashes with uncaught ECONNRESET
- the crash signature consistently looked like a raw Socket with:
- bytesRead: 0
- bytesWritten: 20
- no socket error listeners attached
- this looked like a brand new PostgreSQL connection attempt that reset before the usual pg client error path was ready
Why I think this belongs in pg rather than pg-boss
- the failure appears to happen during raw socket connect, before a client is fully established
- in my app, adding listener coverage at the pg-boss or pool-client level was not sufficient
- pg@8.11.5 lib/stream.js creates the socket with new net.Socket()
- pg@8.11.5 lib/connection.js calls stream.connect(...) before attaching stream.on("error", reportStreamError)
What I tried
- adding listener coverage after pool or client connect was not sufficient
- I also tried a local pg patch in this area, but I was not able to make it sufficient in my environment
- the mitigation that consistently stopped the crash loop was a worker-level guard that suppresses only raw-socket ECONNRESET when listenerCount("error") === 0
Relevant observed signature
- raw Socket
- ECONNRESET
- bytesRead: 0
- bytesWritten: 20
Environment
- Node 20
- pg 8.11.5
- usage via pg-boss / pg.Pool
Possible fix
One possible direction would be to protect the raw socket before connect() can emit an error, for example by either:
- attaching the socket error listener before calling stream.connect(...) in pg/lib/connection.js
- or ensuring sockets created in pg/lib/stream.js have temporary early error handling until the normal pg connection error path is attached
Questions
- is this a known connect-time race in pg?
- is there a recommended way to ensure the raw socket is protected before connect() can emit an error?
- would maintainers be open to a change here, or is there an existing fix or workaround I missed?
I can provide a proposed patch and more detailed logs if helpful.
Hi, I ran into a worker crash loop using pg through pg-boss and pg.Pool on Node 20. It looks like there may be a connect-time raw socket error window before pg normal error handling is attached.
What I observed
Why I think this belongs in pg rather than pg-boss
What I tried
Relevant observed signature
Environment
Possible fix
One possible direction would be to protect the raw socket before connect() can emit an error, for example by either:
Questions
I can provide a proposed patch and more detailed logs if helpful.