Skip to content

LISTEN/NOTIFY sometimes drops without recovery #541

Open
@psteinroe

Description

@psteinroe

Summary

We are using Graphile Worker in production for a while now, more specifically this PR (#474). It works remarkably well. However, every few months the LISTEN/NOTIFY connection seems to drop.

Our logs show ECONNREFUSED at about the time this starts.

ERROR: Failed during pool sweep (migrationNumber=19): Error: connect ECONNREFUSED
...
ERROR: Failed to update heartbeat for pool pool-7699a0aba218dd3402: Error: connect ECONNREFUSED

After that, it is pretty obvious from the log frequency that we only poll every few minutes and not using LISTEN/NOTIFY anymore.

(upload not working, retrying in a few minutes)

The "fix" is to simply restart.

Steps to reproduce

Not really sure, just happens irregularly. Sorry!

Expected results

A worker should recover.

Actual results

LISTEN/NOTIFY never recovers.

Additional context

Postgres v15
Node 20

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions