-
Notifications
You must be signed in to change notification settings - Fork 150
Description
Description
Occasionally, PostgreSQL connections get permanently stuck in state=active, wait_event_type=Client, wait_event=ClientRead after client deployments. These connections never recover — statement_timeout does not apply to ClientRead waits, so the only remedy is manual termination via pg_terminate_backend.
Observed behaviour
After deploying an application that connects through pgdog, often we observe stuck connections in pg_stat_activity like:
state: active
wait_event_type: Client
wait_event: ClientRead
query: SELECT "customers".* FROM "customers" WHERE ...
query_start: 2026-02-27 15:39:26.121194+00
state_change: 2026-02-27 15:39:26.121197+00
Key observation: query_start and state_change are 3 microseconds apart. This means PostgreSQL received a Parse message (which set the query text and query_start), then immediately entered ClientRead — it never received the corresponding Bind/Execute/Sync messages.
These connections remain stuck indefinitely (we've observed >1 hour). They hold locks and cause cascading blocking across the database.
Environment
- pgdog used as a connection pooler in front of PostgreSQL
- pooler_mode = "transaction"
- query_parser = "off"
- Issue appears often after application deployments (when TCP connections to pgdog are abruptly terminated)
- Does not reproduce when the same application connects via pgbouncer
Suspected root cause
Server::send() in src/backend/server.rs iterates over messages in a ClientRequest, calling send_one() for each. Each send_one() call writes to a BufStream with a default capacity of 4096 bytes (net_buffer config).
When PreparedStatements::handle() returns HandleResult::Prepend (re-preparing a statement on a different server connection), an additional Parse message is prepended to the outgoing sequence. This increases the total byte count of the message sequence.
If the total serialised size exceeds the BufStream write buffer capacity (4096 bytes), BufStream will auto-flush its internal buffer to the underlying TCP socket during a write_all call. This means part of the extended protocol sequence (e.g. the Parse message) can be sent to PostgreSQL before the remaining messages (Bind/Execute/Sync) have been written to the buffer.
If a subsequent send_one() call then fails (e.g. due to a network error), the ? operator exits the loop before flush() is called. PostgreSQL has already received the Parse but will never receive Bind/Execute/Sync, leaving it waiting in ClientRead forever.
Even without an explicit error, the partial flush creates a window where the protocol sequence is split across multiple TCP writes, which could interact with connection failures during deployment.
Suggested fix
Pre-serialise all messages (including any prepended prepared statement re-preparations) into a single byte buffer before writing anything to the stream. This ensures the complete extended protocol sequence is delivered atomically — either all messages reach PostgreSQL or none do.
Secondary concern: inline response loop doesn't monitor client socket
As a defence-in-depth measure, client_server_exchange() in src/frontend/client/query_engine/query.rs reads server responses in a tight loop without monitoring the client socket:
while self.backend.has_more_messages() && !self.backend.in_copy_mode() && !self.streaming {
let message = self.read_server_message().await?;
self.process_server_message(context, message).await?;
}
If the client disconnects during this loop, pgdog won't detect it until the loop completes. For long-running queries, this means the server connection remains occupied even though the client is gone. Adding a select! with a client socket check (using the existing Stream::check() method) would allow pgdog to detect client disconnections promptly and cancel the server-side query.