What I observed is that sometimes clients get out of sync and start reading the result of the previous query. This doesn't happen deterministically but it's roughly this sequence:
1. A query will timeout.
2. A cancel request is sent due to timeout.
3. The query eventually completes.
4. A new query is send and the response is a plain `ReadyForQuery` (result is not there).
5. The next query receives the result of the previous query.
I wrote this java console app that can reproduce the issue. However, I couldn't get it to fail with local pgbouncer and postgres. It's only reproducing when pgbouncer and postgres are on separate boxes with arround 0.3ms latency.
This is what the client, using pgjdbc, is reading off the buffer (left). You can see the ReadyForQuery being received in the middle of a query being executed, as a result of the CancelRequest (sent on a separate connection). On the right, is what pgbouncer is observing.
This only happening when hitting pgbouncer. It didn't repro when connecting to postgres directly.
CancelRequests are sent through a separate connection on the client. Pgbouncer handles them by trying to find the original connection that the CancelRequest is referring to and then it might decide to send a ReadyForQuery message on that connection. It only does this if it considers that client connection not linked.
I find the behaviour pgbouncer exhibits for CancelRequest not inline with the postgres protocol documentation and may be breaking the correctness of the protocol.
I've commented this line in pgbouncer and the problem went away.
So it is a race condition that's making pgbouncer think that the client connection is not linked as shown in this diagram.
It became more clear that this has been a problem in the past when I saw this fix. But now I realise that it only fixed the problem for a subset of cases where the clients were lucky enough to have their message received by pgbouncer. All other cases where the request has been sent by the client but not received the pgbouncer are still broken. And I think this problem cannot be fixed.
The question is, why was this behaviour introduced and what is it trying to optimise?
The text was updated successfully, but these errors were encountered:
What I observed is that sometimes clients get out of sync and start reading the result of the previous query. This doesn't happen deterministically but it's roughly this sequence:
I wrote this java console app that can reproduce the issue. However, I couldn't get it to fail with local pgbouncer and postgres. It's only reproducing when pgbouncer and postgres are on separate boxes with arround 0.3ms latency.
This is what the client, using pgjdbc, is reading off the buffer (left). You can see the
ReadyForQuery
being received in the middle of a query being executed, as a result of theCancelRequest
(sent on a separate connection). On the right, is what pgbouncer is observing.This only happening when hitting pgbouncer. It didn't repro when connecting to postgres directly.
CancelRequests
are sent through a separate connection on the client. Pgbouncer handles them by trying to find the original connection that theCancelRequest
is referring to and then it might decide to send aReadyForQuery
message on that connection. It only does this if it considers that client connection not linked.I find the behaviour pgbouncer exhibits for
CancelRequest
not inline with the postgres protocol documentation and may be breaking the correctness of the protocol.I've commented this line in pgbouncer and the problem went away.
So it is a race condition that's making pgbouncer think that the client connection is not linked as shown in this diagram.
It became more clear that this has been a problem in the past when I saw this fix. But now I realise that it only fixed the problem for a subset of cases where the clients were lucky enough to have their message received by pgbouncer. All other cases where the request has been sent by the client but not received the pgbouncer are still broken. And I think this problem cannot be fixed.
The question is, why was this behaviour introduced and what is it trying to optimise?
The text was updated successfully, but these errors were encountered: