Handle race condition between cancel requests and server reuse #717

JelteF · 2022-05-12T09:50:43Z

Cancel requests happen out of band with regular queries. Because
PgBouncer reuses server connections across clients its possible to get
race conditions here if you're not careful.

Without this patch there were two possibilities for a race condition
where a cancel request for client A could cancel a query that was sent
by client B. These race conditions happened when:

A cancel request on client A was forwarded to the server only after
the server was reused by another client.
A cancel request for client A was already forwarded to the server,
but not yet handled by the server and then the server was reused for
client B.

The first race conditions was especially likely if connection
establishment takes a while, for example because of TLS roundtrips.
Because a new connection needs to be established for every forwarded
cancellation request. While establishing this connection the query
running on the server might finish by itself already, and then the
server could be reused by another already waiting client.

To resolve these race conditions the following two fixes are implemented:

We now only forward the cancel request if the server connection over
which the query was sent hasn't become idle or disconnected in the
meantime. This is done by doubly-linking cancel requests and the
original server connection that was used for the query.
Once a cancel request is forwarded to the server, we do not allow reuse of
the server until the cancel request completes. This required tracking
the response of the server for cancel requests too, instead of
closing the connection right after forwarding the cancel.

Fixes #544

In passing this also adds more comprehensive comments to the different
StatLists in the PgPool struct. Because understanding how the state
machine worked was the part of this PR that took me the longest.

x4m · 2022-07-18T14:28:16Z

I'm not really familiar with the PgBouncer's code, but in Odyssey we generate cancelation key for each client. So client cannot cancel someone else's server connection. Does this resolve the issue?

Anyway the behavior of cancels is somewhat problematic. Sometimes server's CPU is starving due to large amount of fork()s. Clients start to cancel their queries. But cancels produce more fork()s! And ISTM this PR addressess the issue, doesn't it?

JelteF · 2022-07-18T17:01:19Z

@x4m PgBouncer also generates unique cancellation tokens per client. So, that alone won't protect you against the same issue in Odyssey.

To clarify the issue: These unique cancellation tokens, when received by the connection pooler, are handled by by sending a cancellation token to the actual Postgres server. This cancellation token should be the one that matches the postgres connection on which the query of the client is currently being sent. The problem is that while sending this cancellation token to Postgres, the query on that connection might finish, and the postgres connection might be reused for a query from another client. If that's possible in Odyssey then you almost certainly have the same bug.

One easy thing to try is run the test I wrote for this PR against Odyssey and see what happens.

Anyway the behavior of cancels is somewhat problematic. Sometimes server's CPU is starving due to large amount of fork()s. Clients start to cancel their queries. But cancels produce more fork()s! And ISTM this PR addressess the issue, doesn't it?

I'm not sure why you think this PR addresses the issue. It's definitely not the purpose of the PR, but it might reduce cancellation load as a side-effect.

include/bouncer.h

src/objects.c

sumedhpathak · 2022-07-29T16:40:49Z

The other comment we discussed was to add this behind a feature flag, so we could toggle this off in case there is an issue.

include/bouncer.h

sumedhpathak

LGTM!

Cancel requests happen out of band with regular queries. Because PgBouncer reuses server connections across clients its possible to get race conditions here if you're not careful. Without this patch there were two possibilities for a race condition where a cancel request for client A could cancel a query that was sent by client B. These race conditions happened when: 1. A cancel request on client A was forwarded to the server only after the server was reused by another client. 2. A cancel request for client A was already forwarded to the server, but not yet handled by the server and then the server was reused for client B. The first race conditions was especially likely if connection establishment takes a while, for example because of TLS roundtrips. Because a new connection needs to be established for every forwarded cancellation request. While establishing this connection the query running on the server might finish by itself already, and then the server could be reused by another already waiting client. To resolve these race conditions the following two fixes are implemented: 1. We now only forward the cancel request if the server connection over which the query was sent hasn't become idle or disconnected in the meantime. This is done by doubly-linking cancel requests and the original server connection that was used for the query. 2. Once a cancel request is forwarded to the server, we do not allow reuse of the server until the cancel request completes. This required tracking the response of the server for cancel requests too, instead of closing the connection right after forwarding the cancel . Fixes pgbouncer#544

according pgbouncer/pgbouncer#717 Signed-off-by: Pascal Bourdier <pascal.bourdier@gmail.com>

Fixes a bug introduced by pgbouncer#717 Fixes pgbouncer#801

Fixes a bug introduced by #717 Related to #801 (and hopefully fixes)

Cancellations and regular queries are quite different. So it makes sense to have different (default) timeouts for cancellation requests than for queries. While I already think the default query_wait_timeout of 120 seconds we have is excessively long, it doesn't make sense at all to have such a long timeout by default for cancellation requests that are not sent. So this introduced a new `cancel_wait_timeout` which is set to 10 seconds by default. If the cancellation request cannot be forwarded within that time it's probably better to drop it. One reason why it's important to relatively quickly close cancellation requests when they are not being handled is because libpq only allows the client to send a cancel in a blocking way. So there's no possibility to bail out after a certain amount of seconds on the client side, if PgBouncer keeps the connection to the client open. Both me and Marco noticed this when testing my PgBouncer peering PR (pgbouncer#666), when an entry in the `[peers]` list pointed to an unreachable host. That meant that pgbouncer would keep the cancel connection open, and even though the query had finished the `psql` would not display the result because it was stuck waiting for a result to the cancel request. To avoid race conditions of cancellations that already forwarded, but not yet answered cancellation requests are not timed out. See pgbouncer#717 for details on such race conditions.

Cancellations and regular queries are quite different. So it makes sense to have different (default) timeouts for cancellation requests than for queries. While I already think the default query_wait_timeout of 120 seconds we have is excessively long, it doesn't make sense at all to have such a long timeout by default for cancellation requests that are not sent. So this introduced a new `cancel_wait_timeout` which is set to 10 seconds by default. If the cancellation request cannot be forwarded within that time it's probably better to drop it. One reason why it's important to relatively quickly close cancellation requests when they are not being handled is because libpq only allows the client to send a cancel in a blocking way. So there's no possibility to bail out after a certain amount of seconds on the client side, if PgBouncer keeps the connection to the client open. Both me and Marco noticed this when testing my PgBouncer peering PR (#666), when an entry in the `[peers]` list pointed to an unreachable host. That meant that pgbouncer would keep the cancel connection open, and even though the query had finished the `psql` would not display the result because it was stuck waiting for a result to the cancel request. To avoid race conditions of cancellations that already forwarded, but not yet answered cancellation requests are not timed out. See #717 for details on such race conditions.

In most cases the postgres server will not respond anything back to a cancel message. It simply closes the connection. It won't even report an error if the cancellation key is incorrect, or the PID is unknown. However, it is possible that it sends another type of error message, such as an OOM. Libpq actually ignores those errors and won't show them to the user, but they might still be sent by the server. Because of the changes in pgbouncer#717 this could cause a FATAL error. This fixes that by now handling `SV_ACTIVE_CANCEL` the same as `SV_ACTIVE` in `server_proto`. Fixes pgbouncer#847 Related to pgbouncer#717, pgbouncer#801, and pgbouncer#815

In most cases the postgres server will not respond anything back to a cancel message. It simply closes the connection. It won't even report an error if the cancellation key is incorrect, or the PID is unknown. However, it is possible that it sends another type of error message, such as an OOM. Libpq actually ignores those errors and won't show them to the user, but they might still be sent by the server. Because of the changes in #717 this could cause a FATAL error. This fixes that by now handling `SV_ACTIVE_CANCEL` the same as `SV_ACTIVE` in `server_proto`. Fixes #847 Related to #717, #801, and #815

…cer#849) In most cases the postgres server will not respond anything back to a cancel message. It simply closes the connection. It won't even report an error if the cancellation key is incorrect, or the PID is unknown. However, it is possible that it sends another type of error message, such as an OOM. Libpq actually ignores those errors and won't show them to the user, but they might still be sent by the server. Because of the changes in pgbouncer#717 this could cause a FATAL error. This fixes that by now handling `SV_ACTIVE_CANCEL` the same as `SV_ACTIVE` in `server_proto`. Fixes pgbouncer#847 Related to pgbouncer#717, pgbouncer#801, and pgbouncer#815 (cherry picked from commit c9aa1c6)

In most cases the postgres server will not respond anything back to a cancel message. It simply closes the connection. It won't even report an error if the cancellation key is incorrect, or the PID is unknown. However, it is possible that it sends another type of error message, such as an OOM. Libpq actually ignores those errors and won't show them to the user, but they might still be sent by the server. Because of the changes in #717 this could cause a FATAL error. This fixes that by now handling `SV_ACTIVE_CANCEL` the same as `SV_ACTIVE` in `server_proto`. Fixes #847 Related to #717, #801, and #815 (cherry picked from commit c9aa1c6)

@raymasson

The cancel request protocol is very simple: 1. Clients send a cancel key 2. Server cancels the query running on the matching backend 3. The server closes the socket to indicate the cancel request was handled At no point after sending the cancel key the client is required to send any data, and if it does it won't get a response anyway because the server would close the connection. So most clients don't, specifically all libpq based client don't. The Go client however, sends a Terminate ('X') packet before closing the cancel socket from its own side (usually due to a 10 second timeout). This would cause a crash in PgBouncer, because PgBouncer would still listen on the client socket, but on receiving any data we wouldreport a fatal error. This PR fixes that fatal error by stopping to listen on the client socket once we've received the cancel key (i.e. when we change the client state to CL_WAITING_CANCEL). I was able to reproduce both of the errors reported by @raymasson and confirmed that after this change the errors would not occur again. Reproducing this with libpq based clients is not possible, except when manually attaching to the client using `gdb` and manually running `send` on the cancel request its socket. So no test is added, since our test suite uses psycopg which uses libpq under the hood. Fixes pgbouncer#904 Related to pgbouncer#717

@raymasson

The cancel request protocol is very simple: 1. Clients send a cancel key 2. Server cancels the query running on the matching backend 3. The server closes the socket to indicate the cancel request was handled At no point after sending the cancel key the client is required to send any data, and if it does it won't get a response anyway because the server would close the connection. So most clients don't, specifically all libpq based client don't. The Go client however, sends a Terminate ('X') packet before closing the cancel socket from its own side (usually due to a 10 second timeout). This would cause a crash in PgBouncer, because PgBouncer would still listen on the client socket, but on receiving any data we wouldreport a fatal error. This PR fixes that fatal error by stopping to listen on the client socket once we've received the cancel key (i.e. when we change the client state to CL_WAITING_CANCEL). I was able to reproduce both of the errors reported by @raymasson and confirmed that after this change the errors would not occur again. Reproducing this with libpq based clients is not possible, except when manually attaching to the client using `gdb` and manually running `send` on the cancel request its socket. So no test is added, since our test suite uses psycopg which uses libpq under the hood. Fixes #904 Related to #717

@tnt-dev

Fixes #822 Related to #904 and #717 In #904 @tnt-dev reported to still get the issue in #822 on 1.19.1 (on which it was expected to be fixed). ``` 2023-08-12T16:34:07+00:00 pgbouncer[4920]: @src/server.c:591 in function server_proto(): server_proto: server in bad state: 11 2023-08-12T16:34:07+00:00 systemd[1]: pgbouncer.service: Main process exited, code=exited, status=1/FAILURE 2023-08-12T16:34:07+00:00 systemd[1]: pgbouncer.service: Failed with result 'exit-code'. 2023-08-12T16:34:07+00:00 systemd[1]: pgbouncer.service: Consumed 1h 51min 3.308s CPU time. 2023-08-12T16:34:07+00:00 pgbouncer[4920]: @src/server.c:591 in function server_proto(): server_proto: server in bad state: 11 2023-08-12T16:34:07+00:00 pgbouncer[4920]: started sending cancel request ``` Looking again, I now understand that the switch in question was missing the SV_BEING_CANCELED case. I now added it to the same case list that SV_IDLE was part of. Since SV_BEING_CANCELED is effectively the same state as SV_IDLE (except we don't allow reuse of the server yet). I did the same in a second switch case, for that one only warnings would have been shown instead of causing a fatal error. But still it seems good to avoid unnecessary warnings. One example where this fatal error might occur is when PgBouncer is still waiting for a response from the cancel request it sent to the server. But the query that's being canceled completes before that happens. This puts the server in SV_BEING_CANCELED. Then for whatever reason the server sends a NOTICE or ERROR message (still before the cancel request receives a response from Postgres). I was able to reproduce (and confirm that the patch resolves it), by adding some fake latency between PgBouncer and Postgres. And stopping Postgres while the server was in the SV_BEING_CANCELED state.

JelteF mentioned this pull request May 12, 2022

introduce a generation to a server connection #544

Closed

JelteF force-pushed the fix-cancel-race-condition branch 3 times, most recently from 1fa16c4 to 3891b64 Compare May 12, 2022 14:10

JelteF force-pushed the fix-cancel-race-condition branch 4 times, most recently from 293ff31 to bec7e42 Compare June 3, 2022 13:36

JelteF force-pushed the fix-cancel-race-condition branch from bec7e42 to d6bc244 Compare June 15, 2022 08:55

JelteF mentioned this pull request Jun 15, 2022

Add PgBouncer peering support #666

Merged

JelteF force-pushed the fix-cancel-race-condition branch from d6bc244 to 7d4ba39 Compare July 25, 2022 15:59

sumedhpathak reviewed Jul 28, 2022

View reviewed changes

include/bouncer.h Outdated Show resolved Hide resolved

sumedhpathak reviewed Jul 29, 2022

View reviewed changes

include/bouncer.h Outdated Show resolved Hide resolved

include/bouncer.h Show resolved Hide resolved

src/objects.c Outdated Show resolved Hide resolved

src/objects.c Outdated Show resolved Hide resolved

JelteF force-pushed the fix-cancel-race-condition branch 6 times, most recently from 1b6bf02 to a24d8bd Compare August 15, 2022 11:23

JelteF requested a review from sumedhpathak August 15, 2022 11:26

JelteF commented Aug 30, 2022

View reviewed changes

include/bouncer.h Outdated Show resolved Hide resolved

sumedhpathak approved these changes Aug 30, 2022

View reviewed changes

JelteF force-pushed the fix-cancel-race-condition branch from a24d8bd to 5e66bb3 Compare August 31, 2022 07:21

JelteF force-pushed the fix-cancel-race-condition branch 2 times, most recently from b6a23af to 414028d Compare August 31, 2022 15:49

JelteF merged commit ea4fb5f into pgbouncer:master Nov 17, 2022

PascalBourdier mentioned this pull request Dec 16, 2022

chore: Bump pgbouncer from 1.17.0 to 1.18.0 cloudnative-pg/cloudnative-pg#1201

Merged

sxd pushed a commit to PascalBourdier/cloudnative-pg that referenced this pull request Dec 20, 2022

update metric names

415ea3b

according pgbouncer/pgbouncer#717 Signed-off-by: Pascal Bourdier <pascal.bourdier@gmail.com>

sxd pushed a commit to PascalBourdier/cloudnative-pg that referenced this pull request Dec 28, 2022

update metric names

6b51bbb

according pgbouncer/pgbouncer#717 Signed-off-by: Pascal Bourdier <pascal.bourdier@gmail.com>

sxd pushed a commit to PascalBourdier/cloudnative-pg that referenced this pull request Dec 30, 2022

update metric names

2aaaf76

according pgbouncer/pgbouncer#717 Signed-off-by: Pascal Bourdier <pascal.bourdier@gmail.com>

sxd pushed a commit to PascalBourdier/cloudnative-pg that referenced this pull request Jan 3, 2023

update metric names

0abccb9

according pgbouncer/pgbouncer#717 Signed-off-by: Pascal Bourdier <pascal.bourdier@gmail.com>

mnencia pushed a commit to PascalBourdier/cloudnative-pg that referenced this pull request Jan 18, 2023

update metric names

c71de2e

according pgbouncer/pgbouncer#717 Signed-off-by: Pascal Bourdier <pascal.bourdier@gmail.com>

JelteF mentioned this pull request Jan 25, 2023

Cancellations seem to have bugs postgresml/pgcat#297

Open

phisco pushed a commit to PascalBourdier/cloudnative-pg that referenced this pull request Feb 1, 2023

update metric names

36ed886

according pgbouncer/pgbouncer#717 Signed-off-by: Pascal Bourdier <pascal.bourdier@gmail.com>

sxd pushed a commit to PascalBourdier/cloudnative-pg that referenced this pull request Feb 2, 2023

update metric names

83d1e03

according pgbouncer/pgbouncer#717 Signed-off-by: Pascal Bourdier <pascal.bourdier@gmail.com>

sxd pushed a commit to PascalBourdier/cloudnative-pg that referenced this pull request Feb 2, 2023

update metric names

5f88a50

according pgbouncer/pgbouncer#717 Signed-off-by: Pascal Bourdier <pascal.bourdier@gmail.com>

JelteF added a commit to JelteF/pgbouncer that referenced this pull request Mar 13, 2023

Fix calling disconnect_server on a server in BEING_CANCELED state

a857378

Fixes a bug introduced by pgbouncer#717 Fixes pgbouncer#801

JelteF mentioned this pull request Mar 13, 2023

Fix calling disconnect_server on a server in BEING_CANCELED state #815

Merged

JelteF added a commit that referenced this pull request Mar 24, 2023

Fix calling disconnect_server on a server in BEING_CANCELED state (#815)

d1b30f8

Fixes a bug introduced by #717 Related to #801 (and hopefully fixes)

JelteF mentioned this pull request Apr 21, 2023

Add a dedicated cancel_wait_timeout #833

Merged

JelteF mentioned this pull request May 23, 2023

Fix: FATAL in function disconnect_client(): bad client state: 0 #846

Merged

This was referenced May 25, 2023

Fix FATAL in function server_proto(): server in bad state: 14 #849

Merged

FATAL ERROR - pgbouncer exit - @src/server.c:590 in function server_proto(): server_proto: server in bad state: 14 #847

Closed

JelteF mentioned this pull request Jul 25, 2023

FATAL @src/client.c:1092 in function client_proto(): bad client state: 6 #904

Closed

This was referenced Aug 17, 2023

Fix FATAL in function client_proto(): bad client state: 6/7 #928

Merged

Fix FATAL in function server_proto(): server in bad state: 11 #927

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle race condition between cancel requests and server reuse #717

Handle race condition between cancel requests and server reuse #717

JelteF commented May 12, 2022 •

edited

Loading

x4m commented Jul 18, 2022

JelteF commented Jul 18, 2022 •

edited

Loading

sumedhpathak commented Jul 29, 2022

sumedhpathak left a comment

Handle race condition between cancel requests and server reuse #717

Handle race condition between cancel requests and server reuse #717

Conversation

JelteF commented May 12, 2022 • edited Loading

x4m commented Jul 18, 2022

JelteF commented Jul 18, 2022 • edited Loading

sumedhpathak commented Jul 29, 2022

sumedhpathak left a comment

Choose a reason for hiding this comment

JelteF commented May 12, 2022 •

edited

Loading

JelteF commented Jul 18, 2022 •

edited

Loading