-
Notifications
You must be signed in to change notification settings - Fork 474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle race condition between cancel requests and server reuse #717
Conversation
1fa16c4
to
3891b64
Compare
293ff31
to
bec7e42
Compare
bec7e42
to
d6bc244
Compare
I'm not really familiar with the PgBouncer's code, but in Odyssey we generate cancelation key for each client. So client cannot cancel someone else's server connection. Does this resolve the issue? Anyway the behavior of cancels is somewhat problematic. Sometimes server's CPU is starving due to large amount of fork()s. Clients start to cancel their queries. But cancels produce more fork()s! And ISTM this PR addressess the issue, doesn't it? |
@x4m PgBouncer also generates unique cancellation tokens per client. So, that alone won't protect you against the same issue in Odyssey. To clarify the issue: These unique cancellation tokens, when received by the connection pooler, are handled by by sending a cancellation token to the actual Postgres server. This cancellation token should be the one that matches the postgres connection on which the query of the client is currently being sent. The problem is that while sending this cancellation token to Postgres, the query on that connection might finish, and the postgres connection might be reused for a query from another client. If that's possible in Odyssey then you almost certainly have the same bug. One easy thing to try is run the test I wrote for this PR against Odyssey and see what happens.
I'm not sure why you think this PR addresses the issue. It's definitely not the purpose of the PR, but it might reduce cancellation load as a side-effect. |
d6bc244
to
7d4ba39
Compare
The other comment we discussed was to add this behind a feature flag, so we could toggle this off in case there is an issue. |
1b6bf02
to
a24d8bd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
a24d8bd
to
5e66bb3
Compare
Cancel requests happen out of band with regular queries. Because PgBouncer reuses server connections across clients its possible to get race conditions here if you're not careful. Without this patch there were two possibilities for a race condition where a cancel request for client A could cancel a query that was sent by client B. These race conditions happened when: 1. A cancel request on client A was forwarded to the server only after the server was reused by another client. 2. A cancel request for client A was already forwarded to the server, but not yet handled by the server and then the server was reused for client B. The first race conditions was especially likely if connection establishment takes a while, for example because of TLS roundtrips. Because a new connection needs to be established for every forwarded cancellation request. While establishing this connection the query running on the server might finish by itself already, and then the server could be reused by another already waiting client. To resolve these race conditions the following two fixes are implemented: 1. We now only forward the cancel request if the server connection over which the query was sent hasn't become idle or disconnected in the meantime. This is done by doubly-linking cancel requests and the original server connection that was used for the query. 2. Once a cancel request is forwarded to the server, we do not allow reuse of the server until the cancel request completes. This required tracking the response of the server for cancel requests too, instead of closing the connection right after forwarding the cancel . Fixes pgbouncer#544
b6a23af
to
414028d
Compare
according pgbouncer/pgbouncer#717 Signed-off-by: Pascal Bourdier <pascal.bourdier@gmail.com>
according pgbouncer/pgbouncer#717 Signed-off-by: Pascal Bourdier <pascal.bourdier@gmail.com>
according pgbouncer/pgbouncer#717 Signed-off-by: Pascal Bourdier <pascal.bourdier@gmail.com>
according pgbouncer/pgbouncer#717 Signed-off-by: Pascal Bourdier <pascal.bourdier@gmail.com>
according pgbouncer/pgbouncer#717 Signed-off-by: Pascal Bourdier <pascal.bourdier@gmail.com>
according pgbouncer/pgbouncer#717 Signed-off-by: Pascal Bourdier <pascal.bourdier@gmail.com>
according pgbouncer/pgbouncer#717 Signed-off-by: Pascal Bourdier <pascal.bourdier@gmail.com>
according pgbouncer/pgbouncer#717 Signed-off-by: Pascal Bourdier <pascal.bourdier@gmail.com>
Fixes a bug introduced by pgbouncer#717 Fixes pgbouncer#801
Cancellations and regular queries are quite different. So it makes sense to have different (default) timeouts for cancellation requests than for queries. While I already think the default query_wait_timeout of 120 seconds we have is excessively long, it doesn't make sense at all to have such a long timeout by default for cancellation requests that are not sent. So this introduced a new `cancel_wait_timeout` which is set to 10 seconds by default. If the cancellation request cannot be forwarded within that time it's probably better to drop it. One reason why it's important to relatively quickly close cancellation requests when they are not being handled is because libpq only allows the client to send a cancel in a blocking way. So there's no possibility to bail out after a certain amount of seconds on the client side, if PgBouncer keeps the connection to the client open. Both me and Marco noticed this when testing my PgBouncer peering PR (pgbouncer#666), when an entry in the `[peers]` list pointed to an unreachable host. That meant that pgbouncer would keep the cancel connection open, and even though the query had finished the `psql` would not display the result because it was stuck waiting for a result to the cancel request. To avoid race conditions of cancellations that already forwarded, but not yet answered cancellation requests are not timed out. See pgbouncer#717 for details on such race conditions.
Cancellations and regular queries are quite different. So it makes sense to have different (default) timeouts for cancellation requests than for queries. While I already think the default query_wait_timeout of 120 seconds we have is excessively long, it doesn't make sense at all to have such a long timeout by default for cancellation requests that are not sent. So this introduced a new `cancel_wait_timeout` which is set to 10 seconds by default. If the cancellation request cannot be forwarded within that time it's probably better to drop it. One reason why it's important to relatively quickly close cancellation requests when they are not being handled is because libpq only allows the client to send a cancel in a blocking way. So there's no possibility to bail out after a certain amount of seconds on the client side, if PgBouncer keeps the connection to the client open. Both me and Marco noticed this when testing my PgBouncer peering PR (pgbouncer#666), when an entry in the `[peers]` list pointed to an unreachable host. That meant that pgbouncer would keep the cancel connection open, and even though the query had finished the `psql` would not display the result because it was stuck waiting for a result to the cancel request. To avoid race conditions of cancellations that already forwarded, but not yet answered cancellation requests are not timed out. See pgbouncer#717 for details on such race conditions.
Cancellations and regular queries are quite different. So it makes sense to have different (default) timeouts for cancellation requests than for queries. While I already think the default query_wait_timeout of 120 seconds we have is excessively long, it doesn't make sense at all to have such a long timeout by default for cancellation requests that are not sent. So this introduced a new `cancel_wait_timeout` which is set to 10 seconds by default. If the cancellation request cannot be forwarded within that time it's probably better to drop it. One reason why it's important to relatively quickly close cancellation requests when they are not being handled is because libpq only allows the client to send a cancel in a blocking way. So there's no possibility to bail out after a certain amount of seconds on the client side, if PgBouncer keeps the connection to the client open. Both me and Marco noticed this when testing my PgBouncer peering PR (#666), when an entry in the `[peers]` list pointed to an unreachable host. That meant that pgbouncer would keep the cancel connection open, and even though the query had finished the `psql` would not display the result because it was stuck waiting for a result to the cancel request. To avoid race conditions of cancellations that already forwarded, but not yet answered cancellation requests are not timed out. See #717 for details on such race conditions.
In most cases the postgres server will not respond anything back to a cancel message. It simply closes the connection. It won't even report an error if the cancellation key is incorrect, or the PID is unknown. However, it is possible that it sends another type of error message, such as an OOM. Libpq actually ignores those errors and won't show them to the user, but they might still be sent by the server. Because of the changes in pgbouncer#717 this could cause a FATAL error. This fixes that by now handling `SV_ACTIVE_CANCEL` the same as `SV_ACTIVE` in `server_proto`. Fixes pgbouncer#847 Related to pgbouncer#717, pgbouncer#801, and pgbouncer#815
In most cases the postgres server will not respond anything back to a cancel message. It simply closes the connection. It won't even report an error if the cancellation key is incorrect, or the PID is unknown. However, it is possible that it sends another type of error message, such as an OOM. Libpq actually ignores those errors and won't show them to the user, but they might still be sent by the server. Because of the changes in #717 this could cause a FATAL error. This fixes that by now handling `SV_ACTIVE_CANCEL` the same as `SV_ACTIVE` in `server_proto`. Fixes #847 Related to #717, #801, and #815
…cer#849) In most cases the postgres server will not respond anything back to a cancel message. It simply closes the connection. It won't even report an error if the cancellation key is incorrect, or the PID is unknown. However, it is possible that it sends another type of error message, such as an OOM. Libpq actually ignores those errors and won't show them to the user, but they might still be sent by the server. Because of the changes in pgbouncer#717 this could cause a FATAL error. This fixes that by now handling `SV_ACTIVE_CANCEL` the same as `SV_ACTIVE` in `server_proto`. Fixes pgbouncer#847 Related to pgbouncer#717, pgbouncer#801, and pgbouncer#815 (cherry picked from commit c9aa1c6)
In most cases the postgres server will not respond anything back to a cancel message. It simply closes the connection. It won't even report an error if the cancellation key is incorrect, or the PID is unknown. However, it is possible that it sends another type of error message, such as an OOM. Libpq actually ignores those errors and won't show them to the user, but they might still be sent by the server. Because of the changes in #717 this could cause a FATAL error. This fixes that by now handling `SV_ACTIVE_CANCEL` the same as `SV_ACTIVE` in `server_proto`. Fixes #847 Related to #717, #801, and #815 (cherry picked from commit c9aa1c6)
The cancel request protocol is very simple: 1. Clients send a cancel key 2. Server cancels the query running on the matching backend 3. The server closes the socket to indicate the cancel request was handled At no point after sending the cancel key the client is required to send any data, and if it does it won't get a response anyway because the server would close the connection. So most clients don't, specifically all libpq based client don't. The Go client however, sends a Terminate ('X') packet before closing the cancel socket from its own side (usually due to a 10 second timeout). This would cause a crash in PgBouncer, because PgBouncer would still listen on the client socket, but on receiving any data we wouldreport a fatal error. This PR fixes that fatal error by stopping to listen on the client socket once we've received the cancel key (i.e. when we change the client state to CL_WAITING_CANCEL). I was able to reproduce both of the errors reported by @raymasson and confirmed that after this change the errors would not occur again. Reproducing this with libpq based clients is not possible, except when manually attaching to the client using `gdb` and manually running `send` on the cancel request its socket. So no test is added, since our test suite uses psycopg which uses libpq under the hood. Fixes pgbouncer#904 Related to pgbouncer#717
The cancel request protocol is very simple: 1. Clients send a cancel key 2. Server cancels the query running on the matching backend 3. The server closes the socket to indicate the cancel request was handled At no point after sending the cancel key the client is required to send any data, and if it does it won't get a response anyway because the server would close the connection. So most clients don't, specifically all libpq based client don't. The Go client however, sends a Terminate ('X') packet before closing the cancel socket from its own side (usually due to a 10 second timeout). This would cause a crash in PgBouncer, because PgBouncer would still listen on the client socket, but on receiving any data we wouldreport a fatal error. This PR fixes that fatal error by stopping to listen on the client socket once we've received the cancel key (i.e. when we change the client state to CL_WAITING_CANCEL). I was able to reproduce both of the errors reported by @raymasson and confirmed that after this change the errors would not occur again. Reproducing this with libpq based clients is not possible, except when manually attaching to the client using `gdb` and manually running `send` on the cancel request its socket. So no test is added, since our test suite uses psycopg which uses libpq under the hood. Fixes #904 Related to #717
Fixes #822 Related to #904 and #717 In #904 @tnt-dev reported to still get the issue in #822 on 1.19.1 (on which it was expected to be fixed). ``` 2023-08-12T16:34:07+00:00 pgbouncer[4920]: @src/server.c:591 in function server_proto(): server_proto: server in bad state: 11 2023-08-12T16:34:07+00:00 systemd[1]: pgbouncer.service: Main process exited, code=exited, status=1/FAILURE 2023-08-12T16:34:07+00:00 systemd[1]: pgbouncer.service: Failed with result 'exit-code'. 2023-08-12T16:34:07+00:00 systemd[1]: pgbouncer.service: Consumed 1h 51min 3.308s CPU time. 2023-08-12T16:34:07+00:00 pgbouncer[4920]: @src/server.c:591 in function server_proto(): server_proto: server in bad state: 11 2023-08-12T16:34:07+00:00 pgbouncer[4920]: started sending cancel request ``` Looking again, I now understand that the switch in question was missing the SV_BEING_CANCELED case. I now added it to the same case list that SV_IDLE was part of. Since SV_BEING_CANCELED is effectively the same state as SV_IDLE (except we don't allow reuse of the server yet). I did the same in a second switch case, for that one only warnings would have been shown instead of causing a fatal error. But still it seems good to avoid unnecessary warnings. One example where this fatal error might occur is when PgBouncer is still waiting for a response from the cancel request it sent to the server. But the query that's being canceled completes before that happens. This puts the server in SV_BEING_CANCELED. Then for whatever reason the server sends a NOTICE or ERROR message (still before the cancel request receives a response from Postgres). I was able to reproduce (and confirm that the patch resolves it), by adding some fake latency between PgBouncer and Postgres. And stopping Postgres while the server was in the SV_BEING_CANCELED state.
Cancel requests happen out of band with regular queries. Because
PgBouncer reuses server connections across clients its possible to get
race conditions here if you're not careful.
Without this patch there were two possibilities for a race condition
where a cancel request for client A could cancel a query that was sent
by client B. These race conditions happened when:
the server was reused by another client.
but not yet handled by the server and then the server was reused for
client B.
The first race conditions was especially likely if connection
establishment takes a while, for example because of TLS roundtrips.
Because a new connection needs to be established for every forwarded
cancellation request. While establishing this connection the query
running on the server might finish by itself already, and then the
server could be reused by another already waiting client.
To resolve these race conditions the following two fixes are implemented:
which the query was sent hasn't become idle or disconnected in the
meantime. This is done by doubly-linking cancel requests and the
original server connection that was used for the query.
the server until the cancel request completes. This required tracking
the response of the server for cancel requests too, instead of
closing the connection right after forwarding the cancel.
Fixes #544
In passing this also adds more comprehensive comments to the different
StatList
s in thePgPool
struct. Because understanding how the statemachine worked was the part of this PR that took me the longest.