Skip to content

Only use positive numbers in the PID part of the cancel key #945

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 5, 2023

Conversation

JelteF
Copy link
Member

@JelteF JelteF commented Sep 6, 2023

We were using the full 64 bits of the BackendKeyData message as random
bytes. This turned out to be arguably incorrect, because the first 32
bits are used by PostgreSQL as a Process ID (and this is part of the
protocol documentation too). Actual PIDs are always positive, but
we also put a random bit in the sign bit so were setting it to negative
numbers half of the time. For most cases this does not matter, but it
turned out that pg_basebackup relied on the PID part to actually be
positive
.

While pg_basebackup is now fixed to support negative Process IDs, it
still seems good to adhere to this implicit requirement on positive
numbers in case other clients also depend on it.

Since this change requires changing the bytes of the cancel key in which
we encode the peer_id, this change breaks cancelations in peered
clusters of different PgBouncer versions. This seems like a minor
enough problem that we should not care about this. In practice this
should only happen during a rolling upgrade, which we currently don't
support well anyway (see #902 for improvements on that). And even if we
did, breaking cancellations for a few minutes in this transitional stage
doesn't seem like a huge deal.

@petere
Copy link
Member

petere commented Oct 4, 2023

While pg_basebackup is now fixed to support negative Process IDs, it still seems good to adhere to this implicit requirement on positive numbers in case other clients also depend on it.

That makes sense.

Since this change requires changing the bytes of the cancel key in which we encode the peer_id, this change breaks cancelations in peered clusters of different PgBouncer versions. This seems like a minor enough problem that we should not care about this. In practice this should only happen during a rolling upgrade, which we currently don't support well anyway (see #902 for improvements on that). And even if we did, breaking cancellations for a few minutes in this transitional stage doesn't seem like a huge deal.

I don't think I follow this. While temporary breakage during rolling upgrades might be a minor problem, what about different pgbouncer versions on different hosts that are peered together? We don't want to force that all of those are upgraded together.

Maybe a compatibility break is necessary, but then we should just say so.

What would happen if we left the peer ID as is (in the first bytes)?

@JelteF
Copy link
Member Author

JelteF commented Oct 4, 2023

Maybe a compatibility break is necessary, but then we should just say so.

Fair enough. I'll update the docs in the PR and note that cross version peering is not supported across this version change. (and also include it in the changelog)

What would happen if we left the peer ID as is (in the first bytes)?

Then we'd cut the usable peer id space in half, because the sign bit always needs to be 0. The maximum value of peer_id (16383) is also documented, so changing that would also be a breaking change. Since we're breaking compatibility either way, I'd rather do it in the way that the PR is doing now so we can keep the same amount of peer_ids.

@JelteF JelteF force-pushed the positive-pid-in-cancel-key branch from 8330df4 to 9182a18 Compare October 5, 2023 07:48
We were using the full 64 bits of the BackendKeyData message as random
bytes. This turned out to be arguably incorrect, because the first 32
bits are used by PostgreSQL as a Process ID (and this is part of the
[protocol documentation too][1]). Actual PIDs are always positive, but
we also put a random bit in the sign bit so were setting it to negative
numbers half of the time. For most cases this does not matter, but it
turned out that [`pg_basebackup` relies on the PID part to actually be
positive][2].

While it seems semi-likely that `pg_basebackup` will be fixed to support
negative Process IDs, it still seems good to adhere to this implicit
requirement on positive numbers in case other clients also depend on it.

Since this change requires changing the bytes of the cancel key in which
we encode the `peer_id`, this change breaks cancelations in peered
clusters of different PgBouncer versions. This seems like a minor
enough problem that we should not care about this. In practice this
should only happen during a rolling upgrade, which we currently don't
support well anyway (see pgbouncer#902 for improvements on that). And even if we
did, breaking cancellations for a few minutes in this transitional stage
doesn't seem like a huge deal.

[1]: https://www.postgresql.org/docs/current/protocol-message-formats.html
[2]: https://www.postgresql.org/message-id/flat/CAGECzQQOGvYfp8ziF4fWQ_o8s2K7ppaoWBQnTmdakn3s-4Z%3D5g%40mail.gmail.com
@JelteF JelteF force-pushed the positive-pid-in-cancel-key branch from 9182a18 to 7e68480 Compare October 5, 2023 07:50
@JelteF
Copy link
Member Author

JelteF commented Oct 5, 2023

I updated the docs to include the cross-version breakage.

@JelteF JelteF merged commit 5065deb into pgbouncer:master Oct 5, 2023
@JelteF JelteF deleted the positive-pid-in-cancel-key branch October 5, 2023 13:32
JelteF added a commit that referenced this pull request May 6, 2024
In session pooling mode PgBouncer is pretty much a transparent proxy,
i.e. the client does normally not even need to know that PgBouncer is in
the middle. This allows things like load balancing and failovers without
the client needing to know about this at all. But as soon as replication
connections are needed, this was not possible anymore, because PgBouncer
would reject those instead of proxying them to the right server.

This PR fixes that by also proxying replication connections. They are
handled pretty differently from normal connections though. A client and
server replication connection will form a strong pair, as soon as one is
closed the other is closed too. So, there's no caching of the server
replication connections, like is done for regular connections. Reusing
replication connections comes with a ton of gotchas. Postgres will
throw errors in many cases when trying to do so. So simply not
doing it seems like a good tradeoff for ease of implementation.
Especially because replication connections are pretty much always
very long lived. So re-using them gains pretty much no performance
benefits.

Fixes #382

Depends on #945
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants