Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL errors when the connection used is no longer valid (pgx driver version) #3432

Open
5 of 6 tasks
chacmool opened this issue Feb 7, 2023 · 1 comment
Open
5 of 6 tasks
Labels
bug Something is not working.

Comments

@chacmool
Copy link

chacmool commented Feb 7, 2023

Preflight checklist

Describe the bug

We found some connectivity issues on our database connections. When one node of our cluster was not available anymore (because a node upgrade or any other reason), we found that some SQLs failed with this kind of errors:

  • write: broken pipe
  • ERROR: server is shutting down (SQLSTATE 57P01)

This was happening because the connection was pointing to a "dying" cluster node, so the connection was not valid anymore.

We temporarily fixed the issue by setting the ConnMaxLifetime to 2 minutes. This way we have a connection life short enough to avoid having connections to unavailable nodes. This patch solved the issue for a while but it led us to another one: high latency on our requests because of too frequent reconnections.

We then thought of another solution. Let the connections live forever and apply some kind of retryer: when a connection fails and it is no longer valid, remove it from the connection pool and retry the query with a new connection.

Looking at golang’s documentation (http://go-database-sql.org/errors.html) we saw that golang itself should be dealing with this retries:

You don’t need to implement any logic to retry failed statements when this happens. As part of the connection pooling in database/sql, handling failed connections is built-in. If you execute a query or other statement and the underlying connection has a failure, Go will reopen a new connection (or just get another from the connection pool) and retry, up to 10 times.

https://github.com/golang/go/blob/56a14ad4bc19d5ee9d4257f370a570377e81e544/src/database/sql/sql.go#L1531-L1546

We also saw at golang’s repo that even the number of retries was reduced from 10 to 2, which makes sense... why should you retry many times on an already dead connection? golang/go@c468f94

So we did some research at pgx repo and we found that there was a related known issue which is fixed on the v5 versión of the driver: jackc/pgx#672 (comment)

We upgraded our workloads to use the new v5 pgx drivers without too much effort and we test that the retryer works well.

That's why we propose to upgrade the pgx driver on this repo

_ "github.com/jackc/pgx/v4/stdlib"

Reproducing the bug

  1. Start the database
  2. Start any workload with a database connection
  3. Perform any query and check that works correctly
  4. Shutdown and restart the database
  5. If you try any query now, it should fail

Performing the same steps with a v5 pgx driver it does not fail.

Relevant log output

No response

Relevant configuration

No response

Version

1.11.8 (but applies to latest also)

On which operating system are you observing this issue?

None

In which environment are you deploying?

None

Additional Context

No response

@chacmool chacmool added the bug Something is not working. label Feb 7, 2023
@glerchundi
Copy link
Contributor

@aeneasr just to let you now that we managed to fix the issue in all of our development components and verified that everything is working fine after the upgrade to pgx/v5. We're now using infinite connection lifetimes as Go handles reconnections and protect us from cluster changes (upgrades, maintenance, ...).

aeneasr added a commit to gobuffalo/pop that referenced this issue Mar 13, 2023
Version 5 of pgx finally resolves some long-awaited issues when PostgreSQL shuts down.

The upgrade is a drop-in replacement with the only change being the pgconn import path.

See ory/hydra#3432

Signed-off-by: aeneasr <3372410+aeneasr@users.noreply.github.com>
aeneasr added a commit to gobuffalo/pop that referenced this issue Mar 13, 2023
Version 5 of pgx finally resolves some long-awaited issues when PostgreSQL shuts down.

The upgrade is a drop-in replacement with the only change being the pgconn import path.

See ory/hydra#3432

Signed-off-by: aeneasr <3372410+aeneasr@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is not working.
Projects
None yet
Development

No branches or pull requests

2 participants