While thinking about potential sources of the infamous PoolTimedOut error, I realized that there's an interesting failure mode to acquire().
Once it decides to open a new connection, that's all it tries to do: https://github.com/launchbadge/sqlx/blob/e1ac3881734293cb33674a8b0b1d983132b9c2b1/sqlx-core/src/pool/inner.rs#L283-L284
If a nonfatal connection error happens, it just continues in the backoff loop in connect() and never touches the idle queue again: https://github.com/launchbadge/sqlx/blob/e1ac3881734293cb33674a8b0b1d983132b9c2b1/sqlx-core/src/pool/inner.rs#L348
It will continue to do this until the timeout if the transient error does not resolve itself.
Right now, only the Postgres driver overrides DatabaseError::is_transient_in_connect_phase(), but one of the error codes it considers transient is the "too many connections" error: https://github.com/launchbadge/sqlx/blob/e1ac3881734293cb33674a8b0b1d983132b9c2b1/sqlx-postgres/src/error.rs#L192-L195
This means that if the max_connections of the pool exceeds what is currently available on the server, tasks can get stuck in a loop trying to open new connections despite there being idle connections available, leading to surprising PoolTimedOut errors.
This is potentially the cause of some such issues being reported, although it's only likely to occur with the Postgres driver.
While thinking about potential sources of the infamous
PoolTimedOuterror, I realized that there's an interesting failure mode toacquire().Once it decides to open a new connection, that's all it tries to do: https://github.com/launchbadge/sqlx/blob/e1ac3881734293cb33674a8b0b1d983132b9c2b1/sqlx-core/src/pool/inner.rs#L283-L284
If a nonfatal connection error happens, it just continues in the backoff loop in
connect()and never touches the idle queue again: https://github.com/launchbadge/sqlx/blob/e1ac3881734293cb33674a8b0b1d983132b9c2b1/sqlx-core/src/pool/inner.rs#L348It will continue to do this until the timeout if the transient error does not resolve itself.
Right now, only the Postgres driver overrides
DatabaseError::is_transient_in_connect_phase(), but one of the error codes it considers transient is the "too many connections" error: https://github.com/launchbadge/sqlx/blob/e1ac3881734293cb33674a8b0b1d983132b9c2b1/sqlx-postgres/src/error.rs#L192-L195This means that if the
max_connectionsof the pool exceeds what is currently available on the server, tasks can get stuck in a loop trying to open new connections despite there being idle connections available, leading to surprisingPoolTimedOuterrors.This is potentially the cause of some such issues being reported, although it's only likely to occur with the Postgres driver.