Description
- asyncpg version: 0.30.0
- PostgreSQL version: 16
- Do you use a PostgreSQL SaaS? If so, which? Can you reproduce
the issue with a local PostgreSQL install?: CloudSQL with cloud sql connector, but it is possible to reproduce it locally with a direct_tls setup. - Python version: 3.12
- Platform: linux / macos
- Do you use pgbouncer?: no
- Did you install asyncpg with pip?: poetry
- If you built asyncpg locally, which version of Cython did you use?:
- Can the issue be reproduced under both asyncio and
uvloop?: the reproducer uses asyncio, but I hit the problem "in prod" with uvicorn / uvloop
Hi, I reported the issue first to sqlalchemy (sqlalchemy/sqlalchemy#12099) but managed to reproduce it using only asyncpg.
From what I could observe, if a direct_tls
connection is waiting for postgres to reply and the task within which this connection lives is cancelled, calling connection.close()
in an exception handler will fail with unexpected connection_lost() call
, leaving the connection open. (a call to connection.terminate()
after that doesn't close the connection either)
The problem only occurs with direct_tls=True
.
The issue was first observed with sqlalchemy with a connection pool of asyncpg connections.
SA thought the connections were closed and would then open new ones, this lead to using up all the available slots on the pg side.
This script can reproduce the issue, however, it needs a postgres setup where direct_tls
can be used.
There a docker-compose file at https://github.com/brittlesoft/repro-starlette-sa-conn-leak that can be used to get a working setup quickly.
import asyncio
import asyncpg
import logging
logging.basicConfig(level=logging.DEBUG)
async def do(i):
try:
# connect using direct_tls leads to `unexpected connection_lost() call` when calling conn.close()
conn = await asyncpg.connect('postgresql://postgres:postgres@localhost:5443/postgres', direct_tls=True)
# connect using default params works fine
#conn = await asyncpg.connect('postgresql://postgres:postgres@localhost:5432/postgres')
# lock and simulate work (or block if lock already taken) -- using select pg_sleep(10) would also work
await conn.execute("select pg_advisory_lock(1234)")
await asyncio.sleep(10)
except BaseException as e:
print(i,"got exc:", e, type(e))
try:
await conn.close(timeout=2)
except BaseException as e:
print(i, "close got exc: ",e)
# NOTE: aborting transport here seems to release the connection
#conn._transport.abort()
try:
print(i, "calling terminate")
conn.terminate()
except BaseException as e:
print(i, "terminate got exc: ",e)
async def main():
ts = []
for i in range(10):
ts.append(asyncio.create_task(do(i)))
async def timeouter():
await asyncio.sleep(1)
for t in ts:
t.cancel()
asyncio.create_task(timeouter())
try:
await asyncio.gather(*ts)
except asyncio.CancelledError:
print("cancelled")
# Sleep so we can observe the state of the connections
await asyncio.sleep(30)
if __name__ == '__main__':
asyncio.run(main())
output:
DEBUG:asyncio:Using selector: KqueueSelector
0 got exc: <class 'asyncio.exceptions.CancelledError'>
1 got exc: <class 'asyncio.exceptions.CancelledError'>
2 got exc: <class 'asyncio.exceptions.CancelledError'>
3 got exc: <class 'asyncio.exceptions.CancelledError'>
4 got exc: <class 'asyncio.exceptions.CancelledError'>
5 got exc: <class 'asyncio.exceptions.CancelledError'>
6 got exc: <class 'asyncio.exceptions.CancelledError'>
7 got exc: <class 'asyncio.exceptions.CancelledError'>
8 got exc: <class 'asyncio.exceptions.CancelledError'>
9 got exc: <class 'asyncio.exceptions.CancelledError'>
1 close got exc: unexpected connection_lost() call
1 calling terminate
7 close got exc: unexpected connection_lost() call
7 calling terminate
0 close got exc: unexpected connection_lost() call
0 calling terminate
4 close got exc: unexpected connection_lost() call
4 calling terminate
5 close got exc: unexpected connection_lost() call
5 calling terminate
3 close got exc: unexpected connection_lost() call
3 calling terminate
9 close got exc: unexpected connection_lost() call
9 calling terminate
6 close got exc: unexpected connection_lost() call
6 calling terminate
8 close got exc: unexpected connection_lost() call
8 calling terminate
And it postgres we see this:
2024-11-15 13:29:02.164078+00 | 2024-11-15 13:29:02.164078+00 | 2024-11-15 13:29:02.164078+00 | Lock | advisory | active | | 1584 | | select pg_advisory_lock(1234) | client backend
2024-11-15 13:29:02.188851+00 | 2024-11-15 13:29:02.188851+00 | 2024-11-15 13:29:02.188852+00 | Lock | advisory | active | | 1584 | | select pg_advisory_lock(1234) | client backend
| 2024-11-15 13:29:02.163992+00 | 2024-11-15 13:29:03.04934+00 | Client | ClientRead | idle | | | | select pg_advisory_lock(1234) | client backend
2024-11-15 13:29:02.163992+00 | 2024-11-15 13:29:02.163992+00 | 2024-11-15 13:29:02.163994+00 | Lock | advisory | active | | 1584 | | select pg_advisory_lock(1234) | client backend
2024-11-15 13:29:02.189558+00 | 2024-11-15 13:29:02.189558+00 | 2024-11-15 13:29:02.189558+00 | Lock | advisory | active | | 1584 | | select pg_advisory_lock(1234) | client backend
2024-11-15 13:29:02.188884+00 | 2024-11-15 13:29:02.188884+00 | 2024-11-15 13:29:02.188885+00 | Lock | advisory | active | | 1584 | | select pg_advisory_lock(1234) | client backend
2024-11-15 13:29:02.188826+00 | 2024-11-15 13:29:02.188826+00 | 2024-11-15 13:29:02.188829+00 | Lock | advisory | active | | 1584 | | select pg_advisory_lock(1234) | client backend
2024-11-15 13:29:02.189478+00 | 2024-11-15 13:29:02.189478+00 | 2024-11-15 13:29:02.189479+00 | Lock | advisory | active | | 1584 | | select pg_advisory_lock(1234) | client backend
2024-11-15 13:29:02.19116+00 | 2024-11-15 13:29:02.19116+00 | 2024-11-15 13:29:02.191161+00 | Lock | advisory | active | | 1584 | | select pg_advisory_lock(1234) | client backend