Related: #150
Benchmark code on: https://github.com/Dev-iL/pgbench
TL;DR execute_many issues one round-trip per row instead of pipelining, making bulk inserts ~100× slower than asyncpg
Summary
Connection.execute_many(query, params) performs dramatically worse than equivalent bulk-insert APIs in asyncpg and psycopg3. Benchmarking a 1 000-row INSERT batch shows psqlpy taking 2–5 seconds per call, versus asyncpg completing the same batch in ~35 ms. The behaviour is consistent with sending one prepared-statement execution per row without pipelining or batching at the wire level.
Environment
- psqlpy: 0.11.12
- Python: 3.14.3
- PostgreSQL: 14.20 (local Docker, sub-millisecond RTT)
Reproduction
import asyncio, time
import psqlpy
pool = psqlpy.ConnectionPool(
host="127.0.0.1", port=5432, username="postgres", password="...",
db_name="postgres", max_db_pool_size=2,
)
ROWS = [[10, 11, 10, "TESTTESTTEST", 10.333, 12341234, "123412341234"]] * 1000
QUERY = "INSERT INTO _test (a, b, c, d, e, f, g) VALUES ($1, $2, $3, $4, $5, $6, $7)"
async def main():
async with pool.acquire() as conn:
await conn.execute_batch(
"CREATE TABLE IF NOT EXISTS _test"
"(a int, b int, c int, d text, e float, f int, g text);"
)
t0 = time.perf_counter()
await conn.execute_many(QUERY, ROWS)
print(f"execute_many 1000 rows: {time.perf_counter() - t0:.3f}s")
asyncio.run(main())
pool.close()
Benchmark results (10 concurrent connections, 10 s, geometric mean)
| Driver |
Batch QPS |
Batch mean latency |
asyncpg (executemany) |
279 |
35 ms |
psycopg3-async (executemany) |
9 |
1 087 ms |
psqlpy (execute_many) |
3 |
~2 000 ms |
At 3 QPS for a 1 000-row batch, the effective single-row throughput is ~3 000 rows/sec. asyncpg achieves ~279 000 rows/sec for the same workload — roughly a 93× difference.
Expected behaviour
execute_many should pipeline prepared-statement executions using PostgreSQL's extended query protocol, or use a COPY-based path for bulk inserts. Both approaches avoid per-row round-trips:
- Extended query pipelining: send all
Bind/Execute messages before waiting for ReadyForQuery, as asyncpg does internally.
- COPY path: serialize rows and stream via
COPY FROM STDIN, which is the fastest bulk-load mechanism PostgreSQL provides.
Actual behaviour
Each row appears to require its own network round-trip (Parse → Bind → Execute → ReadyForQuery cycle), so total latency scales linearly with row count and is dominated by RTT × N rather than throughput.
Notes
This issue is compounded by the absence of a COPY FROM STDIN API (see #166), which would otherwise provide a fast alternative for bulk ingestion.
Related: #150
Benchmark code on: https://github.com/Dev-iL/pgbench
TL;DR
execute_manyissues one round-trip per row instead of pipelining, making bulk inserts ~100× slower than asyncpgSummary
Connection.execute_many(query, params)performs dramatically worse than equivalent bulk-insert APIs in asyncpg and psycopg3. Benchmarking a 1 000-rowINSERTbatch shows psqlpy taking 2–5 seconds per call, versus asyncpg completing the same batch in ~35 ms. The behaviour is consistent with sending one prepared-statement execution per row without pipelining or batching at the wire level.Environment
Reproduction
Benchmark results (10 concurrent connections, 10 s, geometric mean)
executemany)executemany)execute_many)At 3 QPS for a 1 000-row batch, the effective single-row throughput is ~3 000 rows/sec. asyncpg achieves ~279 000 rows/sec for the same workload — roughly a 93× difference.
Expected behaviour
execute_manyshould pipeline prepared-statement executions using PostgreSQL's extended query protocol, or use aCOPY-based path for bulk inserts. Both approaches avoid per-row round-trips:Bind/Executemessages before waiting forReadyForQuery, as asyncpg does internally.COPY FROM STDIN, which is the fastest bulk-load mechanism PostgreSQL provides.Actual behaviour
Each row appears to require its own network round-trip (
Parse→Bind→Execute→ReadyForQuerycycle), so total latency scales linearly with row count and is dominated by RTT × N rather than throughput.Notes
This issue is compounded by the absence of a
COPY FROM STDINAPI (see #166), which would otherwise provide a fast alternative for bulk ingestion.