-
Notifications
You must be signed in to change notification settings - Fork 66
Description
What happens?
filter_projection_pushdown doesn't actually push down the filter to postgres. It still ends up scanning the table taking 10s+ when filter should hit the index and return in milliseconds.
To Reproduce
Create a table with roughly 10M rows in postgres.
ATTACH 'postgres://u:pass@....' AS pg_reader (TYPE POSTGRES, READ_ONLY)
explain analyze select count(*) from pg_reader.main.order_lines where updated_at = '2025-06-09 14:35:00.742342+00';
┌────────────────────────────────────────────────┐
│┌──────────────────────────────────────────────┐│
││ Total Time: 9.27s ││
│└──────────────────────────────────────────────┘│
└────────────────────────────────────────────────┘
┌───────────────────────────┐
│ QUERY │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ EXPLAIN_ANALYZE │
│ ──────────────────── │
│ 0 Rows │
│ (0.00s) │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ UNGROUPED_AGGREGATE │
│ ──────────────────── │
│ Aggregates: │
│ count_star() │
│ │
│ 1 Rows │
│ (0.00s) │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ TABLE_SCAN │
│ ──────────────────── │
│ Table: order_lines │
│ │
│ Projections: │
│ updated_at │
│ │
│ Filters: │
│ updated_at='2025-06-09 14 │
│ :35:00.742342+00': │
│ :TIMESTAMP WITH TIME ZONE │
│ │
│ 2 Rows │
│ (53.29s) │
└───────────────────────────┘
Even though query plan shows a filter, the filter doesn't seem to get applied at scan time. Postgres still ends up scanning all rows of table. updated_at
column has an index. It ends up taking ~10s.
For our larger tables, with 10M+ rows, duckdb extension places huge load on the database.
Same query via postgres_query
runs in milliseconds
explain analyze select count(*) from postgres_query('pg_reader', 'select updated_at from main.order_lines where updated_at = ''2025-06-09 14:35:00.742342+00''');
┌────────────────────────────────────────────────┐
│┌──────────────────────────────────────────────┐│
││ Total Time: 0.525s ││
│└──────────────────────────────────────────────┘│
└────────────────────────────────────────────────┘
┌───────────────────────────┐
│ QUERY │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ EXPLAIN_ANALYZE │
│ ──────────────────── │
│ 0 Rows │
│ (0.00s) │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ UNGROUPED_AGGREGATE │
│ ──────────────────── │
│ Aggregates: │
│ count_star() │
│ │
│ 1 Rows │
│ (0.00s) │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ TABLE_SCAN │
│ ──────────────────── │
│ Function: │
│ POSTGRES_QUERY │
│ │
│ Projections: │
│ updated_at │
│ │
│ 2 Rows │
│ (0.16s) │
└───────────────────────────┘
use_ctid_scan should be disabled if any filters are specified. So the query is equivalent to postgres_query, but much faster.
pg_use_ctid_scan
SET pg_use_ctid_scan = true; -- default
D .timer on
D select count(*) from pg_reader.main.order_lines where updated_at = '2025-06-09 14:35:00.742342+00'::timestamptz;
┌──────────────┐
│ count_star() │
│ int64 │
├──────────────┤
│ 2 │
└──────────────┘
Run Time (s): real 10.491 user 0.108536 sys 0.083701
SET pg_use_ctid_scan = false; -- disable ctid_scan
D select count(*) from pg_reader.main.order_lines where updated_at = '2025-06-09 14:35:00.742342+00'::timestamptz;
┌──────────────┐
│ count_star() │
│ int64 │
├──────────────┤
│ 2 │
└──────────────┘
Run Time (s): real 0.246 user 0.001300 sys 0.001340
50x faster when SET pg_use_ctid_scan = false
.
OS:
MacOS 14.6.1
PostgreSQL Version:
15
DuckDB Version:
1.3.2
DuckDB Client:
cli
Full Name:
Noj V
Affiliation:
Recurrency
Have you tried this on the latest main
branch?
- I agree
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
- I agree