Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Querying CAGG scan the whole chunk #4861

Open
d4b27a opened this issue Oct 20, 2022 · 6 comments
Open

[Bug]: Querying CAGG scan the whole chunk #4861

d4b27a opened this issue Oct 20, 2022 · 6 comments

Comments

@d4b27a
Copy link

d4b27a commented Oct 20, 2022

What type of bug is this?

Performance issue

What subsystems and features are affected?

Query executor, Query planner

What happened?

I am making a simple query, there is an index and the timescaledb nonetheless seems to scan the whole hypertable.

TimescaleDB version affected

2.8.0

PostgreSQL version used

13.8

What operating system did you use?

Debian

What installation method did you use?

Docker

What platform did you run on?

On prem/Self-hosted

Relevant log output and stack trace

Limit  (cost=512.15..512.18 rows=10 width=52) (actual time=31.596..31.603 rows=10 loops=1)
   ->  Sort  (cost=512.15..513.23 rows=430 width=52) (actual time=31.596..31.601 rows=10 loops=1)
         Sort Key: _materialized_hypertable_73.xtime DESC
         Sort Method: top-N heapsort  Memory: 26kB
         ->  Append  (cost=2.04..502.86 rows=430 width=52) (actual time=0.752..23.105 rows=105121 loops=1)
               ->  Custom Scan (ChunkAppend) on _materialized_hypertable_73  (cost=2.04..392.81 rows=376 width=56) (actual time=0.752..19.295 rows=105121 loops=1)
                     Chunks excluded during startup: 0
                     ->  Bitmap Heap Scan on _hyper_73_912_chunk  (cost=2.04..53.99 rows=51 width=56) (actual time=0.751..2.109 rows=14264 loops=1)
                           Recheck Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_internal.to_timestamp_without_timezone(_timescaledb_internal.cagg_watermark(73)), '-infinity'::timestamp without time zone)))
                           Heap Blocks: exact=315
                           ->  Bitmap Index Scan on _hyper_73_912_chunk__materialized_hypertable_73_location_xtime_  (cost=0.00..2.03 rows=51 width=0) (actual time=0.724..0.724 rows=14264 loops=1)
                                 Index Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_internal.to_timestamp_without_timezone(_timescaledb_internal.cagg_watermark(73)), '-infinity'::timestamp without time zone)))
                     ->  Bitmap Heap Scan on _hyper_73_913_chunk  (cost=2.26..74.80 rows=72 width=56) (actual time=1.064..2.918 rows=20160 loops=1)
                           Recheck Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_internal.to_timestamp_without_timezone(_timescaledb_internal.cagg_watermark(73)), '-infinity'::timestamp without time zone)))
                           Heap Blocks: exact=445
                           ->  Bitmap Index Scan on _hyper_73_913_chunk__materialized_hypertable_73_location_xtime_  (cost=0.00..2.24 rows=72 width=0) (actual time=1.031..1.032 rows=20160 loops=1)
                                 Index Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_internal.to_timestamp_without_timezone(_timescaledb_internal.cagg_watermark(73)), '-infinity'::timestamp without time zone)))
                     ->  Bitmap Heap Scan on _hyper_73_914_chunk  (cost=2.26..74.80 rows=72 width=56) (actual time=1.198..3.047 rows=20160 loops=1)
                           Recheck Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_internal.to_timestamp_without_timezone(_timescaledb_internal.cagg_watermark(73)), '-infinity'::timestamp without time zone)))
                           Heap Blocks: exact=445
                           ->  Bitmap Index Scan on _hyper_73_914_chunk__materialized_hypertable_73_location_xtime_  (cost=0.00..2.24 rows=72 width=0) (actual time=1.170..1.170 rows=20160 loops=1)
                                 Index Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_internal.to_timestamp_without_timezone(_timescaledb_internal.cagg_watermark(73)), '-infinity'::timestamp without time zone)))
                     ->  Bitmap Heap Scan on _hyper_73_915_chunk  (cost=2.26..74.80 rows=72 width=56) (actual time=1.122..2.897 rows=20160 loops=1)
                           Recheck Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_internal.to_timestamp_without_timezone(_timescaledb_internal.cagg_watermark(73)), '-infinity'::timestamp without time zone)))
                           Heap Blocks: exact=445
                           ->  Bitmap Index Scan on _hyper_73_915_chunk__materialized_hypertable_73_location_xtime_  (cost=0.00..2.24 rows=72 width=0) (actual time=1.096..1.096 rows=20160 loops=1)
                                 Index Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_internal.to_timestamp_without_timezone(_timescaledb_internal.cagg_watermark(73)), '-infinity'::timestamp without time zone)))
                     ->  Bitmap Heap Scan on _hyper_73_916_chunk  (cost=2.26..74.80 rows=72 width=56) (actual time=1.037..2.828 rows=20160 loops=1)
                           Recheck Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_internal.to_timestamp_without_timezone(_timescaledb_internal.cagg_watermark(73)), '-infinity'::timestamp without time zone)))
                           Heap Blocks: exact=445
                           ->  Bitmap Index Scan on _hyper_73_916_chunk__materialized_hypertable_73_location_xtime_  (cost=0.00..2.24 rows=72 width=0) (actual time=1.012..1.012 rows=20160 loops=1)
                                 Index Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_internal.to_timestamp_without_timezone(_timescaledb_internal.cagg_watermark(73)), '-infinity'::timestamp without time zone)))
                     ->  Bitmap Heap Scan on _hyper_73_917_chunk  (cost=1.77..39.63 rows=37 width=56) (actual time=0.535..1.485 rows=10217 loops=1)
                           Recheck Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_internal.to_timestamp_without_timezone(_timescaledb_internal.cagg_watermark(73)), '-infinity'::timestamp without time zone)))
                           Heap Blocks: exact=226
                           ->  Bitmap Index Scan on _hyper_73_917_chunk__materialized_hypertable_73_location_xtime_  (cost=0.00..1.76 rows=37 width=0) (actual time=0.518..0.518 rows=10217 loops=1)
                                 Index Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_internal.to_timestamp_without_timezone(_timescaledb_internal.cagg_watermark(73)), '-infinity'::timestamp without time zone)))
               ->  HashAggregate  (cost=102.92..103.59 rows=54 width=26) (actual time=0.020..0.021 rows=0 loops=1)
                     Group Key: time_bucket('00:05:00'::interval, temperature."time"), temperature.location
                     Batches: 1  Memory Usage: 24kB
                     ->  Custom Scan (ChunkAppend) on temperature  (cost=0.28..102.38 rows=54 width=18) (actual time=0.017..0.017 rows=0 loops=1)
                           Chunks excluded during startup: 53
                           ->  Index Scan using _hyper_72_911_chunk_temperature_location_time_idx on _hyper_72_911_chunk  (cost=0.28..1.89 rows=1 width=18) (actual time=0.016..0.016 rows=0 loops=1)
                                 Index Cond: ((location = 'Y'::text) AND ("time" >= COALESCE(_timescaledb_internal.to_timestamp_without_timezone(_timescaledb_internal.cagg_watermark(73)), '-infinity'::timestamp without time zone)))
 Planning Time: 11.289 ms
 Execution Time: 32.144 ms
(46 rows)

How can we reproduce the bug?

CREATE TABLE temperature (
        time TIMESTAMP NOT NULL,
        location TEXT NOT NULL,
        temperature DOUBLE PRECISION NOT NULL
);

SELECT create_hypertable('temperature', 'time');

CREATE INDEX temperature_location_time_idx ON temperature(location, time);

INSERT INTO temperature SELECT time, 'X' as location, random() * 100 AS temperature FROM generate_series(now() - interval '1 year', now(), interval '1 minute') as time;
INSERT INTO temperature SELECT time, 'Y' as location, random() * 100 AS temperature FROM generate_series(now() - interval '1 year', now(), interval '1 minute') as time;
INSERT INTO temperature SELECT time, 'Z' as location, random() * 100 AS temperature FROM generate_series(now() - interval '1 year', now(), interval '1 minute') as time;


CREATE MATERIALIZED VIEW temperature_5min
        WITH (timescaledb.continuous)
        AS SELECT time_bucket('5 min', time) as xtime, location, min(temperature), max(temperature)
        FROM temperature
        GROUP BY xtime, location;

EXPLAIN ANALYZE SELECT * FROM temperature_5min WHERE location = 'Y' ORDER BY xtime DESC LIMIT 10;
@d4b27a d4b27a added the bug label Oct 20, 2022
@fabriziomello
Copy link
Contributor

fabriziomello commented Oct 21, 2022

@d4b27a there's nothing wrong with your example. Actually this is an expected behavior because in your query against the temperature_5min Continuous Aggregate you're not filtering by the time bucket column (xtime). See how the plan changes if you use it in your query:

fabrizio=# EXPLAIN ANALYZE SELECT * FROM temperature_5min WHERE location = 'Y' AND xtime > '2022-10-20' ORDER BY xtime DESC LIMIT 10;
                                                                                              QUERY PLAN                                                                                              
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.29..20.86 rows=10 width=26) (actual time=0.088..0.122 rows=10 loops=1)
   ->  Custom Scan (ChunkAppend) on _materialized_hypertable_2  (cost=0.29..827.37 rows=402 width=26) (actual time=0.086..0.116 rows=10 loops=1)
         Order: _materialized_hypertable_2.xtime DESC
         ->  Index Scan using _hyper_2_110_chunk__materialized_hypertable_2_location_xtime_id on _hyper_2_110_chunk  (cost=0.29..827.37 rows=402 width=26) (actual time=0.085..0.111 rows=10 loops=1)
               Index Cond: ((location = 'Y'::text) AND (xtime > '2022-10-20 00:00:00'::timestamp without time zone))
 Planning Time: 1.133 ms
 Execution Time: 0.187 ms
(7 rows)

@d4b27a
Copy link
Author

d4b27a commented Oct 21, 2022

What confuses me is that querying raw hypertable produces much better results (queries only one chunk, produces only 10 rows).

Also, there is "ORDER BY xtime DESC LIMIT 10", which, I hoped, can be used to figure out that it's not necessary go through all 'Y' locations.

@fabriziomello
Copy link
Contributor

What confuses me is that querying raw hypertable produces much better results (queries only one chunk, produces only 10 rows).

It does not scan only one chunk, see:

fabrizio=# EXPLAIN (ANALYZE) SELECT * FROM temperature WHERE location = 'Y' ORDER BY time DESC LIMIT 10;~
                                                                                          QUERY PLAN                                                                                           
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.28..1.01 rows=10 width=18) (actual time=0.039..0.048 rows=10 loops=1)
   ->  Custom Scan (ChunkAppend) on temperature  (cost=0.28..170.81 rows=2348 width=18) (actual time=0.038..0.045 rows=10 loops=1)
         Order: temperature."time" DESC
         ->  Index Scan Backward using _hyper_1_53_chunk_temperature_location_time_idx on _hyper_1_53_chunk  (cost=0.28..170.81 rows=2348 width=18) (actual time=0.037..0.041 rows=10 loops=1)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_52_chunk_temperature_location_time_idx on _hyper_1_52_chunk  (cost=0.29..712.86 rows=10090 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_51_chunk_temperature_location_time_idx on _hyper_1_51_chunk  (cost=0.29..712.50 rows=10069 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_50_chunk_temperature_location_time_idx on _hyper_1_50_chunk  (cost=0.29..712.60 rows=10075 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_49_chunk_temperature_location_time_idx on _hyper_1_49_chunk  (cost=0.29..712.81 rows=10087 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_48_chunk_temperature_location_time_idx on _hyper_1_48_chunk  (cost=0.29..712.57 rows=10073 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_47_chunk_temperature_location_time_idx on _hyper_1_47_chunk  (cost=0.29..712.86 rows=10090 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_46_chunk_temperature_location_time_idx on _hyper_1_46_chunk  (cost=0.29..712.67 rows=10079 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_45_chunk_temperature_location_time_idx on _hyper_1_45_chunk  (cost=0.29..712.72 rows=10082 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_44_chunk_temperature_location_time_idx on _hyper_1_44_chunk  (cost=0.29..712.90 rows=10092 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_43_chunk_temperature_location_time_idx on _hyper_1_43_chunk  (cost=0.29..712.76 rows=10084 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_42_chunk_temperature_location_time_idx on _hyper_1_42_chunk  (cost=0.29..712.71 rows=10081 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_41_chunk_temperature_location_time_idx on _hyper_1_41_chunk  (cost=0.29..712.72 rows=10082 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_40_chunk_temperature_location_time_idx on _hyper_1_40_chunk  (cost=0.29..712.85 rows=10089 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_39_chunk_temperature_location_time_idx on _hyper_1_39_chunk  (cost=0.29..712.62 rows=10076 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_38_chunk_temperature_location_time_idx on _hyper_1_38_chunk  (cost=0.29..712.81 rows=10087 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_37_chunk_temperature_location_time_idx on _hyper_1_37_chunk  (cost=0.29..712.55 rows=10072 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_36_chunk_temperature_location_time_idx on _hyper_1_36_chunk  (cost=0.29..712.71 rows=10081 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_35_chunk_temperature_location_time_idx on _hyper_1_35_chunk  (cost=0.29..712.72 rows=10082 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_34_chunk_temperature_location_time_idx on _hyper_1_34_chunk  (cost=0.29..712.62 rows=10076 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_33_chunk_temperature_location_time_idx on _hyper_1_33_chunk  (cost=0.29..712.58 rows=10074 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_32_chunk_temperature_location_time_idx on _hyper_1_32_chunk  (cost=0.29..713.00 rows=10098 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_31_chunk_temperature_location_time_idx on _hyper_1_31_chunk  (cost=0.29..712.81 rows=10087 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_30_chunk_temperature_location_time_idx on _hyper_1_30_chunk  (cost=0.29..712.71 rows=10081 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_29_chunk_temperature_location_time_idx on _hyper_1_29_chunk  (cost=0.29..712.86 rows=10090 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_28_chunk_temperature_location_time_idx on _hyper_1_28_chunk  (cost=0.29..712.48 rows=10068 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_27_chunk_temperature_location_time_idx on _hyper_1_27_chunk  (cost=0.29..712.60 rows=10075 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_26_chunk_temperature_location_time_idx on _hyper_1_26_chunk  (cost=0.29..712.90 rows=10092 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_25_chunk_temperature_location_time_idx on _hyper_1_25_chunk  (cost=0.29..712.76 rows=10084 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_24_chunk_temperature_location_time_idx on _hyper_1_24_chunk  (cost=0.29..712.71 rows=10081 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_23_chunk_temperature_location_time_idx on _hyper_1_23_chunk  (cost=0.29..712.60 rows=10075 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_22_chunk_temperature_location_time_idx on _hyper_1_22_chunk  (cost=0.29..712.53 rows=10071 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_21_chunk_temperature_location_time_idx on _hyper_1_21_chunk  (cost=0.29..712.78 rows=10085 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_20_chunk_temperature_location_time_idx on _hyper_1_20_chunk  (cost=0.29..712.71 rows=10081 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_19_chunk_temperature_location_time_idx on _hyper_1_19_chunk  (cost=0.29..712.62 rows=10076 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_18_chunk_temperature_location_time_idx on _hyper_1_18_chunk  (cost=0.29..712.60 rows=10075 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_17_chunk_temperature_location_time_idx on _hyper_1_17_chunk  (cost=0.29..712.76 rows=10084 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_16_chunk_temperature_location_time_idx on _hyper_1_16_chunk  (cost=0.29..712.55 rows=10072 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_15_chunk_temperature_location_time_idx on _hyper_1_15_chunk  (cost=0.29..712.71 rows=10081 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_14_chunk_temperature_location_time_idx on _hyper_1_14_chunk  (cost=0.29..712.79 rows=10086 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_13_chunk_temperature_location_time_idx on _hyper_1_13_chunk  (cost=0.29..712.93 rows=10094 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_12_chunk_temperature_location_time_idx on _hyper_1_12_chunk  (cost=0.29..712.57 rows=10073 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_11_chunk_temperature_location_time_idx on _hyper_1_11_chunk  (cost=0.29..712.85 rows=10089 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_10_chunk_temperature_location_time_idx on _hyper_1_10_chunk  (cost=0.29..712.55 rows=10072 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_9_chunk_temperature_location_time_idx on _hyper_1_9_chunk  (cost=0.29..712.71 rows=10081 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_8_chunk_temperature_location_time_idx on _hyper_1_8_chunk  (cost=0.29..712.62 rows=10076 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_7_chunk_temperature_location_time_idx on _hyper_1_7_chunk  (cost=0.29..712.78 rows=10085 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_6_chunk_temperature_location_time_idx on _hyper_1_6_chunk  (cost=0.29..712.53 rows=10071 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_5_chunk_temperature_location_time_idx on _hyper_1_5_chunk  (cost=0.29..712.67 rows=10079 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_4_chunk_temperature_location_time_idx on _hyper_1_4_chunk  (cost=0.29..713.12 rows=10105 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_3_chunk_temperature_location_time_idx on _hyper_1_3_chunk  (cost=0.29..712.50 rows=10069 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_2_chunk_temperature_location_time_idx on _hyper_1_2_chunk  (cost=0.29..712.65 rows=10078 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_1_chunk_temperature_location_time_idx on _hyper_1_1_chunk  (cost=0.29..647.69 rows=9173 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
 Planning Time: 5.331 ms
 Execution Time: 0.432 ms
(111 rows)

And the plan querying the hypertable will be different than querying the continuous aggregate because it have aggregation and using the realtime cagg you have two queries, one against the materialization hypertable and another (using union) against the original hypertable. Both are controlled by the watermark.

Those queries without using the time column WILL scan all chunks because the planner don't know how to exclude unnecessary chunks without any related predicate.

@d4b27a
Copy link
Author

d4b27a commented Oct 21, 2022

None of those scans except the first were executed.

@akuzm
Copy link
Member

akuzm commented Jan 25, 2023

For some reason we are not able to use the Ordered Append optimization on the cagg table, even though the chunks have the necessary indexes, and even when the sort is disabled:

test=# \d _timescaledb_internal._hyper_20_920_chunk
            Table "_timescaledb_internal._hyper_20_920_chunk"
  Column  │            Type             │ Collation │ Nullable │ Default 
──────────┼─────────────────────────────┼───────────┼──────────┼─────────
 xtime    │ timestamp without time zone │           │ not null │ 
 location │ text                        │           │          │ 
 min      │ double precision            │           │          │ 
 max      │ double precision            │           │          │ 
Indexes:
    "_hyper_20_920_chunk__materialized_hypertable_20_location_xtime_" btree (location, xtime DESC)
    "_hyper_20_920_chunk__materialized_hypertable_20_xtime_idx" btree (xtime DESC)
Check constraints:
    "constraint_584" CHECK (xtime >= '2021-12-09 00:00:00'::timestamp without time zone AND xtime < '2022-02-17 00:00:00'::timestamp without time zone)
Inherits: _timescaledb_internal._materialized_hypertable_20


test=# set enable_sort to off;
test=# EXPLAIN ANALYZE SELECT * FROM temperature_5min where location = 'Y' ORDER BY xtime DESC LIMIT 10;
                                                                                                               QUERY PLAN                                                                                                               
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 Limit  (cost=10000009301.97..10000009302.00 rows=10 width=26) (actual time=21.214..21.218 rows=10 loops=1)
   ->  Sort  (cost=10000009301.97..10000009564.91 rows=105174 width=26) (actual time=21.213..21.217 rows=10 loops=1)
         Sort Key: _materialized_hypertable_20.xtime DESC
         Sort Method: top-N heapsort  Memory: 25kB
         ->  Append  (cost=0.29..7029.20 rows=105174 width=26) (actual time=0.037..17.628 rows=105121 loops=1)
               ->  Custom Scan (ChunkAppend) on _materialized_hypertable_20  (cost=0.29..5349.90 rows=105121 width=26) (actual time=0.036..13.935 rows=105121 loops=1)
                     Chunks excluded during startup: 0
                     ->  Index Scan using _hyper_20_920_chunk__materialized_hypertable_20_location_xtime_ on _hyper_20_920_chunk  (cost=0.29..329.06 rows=6440 width=26) (actual time=0.036..1.216 rows=6440 loops=1)
                           Index Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_internal.to_timestamp_without_timezone(_timescaledb_internal.cagg_watermark(20)), '-infinity'::timestamp without time zone)))
                     ->  Index Scan using _hyper_20_921_chunk__materialized_hypertable_20_location_xtime_ on _hyper_20_921_chunk  (cost=0.42..1025.84 rows=20160 width=26) (actual time=0.018..2.084 rows=20160 loops=1)
                           Index Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_internal.to_timestamp_without_timezone(_timescaledb_internal.cagg_watermark(20)), '-infinity'::timestamp without time zone)))
....

The same optimization is correctly applied to the source hypertable, which you can see in the above example by Fabrizio:

fabrizio=# EXPLAIN (ANALYZE) SELECT * FROM temperature WHERE location = 'Y' ORDER BY time DESC LIMIT 10;~
                                                                                          QUERY PLAN                                                                                           
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.28..1.01 rows=10 width=18) (actual time=0.039..0.048 rows=10 loops=1)
   ->  Custom Scan (ChunkAppend) on temperature  (cost=0.28..170.81 rows=2348 width=18) (actual time=0.038..0.045 rows=10 loops=1)
         Order: temperature."time" DESC
         ->  Index Scan Backward using _hyper_1_53_chunk_temperature_location_time_idx on _hyper_1_53_chunk  (cost=0.28..170.81 rows=2348 width=18) (actual time=0.037..0.041 rows=10 loops=1)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan Backward using _hyper_1_52_chunk_temperature_location_time_idx on _hyper_1_52_chunk  (cost=0.29..712.86 rows=10090 width=18) (never executed)
               Index Cond: (location = 'Y'::text)
               ...

This looks like some kind of bug to me. @jnidzwetzki I see you removed the "bug" label, did you find something else not described here?

@fabriziomello
Copy link
Contributor

@d4b27a in the upcoming 2.14.0 version we introduced the ability to constify the cagg_watermark function call making the planner much faster and by doing that we now give a change to Postgres use parallel workers.

Now using your example the query is faster than before:

fabrizio=# EXPLAIN ANALYZE SELECT * FROM temperature_5min WHERE location = 'Y' ORDER BY xtime DESC LIMIT 10;
                                                                                               QUERY PLAN                                                                                               
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=7495.08..7496.25 rows=10 width=26) (actual time=36.479..38.948 rows=10 loops=1)
   ->  Gather Merge  (cost=7495.08..17722.09 rows=87654 width=26) (actual time=36.477..38.945 rows=10 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=6495.06..6604.62 rows=43827 width=26) (actual time=32.968..32.972 rows=10 loops=3)
               Sort Key: (time_bucket('00:05:00'::interval, _hyper_1_53_chunk."time")) DESC
               Sort Method: top-N heapsort  Memory: 26kB
               Worker 0:  Sort Method: top-N heapsort  Memory: 26kB
               Worker 1:  Sort Method: top-N heapsort  Memory: 26kB
               ->  Parallel Append  (cost=0.00..5547.97 rows=43827 width=26) (actual time=0.017..25.621 rows=35040 loops=3)
                     ->  GroupAggregate  (cost=0.29..6.09 rows=1 width=26) (actual time=0.024..0.024 rows=0 loops=1)
                           Group Key: (time_bucket('00:05:00'::interval, _hyper_1_53_chunk."time"))
                           ->  Result  (cost=0.29..6.07 rows=1 width=18) (actual time=0.023..0.023 rows=0 loops=1)
                                 ->  Index Scan using _hyper_1_53_chunk_temperature_location_time_idx on _hyper_1_53_chunk  (cost=0.29..6.06 rows=1 width=18) (actual time=0.021..0.021 rows=0 loops=1)
                                       Index Cond: ((location = 'Y'::text) AND ("time" >= '2024-02-05 15:10:00'::timestamp without time zone))
                     ->  Parallel Append  (cost=0.00..5322.74 rows=43826 width=26) (actual time=0.008..23.750 rows=35040 loops=3)
                           ->  Parallel Seq Scan on _hyper_2_55_chunk  (cost=0.00..978.65 rows=11918 width=26) (actual time=0.006..12.756 rows=20160 loops=1)
                                 Filter: ((xtime < '2024-02-05 15:10:00'::timestamp without time zone) AND (location = 'Y'::text))
                                 Rows Removed by Filter: 40320
                           ->  Parallel Seq Scan on _hyper_2_56_chunk  (cost=0.00..978.65 rows=11975 width=26) (actual time=0.008..12.822 rows=20160 loops=1)
                                 Filter: ((xtime < '2024-02-05 15:10:00'::timestamp without time zone) AND (location = 'Y'::text))
                                 Rows Removed by Filter: 40320
                           ->  Parallel Seq Scan on _hyper_2_57_chunk  (cost=0.00..978.65 rows=11790 width=26) (actual time=0.006..3.975 rows=6720 loops=3)
                                 Filter: ((xtime < '2024-02-05 15:10:00'::timestamp without time zone) AND (location = 'Y'::text))
                                 Rows Removed by Filter: 13440
                           ->  Parallel Seq Scan on _hyper_2_58_chunk  (cost=0.00..978.65 rows=11892 width=26) (actual time=0.006..6.146 rows=10080 loops=2)
                                 Filter: ((xtime < '2024-02-05 15:10:00'::timestamp without time zone) AND (location = 'Y'::text))
                                 Rows Removed by Filter: 20160
                           ->  Parallel Seq Scan on _hyper_2_54_chunk  (cost=0.00..927.99 rows=11143 width=26) (actual time=0.004..12.747 rows=19115 loops=1)
                                 Filter: ((xtime < '2024-02-05 15:10:00'::timestamp without time zone) AND (location = 'Y'::text))
                                 Rows Removed by Filter: 38230
                           ->  Parallel Seq Scan on _hyper_2_59_chunk  (cost=0.00..261.04 rows=3156 width=26) (actual time=0.010..3.591 rows=5366 loops=1)
                                 Filter: ((xtime < '2024-02-05 15:10:00'::timestamp without time zone) AND (location = 'Y'::text))
                                 Rows Removed by Filter: 10732
 Planning Time: 5.534 ms
 Execution Time: 39.118 ms
(36 rows)

Disabling the constify operation in the planner we can check the difference:

fabrizio=# SET timescaledb.enable_cagg_watermark_constify TO off;
SET
fabrizio=# EXPLAIN ANALYZE SELECT * FROM temperature_5min WHERE location = 'Y' ORDER BY xtime DESC LIMIT 10;
                                                                                                                  QUERY PLAN                                                                                                                   
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=10752.48..10752.50 rows=10 width=26) (actual time=71.569..71.577 rows=10 loops=1)
   ->  Sort  (cost=10752.48..11015.58 rows=105240 width=26) (actual time=71.568..71.574 rows=10 loops=1)
         Sort Key: _materialized_hypertable_2.xtime DESC
         Sort Method: top-N heapsort  Memory: 26kB
         ->  Append  (cost=610.59..8478.28 rows=105240 width=26) (actual time=3.111..51.405 rows=105121 loops=1)
               ->  Custom Scan (ChunkAppend) on _materialized_hypertable_2  (cost=610.59..7629.05 rows=105187 width=26) (actual time=3.110..46.291 rows=105121 loops=1)
                     Chunks excluded during startup: 0
                     ->  Bitmap Heap Scan on _hyper_2_54_chunk  (cost=610.59..1411.45 rows=18943 width=26) (actual time=3.110..6.594 rows=19115 loops=1)
                           Recheck Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_functions.to_timestamp_without_timezone(_timescaledb_functions.cagg_watermark(2)), '-infinity'::timestamp without time zone)))
                           Heap Blocks: exact=422
                           ->  Bitmap Index Scan on _hyper_2_54_chunk__materialized_hypertable_2_location_xtime_idx  (cost=0.00..605.85 rows=18943 width=0) (actual time=3.038..3.038 rows=19115 loops=1)
                                 Index Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_functions.to_timestamp_without_timezone(_timescaledb_functions.cagg_watermark(2)), '-infinity'::timestamp without time zone)))
                     ->  Bitmap Heap Scan on _hyper_2_55_chunk  (cost=603.97..1454.19 rows=20261 width=26) (actual time=3.338..7.094 rows=20160 loops=1)
                           Recheck Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_functions.to_timestamp_without_timezone(_timescaledb_functions.cagg_watermark(2)), '-infinity'::timestamp without time zone)))
                           Heap Blocks: exact=445
                           ->  Bitmap Index Scan on _hyper_2_55_chunk__materialized_hypertable_2_location_xtime_idx  (cost=0.00..598.90 rows=20261 width=0) (actual time=3.262..3.263 rows=20160 loops=1)
                                 Index Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_functions.to_timestamp_without_timezone(_timescaledb_functions.cagg_watermark(2)), '-infinity'::timestamp without time zone)))
                     ->  Bitmap Heap Scan on _hyper_2_56_chunk  (cost=617.09..1469.25 rows=20358 width=26) (actual time=3.517..7.540 rows=20160 loops=1)
                           Recheck Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_functions.to_timestamp_without_timezone(_timescaledb_functions.cagg_watermark(2)), '-infinity'::timestamp without time zone)))
                           Heap Blocks: exact=445
                           ->  Bitmap Index Scan on _hyper_2_56_chunk__materialized_hypertable_2_location_xtime_idx  (cost=0.00..612.00 rows=20358 width=0) (actual time=3.432..3.432 rows=20160 loops=1)
                                 Index Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_functions.to_timestamp_without_timezone(_timescaledb_functions.cagg_watermark(2)), '-infinity'::timestamp without time zone)))
                     ->  Bitmap Heap Scan on _hyper_2_57_chunk  (cost=589.74..1435.60 rows=20043 width=26) (actual time=3.254..7.946 rows=20160 loops=1)
                           Recheck Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_functions.to_timestamp_without_timezone(_timescaledb_functions.cagg_watermark(2)), '-infinity'::timestamp without time zone)))
                           Heap Blocks: exact=445
                           ->  Bitmap Index Scan on _hyper_2_57_chunk__materialized_hypertable_2_location_xtime_idx  (cost=0.00..584.72 rows=20043 width=0) (actual time=3.182..3.182 rows=20160 loops=1)
                                 Index Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_functions.to_timestamp_without_timezone(_timescaledb_functions.cagg_watermark(2)), '-infinity'::timestamp without time zone)))
                     ->  Bitmap Heap Scan on _hyper_2_58_chunk  (cost=619.63..1468.95 rows=20216 width=26) (actual time=4.225..8.655 rows=20160 loops=1)
                           Recheck Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_functions.to_timestamp_without_timezone(_timescaledb_functions.cagg_watermark(2)), '-infinity'::timestamp without time zone)))
                           Heap Blocks: exact=445
                           ->  Bitmap Index Scan on _hyper_2_58_chunk__materialized_hypertable_2_location_xtime_idx  (cost=0.00..614.58 rows=20216 width=0) (actual time=4.147..4.148 rows=20160 loops=1)
                                 Index Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_functions.to_timestamp_without_timezone(_timescaledb_functions.cagg_watermark(2)), '-infinity'::timestamp without time zone)))
                     ->  Bitmap Heap Scan on _hyper_2_59_chunk  (cost=163.29..389.61 rows=5366 width=26) (actual time=1.416..2.987 rows=5366 loops=1)
                           Recheck Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_functions.to_timestamp_without_timezone(_timescaledb_functions.cagg_watermark(2)), '-infinity'::timestamp without time zone)))
                           Heap Blocks: exact=119
                           ->  Bitmap Index Scan on _hyper_2_59_chunk__materialized_hypertable_2_location_xtime_idx  (cost=0.00..161.95 rows=5366 width=0) (actual time=1.383..1.383 rows=5366 loops=1)
                                 Index Cond: ((location = 'Y'::text) AND (xtime < COALESCE(_timescaledb_functions.to_timestamp_without_timezone(_timescaledb_functions.cagg_watermark(2)), '-infinity'::timestamp without time zone)))
               ->  HashAggregate  (cost=322.37..323.03 rows=53 width=26) (actual time=0.031..0.032 rows=0 loops=1)
                     Group Key: time_bucket('00:05:00'::interval, temperature."time")
                     Batches: 1  Memory Usage: 24kB
                     ->  Result  (cost=0.29..321.97 rows=53 width=18) (actual time=0.029..0.029 rows=0 loops=1)
                           ->  Custom Scan (ChunkAppend) on temperature  (cost=0.29..321.31 rows=53 width=18) (actual time=0.028..0.029 rows=0 loops=1)
                                 Chunks excluded during startup: 52
                                 ->  Index Scan using _hyper_1_53_chunk_temperature_location_time_idx on _hyper_1_53_chunk  (cost=0.29..6.06 rows=1 width=18) (actual time=0.028..0.028 rows=0 loops=1)
                                       Index Cond: ((location = 'Y'::text) AND ("time" >= COALESCE(_timescaledb_functions.to_timestamp_without_timezone(_timescaledb_functions.cagg_watermark(2)), '-infinity'::timestamp without time zone)))
 Planning Time: 7.188 ms
 Execution Time: 72.725 ms
(47 rows)

And the Ordered Append optimization mentioned by @akuzm works with non-realtime CAggs. Unfortunately it don't work for realtime:

fabrizio=# ALTER MATERIALIZED VIEW temperature_5min SET (timescaledb.materialized_only=true);
ALTER MATERIALIZED VIEW
fabrizio=# EXPLAIN ANALYZE SELECT * FROM temperature_5min WHERE location = 'Y' ORDER BY xtime DESC LIMIT 10;
                                                                                              QUERY PLAN                                                                                              
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.29..1.50 rows=10 width=26) (actual time=0.024..0.038 rows=10 loops=1)
   ->  Custom Scan (ChunkAppend) on _materialized_hypertable_2  (cost=0.29..653.28 rows=5366 width=26) (actual time=0.023..0.036 rows=10 loops=1)
         Order: _materialized_hypertable_2.xtime DESC
         ->  Index Scan using _hyper_2_59_chunk__materialized_hypertable_2_location_xtime_idx on _hyper_2_59_chunk  (cost=0.29..653.28 rows=5366 width=26) (actual time=0.022..0.034 rows=10 loops=1)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan using _hyper_2_58_chunk__materialized_hypertable_2_location_xtime_idx on _hyper_2_58_chunk  (cost=0.41..2436.44 rows=20216 width=26) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan using _hyper_2_56_chunk__materialized_hypertable_2_location_xtime_idx on _hyper_2_56_chunk  (cost=0.41..2438.02 rows=20358 width=26) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan using _hyper_2_57_chunk__materialized_hypertable_2_location_xtime_idx on _hyper_2_57_chunk  (cost=0.29..2401.39 rows=20043 width=26) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan using _hyper_2_55_chunk__materialized_hypertable_2_location_xtime_idx on _hyper_2_55_chunk  (cost=0.29..2432.04 rows=20261 width=26) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan using _hyper_2_54_chunk__materialized_hypertable_2_location_xtime_idx on _hyper_2_54_chunk  (cost=0.41..2328.38 rows=18943 width=26) (never executed)
               Index Cond: (location = 'Y'::text)
 Planning Time: 0.593 ms
 Execution Time: 0.082 ms
(17 rows)

fabrizio=# SET timescaledb.enable_cagg_watermark_constify TO off;
SET
fabrizio=# EXPLAIN ANALYZE SELECT * FROM temperature_5min WHERE location = 'Y' ORDER BY xtime DESC LIMIT 10;
                                                                                              QUERY PLAN                                                                                              
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.29..1.50 rows=10 width=26) (actual time=0.026..0.041 rows=10 loops=1)
   ->  Custom Scan (ChunkAppend) on _materialized_hypertable_2  (cost=0.29..653.28 rows=5366 width=26) (actual time=0.025..0.039 rows=10 loops=1)
         Order: _materialized_hypertable_2.xtime DESC
         ->  Index Scan using _hyper_2_59_chunk__materialized_hypertable_2_location_xtime_idx on _hyper_2_59_chunk  (cost=0.29..653.28 rows=5366 width=26) (actual time=0.023..0.036 rows=10 loops=1)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan using _hyper_2_58_chunk__materialized_hypertable_2_location_xtime_idx on _hyper_2_58_chunk  (cost=0.41..2436.44 rows=20216 width=26) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan using _hyper_2_56_chunk__materialized_hypertable_2_location_xtime_idx on _hyper_2_56_chunk  (cost=0.41..2438.02 rows=20358 width=26) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan using _hyper_2_57_chunk__materialized_hypertable_2_location_xtime_idx on _hyper_2_57_chunk  (cost=0.29..2401.39 rows=20043 width=26) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan using _hyper_2_55_chunk__materialized_hypertable_2_location_xtime_idx on _hyper_2_55_chunk  (cost=0.29..2432.04 rows=20261 width=26) (never executed)
               Index Cond: (location = 'Y'::text)
         ->  Index Scan using _hyper_2_54_chunk__materialized_hypertable_2_location_xtime_idx on _hyper_2_54_chunk  (cost=0.41..2328.38 rows=18943 width=26) (never executed)
               Index Cond: (location = 'Y'::text)
 Planning Time: 0.558 ms
 Execution Time: 0.083 ms
(17 rows)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants