This repository has been archived by the owner on Apr 2, 2024. It is now read-only.
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix poor performance in maintenance tasks
The maintenance tasks were not performing well due to issue timescale/timescaledb#3404. Namely, the qualifier on series_id were not being pushed down correctly to compressed chunks on the metric table. So for example, the old plan for the delete_expired_series query was ``` Update on container_memory_failures_total (cost=0.00..39538450.88 rows=1 width=210) -> Nested Loop Semi Join (cost=0.00..39538450.88 rows=1 width=210) Join Filter: (container_memory_failures_total.id = potentially_drop_series.series_id) -> Nested Loop (cost=0.00..50249.71 rows=1148685 width=168) -> Seq Scan on ids_epoch (cost=0.00..1.01 rows=1 width=14) -> Seq Scan on container_memory_failures_total (cost=0.00..38761.85 rows=1148685 width=154) Filter: (delete_epoch IS NULL) -> Materialize (cost=0.00..39470970.90 rows=1 width=50) -> Nested Loop Anti Join (cost=0.00..39470970.89 rows=1 width=50) Join Filter: (data_exists.series_id = potentially_drop_series.series_id) -> Subquery Scan on potentially_drop_series (cost=0.00..0.07 rows=1 width=40) -> Limit (cost=0.00..0.06 rows=1 width=12) -> HashSetOp Except (cost=0.00..0.06 rows=1 width=12) -> Append (cost=0.00..0.05 rows=2 width=12) -> Subquery Scan on "*SELECT* 1" (cost=0.00..0.02 rows=1 width=12) -> HashAggregate (cost=0.00..0.01 rows=1 width=0) Group Key: series_id -> Result (cost=0.00..0.00 rows=0 width=0) One-Time Filter: false -> Subquery Scan on "*SELECT* 2" (cost=0.00..0.02 rows=1 width=12) -> HashAggregate (cost=0.00..0.01 rows=1 width=0) Group Key: series_id -> Result (cost=0.00..0.00 rows=0 width=0) One-Time Filter: false -> Append (cost=0.00..11565989.70 rows=2232398490 width=18) -> Seq Scan on container_memory_failures_total data_exists_1 (cost=0.00..0.00 rows=1 width=18) Filter: ("time" >= '2021-02-11 13:11:07.779863+00'::timestamp with time zone) -> Custom Scan (DecompressChunk) on _hyper_318_62_chunk data_exists_2 (cost=0.04..2660.36 rows=67068000 width=18) Filter: ("time" >= '2021-02-11 13:11:07.779863+00'::timestamp with time zone) -> Seq Scan on compress_hyper_333_9206_chunk (cost=0.00..2660.36 rows=67068 width=136) Filter: (_ts_meta_max_1 >= '2021-02-11 13:11:07.779863+00'::timestamp with time zone) -> Custom Scan (DecompressChunk) on _hyper_318_4497_chunk data_exists_3 (cost=0.04..2698.91 rows=67432000 width=18) Filter: ("time" >= '2021-02-11 13:11:07.779863+00'::timestamp with time zone) -> Seq Scan on compress_hyper_333_14717_chunk (cost=0.00..2698.91 rows=67432 width=136) Filter: (_ts_meta_max_1 >= '2021-02-11 13:11:07.779863+00'::timestamp with time zone) .... ``` Notice the lack of quals on the compressed chunks. The new plan is: ``` Update on container_memory_failures_total (cost=3.03..6.28 rows=1 width=228) -> Nested Loop (cost=3.03..6.28 rows=1 width=228) -> Nested Loop (cost=3.03..5.26 rows=1 width=214) -> HashAggregate (cost=2.60..2.61 rows=1 width=68) Group Key: potentially_drop_series.series_id -> Nested Loop Left Join (cost=2.51..2.60 rows=1 width=68) Filter: (ex.indicator IS NULL) -> Subquery Scan on potentially_drop_series (cost=0.00..0.07 rows=1 width=40) -> Limit (cost=0.00..0.06 rows=1 width=12) -> HashSetOp Except (cost=0.00..0.06 rows=1 width=12) -> Append (cost=0.00..0.05 rows=2 width=12) -> Subquery Scan on "*SELECT* 1" (cost=0.00..0.02 rows=1 width=12) -> HashAggregate (cost=0.00..0.01 rows=1 width=0) Group Key: series_id -> Result (cost=0.00..0.00 rows=0 width=0) One-Time Filter: false -> Subquery Scan on "*SELECT* 2" (cost=0.00..0.02 rows=1 width=12) -> HashAggregate (cost=0.00..0.01 rows=1 width=0) Group Key: series_id -> Result (cost=0.00..0.00 rows=0 width=0) One-Time Filter: false -> Subquery Scan on ex (cost=2.51..2.52 rows=1 width=32) -> Limit (cost=2.51..2.51 rows=1 width=12) -> Custom Scan (ChunkAppend) on container_memory_failures_total data_exists (cost=2.51..2.51 rows=1000 width=12) Order: data_exists."time" -> Custom Scan (DecompressChunk) on _hyper_318_62_chunk data_exists_1 (cost=2.51..2.51 rows=1000 width=8) Filter: ("time" >= '2021-02-11 13:11:07.779863+00'::timestamp with time zone) -> Index Scan using compress_hyper_333_9206_chunk__compressed_hypertable_333_series on compress_hyper_333_9206_chunk (cost=0.29..2.51 rows=1 width=56) Index Cond: (series_id = potentially_drop_series.series_id) Filter: (_ts_meta_max_1 >= '2021-02-11 13:11:07.779863+00'::timestamp with time zone) -> Custom Scan (DecompressChunk) on _hyper_318_4497_chunk data_exists_2 (cost=2.51..2.51 rows=1000 width=8) Filter: ("time" >= '2021-02-11 13:11:07.779863+00'::timestamp with time zone) -> Index Scan using compress_hyper_333_14717_chunk__compressed_hypertable_333_serie on compress_hyper_333_14717_chunk (cost=0.29..2.51 rows=1 width=56) Index Cond: (series_id = potentially_drop_series.series_id) Filter: (_ts_meta_max_1 >= '2021-02-11 13:11:07.779863+00'::timestamp with time zone) .... ``` We also include a small optimization to not execute the label delete if the array of labels is empty. I think this will speed up the maintenance tasks quite a bit and thus Fixes quite a few issues. Fixes #663, #506, #596
- Loading branch information