Skip to content

Latest commit

 

History

History
245 lines (193 loc) · 12.1 KB

20220408_03.md

File metadata and controls

245 lines (193 loc) · 12.1 KB

PostgreSQL 15 preview - 窗口函数排序过滤N条支持推理filter, 大幅提升性能, 避免whole filter, 避免性能骤变(limit N, GIST等场景)

作者

digoal

日期

2022-04-08

标签

PostgreSQL , 骤变 , 窗口 , 推理 , filter , whole filter , 有限filter


背景

建议先阅读如下文章, 找找感觉.

《PostgreSQL 优化器逻辑推理能力 源码解析》

《PostgreSQL 13 preview - gin倒排索引性能优化 - 防止gin full scan(逻辑推理)》

《PostgreSQL 函数稳定性与constraint_excluded分区表逻辑推理过滤的CASE》

《GIS附近查找性能优化 - PostGIS long lat geometry distance search tuning using gist knn function》

《PostgreSQL GiST Order by 距离 + 距离范围判定 + limit 骤变优化与背景原因》

PostgreSQL 支持窗口函数filter逻辑推理, 降低filter时需要扫描的条数, 避免whole filter. 可以大幅度提升性能.

那么它能不能解决这种问题呢?我认为未来应该可以, 因为只是支持更多的窗口操作符就行了. 目前仅支持btree的单调顺序或者单调降序场景, 支持顺序或降序的单调比较操作符(>,>=,=,<,<=).

复现骤变的例子  
  
postgres=# create table tbl (id int , ps point);  
CREATE TABLE  
postgres=# insert into tbl select generate_series(1,1000000), point(10+random()*1000, 10+random()*1000);  
INSERT 0 1000000  
postgres=# create index idx_tbl_1 on tbl using gist (ps);  
CREATE INDEX  
  
postgres=# explain (analyze,verbose,timing,costs,buffers) select * from tbl where ps <-> point(10,10) <1 order by ps <-> point(10,10) limit 10;  
                                                               QUERY PLAN                                                                  
-----------------------------------------------------------------------------------------------------------------------------------------  
 Limit  (cost=0.29..1.40 rows=10 width=28) (actual time=0.076..1044.221 rows=2 loops=1)  
   Output: id, ps, ((ps <-> '(10,10)'::point))  
   Buffers: shared hit=1005920  
   ->  Index Scan using idx_tbl_1 on public.tbl  (cost=0.29..37009.92 rows=333333 width=28) (actual time=0.074..1044.218 rows=2 loops=1)  
         Output: id, ps, (ps <-> '(10,10)'::point)  
         Order By: (tbl.ps <-> '(10,10)'::point)  
         Filter: ((tbl.ps <-> '(10,10)'::point) < '1'::double precision)  
         Rows Removed by Filter: 999998  
         Buffers: shared hit=1005920  
 Query Identifier: 3039857854596701697  
 Planning Time: 0.090 ms  
 Execution Time: 1044.260 ms  
(12 rows)  
  
postgres=# explain (analyze,verbose,timing,costs,buffers) select * from tbl where ps <-> point(10,10) <1;  
                                                   QUERY PLAN                                                     
----------------------------------------------------------------------------------------------------------------  
 Seq Scan on public.tbl  (cost=0.00..21370.00 rows=333333 width=20) (actual time=74.957..89.471 rows=2 loops=1)  
   Output: id, ps  
   Filter: ((tbl.ps <-> '(10,10)'::point) < '1'::double precision)  
   Rows Removed by Filter: 999998  
   Buffers: shared hit=6370  
 Query Identifier: 7140378054622426006  
 Planning Time: 0.099 ms  
 Execution Time: 89.493 ms  
(8 rows)  
  
postgres=# explain (analyze,verbose,timing,costs,buffers)   
select * from   
(select row_number() over (order by ps <-> point(10,10)) as rn, * from tbl where ps <-> point(10,10) <1 order by ps <-> point(10,10) ) t   
where rn<10;   
                                                                  QUERY PLAN                                                                     
-----------------------------------------------------------------------------------------------------------------------------------------------  
 Subquery Scan on t  (cost=0.29..47009.91 rows=111111 width=28) (actual time=0.802..1046.788 rows=2 loops=1)  
   Output: t.rn, t.id, t.ps  
   Filter: (t.rn < 10)  
   Buffers: shared hit=1005920  
   ->  WindowAgg  (cost=0.29..42843.24 rows=333333 width=36) (actual time=0.800..1046.783 rows=2 loops=1)  
         Output: row_number() OVER (?), tbl.id, tbl.ps, ((tbl.ps <-> '(10,10)'::point))  
         Buffers: shared hit=1005920  
         ->  Index Scan using idx_tbl_1 on public.tbl  (cost=0.29..37009.92 rows=333333 width=28) (actual time=0.149..1046.130 rows=2 loops=1)  
               Output: (tbl.ps <-> '(10,10)'::point), tbl.id, tbl.ps  
               Order By: (tbl.ps <-> '(10,10)'::point)  
               Filter: ((tbl.ps <-> '(10,10)'::point) < '1'::double precision)  
               Rows Removed by Filter: 999998  
               Buffers: shared hit=1005920  
 Query Identifier: 4323994608239779036  
 Planning Time: 0.219 ms  
 Execution Time: 1046.989 ms  
(16 rows)  

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9d9c02ccd1aea8e9131d8f4edb21bf1687e40782

Teach planner and executor about monotonic window funcs  
author	David Rowley <drowley@postgresql.org>	  
Thu, 7 Apr 2022 22:34:36 +0000 (10:34 +1200)  
committer	David Rowley <drowley@postgresql.org>	  
Thu, 7 Apr 2022 22:34:36 +0000 (10:34 +1200)  
commit	9d9c02ccd1aea8e9131d8f4edb21bf1687e40782  
tree	fb33e9286e8c46eb50424fb0e271a4579daa8f5d	tree  
parent	2f4d0d67994b32320487784afab7ab997d331bb5	commit | diff  
Teach planner and executor about monotonic window funcs  
  
Window functions such as row_number() always return a value higher than  
the previously returned value for tuples in any given window partition.  
  
Traditionally queries such as;  
  
SELECT * FROM (  
   SELECT *, row_number() over (order by c) rn  
   FROM t  
) t WHERE rn <= 10;  
  
were executed fairly inefficiently.  Neither the query planner nor the  
executor knew that once rn made it to 11 that nothing further would match  
the outer query's WHERE clause.  It would blindly continue until all  
tuples were exhausted from the subquery.  
  
Here we implement means to make the above execute more efficiently.  
  
This is done by way of adding a pg_proc.prosupport function to various of  
the built-in window functions and adding supporting code to allow the  
support function to inform the planner if the window function is  
monotonically increasing, monotonically decreasing, both or neither.  The  
planner is then able to make use of that information and possibly allow  
the executor to short-circuit execution by way of adding a "run condition"  
to the WindowAgg to allow it to determine if some of its execution work  
can be skipped.  
  
This "run condition" is not like a normal filter.  These run conditions  
are only built using quals comparing values to monotonic window functions.  
For monotonic increasing functions, quals making use of the btree  
operators for <, <= and = can be used (assuming the window function column  
is on the left). You can see here that once such a condition becomes false  
that a monotonic increasing function could never make it subsequently true  
again.  For monotonically decreasing functions the >, >= and = btree  
operators for the given type can be used for run conditions.  
  
The best-case situation for this is when there is a single WindowAgg node  
without a PARTITION BY clause.  Here when the run condition becomes false  
the WindowAgg node can simply return NULL.  No more tuples will ever match  
the run condition.  It's a little more complex when there is a PARTITION  
BY clause.  In this case, we cannot return NULL as we must still process  
other partitions.  To speed this case up we pull tuples from the outer  
plan to check if they're from the same partition and simply discard them  
if they are.  When we find a tuple belonging to another partition we start  
processing as normal again until the run condition becomes false or we run  
out of tuples to process.  
  
When there are multiple WindowAgg nodes to evaluate then this complicates  
the situation.  For intermediate WindowAggs we must ensure we always  
return all tuples to the calling node.  Any filtering done could lead to  
incorrect results in WindowAgg nodes above.  For all intermediate nodes,  
we can still save some work when the run condition becomes false.  We've  
no need to evaluate the WindowFuncs anymore.  Other WindowAgg nodes cannot  
reference the value of these and these tuples will not appear in the final  
result anyway.  The savings here are small in comparison to what can be  
saved in the top-level WingowAgg, but still worthwhile.  
  
Intermediate WindowAgg nodes never filter out tuples, but here we change  
WindowAgg so that the top-level WindowAgg filters out tuples that don't  
match the intermediate WindowAgg node's run condition.  Such filters  
appear in the "Filter" clause in EXPLAIN for the top-level WindowAgg node.  
  
Here we add prosupport functions to allow the above to work for;  
row_number(), rank(), dense_rank(), count(*) and count(expr).  It appears  
technically possible to do the same for min() and max(), however, it seems  
unlikely to be useful enough, so that's not done here.  
  
Bump catversion  
  
Author: David Rowley  
Reviewed-by: Andy Fan, Zhihong Yu  
Discussion: https://postgr.es/m/CAApHDvqvp3At8++yF8ij06sdcoo1S_b2YoaT9D4Nf+MObzsrLQ@mail.gmail.com  
+-- Ensure dr = 1 is converted to dr <= 1 to get all rows leading up to dr = 1  
+EXPLAIN (COSTS OFF)  
+SELECT * FROM  
+  (SELECT empno,  
+          salary,  
+          dense_rank() OVER (ORDER BY salary DESC) dr  
+   FROM empsalary) emp  
+WHERE dr = 1;  
+                     QUERY PLAN                        
+-----------------------------------------------------  
+ Subquery Scan on emp  
+   Filter: (emp.dr = 1)  
+   ->  WindowAgg  
+         Run Condition: (dense_rank() OVER (?) <= 1)  
+         ->  Sort  
+               Sort Key: empsalary.salary DESC  
+               ->  Seq Scan on empsalary  
+(7 rows)  
  
  
+-- likewise with count(empno) instead of row_number()  
+EXPLAIN (COSTS OFF)  
+SELECT * FROM  
+  (SELECT empno,  
+          depname,  
+          salary,  
+          count(empno) OVER (PARTITION BY depname ORDER BY salary DESC) c  
+   FROM empsalary) emp  
+WHERE c <= 3;  
+                         QUERY PLAN                           
+------------------------------------------------------------  
+ WindowAgg  
+   Run Condition: (count(empsalary.empno) OVER (?) <= 3)  
+   ->  Sort  
+         Sort Key: empsalary.depname, empsalary.salary DESC  
+         ->  Seq Scan on empsalary  
+(5 rows)  

digoal's wechat