Skip to content

Latest commit

 

History

History
143 lines (110 loc) · 6.22 KB

20231211_01.md

File metadata and controls

143 lines (110 loc) · 6.22 KB

PostgreSQL 17 preview - btree index backward scan (order by desc 场景)优化

作者

digoal

日期

2023-12-11

标签

PostgreSQL , PolarDB , DuckDB , btree , order by desc , backward scan


背景

本优化order by desc 场景可以少扫描1个leaf page.

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c9c0589fda0edc46b8f5e7362b04636c0c4f0723

Optimize nbtree backward scan boundary cases.  
  
author	Peter Geoghegan <pg@bowt.ie>	  
Fri, 8 Dec 2023 19:05:17 +0000 (11:05 -0800)  
committer	Peter Geoghegan <pg@bowt.ie>	  
Fri, 8 Dec 2023 19:05:17 +0000 (11:05 -0800)  
commit	c9c0589fda0edc46b8f5e7362b04636c0c4f0723  
tree	c09896c93142dc771714209bf94eb8c0aa595090	tree  
parent	b437571714707bc6466abde1a0af5e69aaade09c	commit | diff  
Optimize nbtree backward scan boundary cases.  
  
Teach _bt_binsrch (and related helper routines like _bt_search and  
_bt_compare) about the initial positioning requirements of backward  
scans.  Routines like _bt_binsrch already know all about "nextkey"  
searches, so it seems natural to teach them about "goback"/backward  
searches, too.  These concepts are closely related, and are much easier  
to understand when discussed together.  
  
Now that certain implementation details are hidden from _bt_first, it's  
straightforward to add a new optimization: backward scans using the <  
strategy now avoid extra leaf page accesses in certain "boundary cases".  
Consider the following example, which uses the tenk1 table (and its  
tenk1_hundred index) from the standard regression tests:  
  
SELECT * FROM tenk1 WHERE hundred < 12 ORDER BY hundred DESC LIMIT 1;  
  
Before this commit, nbtree would scan two leaf pages, even though it was  
only really necessary to scan one leaf page.  We'll now descend straight  
to the leaf page containing a (12, -inf) high key instead.  The scan  
will locate matching non-pivot tuples with "hundred" values starting  
from the value 11.  The scan won't waste a page access on the right  
sibling leaf page, which cannot possibly contain any matching tuples.  
  
You can think of the optimization added by this commit as disabling an  
optimization (the _bt_compare "!pivotsearch" behavior that was added to  
Postgres 12 in commit dd299df8) for a small subset of cases where it was  
always counterproductive.  
  
Equivalently, you can think of the new optimization as extending the  
"pivotsearch" behavior that page deletion by VACUUM has long required  
(since the aforementioned Postgres 12 commit went in) to other, similar  
cases.  Obviously, this isn't strictly necessary for these new cases  
(unlike VACUUM, _bt_first is prepared to move the scan to the left once  
on the leaf level), but the underlying principle is the same.  
  
Author: Peter Geoghegan <pg@bowt.ie>  
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>  
Discussion: https://postgr.es/m/CAH2-Wz=XPzM8HzaLPq278Vms420mVSHfgs9wi5tjFKHcapZCEw@mail.gmail.com  
+--  
+-- Add coverage for optimization of backwards scan index descents  
+--  
+-- Here we expect _bt_search to descend straight to a leaf page containing a  
+-- non-pivot tuple with the value '47', which comes last (after 11 similar  
+-- non-pivot tuples).  Query execution should only need to visit a single  
+-- leaf page here.  
+--  
+-- Test case relies on tenk1_hundred index having a leaf page whose high key  
+-- is '(48, -inf)'.  We use a low cardinality index to make our test case less  
+-- sensitive to implementation details that may change in the future.  
+set enable_seqscan to false;  
+set enable_indexscan to true;  
+set enable_bitmapscan to false;  
+explain (costs off)  
+select hundred, twenty from tenk1 where hundred < 48 order by hundred desc limit 1;  
+                       QUERY PLAN                         
+--------------------------------------------------------  
+ Limit  
+   ->  Index Scan Backward using tenk1_hundred on tenk1  
+         Index Cond: (hundred < 48)  
+(3 rows)  
+  
+select hundred, twenty from tenk1 where hundred < 48 order by hundred desc limit 1;  
+ hundred | twenty   
+---------+--------  
+      47 |      7  
+(1 row)  
+  
+-- This variant of the query need only return a single tuple located to the immediate  
+-- right of the '(48, -inf)' high key.  It also only needs to scan one single  
+-- leaf page (the right sibling of the page scanned by the last test case):  
+explain (costs off)  
+select hundred, twenty from tenk1 where hundred <= 48 order by hundred desc limit 1;  
+                       QUERY PLAN                         
+--------------------------------------------------------  
+ Limit  
+   ->  Index Scan Backward using tenk1_hundred on tenk1  
+         Index Cond: (hundred <= 48)  
+(3 rows)  
+  
+select hundred, twenty from tenk1 where hundred <= 48 order by hundred desc limit 1;  
+ hundred | twenty   
+---------+--------  
+      48 |      8  
+(1 row)  
+  

digoal's wechat