digoal
2023-12-11
PostgreSQL , PolarDB , DuckDB , btree , order by desc , backward scan
本优化order by desc 场景可以少扫描1个leaf page.
Optimize nbtree backward scan boundary cases.
author Peter Geoghegan <pg@bowt.ie>
Fri, 8 Dec 2023 19:05:17 +0000 (11:05 -0800)
committer Peter Geoghegan <pg@bowt.ie>
Fri, 8 Dec 2023 19:05:17 +0000 (11:05 -0800)
commit c9c0589fda0edc46b8f5e7362b04636c0c4f0723
tree c09896c93142dc771714209bf94eb8c0aa595090 tree
parent b437571714707bc6466abde1a0af5e69aaade09c commit | diff
Optimize nbtree backward scan boundary cases.
Teach _bt_binsrch (and related helper routines like _bt_search and
_bt_compare) about the initial positioning requirements of backward
scans. Routines like _bt_binsrch already know all about "nextkey"
searches, so it seems natural to teach them about "goback"/backward
searches, too. These concepts are closely related, and are much easier
to understand when discussed together.
Now that certain implementation details are hidden from _bt_first, it's
straightforward to add a new optimization: backward scans using the <
strategy now avoid extra leaf page accesses in certain "boundary cases".
Consider the following example, which uses the tenk1 table (and its
tenk1_hundred index) from the standard regression tests:
SELECT * FROM tenk1 WHERE hundred < 12 ORDER BY hundred DESC LIMIT 1;
Before this commit, nbtree would scan two leaf pages, even though it was
only really necessary to scan one leaf page. We'll now descend straight
to the leaf page containing a (12, -inf) high key instead. The scan
will locate matching non-pivot tuples with "hundred" values starting
from the value 11. The scan won't waste a page access on the right
sibling leaf page, which cannot possibly contain any matching tuples.
You can think of the optimization added by this commit as disabling an
optimization (the _bt_compare "!pivotsearch" behavior that was added to
Postgres 12 in commit dd299df8) for a small subset of cases where it was
always counterproductive.
Equivalently, you can think of the new optimization as extending the
"pivotsearch" behavior that page deletion by VACUUM has long required
(since the aforementioned Postgres 12 commit went in) to other, similar
cases. Obviously, this isn't strictly necessary for these new cases
(unlike VACUUM, _bt_first is prepared to move the scan to the left once
on the leaf level), but the underlying principle is the same.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=XPzM8HzaLPq278Vms420mVSHfgs9wi5tjFKHcapZCEw@mail.gmail.com
+--
+-- Add coverage for optimization of backwards scan index descents
+--
+-- Here we expect _bt_search to descend straight to a leaf page containing a
+-- non-pivot tuple with the value '47', which comes last (after 11 similar
+-- non-pivot tuples). Query execution should only need to visit a single
+-- leaf page here.
+--
+-- Test case relies on tenk1_hundred index having a leaf page whose high key
+-- is '(48, -inf)'. We use a low cardinality index to make our test case less
+-- sensitive to implementation details that may change in the future.
+set enable_seqscan to false;
+set enable_indexscan to true;
+set enable_bitmapscan to false;
+explain (costs off)
+select hundred, twenty from tenk1 where hundred < 48 order by hundred desc limit 1;
+ QUERY PLAN
+--------------------------------------------------------
+ Limit
+ -> Index Scan Backward using tenk1_hundred on tenk1
+ Index Cond: (hundred < 48)
+(3 rows)
+
+select hundred, twenty from tenk1 where hundred < 48 order by hundred desc limit 1;
+ hundred | twenty
+---------+--------
+ 47 | 7
+(1 row)
+
+-- This variant of the query need only return a single tuple located to the immediate
+-- right of the '(48, -inf)' high key. It also only needs to scan one single
+-- leaf page (the right sibling of the page scanned by the last test case):
+explain (costs off)
+select hundred, twenty from tenk1 where hundred <= 48 order by hundred desc limit 1;
+ QUERY PLAN
+--------------------------------------------------------
+ Limit
+ -> Index Scan Backward using tenk1_hundred on tenk1
+ Index Cond: (hundred <= 48)
+(3 rows)
+
+select hundred, twenty from tenk1 where hundred <= 48 order by hundred desc limit 1;
+ hundred | twenty
+---------+--------
+ 48 | 8
+(1 row)
+