Skip to content

Commit

Permalink
Optimize nbtree backward scan boundary cases.
Browse files Browse the repository at this point in the history
Teach _bt_binsrch (and related helper routines like _bt_search and
_bt_compare) about the initial positioning requirements of backward
scans.  Routines like _bt_binsrch already know all about "nextkey"
searches, so it seems natural to teach them about "goback"/backward
searches, too.  These concepts are closely related, and are much easier
to understand when discussed together.

Now that certain implementation details are hidden from _bt_first, it's
straightforward to add a new optimization: backward scans using the <
strategy now avoid extra leaf page accesses in certain "boundary cases".
Consider the following example, which uses the tenk1 table (and its
tenk1_hundred index) from the standard regression tests:

SELECT * FROM tenk1 WHERE hundred < 12 ORDER BY hundred DESC LIMIT 1;

Before this commit, nbtree would scan two leaf pages, even though it was
only really necessary to scan one leaf page.  We'll now descend straight
to the leaf page containing a (12, -inf) high key instead.  The scan
will locate matching non-pivot tuples with "hundred" values starting
from the value 11.  The scan won't waste a page access on the right
sibling leaf page, which cannot possibly contain any matching tuples.

You can think of the optimization added by this commit as disabling an
optimization (the _bt_compare "!pivotsearch" behavior that was added to
Postgres 12 in commit dd299df) for a small subset of cases where it was
always counterproductive.

Equivalently, you can think of the new optimization as extending the
"pivotsearch" behavior that page deletion by VACUUM has long required
(since the aforementioned Postgres 12 commit went in) to other, similar
cases.  Obviously, this isn't strictly necessary for these new cases
(unlike VACUUM, _bt_first is prepared to move the scan to the left once
on the leaf level), but the underlying principle is the same.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=XPzM8HzaLPq278Vms420mVSHfgs9wi5tjFKHcapZCEw@mail.gmail.com
  • Loading branch information
petergeoghegan committed Dec 8, 2023
1 parent b437571 commit c9c0589
Show file tree
Hide file tree
Showing 7 changed files with 210 additions and 149 deletions.
16 changes: 8 additions & 8 deletions contrib/amcheck/verify_nbtree.c
Expand Up @@ -3165,7 +3165,7 @@ invariant_l_offset(BtreeCheckState *state, BTScanInsert key,
ItemId itemid;
int32 cmp;

Assert(key->pivotsearch);
Assert(!key->nextkey && key->backward);

/* Verify line pointer before checking tuple */
itemid = PageGetItemIdCareful(state, state->targetblock, state->target,
Expand Down Expand Up @@ -3227,7 +3227,7 @@ invariant_leq_offset(BtreeCheckState *state, BTScanInsert key,
{
int32 cmp;

Assert(key->pivotsearch);
Assert(!key->nextkey && key->backward);

cmp = _bt_compare(state->rel, key, state->target, upperbound);

Expand All @@ -3250,7 +3250,7 @@ invariant_g_offset(BtreeCheckState *state, BTScanInsert key,
{
int32 cmp;

Assert(key->pivotsearch);
Assert(!key->nextkey && key->backward);

cmp = _bt_compare(state->rel, key, state->target, lowerbound);

Expand Down Expand Up @@ -3288,7 +3288,7 @@ invariant_l_nontarget_offset(BtreeCheckState *state, BTScanInsert key,
ItemId itemid;
int32 cmp;

Assert(key->pivotsearch);
Assert(!key->nextkey && key->backward);

/* Verify line pointer before checking tuple */
itemid = PageGetItemIdCareful(state, nontargetblock, nontarget,
Expand Down Expand Up @@ -3514,17 +3514,17 @@ palloc_btree_page(BtreeCheckState *state, BlockNumber blocknum)
* For example, invariant_g_offset() might miss a cross-page invariant failure
* on an internal level if the scankey built from the first item on the
* target's right sibling page happened to be equal to (not greater than) the
* last item on target page. The !pivotsearch tiebreaker in _bt_compare()
* might otherwise cause amcheck to assume (rather than actually verify) that
* the scankey is greater.
* last item on target page. The !backward tiebreaker in _bt_compare() might
* otherwise cause amcheck to assume (rather than actually verify) that the
* scankey is greater.
*/
static inline BTScanInsert
bt_mkscankey_pivotsearch(Relation rel, IndexTuple itup)
{
BTScanInsert skey;

skey = _bt_mkscankey(rel, itup);
skey->pivotsearch = true;
skey->backward = true;

return skey;
}
Expand Down
16 changes: 13 additions & 3 deletions src/backend/access/nbtree/nbtpage.c
Expand Up @@ -1958,10 +1958,20 @@ _bt_pagedel(Relation rel, Buffer leafbuf, BTVacState *vstate)
return;
}

/* we need an insertion scan key for the search, so build one */
/*
* We need an insertion scan key, so build one.
*
* _bt_search searches for the leaf page that contains any
* matching non-pivot tuples, but we need it to "search" for
* the high key pivot from the page that we're set to delete.
* Compensate for the mismatch by having _bt_search locate the
* last position < equal-to-untruncated-prefix non-pivots.
*/
itup_key = _bt_mkscankey(rel, targetkey);
/* find the leftmost leaf page with matching pivot/high key */
itup_key->pivotsearch = true;

/* Set up a BTLessStrategyNumber-like insertion scan key */
itup_key->nextkey = false;
itup_key->backward = true;
stack = _bt_search(rel, NULL, itup_key, &sleafbuf, BT_READ);
/* won't need a second lock or pin on leafbuf */
_bt_relbuf(rel, sleafbuf);
Expand Down

0 comments on commit c9c0589

Please sign in to comment.