Skip to content

Commit

Permalink
nbtree VACUUM: cope with topparent inconsistencies.
Browse files Browse the repository at this point in the history
Avoid "right sibling %u of block %u is not next child" errors when
vacuuming a corrupt nbtree index.  Just LOG the issue and press on.
That way VACUUM will have a decent chance of finishing off all required
processing for the index (and for the table as a whole).

This is similar to recent work from commit 5abff19, as well as work
from commit 5b861ba (later backpatched as commit 43e409c), which
taught nbtree VACUUM to keep going when its "re-find" check fails.  The
hardening added by this commit takes place directly after the "re-find"
check, right before the critical section for the first stage of page
deletion.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-Wz=dayg0vjs4+er84TS9ami=csdzjpuiCGbEw=idhwqhzQ@mail.gmail.com
Backpatch: 11- (all supported versions).
  • Loading branch information
petergeoghegan committed Jun 22, 2023
1 parent e966b3d commit b631182
Showing 1 changed file with 16 additions and 7 deletions.
23 changes: 16 additions & 7 deletions src/backend/access/nbtree/nbtpage.c
Expand Up @@ -1772,12 +1772,6 @@ _bt_mark_page_halfdead(Relation rel, Buffer leafbuf, BTStack stack)
&topparent, &topparentrightsib))
return false;

/*
* Check that the parent-page index items we're about to delete/overwrite
* in subtree parent page contain what we expect. This can fail if the
* index has become corrupt for some reason. We want to throw any error
* before entering the critical section --- otherwise it'd be a PANIC.
*/
page = BufferGetPage(subtreeparent);
opaque = (BTPageOpaque) PageGetSpecialPointer(page);

Expand All @@ -1795,15 +1789,29 @@ _bt_mark_page_halfdead(Relation rel, Buffer leafbuf, BTStack stack)
nextoffset = OffsetNumberNext(poffset);
itemid = PageGetItemId(page, nextoffset);
itup = (IndexTuple) PageGetItem(page, itemid);

/*
* Check that the parent-page index items we're about to delete/overwrite
* in subtree parent page contain what we expect. This can fail if the
* index has become corrupt for some reason. When that happens we back
* out of deletion of the leafbuf subtree. (This is just like the case
* where _bt_lock_subtree_parent() cannot "re-find" leafbuf's downlink.)
*/
if (BTreeTupleGetDownLink(itup) != topparentrightsib)
ereport(ERROR,
{
ereport(LOG,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg_internal("right sibling %u of block %u is not next child %u of block %u in index \"%s\"",
topparentrightsib, topparent,
BTreeTupleGetDownLink(itup),
BufferGetBlockNumber(subtreeparent),
RelationGetRelationName(rel))));

_bt_relbuf(rel, subtreeparent);
Assert(false);
return false;
}

/*
* Any insert which would have gone on the leaf block will now go to its
* right sibling. In other words, the key space moves right.
Expand Down Expand Up @@ -2428,6 +2436,7 @@ _bt_lock_subtree_parent(Relation rel, BlockNumber child, BTStack stack,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg_internal("failed to re-find parent key in index \"%s\" for deletion target page %u",
RelationGetRelationName(rel), child)));
Assert(false);
return false;
}

Expand Down

0 comments on commit b631182

Please sign in to comment.