Fix bugs which could not create unique index "gp_fastsequence_objid_objmod_index" #146

lss602726449 · 2023-08-14T09:11:01Z

closes: #132

Change logs

As indicated by the test in the issue. The same key exists in gp_fastsequence. The root cause is that a crash occurred when a tuple was inserted in gp_fastsequence in the test. Because it inserted a frozen tuple, and the tuple insertion and index update were not atomic. We can reproduce this problem by killing the server when the index is not inserted after inserting the tuple. To solve this problem, we can solve this problem by inserting a simple tuple and index (that is, the serevr crash triggers transaction rollback at this time), and after inserting the index, update it to a frozen tuple.

Why are the changes needed?

Fix the inconsistency of index and tuple in gp_fastsequence.

Does this PR introduce any user-facing change?

No

How was this patch tested?

cherrypick from GPDB

Contributor's Checklist

Here are some reminders and checklists before/when submitting your pull request, please check them:

Make sure your Pull Request has a clear title and commit message. You can take git-commit template as a reference.
Sign the Contributor License Agreement as prompted for your first-time contribution.
List your communication in the GitHub Issues or Discussions (if has or needed).
Document changes.
Add tests for the change
Pass make installcheck
Pass make -C src/test installcheck-cbdb-parallel
Feel free to @cloudberrydb/dev team for review and approval when your PR is ready🥳

tuhaihe · 2023-08-14T10:04:40Z

Hi @lss602726449, Thank you for your efforts! Please revise both your pull request and commit message's title and body to ensure they are thorough and easy to understand. You can follow our commit conventions as a guide.

yjhjstz · 2023-08-15T02:35:22Z

root cause is ?

lss602726449 · 2023-08-15T03:42:13Z

root cause is ?

Maybe the root cause is geting the wrong snapshot in the reindex. So that it gets two version of the same tuple to build the unique index. So error ocuurs. And due to cannot reproduct in the lcoal computer. We try to reproduct it in the CI pipeline.

CLAassistant · 2023-08-17T10:47:21Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 3 committers have signed the CLA.

✅ lss602726449
❌ ashwinstar
❌ huansong
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

gp_fastsequence is populated and kept updated during inserts. No need exist to write/update gp_fastsequence during index builds if block directory needs to be created. Seems this code was added way back in history when gp_fastsequence was introduced and to cover the upgrades required to add/update entry in gp_fastsequence. No more such need exist hence deleting this code. Also, deleting code to initialize gp_fastsequence with 0 as last_sequence during InsertInitialSegnoEntry(). No having tuple in gp_fastsequence can already provide that behavior so no need to add entry with 0 and then update the same with next value later during insert. Helps to complete delete InsertFastSequenceEntry() interface.

When we insert into gp_fastsequence, we insert a frozen row (xmin=FrozenTransactionXid) and then update the index. That creates a possibility of inconsistency in presence of server crash: if the WAL for the frozen insert is flushed to persistent storage but the index is not, after crash recovery we would be able to see the frozen row in seqscan, but not in indexscan. Now fixing this problem by doing regular heap_insert (normal MVCC) and update the index first. Only after that we mark the row to be frozen. That way we remove the inconsistency: if we crashed before we mark the row frozen, the row follows regular MVCC and we won't be able to see it; or if we crashed after we marked it frozen (and also it made to the disk), we would replay both table row and index, so we would see the row in both seqscan and indexscan. Note that, now we use the upstream-aligned way to mark the tuple frozen which is infomask |= HEAP_XMIN_FROZEN (commit 37484ad), instead of setting xmin = FrozenTransactionId. The freeze effect should be the same though, but it helps leave behind a trail as to which transaction did the freezing. In addition, we use the same WAL type XLOG_HEAP2_FREEZE_PAGE as vacuum, so we also align with upstream in how to replay the frozen tuple. The alternative was to use the WAL type for heap_inplace_update, but then we would need to change it to update the header portion as well as data portion, which would be an unpleasant difference in WAL semantics. The only other thing that has been changed from the previous way of doing frozen_heap_insert is that now we don't set XLogRecord.xl_xid to FrozenTransactionId anymore. However, this should be OK because: (1) we are not setting xim=FrozenTransactionId anymore, the WAL record should remain consistent. (2) the change where we were setting this field in Greenplum seems to be outdated itself. Upstream doesn't have even the capability to do that: it never passes an arbitrary XID to that field, it's always GetCurrentTransactionIdIfAny() (see XLogRecordAssemble()), even if you insert a tuple w/ xmin=FrozenTransactionId (e.g. for PG sequence, see fill_seq_with_data()). GPDB doesn't have a reason to differ. Finally, need to make a fewn adjustment in the regress test output where xmin is queried for gp_fastsequence. Co-authored-by: Ashwin Agrawal <aashwin@vmware.com>

Relations such as temporary tables or unlogged tables should not have WALs created.

Due to the different definition in PG12 and PG14 for struct RelFileNode, There is different length for XLogRecord. See function XLogRecordAssemble in detail.

lss602726449 changed the title ~~reindex~~ [regression reindex_gpfastsequence] could not create unique index "gp_fastsequence_objid_objmod_index" Aug 15, 2023

my-ship-it assigned lss602726449 Aug 15, 2023

lss602726449 changed the title ~~[regression reindex_gpfastsequence] could not create unique index "gp_fastsequence_objid_objmod_index"~~ Fix bugs which could not create unique index "gp_fastsequence_objid_objmod_index" Aug 15, 2023

lss602726449 force-pushed the reindex branch 3 times, most recently from 4e8cb05 to 77fe8a4 Compare August 17, 2023 10:47

lss602726449 force-pushed the reindex branch 3 times, most recently from 30bdac0 to 0b7a55d Compare August 21, 2023 06:55

ashwinstar and others added 4 commits August 23, 2023 10:54

Only write WAL when needed in heap_freeze_tuple_wal_logged()

3bcf04f

Relations such as temporary tables or unlogged tables should not have WALs created.

Fix test case for frozen_insert_crash

b21f24f

Due to the different definition in PG12 and PG14 for struct RelFileNode, There is different length for XLogRecord. See function XLogRecordAssemble in detail.

lss602726449 force-pushed the reindex branch from 0b7a55d to b21f24f Compare August 23, 2023 02:54

my-ship-it requested review from avamingli and HuSen8891 August 31, 2023 01:15

avamingli approved these changes Aug 31, 2023

View reviewed changes

HuSen8891 approved these changes Aug 31, 2023

View reviewed changes

my-ship-it merged commit ae1821e into cloudberrydb:main Sep 11, 2023
5 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bugs which could not create unique index "gp_fastsequence_objid_objmod_index" #146

Fix bugs which could not create unique index "gp_fastsequence_objid_objmod_index" #146

lss602726449 commented Aug 14, 2023 •

edited

tuhaihe commented Aug 14, 2023

yjhjstz commented Aug 15, 2023

lss602726449 commented Aug 15, 2023

CLAassistant commented Aug 17, 2023 •

edited

Fix bugs which could not create unique index "gp_fastsequence_objid_objmod_index" #146

Fix bugs which could not create unique index "gp_fastsequence_objid_objmod_index" #146

Conversation

lss602726449 commented Aug 14, 2023 • edited

Change logs

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Contributor's Checklist

tuhaihe commented Aug 14, 2023

yjhjstz commented Aug 15, 2023

lss602726449 commented Aug 15, 2023

CLAassistant commented Aug 17, 2023 • edited

lss602726449 commented Aug 14, 2023 •

edited

CLAassistant commented Aug 17, 2023 •

edited