-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bugs which could not create unique index "gp_fastsequence_objid_objmod_index" #146
Conversation
Hi @lss602726449, Thank you for your efforts! Please revise both your pull request and commit message's title and body to ensure they are thorough and easy to understand. You can follow our commit conventions as a guide. |
root cause is ? |
Maybe the root cause is geting the wrong snapshot in the reindex. So that it gets two version of the same tuple to build the unique index. So error ocuurs. And due to cannot reproduct in the lcoal computer. We try to reproduct it in the CI pipeline. |
4e8cb05
to
77fe8a4
Compare
|
30bdac0
to
0b7a55d
Compare
gp_fastsequence is populated and kept updated during inserts. No need exist to write/update gp_fastsequence during index builds if block directory needs to be created. Seems this code was added way back in history when gp_fastsequence was introduced and to cover the upgrades required to add/update entry in gp_fastsequence. No more such need exist hence deleting this code. Also, deleting code to initialize gp_fastsequence with 0 as last_sequence during InsertInitialSegnoEntry(). No having tuple in gp_fastsequence can already provide that behavior so no need to add entry with 0 and then update the same with next value later during insert. Helps to complete delete InsertFastSequenceEntry() interface.
When we insert into gp_fastsequence, we insert a frozen row (xmin=FrozenTransactionXid) and then update the index. That creates a possibility of inconsistency in presence of server crash: if the WAL for the frozen insert is flushed to persistent storage but the index is not, after crash recovery we would be able to see the frozen row in seqscan, but not in indexscan. Now fixing this problem by doing regular heap_insert (normal MVCC) and update the index first. Only after that we mark the row to be frozen. That way we remove the inconsistency: if we crashed before we mark the row frozen, the row follows regular MVCC and we won't be able to see it; or if we crashed after we marked it frozen (and also it made to the disk), we would replay both table row and index, so we would see the row in both seqscan and indexscan. Note that, now we use the upstream-aligned way to mark the tuple frozen which is infomask |= HEAP_XMIN_FROZEN (commit 37484ad), instead of setting xmin = FrozenTransactionId. The freeze effect should be the same though, but it helps leave behind a trail as to which transaction did the freezing. In addition, we use the same WAL type XLOG_HEAP2_FREEZE_PAGE as vacuum, so we also align with upstream in how to replay the frozen tuple. The alternative was to use the WAL type for heap_inplace_update, but then we would need to change it to update the header portion as well as data portion, which would be an unpleasant difference in WAL semantics. The only other thing that has been changed from the previous way of doing frozen_heap_insert is that now we don't set XLogRecord.xl_xid to FrozenTransactionId anymore. However, this should be OK because: (1) we are not setting xim=FrozenTransactionId anymore, the WAL record should remain consistent. (2) the change where we were setting this field in Greenplum seems to be outdated itself. Upstream doesn't have even the capability to do that: it never passes an arbitrary XID to that field, it's always GetCurrentTransactionIdIfAny() (see XLogRecordAssemble()), even if you insert a tuple w/ xmin=FrozenTransactionId (e.g. for PG sequence, see fill_seq_with_data()). GPDB doesn't have a reason to differ. Finally, need to make a fewn adjustment in the regress test output where xmin is queried for gp_fastsequence. Co-authored-by: Ashwin Agrawal <aashwin@vmware.com>
Relations such as temporary tables or unlogged tables should not have WALs created.
Due to the different definition in PG12 and PG14 for struct RelFileNode, There is different length for XLogRecord. See function XLogRecordAssemble in detail.
closes: #132
Change logs
As indicated by the test in the issue. The same key exists in gp_fastsequence. The root cause is that a crash occurred when a tuple was inserted in gp_fastsequence in the test. Because it inserted a frozen tuple, and the tuple insertion and index update were not atomic. We can reproduce this problem by killing the server when the index is not inserted after inserting the tuple. To solve this problem, we can solve this problem by inserting a simple tuple and index (that is, the serevr crash triggers transaction rollback at this time), and after inserting the index, update it to a frozen tuple.
Why are the changes needed?
Fix the inconsistency of index and tuple in gp_fastsequence.
Does this PR introduce any user-facing change?
No
How was this patch tested?
cherrypick from GPDB
Contributor's Checklist
Here are some reminders and checklists before/when submitting your pull request, please check them:
make installcheck
make -C src/test installcheck-cbdb-parallel