Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

duplicate gp_fastsequence value (ctid) appears in ao table #13699

Closed
ppggff opened this issue Jun 20, 2022 · 2 comments · Fixed by #13762
Closed

duplicate gp_fastsequence value (ctid) appears in ao table #13699

ppggff opened this issue Jun 20, 2022 · 2 comments · Fixed by #13762
Assignees
Labels

Comments

@ppggff
Copy link
Contributor

ppggff commented Jun 20, 2022

Bug Report

Greenplum version or build

master 59eb923

OS version and uname -a

Linux 3.10.0-1160.49.1.el7.x86_64 #1 SMP Tue Nov 30 15:51:32 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

autoconf options used ( config.status --config )

Installation information ( pg_config )

Expected behavior

no duplicate gp_fastsequence value (ctid) appears in ao table

Actual behavior

duplicate gp_fastsequence value (ctid) appears in ao table

Step to reproduce the behavior

begin;
create table a(a int, b int) with (appendonly = true) distributed by (a);
--create index ia on a(b);
insert into a values(1,1);
insert into a values(1,1);
insert into a values(1,1);
savepoint s1;
truncate a;
insert into a values(1,1);
rollback to s1;
insert into a values(1,2);

postgres=# select ctid,* from a;
  ctid   | a | b
---------+---+---
 (0,2)   | 1 | 1
 (0,102) | 1 | 1
 (0,202) | 1 | 1
 (0,102) | 1 | 2
(4 rows)

panic with index

begin;
create table a(a int, b int) with (appendonly = true) distributed by (a);
create index ia on a(b);
insert into a values(1,1);
insert into a values(1,1);
insert into a values(1,1);
savepoint s1;
truncate a;
insert into a values(1,1);
rollback to s1;
insert into a values(1,2);

ERROR:  Unexpected internal error (assert.c:44)  (seg1 127.0.1.1:7003 pid=22234) (assert.c:44)
DETAIL:  FailedAssertion("!(entry->firstRowNum < firstRowNum)", File: "appendonlyblockdirectory.c", Line: 747)
@ppggff ppggff closed this as completed Jun 20, 2022
@ppggff ppggff changed the title duplicate gp_fastsequence value (ctid) appears in ao table duplicate gp_fastsequence value (ctid) appears in ao table in one transaction Jun 20, 2022
@ppggff ppggff reopened this Jun 20, 2022
@ppggff ppggff changed the title duplicate gp_fastsequence value (ctid) appears in ao table in one transaction duplicate gp_fastsequence value (ctid) appears in ao table Jun 21, 2022
@SmartKeyerror
Copy link
Contributor

RCA:

If we run TRUNCATE TABLE in subtransaction, it will not delete all data directly, for it's can't roll back this sub-transaction, so Greenplum will create a new empty storage file for the relation, and assign it as the relfilenode value. The old storage file is scheduled for deletion at commit.

But for AO table, the relfilenode should not be changed to gp_fastsequence, since AO table relies on the last_sequence to generate a new firstRowNumber to a new file block.

firstRowNumber should be guaranteed to keep incrementing in all cases, but this failed at the truncate table in subtransaction.

@ashwinstar
Copy link
Contributor

ashwinstar commented Jul 5, 2022

Thanks for reporting. The problem is worst for 6X_STABLE as truncate even outside create table will produce duplicate CTID and hence incorrect indexes.

drop table if exists ao;
DROP TABLE
create table ao(a int, b int) with (appendonly=true) distributed by (a);
CREATE TABLE
--create index idx on ao(b);
insert into ao values(1,1);
INSERT 0 1
select gp_segment_id, xmin, xmax, * from gp_dist_random('gp_fastsequence') where gp_segment_id=1 and objid in (select segrelid from pg_appendonly where relid = 'ao'::regclass);
 gp_segment_id | xmin | xmax | objid | objmod | last_sequence 
---------------+------+------+-------+--------+---------------
             1 |  732 |    0 | 24599 |      1 |           100
             1 |  732 |    0 | 24599 |      0 |             0
(2 rows)

insert into ao values(1,1);
INSERT 0 1
insert into ao values(1,1);
INSERT 0 1
select gp_segment_id, xmin, xmax, * from gp_dist_random('gp_fastsequence') where gp_segment_id=1 and objid in (select segrelid from pg_appendonly where relid = 'ao'::regclass);
 gp_segment_id | xmin | xmax | objid | objmod | last_sequence 
---------------+------+------+-------+--------+---------------
             1 |  732 |    0 | 24599 |      1 |           300
             1 |  732 |    0 | 24599 |      0 |             0
(2 rows)

select ctid, * from ao;
      ctid      | a | b 
----------------+---+---
 (33554432,2)   | 1 | 1
 (33554432,102) | 1 | 1
 (33554432,202) | 1 | 1
(3 rows)

begin;
BEGIN
truncate table ao;
TRUNCATE TABLE
insert into ao values(1,2);
INSERT 0 1
select gp_segment_id, xmin, xmax, * from gp_dist_random('gp_fastsequence') where gp_segment_id=1 and objid in (select segrelid from pg_appendonly where relid = 'ao'::regclass);
 gp_segment_id | xmin | xmax | objid | objmod | last_sequence 
---------------+------+------+-------+--------+---------------
             1 |  732 |    0 | 24599 |      1 |           100
             1 |  732 |    0 | 24599 |      0 |             0
(2 rows)

abort;
ROLLBACK
select gp_segment_id, xmin, xmax from gp_dist_random('gp_fastsequence') where gp_segment_id=1 and objid in (select segrelid from pg_appendonly where relid = 'ao'::regclass);
 gp_segment_id | xmin | xmax 
---------------+------+------
             1 |  732 |    0
             1 |  732 |    0
(2 rows)

insert into ao values(1,3);
INSERT 0 1
select gp_segment_id, xmin, xmax, * from gp_dist_random('gp_fastsequence') where gp_segment_id=1 and objid in (select segrelid from pg_appendonly where relid = 'ao'::regclass);
 gp_segment_id | xmin | xmax | objid | objmod | last_sequence 
---------------+------+------+-------+--------+---------------
             1 |  732 |    0 | 24599 |      1 |           200
             1 |  732 |    0 | 24599 |      0 |             0
(2 rows)

select ctid, gp_segment_id,* from ao order by ctid;
      ctid      | gp_segment_id | a | b 
----------------+---------------+---+---
 (33554432,2)   |             1 | 1 | 1
 (33554432,102) |             1 | 1 | 3  <<====================
 (33554432,102) |             1 | 1 | 1
 (33554432,202) |             1 | 1 | 1
(4 rows)

hence the fix in #13762 becomes much more important.

The reason its more serious for 6X_STABLE as in master branch inserts after truncate is using only segfile 0 due to code in choose_segno_internal() and more importantly ShouldUseReservedSegno() but that's not the case for 6X_STABLE and doesn't need to be. It's just due to ShouldUseReservedSegno() making decision based on xmin of pg_class for table and truncate modifies it and hence results in picking segfile 0.

@ashwinstar ashwinstar added the Priority 1 Critical issue that need immediate action label Jul 6, 2022
SmartKeyerror added a commit that referenced this issue Aug 8, 2022
This is to fix #13699.

As we know, AO/AOCO table relies on the last_sequence to generate a new first row number
to a new var block, this value will store in the heap table gp_fastsequence.

Whenever we generate a new variable-length storage block, we need to obtain a continuous row
number from the gp_fastsequence table as the first row number of the var block, and also as the
unique identifier of the tuple when creating an index.

So, the row number should not be same and not reuse deleted tuple's row number. Therefore,
Greenplum must ensure that row numbers are not repeated and always increment.

But sub-transaction will break it:

begin;
create table t (a int, b int) with (appendonly = true);
insert into t values (1, 1);
savepoint sub;
truncate table t;
insert into t values (2, 2);
rollback to sub;
abort;

After finished truncate table t;, the table's auxiliary table, such as pg_aoseg.pg_aoseg_<OID> and
visitmap will have a new relation file node. This will lead to generating a new last_sequence in
gp_fastsequence when we do INSERT next, and this can not rollback.

So if we want to fix this problem, we should let table gp_fastsequence can be rollback as other auxiliary
tables in transaction.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants