Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault with create index statement (btree paritial jsonb column) #9022

Closed
mark-a-s opened this issue Nov 8, 2019 · 9 comments
Closed

segfault with create index statement (btree paritial jsonb column) #9022

mark-a-s opened this issue Nov 8, 2019 · 9 comments

Comments

@mark-a-s
Copy link

mark-a-s commented Nov 8, 2019

Greenplum version or build

6.1.0

OS version and uname -a

Linux gpdb-sn.openstacklocal 3.10.0-957.27.2.el7.x86_64 #1 SMP Mon Jul 29 17:46:05 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

autoconf options used ( config.status --config )

N/A

Installation information ( pg_config )

N/A

Expected behavior

create index statement should NOT cause a panic.

 create index on session using btree(journey_id) WHERE (session_attributes->>'ivr.duration')::int > 1000 ;  
CREATE INDEX

Actual behavior

segfault

create index on session using btree(journey_Id) WHERE (session_attributes->>'ivr.duration')::int > 1000 ; 
ERROR:  Error on receive from seg1 127.0.0.1:6001 pid=22795: server closed the connection unexpectedly
DETAIL:  
	This probably means the server terminated abnormally
	before or while processing the request.

segment log.

2019-11-08 16:40:27.530192 EST,,,p28894,th0,,,2019-11-08 16:40:27 EST,0,con44,cmd1,seg5,,,,,"PANIC","XX000","Unexpected internal error: Segment process received signal SIGSEGV",,,,,,,0,,,,"1    0x7f9866de75f0 libpthread.so.0 <symbol not found> + 0x66de75f0
2    0x6cfcee postgres <symbol not found> (execQual.c:2162)
3    0x6cd3b0 postgres <symbol not found> (execQual.c:5114)
4    0x6cfc91 postgres <symbol not found> (execQual.c:2135)
5    0x6d537e postgres ExecQual (execQual.c:6237)
6    0x5a22a9 postgres <symbol not found> (index.c:3097)
7    0x5a3f11 postgres IndexBuildScan (index.c:2496)
8    0x521fa3 postgres btbuild (nbtree.c:130)
9    0x9fc9db postgres OidFunctionCall3Coll (fmgr.c:1669)
10   0x5a29c9 postgres index_build (index.c:2291)
11   0x5a37fa postgres index_create (index.c:1248)
12   0x665da5 postgres DefineIndex (indexcmds.c:823)
13   0x8926f1 postgres <symbol not found> (utility.c:1512)
14   0x891565 postgres standard_ProcessUtility (utility.c:1043)
15   0x88e161 postgres <symbol not found> (palloc.h:158)
16   0x88ec5e postgres <symbol not found> (pquery.c:1510)
17   0x890337 postgres PortalRun (pquery.c:1020)
18   0x88888e postgres <symbol not found> (postgres.c:1377)
19   0x88d4c9 postgres PostgresMain (postgres.c:5361)
"
2019-11-08 16:40:27.530321 EST,,,p27376,th1786894464,,,,0,,,seg5,,,,,"LOG","00000","server process (PID 28894) was terminated by signal 11: Segmentation fault","Failed process was running: create index on session using btree(journey_Id) WHERE (session_attributes->>'ivr.duration')::int > 1000 ;",,,,,,0,,"postmaster.c",3962,
2019-11-08 16:40:27.530341 EST,,,p27376,th1786894464,,,,0,,,seg5,,,,,"LOG","00000","terminating any other

Step to reproduce the behavior

create table with a jsonb column. create a btree index on that table for a column and filter where the jsonb is being type cast.

create index on session using btree(journey_Id) WHERE (session_attributes->>'ivr.duration')::int > 1000 ; ```

@gaos1
Copy link

gaos1 commented Nov 11, 2019

i cannot reproduce on empty table with 6.1. it might be related to some of your table data. can you help to identify the data to trigger the issue and share them?

@mark-a-s
Copy link
Author

I guess this could be data related, having some trouble reproducing it with consistency or a small test case.

I'm still trying narrow it down some, unsure if this will help.

create table dataset_13_13.test_core (myid uuid, jdoc jsonb) WITH (
appendonly=true,
orientation=column,
compresstype=zstd,
compresslevel=2
)
DISTRIBUTED RANDOMLY;

CREATE TABLE
insert into dataset_13_13.test_core (SELECT * FROM dataset_13_13.ms_visitor_attribute order by visitor_id );
INSERT 0 2698527
create index on dataset_13_13.test_core using btree(myid) WHERE jdoc->>'seg.elctr_billg_enrlmt_cd'='paper statement' ;
CREATE INDEX
drop index test_core_myid_idx ;
DROP INDEX
analyze dataset_13_13.test_core ;
ANALYZE
create index on dataset_13_13.test_core using btree(myid) WHERE jdoc->>'seg.elctr_billg_enrlmt_cd'='paper statement' ;

ERROR:  Error on receive from seg5 127.0.0.1:6005 pid=16797: server closed the connection unexpectedly
DETAIL:  
	This probably means the server terminated abnormally
	before or while processing the request.

I don't really understand why an analyze would affect if this gets a panic.

@mark-a-s
Copy link
Author

mark-a-s commented Nov 11, 2019

yeah I don't fully understand why this is intermittent.

but this just some json data I found on the internet that I can reproduce with.

mmk_fox_ddb=> create table dataset_13_13.test_core (myid int, jdoc jsonb) WITH ( appendonly=true, orientation=column, compresstype=zstd, compresslevel=2 ) DISTRIBUTED RANDOMLY; CREATE TABLE

INSERT INTO test_core VALUES (1, '{"title": "Sleeping Beauties", "genres": ["Fiction", "Thriller", "Horror"], "published": false}'); INSERT INTO test_core VALUES (2, '{"title": "Influence", "genres": ["Marketing & Sales", "Self-Help ", "Psychology"], "published": true}'); INSERT INTO test_core VALUES (3, '{"title": "The Dictator''s Handbook", "genres": ["Law", "Politics"], "authors": ["Bruce Bueno de Mesquita", "Alastair Smith"], "published": true}'); INSERT INTO test_core VALUES (4, '{"title": "Deep Work", "genres": ["Productivity", "Reference"], "published": true}'); INSERT INTO test_core VALUES (5, '{"title": "Siddhartha", "genres": ["Fiction", "Spirituality"], "published": true}');
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
analyze test_core
ANALYZE
create index on dataset_13_13.test_core using btree(myid) WHERE jdoc->>'title'='Sleeping Beauties' ;
CREATE INDEX
drop index test_core_myid_idx ;
DROP INDEX
analyze test_core ;
ANALYZE
create index on dataset_13_13.test_core using btree(myid) WHERE jdoc->>'title'='Sleeping Beauties' ;

ERROR: Error on receive from seg6 127.0.0.1:6006 pid=24743: server closed the connection unexpectedly DETAIL: This probably means the server terminated abnormally before or while processing the request.

SELECT gp_segment_id, * FROM dataset_13_13.test_core ;

gp_segment_id | myid | jdoc
---------------+------+--------------------------------------------------------------------------------------------------------------------------------------------------
4 | 4 | {"title": "Deep Work", "genres": ["Productivity", "Reference"], "published": true}
3 | 3 | {"title": "The Dictator's Handbook", "genres": ["Law", "Politics"], "authors": ["Bruce Bueno de Mesquita", "Alastair Smith"], "published": true}
7 | 5 | {"title": "Siddhartha", "genres": ["Fiction", "Spirituality"], "published": true}
6 | 1 | {"title": "Sleeping Beauties", "genres": ["Fiction", "Thriller", "Horror"], "published": false}
6 | 2 | {"title": "Influence", "genres": ["Marketing & Sales", "Self-Help ", "Psychology"], "published": true}
(5 rows)

@remotefish
Copy link
Contributor

Are you still able to reproduce the issue? If yes is it possible for you to upload the coredump? You could use the packcore utility to collect the core and related libraries.

@mark-a-s
Copy link
Author

i'll go see about getting a core.

@mark-a-s
Copy link
Author

I've uploaded the cores to the commercial support side now that I have access again.

@hlinnaka
Copy link
Contributor

I did some debugging on the core file, and was able to track this down to how we build the list columns that need to be scanned in IndexBuildAppendOnlyColScan(). It doesn't take into account columns that are used in the WHERE clause, so if there is a column that's only used in the WHERE clause, it's not fetched. In this particular case, that lead to a crash because the corresponding field in the "slot" to hold each row was left uninitialized. The first time an index is created on an AOCO table, it always scans all columns to build the so-called "AO block directory", but on subsequent indexes on the same table, the bug arises.

Here's a simpler repro:

create table aocol (id int4, t text) with (appendonly=true, orientation=column);
insert into aocol values (1, 'foobar');

-- Create an index on the table. This is the first index on the table, so it
-- creates the ao block directory, and scans all columns.
create index on aocol (id);

-- Create another index. This index needs to scan both columns, because both
-- columns are used by the index; one as the index column, and the other in
-- the expression.
create index myidx on aocol (id) WHERE t like 'foo%';

-- Check that the row is found using the partial index.
set optimizer=off;
set enable_seqscan=off;
select * from aocol where t like 'foo%';

It doesn't lead to a crash, at least on my laptop on a debugging build with compiler optimizations disabled, but the SELECT returns incorrect result. It should return the one row in the table, but it returns 0 rows.

Thanks for the report, I'll open a PR to fix this soon.

@mark-a-s
Copy link
Author

mark-a-s commented Nov 22, 2019 via email

hlinnaka added a commit that referenced this issue Nov 24, 2019
Fixes github issue #9022

Reviewed-by: Asim R P <apraveen@pivotal.io>
hlinnaka added a commit that referenced this issue Nov 24, 2019
Fixes github issue #9022

Reviewed-by: Asim R P <apraveen@pivotal.io>
hlinnaka added a commit that referenced this issue Nov 24, 2019
Fixes github issue #9022

Reviewed-by: Asim R P <apraveen@pivotal.io>
@hlinnaka
Copy link
Contributor

Fixed master, 6X_STABLE and 5X_STABLE. Thanks for the report!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants