You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello again, this issue is somewhat related to #41 but happens with different condtions and at a different code location.
This time the crash happens when creating an index on a table with a large number of rows and with the lists param also greater than around 6500 clusters.
Reproduction steps:
CREATETABLEembed (id integerNOT NULL, vec vector(384) NOT NULL);
Insert 1M rows into the table
SET maintenance_work_mem='16GB';
CREATEINDEXON embed USING ivfflat (vec vector_cosine_ops) WITH (lists =8000);
Server logs
2022-10-26 09:17:07.110 UTC [54774] STATEMENT: create index on embed using ivfflat (vec vector_cosine_ops) with (lists = 8000);
2022-10-26 09:17:07.110 UTC [54774] DEBUG: building index "embed_vec_idx" on table "embed" serially
2022-10-26 09:17:19.927 UTC [54584] DEBUG: snapshot of 1+0 running transaction ids (lsn 0/6A280188 oldest xid 773 latest complete 772 next xid 774)
2022-10-26 09:17:47.742 UTC [53734] DEBUG: reaping dead processes
2022-10-26 09:17:47.743 UTC [53734] DEBUG: server process (PID 54774) was terminated by signal 11: Segmentation fault
2022-10-26 09:17:47.743 UTC [53734] DETAIL: Failed process was running: create index on embed using ivfflat (vec vector_cosine_ops) with (lists = 8000);
2022-10-26 09:17:47.743 UTC [53734] LOG: server process (PID 54774) was terminated by signal 11: Segmentation fault
2022-10-26 09:17:47.743 UTC [53734] DETAIL: Failed process was running: create index on embed using ivfflat (vec vector_cosine_ops) with (lists = 8000);
2022-10-26 09:17:47.743 UTC [53734] LOG: terminating any other active server processes
2022-10-26 09:17:47.743 UTC [53734] DEBUG: sending SIGQUIT to process 54588
2022-10-26 09:17:47.743 UTC [53734] DEBUG: sending SIGQUIT to process 54584
2022-10-26 09:17:47.743 UTC [53734] DEBUG: sending SIGQUIT to process 54583
2022-10-26 09:17:47.743 UTC [53734] DEBUG: sending SIGQUIT to process 54585
2022-10-26 09:17:47.743 UTC [53734] DEBUG: sending SIGQUIT to process 54586
2022-10-26 09:17:47.743 UTC [53734] DEBUG: sending SIGQUIT to process 54587
2022-10-26 09:17:47.743 UTC [54587] DEBUG: writing stats file "pg_stat/global.stat"
2022-10-26 09:17:47.743 UTC [53734] DEBUG: forked new backend, pid=54840 socket=9
2022-10-26 09:17:47.744 UTC [54840] LOG: connection received: host=[local]
2022-10-26 09:17:47.744 UTC [54840] FATAL: the database system is in recovery mode
2022-10-26 09:17:47.744 UTC [54840] DEBUG: shmem_exit(1): 0 before_shmem_exit callbacks to make
2022-10-26 09:17:47.744 UTC [54840] DEBUG: shmem_exit(1): 0 on_shmem_exit callbacks to make
2022-10-26 09:17:47.744 UTC [54840] DEBUG: proc_exit(1): 1 callbacks to make
2022-10-26 09:17:47.744 UTC [54840] DEBUG: exit(1)
2022-10-26 09:17:47.744 UTC [54840] DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make
2022-10-26 09:17:47.744 UTC [54840] DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make
2022-10-26 09:17:47.744 UTC [54840] DEBUG: proc_exit(-1): 0 callbacks to make
2022-10-26 09:17:47.746 UTC [53734] DEBUG: reaping dead processes
2022-10-26 09:17:47.746 UTC [53734] DEBUG: server process (PID 54840) exited with exit code 1
2022-10-26 09:17:47.746 UTC [53734] DEBUG: reaping dead processes
2022-10-26 09:17:47.746 UTC [53734] DEBUG: reaping dead processes
2022-10-26 09:17:47.746 UTC [53734] DEBUG: reaping dead processes
2022-10-26 09:17:47.746 UTC [53734] DEBUG: reaping dead processes
2022-10-26 09:17:48.242 UTC [54587] DEBUG: writing stats file "pg_stat/db_16385.stat"
2022-10-26 09:17:48.243 UTC [54587] DEBUG: removing temporary stats file "pg_stat_tmp/db_16385.stat"
2022-10-26 09:17:48.243 UTC [54587] DEBUG: writing stats file "pg_stat/db_0.stat"
2022-10-26 09:17:48.243 UTC [54587] DEBUG: removing temporary stats file "pg_stat_tmp/db_0.stat"
2022-10-26 09:17:48.243 UTC [54587] DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make
2022-10-26 09:17:48.243 UTC [54587] DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make
2022-10-26 09:17:48.243 UTC [54587] DEBUG: proc_exit(-1): 0 callbacks to make
2022-10-26 09:17:48.245 UTC [53734] DEBUG: reaping dead processes
2022-10-26 09:17:48.245 UTC [53734] LOG: all server processes terminated; reinitializing
GDB stack trace
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f6309e36df9 in InitCenters (index=index@entry=0x7f6309ea5d38, samples=samples@entry=0x7f62dde6b050, centers=centers@entry=0x564469e56d20, lowerBound=lowerBound@entry=0x7f5fe2f62050)
at src/ivfkmeans.c:63
63 lowerBound[j * numCenters + i] = distance;
(gdb) bt
#0 0x00007f6309e36df9 in InitCenters (index=index@entry=0x7f6309ea5d38, samples=samples@entry=0x7f62dde6b050, centers=centers@entry=0x564469e56d20, lowerBound=lowerBound@entry=0x7f5fe2f62050)
at src/ivfkmeans.c:63
#1 0x00007f6309e37164 in ElkanKmeans (index=0x7f6309ea5d38, samples=0x7f62dde6b050, centers=0x564469e56d20) at src/ivfkmeans.c:254
#2 0x00007f6309e37ada in IvfflatKmeans (index=0x7f6309ea5d38, samples=<optimized out>, centers=0x564469e56d20) at src/ivfkmeans.c:513
#3 0x00007f6309e354c0 in ComputeCenters (buildstate=buildstate@entry=0x7fff8198cd30) at src/ivfbuild.c:401
#4 0x00007f6309e35f4c in BuildIndex (heap=<optimized out>, index=0x7f6309ea5d38, indexInfo=<optimized out>, buildstate=buildstate@entry=0x7fff8198cd30, forkNum=forkNum@entry=MAIN_FORKNUM) at src/ivfbuild.c:580
#5 0x00007f6309e35fc4 in ivfflatbuild (heap=<optimized out>, index=<optimized out>, indexInfo=<optimized out>) at src/ivfbuild.c:599
#6 0x0000564467c8022c in index_build (heapRelation=heapRelation@entry=0x7f6309ea5a30, indexRelation=indexRelation@entry=0x7f6309ea5d38, indexInfo=indexInfo@entry=0x564469d00f38,
isreindex=isreindex@entry=false, parallel=parallel@entry=true) at index.c:3012
#7 0x0000564467c81e06 in index_create (heapRelation=heapRelation@entry=0x7f6309ea5a30, indexRelationName=indexRelationName@entry=0x564469de0920 "embed_vec_idx", indexRelationId=40964, indexRelationId@entry=0,
parentIndexRelid=parentIndexRelid@entry=0, parentConstraintId=parentConstraintId@entry=0, relFileNode=0, indexInfo=0x564469d00f38, indexColNames=0x564469de08a8, accessMethodObjectId=16435, tableSpaceId=0,
collationObjectId=0x564469dd3360, classObjectId=0x564469dd3380, coloptions=0x564469dd33a0, reloptions=94851833923912, flags=0, constr_flags=0, allow_system_table_mods=false, is_internal=false,
constraintId=0x7fff8198d0f4) at index.c:1232
#8 0x0000564467d39c55 in DefineIndex (relationId=relationId@entry=16462, stmt=stmt@entry=0x564469b70780, indexRelationId=indexRelationId@entry=0, parentIndexId=parentIndexId@entry=0,
parentConstraintId=parentConstraintId@entry=0, is_alter_table=is_alter_table@entry=false, check_rights=true, check_not_in_use=true, skip_build=false, quiet=false) at indexcmds.c:1164
#9 0x0000564467f6d24b in ProcessUtilitySlow (pstate=pstate@entry=0x564469d00e20, pstmt=pstmt@entry=0x564469b71820,
queryString=queryString@entry=0x564469b6fa10 "create index on embed using ivfflat (vec vector_cosine_ops) with (lists = 8000);", context=context@entry=PROCESS_UTILITY_TOPLEVEL, params=params@entry=0x0,
queryEnv=queryEnv@entry=0x0, dest=0x564469c87e28, qc=0x7fff8198d780) at utility.c:1534
#10 0x0000564467f6c6d2 in standard_ProcessUtility (pstmt=0x564469b71820, queryString=0x564469b6fa10 "create index on embed using ivfflat (vec vector_cosine_ops) with (lists = 8000);",
readOnlyTree=<optimized out>, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x564469c87e28, qc=0x7fff8198d780) at utility.c:1066
#11 0x0000564467f6c7bc in ProcessUtility (pstmt=pstmt@entry=0x564469b71820, queryString=<optimized out>, readOnlyTree=<optimized out>, context=context@entry=PROCESS_UTILITY_TOPLEVEL, params=<optimized out>,
queryEnv=<optimized out>, dest=0x564469c87e28, qc=0x7fff8198d780) at utility.c:527
#12 0x0000564467f69af6 in PortalRunUtility (portal=portal@entry=0x564469bb35d0, pstmt=pstmt@entry=0x564469b71820, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false,
dest=dest@entry=0x564469c87e28, qc=qc@entry=0x7fff8198d780) at pquery.c:1155
#13 0x0000564467f69dd6 in PortalRunMulti (portal=portal@entry=0x564469bb35d0, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x564469c87e28,
altdest=altdest@entry=0x564469c87e28, qc=qc@entry=0x7fff8198d780) at pquery.c:1312
#14 0x0000564467f6a1a3 in PortalRun (portal=portal@entry=0x564469bb35d0, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x564469c87e28,
altdest=altdest@entry=0x564469c87e28, qc=0x7fff8198d780) at pquery.c:788
#15 0x0000564467f66084 in exec_simple_query (query_string=query_string@entry=0x564469b6fa10 "create index on embed using ivfflat (vec vector_cosine_ops) with (lists = 8000);") at postgres.c:1213
#16 0x0000564467f682a4 in PostgresMain (argc=argc@entry=1, argv=argv@entry=0x7fff8198d980, dbname=<optimized out>, username=<optimized out>) at postgres.c:4496
#17 0x0000564467ec5e46 in BackendRun (port=port@entry=0x564469b98a70) at postmaster.c:4530
#18 0x0000564467ec8318 in BackendStartup (port=port@entry=0x564469b98a70) at postmaster.c:4252
#19 0x0000564467ec8565 in ServerLoop () at postmaster.c:1745
#20 0x0000564467ec9b9e in PostmasterMain (argc=argc@entry=5, argv=argv@entry=0x564469b691b0) at postmaster.c:1417
#21 0x0000564467e0aa52 in main (argc=5, argv=0x564469b691b0) at main.c:209
Hey @ArthurMelin, thanks for the great reporting and debugging! Just pushed a fix.
A few notes to self:
lowerBound indexing overflows at ~6500 lists since 6500 * (6500 * 50) is around 2^31 (centers * (centers * samples per center))
halfcdist indexing does not overflow since 32768 * 32768 is less than 2^31 (IVFFLAT_MAX_LISTS * IVFFLAT_MAX_LISTS), but could if IVFFLAT_MAX_LISTS is increased
Hello again, this issue is somewhat related to #41 but happens with different condtions and at a different code location.
This time the crash happens when creating an index on a table with a large number of rows and with the
lists
param also greater than around 6500 clusters.Reproduction steps:
Insert 1M rows into the table
Server logs
GDB stack trace
Versions:
posgresql 14.5
pgvector v0.3.0 (379a760)
The text was updated successfully, but these errors were encountered: