Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grn_io_lock failed / Deadlock #48

Open
peter-schmitz opened this issue Jul 7, 2017 · 32 comments
Open

grn_io_lock failed / Deadlock #48

peter-schmitz opened this issue Jul 7, 2017 · 32 comments

Comments

@peter-schmitz
Copy link

peter-schmitz commented Jul 7, 2017

Hi,

PGroonga probably caused a deadlock. I believe its strongly connected to (P)Groonga, because the Deadlock happened again (or still?) after restarting postgres a few hours later ( sudo service postgresql restart) and running the same transaction.
After dropping the PGroonga indices, this transaction went through without problems.

We had the same problem on another machine a few month ago too, but with PGroonga 1.1.9 (as

Postgresql 9.5.4 on Ubuntu 14.04 LTS / Trusty
PGroonga 1.2.1 & libgroonga0 7.0.3

Please ask if you need any more information.

Indizes:

create index zga on sucheintrag using pgroonga(volltextgeraeta) WITH (tokenizer='TokenTrigram', normalizer='');
create index zva on sucheintrag using pgroonga(volltexta) WITH (tokenizer='TokenTrigram', normalizer='');

postgresql.log (excerpt)

2017-07-06 04:25:38 CEST [25119-1] postgres@kris LOG:  duration: 14564.392 ms  statement: vacuum analyze verbose public.artikel
2017-07-06 04:25:49 CEST [27458-1] postgres@kris LOG:  duration: 10906.223 ms  statement: vacuum analyze verbose public.stueckliste
2017-07-06 04:26:05 CEST [27672-1] postgres@kris LOG:  duration: 15643.526 ms  statement: vacuum analyze verbose public.produkt
*** Error in `postgres: postgres kris [local] VACUUM': munmap_chunk(): invalid pointer: 0x000055c29292d250 ***
...
2017-07-06 04:48:44 CEST [22593-94] LOG:  server process (PID 28253) was terminated by signal 6: Aborted
2017-07-06 04:48:44 CEST [22593-95] DETAIL:  Failed process was running: vacuum analyze verbose public.sucheintrag
2017-07-06 04:48:44 CEST [22593-96] LOG:  terminating any other active server processes
2017-07-06 04:48:44 CEST [17103-1] postgres@kris WARNING:  terminating connection because of crash of another server process
2017-07-06 04:48:44 CEST [17103-2] postgres@kris DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
... (repeated)
2017-07-06 04:48:44 CEST [15112-2] WARNING:  terminating connection because of crash of another server process
2017-07-06 04:48:44 CEST [15112-3] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2017-07-06 04:48:44 CEST [15112-4] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2017-07-06 04:48:44 CEST [24489-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:44 CEST [24500-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:44 CEST [24510-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:44 CEST [24520-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:44 CEST [24530-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:44 CEST [22593-97] LOG:  all server processes terminated; reinitializing
2017-07-06 04:48:45 CEST [24570-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:45 CEST [24569-1] LOG:  database system was interrupted; last known up at 2017-07-06 04:41:47 CEST
2017-07-06 04:48:46 CEST [24573-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:46 CEST [24575-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:46 CEST [24577-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:47 CEST [24595-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:49 CEST [24596-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:50 CEST [24597-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:52 CEST [24569-2] LOG:  database system was not properly shut down; automatic recovery in progress
2017-07-06 04:48:52 CEST [24569-3] LOG:  redo starts at 169/672A6CB8
2017-07-06 04:48:52 CEST [24569-4] LOG:  invalid record length at 169/672DB9D8
2017-07-06 04:48:52 CEST [24569-5] LOG:  redo done at 169/672DB9B0
2017-07-06 04:48:52 CEST [24569-6] LOG:  last completed transaction was at log time 2017-07-06 04:47:17.730385+02
2017-07-06 04:48:52 CEST [24569-7] LOG:  MultiXact member wraparound protections are now enabled
2017-07-06 04:48:53 CEST [22593-98] LOG:  database system is ready to accept connections
2017-07-06 04:48:53 CEST [24603-1] LOG:  autovacuum launcher started
... (the following appears every 21 Minutes in the log)
2017-07-06 05:09:54 CEST [30644-1] postgres@kris ERROR:  pgroonga: failed to set column value: grn_io_lock failed
2017-07-06 05:09:54 CEST [30644-2] postgres@kris STATEMENT:  UPDATE "sucheintrag" SET "crea"=$1,"lmod"=$2,"ldel"=$3,"bot"=$4,"mandant"=$5,"volltextn"=$6,"volltexta"=$7,"volltextgeraetn"=$8,"volltextgeraeta"=$9,"suchklasse"=$10,"dirtyvollt
ext"=$11,"produkt"=$12 WHERE "id"=$13

pgroonga.log (default log level, excerpt):

2017-07-06 04:13:29.282776|n|31354: grn_init: <7.0.3>
2017-07-06 04:13:29.282933|n|31354: pgroonga: initialize: <1.2.1>
2017-07-06 04:16:26.250258|n|31354: grn_fin (0)
2017-07-06 04:20:27.747063|n|30326: grn_init: <7.0.3>
2017-07-06 04:20:27.747214|n|30326: pgroonga: initialize: <1.2.1>
2017-07-06 04:23:27.452161|n|30326: grn_fin (0)
2017-07-06 04:26:05.430571|n|28253: grn_init: <7.0.3>
2017-07-06 04:26:05.430721|n|28253: pgroonga: initialize: <1.2.1>
2017-07-06 04:27:32.852909|n|13483: grn_init: <7.0.3>
2017-07-06 04:27:32.853081|n|13483: pgroonga: initialize: <1.2.1>
2017-07-06 04:28:26.695134|n|13483: grn_fin (0)
2017-07-06 04:34:46.642268|n|19672: grn_init: <7.0.3>
2017-07-06 04:34:46.642419|n|19672: pgroonga: initialize: <1.2.1>
2017-07-06 04:34:49.945606|n|19672: io(base/721070/pgrn.000011D) collisions(1000/10): lock failed 1000 times
2017-07-06 04:36:28.040982|n|21138: grn_init: <7.0.3>
2017-07-06 04:36:28.041150|n|21138: pgroonga: initialize: <1.2.1>
2017-07-06 04:37:41.835086|n|21170: grn_init: <7.0.3>
2017-07-06 04:37:41.835249|n|21170: pgroonga: initialize: <1.2.1>
2017-07-06 04:40:49.826158|n|21138: grn_fin (0)
2017-07-06 04:40:54.061475|n|21170: grn_fin (0)
2017-07-06 04:46:53.549713|n|11897: grn_init: <7.0.3>
2017-07-06 04:46:53.549872|n|11897: pgroonga: initialize: <1.2.1>
2017-07-06 04:47:16.739324|n|17103: grn_init: <7.0.3>
2017-07-06 04:47:16.739483|n|17103: pgroonga: initialize: <1.2.1>
2017-07-06 04:48:38.585298|n|11897: grn_fin (0)
2017-07-06 04:53:57.987123|n|30644: grn_init: <7.0.3>
2017-07-06 04:53:57.987280|n|30644: pgroonga: initialize: <1.2.1>
2017-07-06 04:53:59.548606|n|30644: io(base/721070/pgrn.000011D) collisions(1000/6): lock failed 1000 times
2017-07-06 04:57:26.163223|n|827: grn_init: <7.0.3>
2017-07-06 04:57:26.163384|n|827: pgroonga: initialize: <1.2.1>
2017-07-06 04:58:46.705532|n|827: grn_fin (0)
2017-07-06 05:03:37.361773|n|19505: grn_init: <7.0.3>
2017-07-06 05:03:37.361932|n|19505: pgroonga: initialize: <1.2.1>
2017-07-06 05:03:50.407183|n|19505: grn_fin (0)
2017-07-06 05:04:29.389968|n|5995: grn_init: <7.0.3>
2017-07-06 05:04:29.390124|n|5995: pgroonga: initialize: <1.2.1>
2017-07-06 05:08:27.088960|n|5995: grn_fin (0)
2017-07-06 05:09:54.252436|w|30644: [DB Locked] time out(900000): io(base/721070/pgrn.000011D) collisions(900000/6)
2017-07-06 05:09:54.252470|e|30644: grn_io_lock failed
2017-07-06 05:09:54.284455|e|30644: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_io_lock+0x28e) [0x7fb2b15a970e]
2017-07-06 05:09:54.284467|e|30644: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_ii_column_update+0x15c) [0x7fb2b158c3ec]
2017-07-06 05:09:54.284471|e|30644: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_obj_default_set_value_hook+0x81) [0x7fb2b1466ca1]
2017-07-06 05:09:54.284474|e|30644: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_obj_set_value+0x1593) [0x7fb2b1479583]
2017-07-06 05:09:54.284477|e|30644: /usr/lib/postgresql/9.5/lib/pgroonga.so(+0xa0d9) [0x7fb2b1c520d9]
2017-07-06 05:09:54.284479|e|30644: /usr/lib/postgresql/9.5/lib/pgroonga.so(pgroonga_insert+0x5d) [0x7fb2b1c57cbd]
2017-07-06 05:09:54.284482|e|30644: postgres: postgres kris 127.0.0.1(47088) UPDATE(FunctionCall6Coll+0xa6) [0x55c2911f6336]
2017-07-06 05:09:54.284485|e|30644: postgres: postgres kris 127.0.0.1(47088) UPDATE(index_insert+0xa5) [0x55c290edd7e5]
2017-07-06 05:09:54.284488|e|30644: postgres: postgres kris 127.0.0.1(47088) UPDATE(ExecInsertIndexTuples+0x2cc) [0x55c290ff254c]
2017-07-06 05:09:54.284491|e|30644: postgres: postgres kris 127.0.0.1(47088) UPDATE(+0x20bc28) [0x55c29100dc28]
2017-07-06 05:09:54.284493|e|30644: postgres: postgres kris 127.0.0.1(47088) UPDATE(ExecModifyTable+0x3ff) [0x55c29100e34f]
2017-07-06 05:09:54.284496|e|30644: postgres: postgres kris 127.0.0.1(47088) UPDATE(ExecProcNode+0x188) [0x55c290ff6308]
2017-07-06 05:09:54.284499|e|30644: postgres: postgres kris 127.0.0.1(47088) UPDATE(standard_ExecutorRun+0x132) [0x55c290ff31f2]
2017-07-06 05:09:54.284502|e|30644: postgres: postgres kris 127.0.0.1(47088) UPDATE(+0x2fb509) [0x55c2910fd509]
2017-07-06 05:09:54.284504|e|30644: postgres: postgres kris 127.0.0.1(47088) UPDATE(+0x2fb751) [0x55c2910fd751]
2017-07-06 05:09:54.284507|e|30644: postgres: postgres kris 127.0.0.1(47088) UPDATE(PortalRun+0x1c5) [0x55c2910fe365]
2017-07-06 05:09:54.681939|n|30644: grn_fin (0)
2017-07-06 05:14:58.105577|n|6287: grn_init: <7.0.3>
2017-07-06 05:14:58.105733|n|6287: pgroonga: initialize: <1.2.1>
2017-07-06 05:14:59.705018|n|6287: io(base/721070/pgrn.000011D) collisions(1000/6): lock failed 1000 times
...
2017-07-06 04:53:57.987123|n|30644: grn_init: <7.0.3>
2017-07-06 04:53:57.987280|n|30644: pgroonga: initialize: <1.2.1>
2017-07-06 04:53:59.548606|n|30644: io(base/721070/pgrn.000011D) collisions(1000/6): lock failed 1000 times
...
2017-07-06 05:09:54.252436|w|30644: [DB Locked] time out(900000): io(base/721070/pgrn.000011D) collisions(900000/6)
2017-07-06 05:09:54.252470|e|30644: grn_io_lock failed
2017-07-06 05:09:54.284455|e|30644: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_io_lock+0x28e) [0x7fb2b15a970e]
2017-07-06 05:09:54.284467|e|30644: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_ii_column_update+0x15c) [0x7fb2b158c3ec]
2017-07-06 05:09:54.284471|e|30644: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_obj_default_set_value_hook+0x81) [0x7fb2b1466ca1]
2017-07-06 05:09:54.284474|e|30644: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_obj_set_value+0x1593) [0x7fb2b1479583]
2017-07-06 05:09:54.284477|e|30644: /usr/lib/postgresql/9.5/lib/pgroonga.so(+0xa0d9) [0x7fb2b1c520d9]
2017-07-06 05:09:54.284479|e|30644: /usr/lib/postgresql/9.5/lib/pgroonga.so(pgroonga_insert+0x5d) [0x7fb2b1c57cbd]
2017-07-06 05:09:54.284482|e|30644: postgres: postgres kris 127.0.0.1(47088) UPDATE(FunctionCall6Coll+0xa6) [0x55c2911f6336]
2017-07-06 05:09:54.284485|e|30644: postgres: postgres kris 127.0.0.1(47088) UPDATE(index_insert+0xa5) [0x55c290edd7e5]
2017-07-06 05:09:54.284488|e|30644: postgres: postgres kris 127.0.0.1(47088) UPDATE(ExecInsertIndexTuples+0x2cc) [0x55c290ff254c]
2017-07-06 05:09:54.284491|e|30644: postgres: postgres kris 127.0.0.1(47088) UPDATE(+0x20bc28) [0x55c29100dc28]
2017-07-06 05:09:54.284493|e|30644: postgres: postgres kris 127.0.0.1(47088) UPDATE(ExecModifyTable+0x3ff) [0x55c29100e34f]
2017-07-06 05:09:54.284496|e|30644: postgres: postgres kris 127.0.0.1(47088) UPDATE(ExecProcNode+0x188) [0x55c290ff6308]
2017-07-06 05:09:54.284499|e|30644: postgres: postgres kris 127.0.0.1(47088) UPDATE(standard_ExecutorRun+0x132) [0x55c290ff31f2]
2017-07-06 05:09:54.284502|e|30644: postgres: postgres kris 127.0.0.1(47088) UPDATE(+0x2fb509) [0x55c2910fd509]
2017-07-06 05:09:54.284504|e|30644: postgres: postgres kris 127.0.0.1(47088) UPDATE(+0x2fb751) [0x55c2910fd751]
2017-07-06 05:09:54.284507|e|30644: postgres: postgres kris 127.0.0.1(47088) UPDATE(PortalRun+0x1c5) [0x55c2910fe365]
2017-07-06 05:09:54.681939|n|30644: grn_fin (0)
@peter-schmitz peter-schmitz changed the title grn_io_lock failed / Deadlock while vacuum analyze grn_io_lock failed / Deadlock Jul 7, 2017
@kou
Copy link
Member

kou commented Jul 7, 2017

It's caused by crash at:

*** Error in `postgres: postgres kris [local] VACUUM': munmap_chunk(): invalid pointer: 0x000055c29292d250 ***
...
2017-07-06 04:48:44 CEST [22593-94] LOG:  server process (PID 28253) was terminated by signal 6: Aborted

Could you show full log for the above part?

@peter-schmitz
Copy link
Author

Could you show full log for the above part?

Is there something you're looking for explicitly?

IMHO, there is nothing really interesting at the "..." part above, only a query that is executed a lot and logged, because it takes slightly longer than 1 second most of the time.
The query itself does not use groonga or any table that groonga uses.

Some table names have been changed to protect their semantics:

2017-07-06 04:24:02 CEST [15994-1] postgres@kris LOG:  duration: 5340.907 ms  statement: vacuum analyze verbose public.wpzz
2017-07-06 04:24:09 CEST [17202-1] postgres@kris LOG:  duration: 6066.541 ms  statement: vacuum analyze verbose public.vssa
2017-07-06 04:24:22 CEST [17250-1] postgres@kris LOG:  duration: 13563.553 ms  statement: vacuum analyze verbose public.acse
2017-07-06 04:24:29 CEST [17617-1] postgres@kris LOG:  duration: 6711.251 ms  statement: vacuum analyze verbose public.bmsg
2017-07-06 04:24:38 CEST [20366-1] postgres@kris LOG:  duration: 8615.420 ms  statement: vacuum analyze verbose public.bmsp
2017-07-06 04:24:52 CEST [20571-1] postgres@kris LOG:  duration: 14598.891 ms  statement: vacuum analyze verbose public.kse
2017-07-06 04:24:54 CEST [22857-1] postgres@kris LOG:  duration: 1551.149 ms  statement: vacuum analyze verbose public.exsg
2017-07-06 04:25:00 CEST [23151-1] postgres@kris LOG:  duration: 5858.465 ms  statement: vacuum analyze verbose public.wsg
2017-07-06 04:25:17 CEST [23619-1] postgres@kris LOG:  duration: 17021.012 ms  statement: vacuum analyze verbose public.exse
2017-07-06 04:25:22 CEST [24583-1] postgres@kris LOG:  duration: 4027.982 ms  statement: vacuum analyze verbose public.pfsp
2017-07-06 04:25:38 CEST [25119-1] postgres@kris LOG:  duration: 14564.392 ms  statement: vacuum analyze verbose public.artikel
2017-07-06 04:25:49 CEST [27458-1] postgres@kris LOG:  duration: 10906.223 ms  statement: vacuum analyze verbose public.stueckliste
2017-07-06 04:26:05 CEST [27672-1] postgres@kris LOG:  duration: 15643.526 ms  statement: vacuum analyze verbose public.produkt
*** Error in `postgres: postgres kris [local] VACUUM': munmap_chunk(): invalid pointer: 0x000055c29292d250 ***
2017-07-06 04:42:17 CEST [5865-1] postgres@kris LOG:  duration: 1107.783 ms  execute S_146/C_991: /* sq.fs=100 */select _a.artikel from unrelatedtable _a (...)
2017-07-06 04:42:17 CEST [5865-2] postgres@kris DETAIL:  parameters: $1 = '2017-07-06 04:42:16.839+02', $2 = '2017-07-06 04:42:16.839+02', $3 = 'someid', $4 = 'someid'
2017-07-06 04:47:17 CEST [11897-1] postgres@kris LOG:  duration: 1003.105 ms  execute S_18/C_766: /* sq.fs=100 */select _a.artikel from unrelatedtable _a (...)
2017-07-06 04:47:17 CEST [11897-2] postgres@kris DETAIL:  parameters: $1 = '2017-07-06 04:47:16.316+02', $2 = '2017-07-06 04:47:16.316+02', $3 = 'someid', $4 = 'someid'
2017-07-06 04:48:44 CEST [22593-94] LOG:  server process (PID 28253) was terminated by signal 6: Aborted
2017-07-06 04:48:44 CEST [22593-95] DETAIL:  Failed process was running: vacuum analyze verbose public.sucheintrag
2017-07-06 04:48:44 CEST [22593-96] LOG:  terminating any other active server processes
2017-07-06 04:48:44 CEST [17103-1] postgres@kris WARNING:  terminating connection because of crash of another server process
2017-07-06 04:48:44 CEST [17103-2] postgres@kris DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2017-07-06 04:48:44 CEST [17103-3] postgres@kris HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2017-07-06 04:48:44 CEST [17105-1] postgres@kris WARNING:  terminating connection because of crash of another server process
2017-07-06 04:48:44 CEST [17105-2] postgres@kris DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2017-07-06 04:48:44 CEST [17105-3] postgres@kris HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2017-07-06 04:48:44 CEST [6530-1] postgres@kris WARNING:  terminating connection because of crash of another server process
2017-07-06 04:48:44 CEST [6530-2] postgres@kris DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2017-07-06 04:48:44 CEST [6530-3] postgres@kris HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2017-07-06 04:48:44 CEST [17109-1] postgres@kris WARNING:  terminating connection because of crash of another server process
2017-07-06 04:48:44 CEST [17109-2] postgres@kris DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2017-07-06 04:48:44 CEST [17109-3] postgres@kris HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2017-07-06 04:48:44 CEST [19672-1] postgres@kris WARNING:  terminating connection because of crash of another server process
2017-07-06 04:48:44 CEST [19672-2] postgres@kris DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2017-07-06 04:48:44 CEST [19672-3] postgres@kris HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2017-07-06 04:48:44 CEST [15112-2] WARNING:  terminating connection because of crash of another server process
2017-07-06 04:48:44 CEST [15112-3] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2017-07-06 04:48:44 CEST [15112-4] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2017-07-06 04:48:44 CEST [24489-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:44 CEST [24500-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:44 CEST [24510-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:44 CEST [24520-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:44 CEST [24530-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:44 CEST [22593-97] LOG:  all server processes terminated; reinitializing
2017-07-06 04:48:45 CEST [24570-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:45 CEST [24569-1] LOG:  database system was interrupted; last known up at 2017-07-06 04:41:47 CEST
2017-07-06 04:48:46 CEST [24573-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:46 CEST [24575-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:46 CEST [24577-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:47 CEST [24595-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:49 CEST [24596-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:50 CEST [24597-1] postgres@kris FATAL:  the database system is in recovery mode
2017-07-06 04:48:52 CEST [24569-2] LOG:  database system was not properly shut down; automatic recovery in progress
2017-07-06 04:48:52 CEST [24569-3] LOG:  redo starts at 169/672A6CB8
2017-07-06 04:48:52 CEST [24569-4] LOG:  invalid record length at 169/672DB9D8
2017-07-06 04:48:52 CEST [24569-5] LOG:  redo done at 169/672DB9B0
2017-07-06 04:48:52 CEST [24569-6] LOG:  last completed transaction was at log time 2017-07-06 04:47:17.730385+02
2017-07-06 04:48:52 CEST [24569-7] LOG:  MultiXact member wraparound protections are now enabled
2017-07-06 04:48:53 CEST [22593-98] LOG:  database system is ready to accept connections
2017-07-06 04:48:53 CEST [24603-1] LOG:  autovacuum launcher started
(logrotate)
2017-07-06 04:48:55 CEST [24607-1] postgres@kris LOG:  duration: 1166.946 ms  execute S_13/C_14: /* sq.fs=100 */select _a.artikel from unrelatedtable _a (...)

(logrotate) : This is where the logfile ends and the new one begins. The whole first file is 21MiB and consists mostly of unrelated the query.

Then, during the restart of postgresql later:

2017-07-06 07:35:40 CEST [8534-1] postgres@kris LOG:  unexpected EOF on client connection with an open transaction
2017-07-06 07:36:12 CEST [22593-99] LOG:  received fast shutdown request
2017-07-06 07:36:12 CEST [22593-100] LOG:  aborting any active transactions
2017-07-06 07:36:12 CEST [24603-2] LOG:  autovacuum launcher shutting down
2017-07-06 07:36:54 CEST [8535-1] postgres@kris ERROR:  pgroonga: failed to set column value: grn_io_lock failed
2017-07-06 07:36:54 CEST [8535-2] postgres@kris STATEMENT:  UPDATE "sucheintrag" SET "crea"=$1,"lmod"=$2,"ldel"=$3,"bot"=$4,"mandant"=$5,"volltextn"=$6,"volltexta"=$7,"volltextgeraetn"=$8,"volltextgeraeta"=$9,"suchklasse"=$10,"dirtyvolltext"=$11,"produkt"=$12 WHERE "id"=$13
2017-07-06 07:36:54 CEST [8535-3] postgres@kris FATAL:  terminating connection due to administrator command
2017-07-06 07:36:54 CEST [8535-4] postgres@kris LOG:  could not send data to client: Broken pipe
2017-07-06 07:36:54 CEST [24600-1] LOG:  shutting down
2017-07-06 07:36:54 CEST [24600-2] LOG:  database system is shut down
2017-07-06 07:36:56 CEST [11963-1] LOG:  database system was shut down at 2017-07-06 07:36:54 CEST
2017-07-06 07:36:56 CEST [11963-2] LOG:  MultiXact member wraparound protections are now enabled
2017-07-06 07:36:56 CEST [11967-1] LOG:  autovacuum launcher started
2017-07-06 07:36:56 CEST [11937-1] LOG:  database system is ready to accept connections
2017-07-06 07:36:56 CEST [11969-1] [unknown]@[unknown] LOG:  incomplete startup packet

@kou
Copy link
Member

kou commented Jul 8, 2017

Is there something you're looking for explicitly?

Unfortunately, no...

I expect there is the backtrace on crash in VACUUM ANALYZE VERBOSE public.sucheintrag. But it doesn't exist.
Ah, I got it. Could you show pgroonga.log? It may contain the backtrace.

@peter-schmitz
Copy link
Author

peter-schmitz commented Jul 8, 2017 via email

@kou
Copy link
Member

kou commented Jul 8, 2017

Ah, sorry. I forgot that you already provided pgroonga.log.

I expected that the process logs backtrace on crash but it didn't:

2017-07-06 04:26:05.430571|n|28253: grn_init: <7.0.3>
2017-07-06 04:26:05.430721|n|28253: pgroonga: initialize: <1.2.1>

Umm...

Can you reproduce it by running VACUUM ANALYZE VERBOSE public.sucheintrag manually?

@peter-schmitz
Copy link
Author

peter-schmitz commented Jul 10, 2017

Can you reproduce it by running VACUUM ANALYZE VERBOSE public.sucheintrag manually?

The machine it happened on isn't available for testing.
I'm trying it with another machine that is similar in setup, data and data update services. So far, I couldn't recreate the error.

I did find the output of verbose in another castle log file. Just the one timestamp at the end available, though.

VACUUM
INFO:  vacuuming "public.sucheintrag"
INFO:  scanned index "sucheintrag_pkey" to remove 327972 row versions
DETAIL:  CPU 0.06s/0.31u sec elapsed 0.49 sec.
INFO:  scanned index "ix_7934" to remove 327972 row versions
DETAIL:  CPU 0.09s/0.33u sec elapsed 0.72 sec.
INFO:  scanned index "ix_7937" to remove 327972 row versions
DETAIL:  CPU 0.04s/0.35u sec elapsed 0.53 sec.
INFO:  scanned index "ix_10479_brin" to remove 327972 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  scanned index "ix_10480_brin" to remove 327972 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  scanned index "ix_10481_brin" to remove 327972 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  scanned index "ix_11200_gin" to remove 327972 row versions
DETAIL:  CPU 0.00s/0.12u sec elapsed 0.17 sec.
INFO:  scanned index "ix_12477" to remove 327972 row versions
DETAIL:  CPU 0.08s/0.33u sec elapsed 0.73 sec.
INFO:  scanned index "ix_12478" to remove 327972 row versions
DETAIL:  CPU 0.07s/0.34u sec elapsed 0.72 sec.
INFO:  scanned index "ix_8826_trgm" to remove 327972 row versions
DETAIL:  CPU 0.90s/17.44u sec elapsed 20.77 sec.
INFO:  scanned index "ix_8824_trgm" to remove 327972 row versions
DETAIL:  CPU 0.79s/17.78u sec elapsed 22.17 sec.
INFO:  scanned index "ix_8823_trgm" to remove 327972 row versions
DETAIL:  CPU 0.17s/3.42u sec elapsed 4.03 sec.
INFO:  scanned index "ix_8825_trgm" to remove 327972 row versions
DETAIL:  CPU 0.15s/4.43u sec elapsed 5.12 sec.
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
connection to server was lost
psql: FATAL:  the database system is in recovery mode
psql: FATAL:  the database system is in recovery mode
psql: FATAL:  the database system is in recovery mode
psql: FATAL:  the database system is in recovery mode
psql: FATAL:  the database system is in recovery mode
170706-0448 Hard disk status after reindexing/vacuum:

P.S. updated the first post, forgot to add tokenizer='TokenTrigram' to the index definition.

@peter-schmitz
Copy link
Author

If the problem occurs again, given an available timeframe of about 5 minutes, anything I can try before 'fixing' the problem to generate more data the next time it happens? 'Fixing' means restarting postgresql and dropping the indexes, before re-running the transaction...

Last time it felt like the IO lock survived the first manual service postgresql restart, because inserting the same data afterwards with the same indexes (restarted the backend too) just ran into a lock again. Dropping the indexes via CLI then didn't complete for 2 minutes (normally takes just a few seconds). Then I restarted postgres again, drop the 2 indexes and insert the data afterwards, which worked.
Recreating the indexes at the very end then worked again too.

Did not drop the extension.

@kou
Copy link
Member

kou commented Jul 11, 2017

I want "backtrace" on crash. It will show where is there problem. Normally, it's logged in pgroonga.log on crash. We can get "backtrace" from core file. If you didn't disable core on your system, you may have core on the crash. Could you confirm it?

If the problem occurs again, given an available timeframe of about 5 minutes, anything I can try before 'fixing' the problem to generate more data the next time it happens?

Please ensure keeping core file for the crash.

Here is the right way to recover the IO lock:

  1. Stop PostgreSQL.
  2. Remove pgrn* files in data directory.
  3. Start PostgreSQL.
  4. Run REINDEX INDEX index_name; for all PGroonga indexes.

In many cases, you can recover by only running REINDEX. But some cases (e.g.: lock stays behind pgrn (no following number) file), you need to 1., 2. and 3.

@kou
Copy link
Member

kou commented Jul 11, 2017

INFO:  scanned index "ix_8825_trgm" to remove 327972 row versions
DETAIL:  CPU 0.15s/4.43u sec elapsed 5.12 sec.
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
connection to server was lost

It's interesting. The log may show PGroonga is crashed while bulk delete phase in VACUUM not vacuum cleanup phase in VACUUM.

Were you running SELECT, INSERT or UPDATE against the table while VACUUM? If you were, it may be a hint to reproduce this problem.

@peter-schmitz
Copy link
Author

peter-schmitz commented Jul 12, 2017

If you didn't disable core on your system, you may have core on the crash. Could you confirm it?

I have a ".crash" file of postgres from that time, its saved on my machine for now, but I haven't had the time to look into it yet.

Were you running SELECT, INSERT or UPDATE against the table while VACUUM? If you were, it may be a hint to reproduce this problem.

Yes, an UPDATE was triggered at 04:27:31.667 (succeeded) and the next at 04:34:46.436 (failed).
IIRC, then our backend first selects a row before updating it. This SELECT does not use pgroonga.
There was no SELECT query that uses pgroonga at the time.

If you were, it may be a hint to reproduce this problem.

A single "lock failed 1000 times" was all I managed yesterday, did not lock up the whole table.
Did not (yet) change vm.overcommit locally.

2017-07-10 11:15:20.492026|n|14021: grn_init: <7.0.4>
2017-07-10 11:15:20.492080|n|14021: vm.overcommit_memory kernel parameter should be 1: <0>: See INFO level log to resolve this
2017-07-10 11:15:20.492083|i|14021: Some processings with vm.overcommit_memory != 1 may break DB under low memory condition.
2017-07-10 11:15:20.492086|i|14021: To set vm.overcommit_memory to 1
2017-07-10 11:15:20.492087|i|14021: add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and restart your system or
2017-07-10 11:15:20.492095|i|14021: run 'sudo /sbin/sysctl vm.overcommit_memory=1' command.
2017-07-10 11:15:20.492166|n|14021: pgroonga: initialize: <1.2.3>
2017-07-10 11:15:32.890300|d|14021: flushing a[0]=5114716 seg=78(0x7f931dada000) free=0
2017-07-10 11:15:32.904601|d|14021: flushed  a[0]=5114716 seg=78(0x7f9310dcd000) free=0->248768 nterms=835 v=0
2017-07-10 11:15:32.904679|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3cd6690 max=116 (3/3)
2017-07-10 11:15:33.174537|d|14021: flushing a[0]=11143708 seg=170(0x7f931890d000) free=4
2017-07-10 11:15:33.181123|d|14021: flushed  a[0]=11143708 seg=170(0x7f9310bcd000) free=4->250816 nterms=707 v=0
2017-07-10 11:15:33.181171|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3cd6690 max=116 (3/3)
2017-07-10 11:15:33.195569|d|14021: flushing a[0]=524668 seg=8(0x7f93189cd000) free=8
2017-07-10 11:15:33.204033|d|14021: flushed  a[0]=524668 seg=8(0x7f930fd8d000) free=8->248608 nterms=845 v=0
2017-07-10 11:15:33.204090|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3cd6690 max=119 (3/3)
2017-07-10 11:15:44.493855|d|14021: flushing a[0]=1769568 seg=27(0x7f931934d000) free=4
2017-07-10 11:15:44.506410|d|14021: flushed  a[0]=1769568 seg=27(0x7f930dc0d000) free=4->247632 nterms=906 v=0
2017-07-10 11:15:44.506461|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3e0dbd0 max=44 (3/3)
2017-07-10 11:15:44.540986|d|14021: nterms=8220 chunk=53306 total=142110
2017-07-10 11:15:44.542458|d|14021: d0=4337 d1=3883
... (variations of 14021 expired repeated)
2017-07-10 11:16:09.182960|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3e0dbd0 max=70 (3/3)
2017-07-10 11:16:20.772026|d|10362: flushing a[0]=4981840 seg=76(0x7f931de1a000) free=8
2017-07-10 11:16:20.784763|d|10362: flushed  a[0]=4981840 seg=76(0x7f930348c000) free=8->251696 nterms=652 v=0
2017-07-10 11:16:20.784833|i|10362: <0x7f932c5542e0:0> expired i=0x55f4d3f41f70 max=143 (3/3)
2017-07-10 11:16:32.322395|d|10362: flushing a[0]=404 seg=0(0x7f930568c000) free=4
2017-07-10 11:16:32.339863|d|10362: flushed  a[0]=404 seg=0(0x7f930344c000) free=4->252272 nterms=616 v=0
2017-07-10 11:16:32.339959|i|10362: <0x7f932c5542e0:0> expired i=0x55f4d3f41f70 max=143 (3/3)
2017-07-10 11:16:44.891099|d|14021: flushing a[0]=983200 seg=15(0x7f931928d000) free=4
2017-07-10 11:16:44.905489|d|14021: flushed  a[0]=983200 seg=15(0x7f930ab8d000) free=4->260912 nterms=76 v=0
2017-07-10 11:16:44.905551|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3e0dbd0 max=70 (3/3)
2017-07-10 11:16:44.967905|d|14021: flushing a[0]=2031972 seg=31(0x7f931898d000) free=8
2017-07-10 11:16:44.973574|d|14021: flushed  a[0]=2031972 seg=31(0x7f930aa0d000) free=8->250608 nterms=720 v=0
2017-07-10 11:16:44.973625|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3cd6690 max=119 (3/3)
2017-07-10 11:16:56.231265|d|14021: flushing a[0]=17694864 seg=270(0x7f931918d000) free=8
2017-07-10 11:16:56.242483|d|14021: flushed  a[0]=17694864 seg=270(0x7f930a88d000) free=8->257456 nterms=292 v=0
2017-07-10 11:16:56.242551|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3e0dbd0 max=70 (2/2)
2017-07-10 11:16:56.271057|d|14021: flushing a[0]=4785196 seg=73(0x7f931dd9a000) free=4
2017-07-10 11:16:56.279477|d|14021: flushed  a[0]=4785196 seg=73(0x7f930a80d000) free=4->249936 nterms=762 v=0
2017-07-10 11:16:56.279535|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3cd6690 max=119 (4/4)
2017-07-10 11:17:07.772547|d|14021: flushing a[0]=4328436 seg=66(0x7f931894d000) free=0
2017-07-10 11:17:07.778413|d|14021: flushed  a[0]=4328436 seg=66(0x7f930a28d000) free=0->247552 nterms=911 v=0
2017-07-10 11:17:07.778459|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3cd6690 max=130 (3/3)
2017-07-10 11:17:20.444779|d|14021: flushing a[0]=12255384 seg=187(0x7f931958d000) free=8
2017-07-10 11:17:20.447091|d|14021: flushed  a[0]=12255384 seg=187(0x7f93092cd000) free=8->247712 nterms=901 v=1
2017-07-10 11:17:20.447164|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3e0dbd0 max=70 (3/3)
2017-07-10 11:17:31.379462|d|14021: flushing a[0]=198460 seg=3(0x7f931348d000) free=4
2017-07-10 11:17:31.379472|d|14021: nterms=1610 chunk=149966 total=142284
2017-07-10 11:17:31.379980|d|14021: d0=649 d1=961
2017-07-10 11:17:31.382188|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3e0dbd0 max=81 (7/8)
2017-07-10 11:17:31.940672|d|14021: nterms=9327 chunk=55822 total=142285
2017-07-10 11:17:31.942635|d|14021: d0=4860 d1=4467
2017-07-10 11:17:31.382188|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3e0dbd0 max=81 (7/8)
2017-07-10 11:17:31.940672|d|14021: nterms=9327 chunk=55822 total=142285
2017-07-10 11:17:31.942635|d|14021: d0=4860 d1=4467
2017-07-10 11:17:31.943715|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3e0dbd0 max=81 (2/3)
2017-07-10 11:17:32.211188|d|14021: flushing a[0]=7997520 seg=122(0x7f930b7cd000) free=0
2017-07-10 11:17:32.223428|d|14021: flushed  a[0]=7997520 seg=122(0x7f930868d000) free=0->250000 nterms=758 v=0
2017-07-10 11:17:32.223533|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3cd6690 max=130 (3/3)
2017-07-10 11:17:43.351351|d|14021: flushing a[0]=9438196 seg=144(0x7f9318ecd000) free=4
2017-07-10 11:17:43.354881|d|14021: flushed  a[0]=9438196 seg=144(0x7f9307c4d000) free=4->258080 nterms=253 v=0
2017-07-10 11:17:43.354921|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3e0dbd0 max=81 (3/4)
2017-07-10 11:17:43.358450|d|14021: flushing a[0]=17957688 seg=274(0x7f9318f8d000) free=4
2017-07-10 11:17:43.364341|d|14021: flushed  a[0]=17957688 seg=274(0x7f9307c0d000) free=4->253552 nterms=536 v=0
2017-07-10 11:17:43.364374|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3e0dbd0 max=81 (3/4)
2017-07-10 11:17:43.457578|d|14021: flushing a[0]=3343572 seg=51(0x7f93175cd000) free=4
2017-07-10 11:17:43.461817|d|14021: flushed  a[0]=3343572 seg=51(0x7f9307b4d000) free=4->250928 nterms=700 v=0
2017-07-10 11:17:43.461850|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3e0dbd0 max=81 (3/4)
2017-07-10 11:17:43.793526|d|14021: flushing a[0]=1114212 seg=17(0x7f931950d000) free=8
2017-07-10 11:17:43.809584|d|14021: split (125) encsize=398337
2017-07-10 11:17:43.811161|d|14021: flushed  a[0]=1114212 seg=17(0x7f9307a4d000) free=8->259712 nterms=151 v=0
2017-07-10 11:17:43.811236|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3e0dbd0 max=82 (5/6)
2017-07-10 11:17:56.454475|d|14021: flushing a[0]=4917312 seg=75(0x7f9318a0d000) free=8
2017-07-10 11:17:56.460848|d|14021: flushed  a[0]=4917312 seg=75(0x7f930758d000) free=8->246672 nterms=966 v=0
2017-07-10 11:17:56.460910|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3cd6690 max=131 (3/3)
2017-07-10 11:18:08.219132|d|14021: flushing a[0]=5309600 seg=81(0x7f9318a4d000) free=0
2017-07-10 11:18:08.226008|d|14021: flushed  a[0]=5309600 seg=81(0x7f930660d000) free=0->247424 nterms=919 v=0
2017-07-10 11:18:08.226077|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3cd6690 max=131 (3/3)
2017-07-10 11:18:19.705870|d|14021: flushing a[0]=3933808 seg=60(0x7f931dc9a000) free=0
2017-07-10 11:18:19.717218|d|14021: flushed  a[0]=3933808 seg=60(0x7f93064cd000) free=0->249472 nterms=791 v=0
2017-07-10 11:18:19.717281|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3cd6690 max=131 (3/3)
2017-07-10 11:18:19.925037|d|14021: flushing a[0]=8063836 seg=123(0x7f9315f8d000) free=0
2017-07-10 11:18:19.929121|d|14021: flushed  a[0]=8063836 seg=123(0x7f93063cd000) free=0->248656 nterms=842 v=0
2017-07-10 11:18:19.929166|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3e0dbd0 max=82 (3/4)
2017-07-10 11:18:31.846612|n|10362: grn_fin (0)
2017-07-10 11:18:32.440728|d|14021: flushing a[0]=11601400 seg=177(0x7f9318f0d000) free=4
2017-07-10 11:18:32.444054|d|14021: flushed  a[0]=11601400 seg=177(0x7f931934d000) free=4->255856 nterms=392 v=0
2017-07-10 11:18:32.444081|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3e0dbd0 max=82 (3/4)
2017-07-10 11:18:32.652345|d|14021: flushing a[0]=17694872 seg=270(0x7f930a88d000) free=4
2017-07-10 11:18:32.662713|d|14021: flushed  a[0]=17694872 seg=270(0x7f9316bcd000) free=4->257456 nterms=292 v=0
2017-07-10 11:18:32.662776|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3e0dbd0 max=82 (3/4)
2017-07-10 11:18:32.741130|d|14021: flushing a[0]=404 seg=0(0x7f930af0d000) free=4
2017-07-10 11:18:32.760459|d|14021: flushed  a[0]=404 seg=0(0x7f930630d000) free=4->252272 nterms=616 v=0
2017-07-10 11:18:32.760576|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3cd6690 max=131 (3/3)
2017-07-10 11:18:43.083620|d|5721: flushing! b=0x7f9312328000 free=4, seg(122)
2017-07-10 11:18:43.102531|d|5721: flushed!  b=0x7f93128e8000 free=250000, seg(122)
2017-07-10 11:18:43.950928|d|5721: flushing! b=0x7f9322b90000 free=4, seg(84)
2017-07-10 11:18:43.959999|d|5721: flushed!  b=0x7f9323510000 free=251440, seg(84)
... just 5721 flushing/flushed
2017-07-10 11:18:48.188777|d|5721: flushed!  b=0x7f9310ba8000 free=250816, seg(170)
2017-07-10 11:18:48.331364|d|5721: flushing! b=0x7f93200d0000 free=0, seg(74)
2017-07-10 11:18:48.342908|d|5721: flushed!  b=0x7f932c290000 free=252144, seg(74)
2017-07-10 11:18:48.430631|d|5721: flushing! b=0x7f9327c50000 free=4, seg(136)
2017-07-10 11:18:48.437973|d|5721: flushed!  b=0x7f9312f68000 free=245984, seg(136)
2017-07-10 11:18:48.828140|d|5721: flushing! b=0x7f93128e8000 free=4, seg(122)
2017-07-10 11:18:48.851817|d|5721: flushed!  b=0x7f930ea68000 free=250000, seg(122)
2017-07-10 11:18:49.256476|d|5721: flushing! b=0x7f92a849e000 free=0, seg(76)
2017-07-10 11:18:49.274083|d|5721: flushed!  b=0x7f9324390000 free=251696, seg(76)
2017-07-10 11:18:52.142917|n|14021: io(base/100039298/pgrn.00001DE) collisions(7070/744813): lock failed 1000 times
2017-07-10 11:18:52.596657|d|5721: flushing! b=0x7f9322f90000 free=4, seg(73)
2017-07-10 11:18:52.610423|d|5721: flushed!  b=0x7f93125a8000 free=249936, seg(73)
2017-07-10 11:18:53.630990|d|5721: flushing! b=0x7f9323610000 free=0, seg(31)
...
2017-07-10 11:19:01.819602|d|5721: flushing! b=0x7f9323510000 free=4, seg(84)
2017-07-10 11:19:01.837465|d|5721: flushed!  b=0x7f9322cd0000 free=251440, seg(84)
2017-07-10 11:19:02.019884|d|14021: flushing a[0]=5114716 seg=78(0x7f931890d000) free=0
2017-07-10 11:19:02.041970|d|14021: flushed  a[0]=5114716 seg=78(0x7f9310bcd000) free=0->248768 nterms=835 v=0
2017-07-10 11:19:02.042381|i|14021: <0x7f932c5542e0:0> expired i=0x55f4d3cd6690 max=138 (66/66)
2017-07-10 11:19:02.232881|d|5721: flushing! b=0x7f9323a10000 free=0, seg(122)
2017-07-10 11:19:02.253210|d|5721: flushed!  b=0x7f93200d0000 free=250000, seg(122)
...

@kou
Copy link
Member

kou commented Jul 13, 2017

I have a ".crash" file of postgres from that time, its saved on my machine for now, but I haven't had the time to look into it yet.

I don't know about the ".crash" file but if it's a core file, we can get backtrace from it by the following command line:

% echo "thread apply all backtrace full" > gdb-command
% gdb --command=gdb-command postgres XXX.crash
(...backtrace...)

Yes, an UPDATE was triggered at 04:27:31.667 (succeeded) and the next at 04:34:46.436 (failed).
IIRC, then our backend first selects a row before updating it. This SELECT does not use pgroonga.
There was no SELECT query that uses pgroonga at the time.

Thanks. We may need to run UPDATE while VACUUM to reproduce this problem.

A single "lock failed 1000 times" was all I managed yesterday, did not lock up the whole table.

It's not a problem. In this log, you run UPDATE (or INSERT) from two processes. Index update can be done from only one process. So PGroonga tries to acquire a lock. If the process that acquires a lock takes time, another process may take time to acquire a lock. "lock failed 1000 times" is logged for the case. It's not a problem because another process acquired a lock later.

2017-07-10 11:18:52.142917|n|14021: io(base/100039298/pgrn.00001DE) collisions(7070/744813): lock failed 1000 times
...
2017-07-10 11:19:02.019884|d|14021: flushing a[0]=5114716 seg=78(0x7f931890d000) free=0 <- 14021 process acquired a lock

@peter-schmitz
Copy link
Author

Sorry for the wait, but it took some time to happen again and hadn't the time to look much into it in the past weeks.

This time it locked up during a different time of day - contrary to the occurences posted earlier - i.e. it might have a different outside trigger.

I did not see any running VACUUM-Process (except the Autovacuum launcher) in htop before starting dumping the backtraces.

I hope the following data contains useful information anyway.

Overview:

  • internal log of our server the moment it happend initially, might show the initial trigger.
  • excerpt from postgresql.log
  • excerpt from pgroonga.log, both from the time of the initial error and of the gdb backtraces
  • gdb backtraces for autovacuum launcher thread and 2 locked Postgresql worker threads being locked pgroonga.

JDBC driver error:

WARN  23:38:59.174 [ris->212.185.169.242] ProxyConnection checkException(152)- HikariPool-1 - Connection org.postgresql.jdbc4.Jdbc4Connection@4811db49 marked as broken because of SQLSTATE(08006), ErrorCode(0)
org.postgresql.util.PSQLException: An I/O error occurred while sending to the backend.
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:201)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:582)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:432)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:378)
        at com.zaxxer.hikari.pool.ProxyPreparedStatement.executeUpdate(ProxyPreparedStatement.java:61)
        at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeUpdate(HikariProxyPreparedStatement.java)
        at org.exolab.castor.jdo.engine.SQLEngine.store(SQLEngine.java:920)
        at org.exolab.castor.persist.ClassMolder.store(ClassMolder.java:1742)
        at org.exolab.castor.persist.LockEngine.store(LockEngine.java:806)
        at org.exolab.castor.persist.TransactionContext.prepare(TransactionContext.java:1591)
        at org.exolab.castor.jdo.engine.DatabaseImpl.commit(DatabaseImpl.java:514)
        at [internal class].commit([internal class].java:353)
...
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.EOFException
        at org.postgresql.core.PGStream.ReceiveChar(PGStream.java:284)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1710)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:173)
        ... 21 more

postgresql.log

*** Error in `postgres: postgres kris 127.0.0.1(33416) UPDATE': corrupted size vs. prev_size: 0x000055978046d530 ***
2017-08-24 23:38:59 CEST [15949-10] LOG:  server process (PID 12649) was terminated by signal 6: Aborted
2017-08-24 23:38:59 CEST [15949-11] DETAIL:  Failed process was running: UPDATE "sucheintrag" SET "crea"=$1,"lmod"=$2,"ldel"=$3,"bot"=$4,"mandant"=$5,"volltextn"=$6,"volltexta"=$7,"volltextgeraetn"=$8,"volltextgeraeta"=$9,"suchklasse"=$10,"dirtyvolltext"=$11,"produkt"=$12 WHERE "id"=$13
2017-08-24 23:38:59 CEST [15949-12] LOG:  terminating any other active server processes
2017-08-24 23:38:59 CEST [22772-1] postgres@kris WARNING:  terminating connection because of crash of another server process
2017-08-24 23:38:59 CEST [22772-2] postgres@kris DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2017-08-24 23:38:59 CEST [22772-3] postgres@kris HINT:  In a moment you should be able to reconnect to the database and repeat your command.
... (repeated for several processes)
2017-08-24 23:38:59 CEST [22678-3] postgres@kris HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2017-08-24 23:38:59 CEST [5891-2] WARNING:  terminating connection because of crash of another server process
2017-08-24 23:38:59 CEST [5891-3] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2017-08-24 23:38:59 CEST [5891-4] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2017-08-24 23:38:59 CEST [22671-1] postgres@kris WARNING:  terminating connection because of crash of another server process
2017-08-24 23:38:59 CEST [22671-2] postgres@kris DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2017-08-24 23:38:59 CEST [22671-3] postgres@kris HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2017-08-24 23:38:59 CEST [22589-1] postgres@kris WARNING:  terminating connection because of crash of another server process
2017-08-24 23:38:59 CEST [22589-2] postgres@kris DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2017-08-24 23:38:59 CEST [22589-3] postgres@kris HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2017-08-24 23:38:59 CEST [15949-13] LOG:  all server processes terminated; reinitializing
2017-08-24 23:39:00 CEST [25611-1] postgres@kris FATAL:  the database system is in recovery mode
2017-08-24 23:39:00 CEST [25610-1] LOG:  database system was interrupted; last known up at 2017-08-24 23:38:05 CEST
2017-08-24 23:39:01 CEST [25612-1] postgres@kris FATAL:  the database system is in recovery mode
2017-08-24 23:39:01 CEST [25617-1] postgres@kris FATAL:  the database system is in recovery mode
2017-08-24 23:39:01 CEST [25639-1] postgres@kris FATAL:  the database system is in recovery mode
2017-08-24 23:39:02 CEST [25640-1] postgres@kris FATAL:  the database system is in recovery mode
2017-08-24 23:39:04 CEST [25648-1] postgres@kris FATAL:  the database system is in recovery mode
2017-08-24 23:39:05 CEST [25650-1] postgres@kris FATAL:  the database system is in recovery mode
2017-08-24 23:39:07 CEST [25610-2] LOG:  database system was not properly shut down; automatic recovery in progress
2017-08-24 23:39:07 CEST [25610-3] LOG:  redo starts at 1B3/ACCAB738
2017-08-24 23:39:07 CEST [25610-4] LOG:  unexpected pageaddr 1B3/3F478000 in log segment 00000001000001B3000000AD, offset 4685824
2017-08-24 23:39:07 CEST [25610-5] LOG:  redo done at 1B3/AD477160
2017-08-24 23:39:07 CEST [25610-6] LOG:  last completed transaction was at log time 2017-08-24 23:38:51.986953+02
2017-08-24 23:39:08 CEST [25610-7] LOG:  MultiXact member wraparound protections are now enabled
2017-08-24 23:39:08 CEST [15949-14] LOG:  database system is ready to accept connections
2017-08-24 23:39:08 CEST [25656-1] LOG:  autovacuum launcher started
2017-08-24 23:39:10 CEST [25661-1] postgres@kris LOG:  duration: 1506.222 ms  execute S_12/C_13: /* sq.fs=100 */select _a.artikel...
... (only normal messages about slow queries)
2017-08-25 00:00:14 CEST [2897-1] postgres@kris ERROR:  pgroonga: failed to set column value: grn_io_lock failed
2017-08-25 00:00:14 CEST [2897-2] postgres@kris STATEMENT:  UPDATE "sucheintrag" SET "crea"=$1,"lmod"=$2,"ldel"=$3,"bot"=$4,"mandant"=$5,"volltextn"=$6,"volltexta"=$7,"volltextgeraetn"=$8,"volltextgeraeta"=$9,"suchklasse"=$10,"dirtyvolltext"=$11,"produkt"=$12 WHERE "id"=$13

postgresql

2017-08-25 07:48:22 CEST [15949-15] LOG:  received fast shutdown request
2017-08-25 07:48:22 CEST [15949-16] LOG:  aborting any active transactions
2017-08-25 07:48:22 CEST [2145-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [2131-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [2285-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [2126-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [2114-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [2062-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [2083-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [31811-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [2109-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [31530-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [31595-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [21762-25] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [2101-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [2087-5] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [31818-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [31824-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [31551-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [31540-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [31804-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [31823-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [31545-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [25656-2] LOG:  autovacuum launcher shutting down
2017-08-25 07:48:22 CEST [2063-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [2047-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [31817-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [31790-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [31535-3] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [31606-5] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [2024-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [31808-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [2006-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:22 CEST [30472-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:25 CEST [11427-1] postgres@kris FATAL:  the database system is shutting down
2017-08-25 07:48:26 CEST [11428-1] postgres@kris FATAL:  the database system is shutting down
2017-08-25 07:48:26 CEST [11430-1] postgres@kris FATAL:  the database system is shutting down
2017-08-25 07:48:27 CEST [11431-1] postgres@kris FATAL:  the database system is shutting down
2017-08-25 07:48:27 CEST [11435-1] postgres@kris FATAL:  the database system is shutting down
2017-08-25 07:48:29 CEST [11453-1] postgres@kris FATAL:  the database system is shutting down
2017-08-25 07:48:31 CEST [11456-1] postgres@kris FATAL:  the database system is shutting down
2017-08-25 07:48:33 CEST [11462-1] postgres@kris FATAL:  the database system is shutting down
2017-08-25 07:48:38 CEST [11470-1] postgres@kris FATAL:  the database system is shutting down
2017-08-25 07:48:44 CEST [11480-1] postgres@kris FATAL:  the database system is shutting down
2017-08-25 07:48:45 CEST [13329-1] postgres@kris FATAL:  terminating connection due to administrator command
2017-08-25 07:48:45 CEST [13329-2] postgres@kris STATEMENT:  vacuum analyze verbose public.sucheintrag
2017-08-25 07:48:45 CEST [11493-1] postgres@kris FATAL:  the database system is shutting down
2017-08-25 07:48:45 CEST [11503-1] postgres@kris FATAL:  the database system is shutting down

pgroonga.log - During initial error

2017-08-24 23:35:20.053171|n|28423: grn_fin (0)
2017-08-24 23:35:20.242945|n|28635: grn_fin (0)
2017-08-24 23:38:26.477907|n|12649: grn_init: <7.0.5>
2017-08-24 23:38:26.478043|n|12649: pgroonga: initialize: <2.0.1>
2017-08-24 23:38:26.484488|n|4411: grn_init: <7.0.5>
2017-08-24 23:38:26.484645|n|4411: pgroonga: initialize: <2.0.1>
2017-08-24 23:38:26.489652|n|2293: grn_init: <7.0.5>
2017-08-24 23:38:26.489870|n|2293: pgroonga: initialize: <2.0.1>
2017-08-24 23:38:26.490326|n|2451: grn_init: <7.0.5>
2017-08-24 23:38:26.490475|n|2451: pgroonga: initialize: <2.0.1>
2017-08-24 23:38:28.997825|n|12651: grn_init: <7.0.5>
2017-08-24 23:38:28.998060|n|12651: pgroonga: initialize: <2.0.1>
2017-08-24 23:38:29.003015|n|2292: grn_init: <7.0.5>
2017-08-24 23:38:29.003184|n|2292: pgroonga: initialize: <2.0.1>
2017-08-24 23:38:29.506948|n|22632: grn_init: <7.0.5>
2017-08-24 23:38:29.507110|n|22632: pgroonga: initialize: <2.0.1>
2017-08-24 23:38:33.002230|n|22723: grn_init: <7.0.5>
2017-08-24 23:38:33.002403|n|22723: pgroonga: initialize: <2.0.1>
2017-08-24 23:38:33.009183|n|22626: grn_init: <7.0.5>
2017-08-24 23:38:33.009349|n|22626: pgroonga: initialize: <2.0.1>
2017-08-24 23:40:54.556715|n|25659: grn_init: <7.0.5>
2017-08-24 23:40:54.556871|n|25659: pgroonga: initialize: <2.0.1>
2017-08-24 23:41:48.249878|n|2871: grn_init: <7.0.5>
2017-08-24 23:41:48.250063|n|2871: pgroonga: initialize: <2.0.1>
2017-08-24 23:41:48.255925|n|25658: grn_init: <7.0.5>
2017-08-24 23:41:48.256076|n|25658: pgroonga: initialize: <2.0.1>
2017-08-24 23:41:48.256851|n|25661: grn_init: <7.0.5>
2017-08-24 23:41:48.256990|n|25661: pgroonga: initialize: <2.0.1>
2017-08-24 23:44:01.382583|n|25658: grn_fin (0)
2017-08-24 23:44:03.131435|n|25659: grn_fin (0)
2017-08-24 23:44:03.487357|n|25661: grn_fin (0)
2017-08-24 23:44:17.703922|n|2897: grn_init: <7.0.5>
2017-08-24 23:44:17.704112|n|2897: pgroonga: initialize: <2.0.1>
2017-08-24 23:44:20.220152|n|2897: io(base/721070/pgrn.000010A) collisions(1000/37): lock failed 1000 times
2017-08-24 23:46:36.282677|n|17054: grn_init: <7.0.5>
2017-08-24 23:46:36.282816|n|17054: pgroonga: initialize: <2.0.1>
2017-08-24 23:46:36.288484|n|21787: grn_init: <7.0.5>
2017-08-24 23:46:36.288480|n|21798: grn_init: <7.0.5>
2017-08-24 23:46:36.288498|n|21768: grn_init: <7.0.5>
2017-08-24 23:46:36.288642|n|21768: pgroonga: initialize: <2.0.1>
2017-08-24 23:46:36.288647|n|21798: pgroonga: initialize: <2.0.1>
2017-08-24 23:46:36.288648|n|21787: pgroonga: initialize: <2.0.1>
2017-08-24 23:49:01.249106|n|17054: grn_fin (0)
2017-08-24 23:49:09.237665|n|21729: grn_init: <7.0.5>
2017-08-24 23:49:09.237818|n|21729: pgroonga: initialize: <2.0.1>
2017-08-24 23:49:28.443607|n|21787: grn_fin (0)
2017-08-24 23:49:29.546442|n|21768: grn_fin (0)
2017-08-24 23:49:31.556003|n|21729: grn_fin (0)
2017-08-24 23:49:32.675754|n|21798: grn_fin (0)
2017-08-24 23:51:10.741657|n|5548: grn_init: <7.0.5>
2017-08-24 23:51:10.741827|n|5548: pgroonga: initialize: <2.0.1>
2017-08-24 23:51:10.748245|n|30193: grn_init: <7.0.5>
2017-08-24 23:51:10.748380|n|30193: pgroonga: initialize: <2.0.1>
2017-08-24 23:51:10.748414|n|2800: grn_init: <7.0.5>
2017-08-24 23:51:10.748436|n|5552: grn_init: <7.0.5>
2017-08-24 23:51:10.748541|n|2800: pgroonga: initialize: <2.0.1>
2017-08-24 23:51:10.748568|n|5552: pgroonga: initialize: <2.0.1>
2017-08-24 23:53:49.040784|n|30193: grn_fin (0)
2017-08-24 23:54:12.892392|n|5355: grn_init: <7.0.5>
2017-08-24 23:54:12.892573|n|5355: pgroonga: initialize: <2.0.1>
2017-08-24 23:54:41.175020|n|2800: grn_fin (0)
2017-08-24 23:54:47.038331|n|5552: grn_fin (0)
2017-08-24 23:54:51.590837|n|5548: grn_fin (0)
2017-08-24 23:54:53.010282|n|5355: grn_fin (0)
2017-08-24 23:55:26.793119|n|20099: grn_init: <7.0.5>
2017-08-24 23:55:26.793293|n|20099: pgroonga: initialize: <2.0.1>
2017-08-24 23:58:08.194355|n|22726: grn_init: <7.0.5>
2017-08-24 23:58:08.194514|n|22726: pgroonga: initialize: <2.0.1>
2017-08-24 23:58:08.201547|n|20027: grn_init: <7.0.5>
2017-08-24 23:58:08.201701|n|20027: pgroonga: initialize: <2.0.1>
2017-08-24 23:58:08.231366|n|23209: grn_init: <7.0.5>
2017-08-24 23:58:08.231526|n|23209: pgroonga: initialize: <2.0.1>
2017-08-24 23:58:53.725916|n|23258: grn_init: <7.0.5>
2017-08-24 23:58:53.726083|n|23258: pgroonga: initialize: <2.0.1>
2017-08-24 23:58:53.732767|n|23256: grn_init: <7.0.5>
2017-08-24 23:58:53.732928|n|23256: pgroonga: initialize: <2.0.1>
2017-08-24 23:58:53.732900|n|23264: grn_init: <7.0.5>
2017-08-24 23:58:53.733032|n|23264: pgroonga: initialize: <2.0.1>
2017-08-24 23:59:09.355691|n|20027: grn_fin (0)
2017-08-24 23:59:15.503486|n|20099: grn_fin (0)
2017-08-24 23:59:36.342743|n|22726: grn_fin (0)
2017-08-25 00:00:14.892106|w|2897: [DB Locked] time out(900000): io(base/721070/pgrn.000010A) collisions(900000/37)
2017-08-25 00:00:14.892162|e|2897: grn_io_lock failed
2017-08-25 00:00:14.949436|e|2897: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_io_lock+0x28e) [0x7fc50046a57e]
2017-08-25 00:00:14.949448|e|2897: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_ii_column_update+0x17a) [0x7fc50044c64a]
2017-08-25 00:00:14.949452|e|2897: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_obj_default_set_value_hook+0x81) [0x7fc500324001]
2017-08-25 00:00:14.949455|e|2897: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_obj_set_value+0x1593) [0x7fc5003368e3]
2017-08-25 00:00:14.949458|e|2897: /usr/lib/postgresql/9.5/lib/pgroonga.so(+0xa509) [0x7fc500b19509]
2017-08-25 00:00:14.949461|e|2897: /usr/lib/postgresql/9.5/lib/pgroonga.so(pgroonga_insert+0x5d) [0x7fc500b1cd3d]
2017-08-25 00:00:14.949463|e|2897: postgres: postgres kris 127.0.0.1(35116) UPDATE(FunctionCall6Coll+0xa6) [0x55977b275336]
2017-08-25 00:00:14.949466|e|2897: postgres: postgres kris 127.0.0.1(35116) UPDATE(index_insert+0xa5) [0x55977af5c7e5]
2017-08-25 00:00:14.949469|e|2897: postgres: postgres kris 127.0.0.1(35116) UPDATE(ExecInsertIndexTuples+0x2cc) [0x55977b07154c]
2017-08-25 00:00:14.949472|e|2897: postgres: postgres kris 127.0.0.1(35116) UPDATE(+0x20bc28) [0x55977b08cc28]
2017-08-25 00:00:14.949474|e|2897: postgres: postgres kris 127.0.0.1(35116) UPDATE(ExecModifyTable+0x3ff) [0x55977b08d34f]
2017-08-25 00:00:14.949477|e|2897: postgres: postgres kris 127.0.0.1(35116) UPDATE(ExecProcNode+0x188) [0x55977b075308]
2017-08-25 00:00:14.949480|e|2897: postgres: postgres kris 127.0.0.1(35116) UPDATE(standard_ExecutorRun+0x132) [0x55977b0721f2]
2017-08-25 00:00:14.949482|e|2897: postgres: postgres kris 127.0.0.1(35116) UPDATE(+0x2fb509) [0x55977b17c509]
2017-08-25 00:00:14.949493|e|2897: postgres: postgres kris 127.0.0.1(35116) UPDATE(+0x2fb751) [0x55977b17c751]
2017-08-25 00:00:14.949497|e|2897: postgres: postgres kris 127.0.0.1(35116) UPDATE(PortalRun+0x1c5) [0x55977b17d365]
2017-08-25 00:00:15.044271|n|2897: grn_fin (0)
2017-08-25 00:00:15.046691|n|2871: grn_fin (0)

pgroonga.log - During gdb dumps

2017-08-25 07:23:26.016228|n|7549: grn_init: <7.0.5>
2017-08-25 07:23:26.016444|n|7549: pgroonga: initialize: <2.0.1>
2017-08-25 07:24:04.073718|n|7656: grn_init: <7.0.5>
2017-08-25 07:24:04.073890|n|7656: pgroonga: initialize: <2.0.1>
2017-08-25 07:24:04.080385|n|7649: grn_init: <7.0.5>
2017-08-25 07:24:04.080554|n|7649: pgroonga: initialize: <2.0.1>
2017-08-25 07:25:36.582794|n|7267: grn_fin (0)
2017-08-25 07:26:13.763834|n|7656: io(base/721070/pgrn.000010A) collisions(1000/6): lock failed 1000 times
2017-08-25 07:26:15.721532|n|7549: io(base/721070/pgrn.000010B) collisions(1000/15): lock failed 1000 times
2017-08-25 07:26:51.521835|n|7552: grn_init: <7.0.5>
2017-08-25 07:26:51.522013|n|7552: pgroonga: initialize: <2.0.1>
2017-08-25 07:28:15.358505|n|7645: grn_init: <7.0.5>
2017-08-25 07:28:15.358705|n|7645: pgroonga: initialize: <2.0.1>
2017-08-25 07:28:17.911807|n|6861: grn_fin (0)
2017-08-25 07:28:19.181136|n|7649: grn_fin (0)
2017-08-25 07:28:20.664770|n|7552: grn_fin (0)
2017-08-25 07:28:21.673817|n|6471: grn_fin (0)
2017-08-25 07:28:24.691844|n|7515: grn_fin (0)
2017-08-25 07:28:25.165507|n|7645: grn_fin (0)
2017-08-25 07:28:41.216860|n|15328: grn_init: <7.0.5>
2017-08-25 07:28:41.217017|n|15328: pgroonga: initialize: <2.0.1>
2017-08-25 07:28:41.223549|n|15151: grn_init: <7.0.5>
2017-08-25 07:28:41.223599|n|15243: grn_init: <7.0.5>
2017-08-25 07:28:41.223706|n|15151: pgroonga: initialize: <2.0.1>
2017-08-25 07:28:41.223740|n|15243: pgroonga: initialize: <2.0.1>
2017-08-25 07:28:41.251087|n|8419: grn_init: <7.0.5>
2017-08-25 07:28:41.251269|n|8419: pgroonga: initialize: <2.0.1>
2017-08-25 07:28:53.451276|n|8422: grn_init: <7.0.5>
2017-08-25 07:28:53.451493|n|8422: pgroonga: initialize: <2.0.1>
2017-08-25 07:30:37.144532|n|17312: grn_init: <7.0.5>
2017-08-25 07:30:37.144619|n|8455: grn_init: <7.0.5>
2017-08-25 07:30:37.144711|n|17312: pgroonga: initialize: <2.0.1>
2017-08-25 07:30:37.144751|n|8455: pgroonga: initialize: <2.0.1>
2017-08-25 07:31:08.592904|n|15328: grn_fin (0)
2017-08-25 07:31:14.856259|n|15243: grn_fin (0)
2017-08-25 07:31:15.439751|n|15151: grn_fin (0)
2017-08-25 07:31:31.894644|w|19550: [DB Locked] time out(900000): io(base/721070/pgrn.000010A) collisions(900000/1169)
2017-08-25 07:31:31.894679|e|19550: grn_io_lock failed
2017-08-25 07:31:31.895577|e|19550: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_io_lock+0x28e) [0x7fc4ffcea57e]
2017-08-25 07:31:31.895583|e|19550: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_ii_column_update+0x17a) [0x7fc4ffccc64a]
2017-08-25 07:31:31.895587|e|19550: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_obj_default_set_value_hook+0x81) [0x7fc4ffba4001]
2017-08-25 07:31:31.895590|e|19550: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_obj_set_value+0x1593) [0x7fc4ffbb68e3]
2017-08-25 07:31:31.895593|e|19550: /usr/lib/postgresql/9.5/lib/pgroonga.so(+0xa509) [0x7fc500399509]
2017-08-25 07:31:31.895596|e|19550: /usr/lib/postgresql/9.5/lib/pgroonga.so(pgroonga_insert+0x5d) [0x7fc50039cd3d]
2017-08-25 07:31:31.895599|e|19550: postgres: postgres kris 127.0.0.1(37300) UPDATE(FunctionCall6Coll+0xa6) [0x55977b275336]
2017-08-25 07:31:31.895602|e|19550: postgres: postgres kris 127.0.0.1(37300) UPDATE(index_insert+0xa5) [0x55977af5c7e5]
2017-08-25 07:31:31.895605|e|19550: postgres: postgres kris 127.0.0.1(37300) UPDATE(ExecInsertIndexTuples+0x2cc) [0x55977b07154c]
2017-08-25 07:31:31.895608|e|19550: postgres: postgres kris 127.0.0.1(37300) UPDATE(+0x20bc28) [0x55977b08cc28]
2017-08-25 07:31:31.895610|e|19550: postgres: postgres kris 127.0.0.1(37300) UPDATE(ExecModifyTable+0x3ff) [0x55977b08d34f]
2017-08-25 07:31:31.895613|e|19550: postgres: postgres kris 127.0.0.1(37300) UPDATE(ExecProcNode+0x188) [0x55977b075308]
2017-08-25 07:31:31.895616|e|19550: postgres: postgres kris 127.0.0.1(37300) UPDATE(standard_ExecutorRun+0x132) [0x55977b0721f2]
2017-08-25 07:31:31.895619|e|19550: postgres: postgres kris 127.0.0.1(37300) UPDATE(+0x2fb509) [0x55977b17c509]
2017-08-25 07:31:31.895622|e|19550: postgres: postgres kris 127.0.0.1(37300) UPDATE(+0x2fb751) [0x55977b17c751]
2017-08-25 07:31:31.895625|e|19550: postgres: postgres kris 127.0.0.1(37300) UPDATE(PortalRun+0x1c5) [0x55977b17d365]
2017-08-25 07:31:32.073361|n|3367: grn_fin (0)
2017-08-25 07:31:32.076623|n|19550: grn_fin (0)
2017-08-25 07:32:49.322944|w|13329: [DB Locked] time out(900000): io(base/721070/pgrn.000010A) collisions(11700000/12685)
2017-08-25 07:32:50.389490|n|13329: io(base/721070/pgrn.000010A) collisions(11701000/12686): lock failed 1000 times
2017-08-25 07:33:27.248419|n|8455: grn_fin (0)
2017-08-25 07:33:28.636726|n|8419: grn_fin (0)
2017-08-25 07:33:28.850517|n|8422: grn_fin (0)
2017-08-25 07:34:25.472861|n|17302: grn_init: <7.0.5>
2017-08-25 07:34:25.473035|n|17302: pgroonga: initialize: <2.0.1>
2017-08-25 07:34:25.478549|n|21762: grn_init: <7.0.5>
2017-08-25 07:34:25.478714|n|21762: pgroonga: initialize: <2.0.1>
2017-08-25 07:34:25.478719|n|17319: grn_init: <7.0.5>
2017-08-25 07:34:25.478854|n|17319: pgroonga: initialize: <2.0.1>
2017-08-25 07:34:44.252804|n|17302: grn_fin (0)
2017-08-25 07:34:44.282190|n|17319: grn_fin (0)
2017-08-25 07:34:44.310906|n|17312: grn_fin (0)
2017-08-25 07:35:50.869079|n|21866: grn_init: <7.0.5>
2017-08-25 07:35:50.869350|n|21866: pgroonga: initialize: <2.0.1>
2017-08-25 07:35:50.872307|n|21780: grn_init: <7.0.5>
2017-08-25 07:35:50.872485|n|21780: pgroonga: initialize: <2.0.1>
2017-08-25 07:35:50.873362|n|2291: grn_init: <7.0.5>
2017-08-25 07:35:50.873511|n|2291: pgroonga: initialize: <2.0.1>
2017-08-25 07:36:18.874340|n|28302: grn_init: <7.0.5>
2017-08-25 07:36:18.874570|n|28302: pgroonga: initialize: <2.0.1>
2017-08-25 07:36:18.887921|n|21819: grn_init: <7.0.5>
2017-08-25 07:36:18.888161|n|21819: pgroonga: initialize: <2.0.1>
2017-08-25 07:36:32.392562|n|21939: grn_init: <7.0.5>
2017-08-25 07:36:32.392752|n|21939: pgroonga: initialize: <2.0.1>
2017-08-25 07:36:32.393001|n|9697: grn_init: <7.0.5>
2017-08-25 07:36:32.393218|n|9697: pgroonga: initialize: <2.0.1>
2017-08-25 07:36:32.394438|n|5007: grn_init: <7.0.5>
2017-08-25 07:36:32.394581|n|5007: pgroonga: initialize: <2.0.1>
2017-08-25 07:36:44.509150|n|21780: io(base/721070/pgrn.000010B) collisions(1000/781068): lock failed 1000 times
2017-08-25 07:36:50.389387|n|28308: grn_init: <7.0.5>
2017-08-25 07:36:50.389572|n|28308: pgroonga: initialize: <2.0.1>
2017-08-25 07:38:08.270495|n|13329: io(base/721070/pgrn.000010A) collisions(12000000/12686)
2017-08-25 07:38:37.161725|n|21866: grn_fin (0)
2017-08-25 07:38:39.689540|n|4930: grn_init: <7.0.5>
2017-08-25 07:38:39.689683|n|4930: pgroonga: initialize: <2.0.1>
2017-08-25 07:38:39.700419|n|2344: grn_init: <7.0.5>
2017-08-25 07:38:39.700620|n|2344: pgroonga: initialize: <2.0.1>
2017-08-25 07:38:40.536234|n|21819: grn_fin (0)
2017-08-25 07:39:04.685827|n|2344: grn_fin (0)
2017-08-25 07:39:08.761590|n|4946: grn_init: <7.0.5>
2017-08-25 07:39:08.761825|n|4946: pgroonga: initialize: <2.0.1>
2017-08-25 07:39:08.775486|n|4956: grn_init: <7.0.5>
2017-08-25 07:39:08.775690|n|4956: pgroonga: initialize: <2.0.1>
2017-08-25 07:39:10.622359|n|2291: grn_fin (0)
2017-08-25 07:39:32.678150|n|4956: grn_fin (0)
2017-08-25 07:39:33.379420|n|4930: grn_fin (0)
2017-08-25 07:39:33.540383|n|5007: grn_fin (0)
2017-08-25 07:39:34.890948|n|4946: grn_fin (0)
2017-08-25 07:39:39.718756|n|7250: grn_init: <7.0.5>
2017-08-25 07:39:39.718925|n|7250: pgroonga: initialize: <2.0.1>
2017-08-25 07:39:39.725603|n|7234: grn_init: <7.0.5>
2017-08-25 07:39:39.725616|n|7251: grn_init: <7.0.5>
2017-08-25 07:39:39.725762|n|7234: pgroonga: initialize: <2.0.1>
2017-08-25 07:39:39.725769|n|7251: pgroonga: initialize: <2.0.1>
2017-08-25 07:39:52.381227|n|7251: grn_fin (0)
2017-08-25 07:39:53.919200|n|7250: grn_fin (0)
2017-08-25 07:39:59.361032|n|7234: grn_fin (0)
2017-08-25 07:40:16.489559|n|21924: grn_init: <7.0.5>
2017-08-25 07:40:16.489693|n|21924: pgroonga: initialize: <2.0.1>
2017-08-25 07:40:17.653706|n|21924: io(base/721070/pgrn.000010B) collisions(1000/18581): lock failed 1000 times
2017-08-25 07:40:26.152226|n|21940: grn_init: <7.0.5>
2017-08-25 07:40:26.152403|n|21940: pgroonga: initialize: <2.0.1>
2017-08-25 07:40:29.756298|n|21940: grn_fin (0)
2017-08-25 07:40:30.763833|n|21939: grn_fin (0)
2017-08-25 07:40:31.572498|n|28306: grn_init: <7.0.5>
2017-08-25 07:40:31.572472|n|29516: grn_init: <7.0.5>
2017-08-25 07:40:31.572688|n|29516: pgroonga: initialize: <2.0.1>
2017-08-25 07:40:31.572692|n|28306: pgroonga: initialize: <2.0.1>
2017-08-25 07:40:46.231021|n|28302: grn_fin (0)
2017-08-25 07:40:47.487645|n|28306: grn_fin (0)
2017-08-25 07:40:49.752250|n|29516: grn_fin (0)
2017-08-25 07:40:50.778160|n|28308: grn_fin (0)
2017-08-25 07:41:10.598515|n|9129: grn_init: <7.0.5>
2017-08-25 07:41:10.598691|n|9129: pgroonga: initialize: <2.0.1>
2017-08-25 07:41:10.605391|n|9116: grn_init: <7.0.5>
2017-08-25 07:41:10.605548|n|9116: pgroonga: initialize: <2.0.1>
2017-08-25 07:41:14.949802|n|9697: grn_fin (0)
2017-08-25 07:41:16.481706|n|9116: grn_fin (0)
2017-08-25 07:41:17.327716|n|9129: grn_fin (0)
2017-08-25 07:41:45.207412|n|14741: grn_init: <7.0.5>
2017-08-25 07:41:45.207582|n|14741: pgroonga: initialize: <2.0.1>
2017-08-25 07:41:45.214561|n|14884: grn_init: <7.0.5>
2017-08-25 07:41:45.214566|n|14764: grn_init: <7.0.5>
2017-08-25 07:41:45.214727|n|14764: pgroonga: initialize: <2.0.1>
2017-08-25 07:41:45.214728|n|14884: pgroonga: initialize: <2.0.1>
2017-08-25 07:42:08.414566|w|7656: [DB Locked] time out(900000): io(base/721070/pgrn.000010A) collisions(900000/6)
2017-08-25 07:42:08.414589|e|7656: grn_io_lock failed
2017-08-25 07:42:08.415266|e|7656: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_io_lock+0x28e) [0x7fc50044957e]
2017-08-25 07:42:08.415272|e|7656: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_ii_column_update+0x17a) [0x7fc50042b64a]
2017-08-25 07:42:08.415275|e|7656: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_obj_default_set_value_hook+0x81) [0x7fc500303001]
2017-08-25 07:42:08.415278|e|7656: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_obj_set_value+0x1593) [0x7fc5003158e3]
2017-08-25 07:42:08.415281|e|7656: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(+0xd0229) [0x7fc500316229]
2017-08-25 07:42:08.415284|e|7656: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(+0xd0b08) [0x7fc500316b08]
2017-08-25 07:42:08.415287|e|7656: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(_grn_table_delete_by_id+0x62) [0x7fc500317052]
2017-08-25 07:42:08.415289|e|7656: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_table_delete_by_id+0xe0) [0x7fc5003171f0]
2017-08-25 07:42:08.415292|e|7656: /usr/lib/postgresql/9.5/lib/pgroonga.so(pgroonga_gettuple+0x6fa) [0x7fc500aff23a]
2017-08-25 07:42:08.415295|e|7656: postgres: postgres kris 127.0.0.1(40358) SELECT(FunctionCall2Coll+0x72) [0x55977b274f82]
2017-08-25 07:42:08.415297|e|7656: postgres: postgres kris 127.0.0.1(40358) SELECT(index_getnext_tid+0x56) [0x55977af5cba6]
2017-08-25 07:42:08.415300|e|7656: postgres: postgres kris 127.0.0.1(40358) SELECT(index_getnext+0x24) [0x55977af5cda4]
2017-08-25 07:42:08.415303|e|7656: postgres: postgres kris 127.0.0.1(40358) SELECT(+0x2067aa) [0x55977b0877aa]
2017-08-25 07:42:08.415305|e|7656: postgres: postgres kris 127.0.0.1(40358) SELECT(ExecScan+0x1a1) [0x55977b07c5e1]
2017-08-25 07:42:08.415308|e|7656: postgres: postgres kris 127.0.0.1(40358) SELECT(ExecProcNode+0x128) [0x55977b0752a8]
2017-08-25 07:42:08.415311|e|7656: postgres: postgres kris 127.0.0.1(40358) SELECT(ExecNestLoop+0x93) [0x55977b08ec83]
2017-08-25 07:42:09.474667|n|21780: io(base/721070/pgrn.000010A) collisions(307035/781069): lock failed 1000 times
2017-08-25 07:42:09.475735|n|21924: io(base/721070/pgrn.000010A) collisions(106350/18582): lock failed 1000 times
2017-08-25 07:42:09.524830|n|7656: io(base/721070/pgrn.000010A) collisions(901000/8): lock failed 1000 times
2017-08-25 07:43:06.058983|n|17177: grn_init: <7.0.5>
2017-08-25 07:43:06.059174|n|17177: pgroonga: initialize: <2.0.1>
2017-08-25 07:43:06.064272|n|17262: grn_init: <7.0.5>
2017-08-25 07:43:06.064442|n|17262: pgroonga: initialize: <2.0.1>
2017-08-25 07:43:06.064685|n|14758: grn_init: <7.0.5>
2017-08-25 07:43:06.064824|n|14758: pgroonga: initialize: <2.0.1>
2017-08-25 07:43:06.064816|n|23665: grn_init: <7.0.5>
2017-08-25 07:43:06.064943|n|23665: pgroonga: initialize: <2.0.1>
2017-08-25 07:43:54.541376|n|7656: io(base/721070/pgrn.000010A) collisions(1000000/8)
2017-08-25 07:44:29.875712|n|14741: io(base/721070/pgrn.000010B) collisions(1000/4): lock failed 1000 times
2017-08-25 07:44:49.940768|n|17257: grn_init: <7.0.5>
2017-08-25 07:44:49.941022|n|17257: pgroonga: initialize: <2.0.1>
2017-08-25 07:46:06.357791|n|14758: grn_fin (0)
2017-08-25 07:46:09.674770|n|14764: grn_fin (0)
2017-08-25 07:46:10.247582|n|14884: grn_fin (0)
2017-08-25 07:46:18.741281|n|17177: grn_fin (0)
2017-08-25 07:46:18.766318|n|17257: grn_fin (0)
2017-08-25 07:46:20.144977|n|17262: grn_fin (0)
2017-08-25 07:46:39.406337|n|23665: grn_fin (0)
2017-08-25 07:47:00.049720|n|31540: grn_init: <7.0.5>
2017-08-25 07:47:00.049904|n|31540: pgroonga: initialize: <2.0.1>
2017-08-25 07:47:00.056152|n|31818: grn_init: <7.0.5>
2017-08-25 07:47:00.056310|n|31818: pgroonga: initialize: <2.0.1>
2017-08-25 07:47:00.056678|n|31530: grn_init: <7.0.5>
2017-08-25 07:47:00.056818|n|31530: pgroonga: initialize: <2.0.1>
2017-08-25 07:47:00.059559|n|31535: grn_init: <7.0.5>
2017-08-25 07:47:00.059713|n|31535: pgroonga: initialize: <2.0.1>
2017-08-25 07:48:22.498304|n|31530: grn_fin (0)
2017-08-25 07:48:22.501201|n|31818: grn_fin (0)
2017-08-25 07:48:22.513573|n|21762: grn_fin (0)
2017-08-25 07:48:22.525498|n|31540: grn_fin (0)
2017-08-25 07:48:22.526307|n|31535: grn_fin (0)
2017-08-25 07:48:45.136101|w|13329: [DB Locked] time out(900000): io(base/721070/pgrn.000010A) collisions(12600000/12686)
2017-08-25 07:48:45.152779|n|13329: (13 same messages are truncated)
2017-08-25 07:48:45.153989|n|13329: grn_fin (0)

gdb - Autovacuum process 25656

(gdb) bt full
#0  0x00007fc9cc465c40 in __poll_nocancel () at ../sysdeps/unix/syscall-template.S:81
No locals.
#1  0x000055977b10e887 in poll (__timeout=30032, __nfds=2, __fds=0x7ffd0388cac0) at /usr/include/x86_64-linux-gnu/bits/poll2.h:41
No locals.
#2  WaitLatchOrSocket (latch=0x7fc9bbc09234, wakeEvents=wakeEvents@entry=25, sock=sock@entry=-1, timeout=30032) at pg_latch.c:333
        result = <optimized out>
        start_time = {tv_sec = 1503640010, tv_usec = 20306}
        cur_time = {tv_sec = 1, tv_usec = 0}
        cur_timeout = 30032
        pfds = {{fd = 10, events = 1, revents = 0}, {fd = 7, events = 1, revents = 0}, {fd = 59296496, events = 32765, revents = 0}}
        nfds = 2
        __func__ = "WaitLatchOrSocket"
#3  0x000055977b10eb62 in WaitLatch (latch=<optimized out>, wakeEvents=wakeEvents@entry=25, timeout=<optimized out>) at pg_latch.c:197
No locals.
#4  0x000055977b110399 in AutoVacLauncherMain (argv=0x0, argc=0) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/autovacuum.c:599
        nap = {tv_sec = 30, tv_usec = 32149}
        current_time = 0
        can_launch = <optimized out>
        rc = <optimized out>
        local_sigjmp_buf = {{__jmpbuf = {94109098919092, -5146058679329745299, 94109098918016, 140504721901280, 0, 0, -5146058679359105427, -1422724242177124755}, __mask_was_saved = 1, __saved_mask = {__val = {18446744066192964103, 140724662750176, 0, 140724662750256, 140504692137807, 140504695287432, 
                140724662750192, 140724603453456, 140724662750272, 140724662750208, 140504691574850, 0, 140724662750304, 2048, 140504708480234, 0}}}}
#5  0x000055977b110719 in StartAutoVacLauncher () at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/autovacuum.c:391
        AutoVacPID = <optimized out>
        __func__ = "StartAutoVacLauncher"
#6  0x000055977b11eb62 in reaper (postgres_signal_arg=<optimized out>) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/postmaster.c:2781
        save_errno = 4
        pid = <optimized out>
        exitstatus = 0
        __func__ = "reaper"
#7  <signal handler called>
No locals.
#8  0x00007fc9cc46a8b3 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:81
No locals.
#9  0x000055977af1dc1c in ServerLoop () at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/postmaster.c:1652
        timeout = {tv_sec = 57, tv_usec = 737392}
        rmask = {fds_bits = {72, 0 <repeats 15 times>}}
        selres = <optimized out>
        now = <optimized out>
        readmask = {fds_bits = {72, 0 <repeats 15 times>}}
        last_lockfile_recheck_time = 1503610730
        last_touch_time = 1503610250
        __func__ = "ServerLoop"
#10 0x000055977b12042b in PostmasterMain (argc=5, argv=<optimized out>) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/postmaster.c:1296
        opt = <optimized out>
        status = <optimized out>
        userDoption = <optimized out>
        listen_addr_saved = 1 '\001'
        i = <optimized out>
        output_config_variable = <optimized out>
        __func__ = "PostmasterMain"
#11 0x000055977af1f372 in main (argc=5, argv=0x55977d5791f0) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/main/main.c:228
No locals.
(gdb) quit
A debugging session is active.

	Inferior 1 [process 25656] will be detached.

gdb - Process 7656

(gdb) bt full
#0  0x00007fc9cc439d30 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:81
No locals.
#1  0x00007fc5002e84b9 in grn_nanosleep () from /usr/lib/x86_64-linux-gnu/libgroonga.so.0
No symbol table info available.
#2  0x00007fc5004493ec in grn_io_lock () from /usr/lib/x86_64-linux-gnu/libgroonga.so.0
No symbol table info available.
#3  0x00007fc50042b64a in grn_ii_column_update () from /usr/lib/x86_64-linux-gnu/libgroonga.so.0
No symbol table info available.
#4  0x00007fc500303001 in grn_obj_default_set_value_hook () from /usr/lib/x86_64-linux-gnu/libgroonga.so.0
No symbol table info available.
#5  0x00007fc5003158e3 in grn_obj_set_value () from /usr/lib/x86_64-linux-gnu/libgroonga.so.0
No symbol table info available.
#6  0x00007fc500316229 in ?? () from /usr/lib/x86_64-linux-gnu/libgroonga.so.0
No symbol table info available.
#7  0x00007fc500316b08 in ?? () from /usr/lib/x86_64-linux-gnu/libgroonga.so.0
No symbol table info available.
#8  0x00007fc500317052 in _grn_table_delete_by_id () from /usr/lib/x86_64-linux-gnu/libgroonga.so.0
No symbol table info available.
#9  0x00007fc5003171f0 in grn_table_delete_by_id () from /usr/lib/x86_64-linux-gnu/libgroonga.so.0
No symbol table info available.
#10 0x00007fc500aff23a in pgroonga_gettuple () from /usr/lib/postgresql/9.5/lib/pgroonga.so
No symbol table info available.
#11 0x000055977b274f82 in FunctionCall2Coll (flinfo=flinfo@entry=0x55977e55e6a0, collation=collation@entry=0, arg1=arg1@entry=94109160552400, arg2=arg2@entry=1) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/utils/fmgr/fmgr.c:1323
        fcinfo = {flinfo = 0x55977e55e6a0, context = 0x0, resultinfo = 0x0, fncollation = 0, isnull = 0 '\000', nargs = 2, arg = {94109160552400, 1, 94109160560536, 94109160559768, 0, 94109160560312, 140724662749200, 94109091378727, 10703112672, 94109160560536, 94109134320416, 1, 140724662750160, 94109094597175, 
            94109134320416, 0, 0, 1407374883553280, 94109160560536, 94109160559768, 1, 94109160560312, 0, 4294967295, 94109134024288, 1, 140504404547136, 94109134024288, 140724662749360, 94109093480836, 94109098793312, 1, 140724662749488, 94109093488297, 94109098980344, 140504362983184, 94109159451392, 1, 
            3107009407418542, 72057594037927936, 94107028422657, 347432, 18446603349046802049, 21715, 5428, 94107028428132, 140724662749568, 678, 3302889146704, 84, 10, 554050781378, 1, 140724662749567, 3096972068850127, 94109159451256, 94107028422781, 94109094614787, 140724662749792, 94109091334236, 320, 
            18729789737140224, 189231280, 140504692787360, 140484112348896, 4360869, 140724662749696, 94109093392080, 0, 140484112348896, 0, 140484710851200, 140484112348896, 1, 140724662749904, 94109093400936, 94109160543008, 6921, 94107028422768, 0, 94109141602368, 343524369235968, 0, 140724662749792, 
            94109160543008, 72057617644890688, 3096972068128383, 979877, 94107028502639, 94109160835108, 6, 94109160549088, 140724662749888, 17448787741933925, 140724662749920, 140484104377624, 140724662749888, 94109091631987, 140724662749920, 94109091632055}, 
          argnull = "\000\000PO\311\177\000\000\245\212B\000\000\000\000\000\060ˈ\003\375\177\000\000t\273\026{\227U\000\000\030U#\001\305\177\000\000\340\337\370~\227U\000\000\060ˈ\003\375\177\000\000з\025\177\227U\000\000 \270\025\177\227U\000\000$\270\025\177\227U\000\000з\025\177\227U\000\000 \270\025\177\227U\000\000\200ˈ\003"}
        result = <optimized out>
        __func__ = "FunctionCall2Coll"
#12 0x000055977af5cba6 in index_getnext_tid (scan=scan@entry=0x55977f15b7d0, direction=direction@entry=ForwardScanDirection) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/access/index/indexam.c:462
        procedure = 0x55977e55e6a0
        found = <optimized out>
        __func__ = "index_getnext_tid"
#13 0x000055977af5cda4 in index_getnext (scan=scan@entry=0x55977f15b7d0, direction=direction@entry=ForwardScanDirection) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/access/index/indexam.c:602
        heapTuple = <optimized out>
        tid = <optimized out>
#14 0x000055977b0877aa in IndexNext (node=node@entry=0x55977f159110) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/nodeIndexscan.c:104
        estate = <optimized out>
        econtext = 0x55977f159320
        direction = ForwardScanDirection
        scandesc = 0x55977f15b7d0
        tuple = <optimized out>
        slot = 0x55977f15b3c0
#15 0x000055977b07c5e1 in ExecScanFetch (recheckMtd=0x55977b087720 <IndexRecheck>, accessMtd=0x55977b087760 <IndexNext>, node=0x55977f159110) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execScan.c:95
        estate = <optimized out>
#16 ExecScan (node=node@entry=0x55977f159110, accessMtd=0x55977b087760 <IndexNext>, recheckMtd=0x55977b087720 <IndexRecheck>) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execScan.c:180
        econtext = 0x55977f159320
        qual = 0x0
        projInfo = 0x55977f15b550
        isDone = ExprSingleResult
        resultSlot = <optimized out>
#17 0x000055977b087eac in ExecIndexScan (node=node@entry=0x55977f159110) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/nodeIndexscan.c:498
No locals.
#18 0x000055977b0752a8 in ExecProcNode (node=node@entry=0x55977f159110) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execProcnode.c:420
        result = <optimized out>
        __func__ = "ExecProcNode"
#19 0x000055977b08ec83 in ExecNestLoop (node=node@entry=0x55977f158ea0) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/nodeNestloop.c:123
        nl = 0x55977f05a508
        innerPlan = 0x55977f15a6b0
        outerPlan = 0x55977f159110
        outerTupleSlot = <optimized out>
        innerTupleSlot = <optimized out>
        joinqual = 0x0
        otherqual = 0x0
        econtext = 0x55977f158fb0
        lc = <optimized out>
#20 0x000055977b075378 in ExecProcNode (node=node@entry=0x55977f158ea0) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execProcnode.c:469
        result = <optimized out>
        __func__ = "ExecProcNode"
#21 0x000055977b08a110 in ExecLimit (node=node@entry=0x55977f158b70) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/nodeLimit.c:139
        direction = ForwardScanDirection
        slot = <optimized out>
        outerPlan = 0x55977f158ea0
        __func__ = "ExecLimit"
#22 0x000055977b0751e8 in ExecProcNode (node=node@entry=0x55977f158b70) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execProcnode.c:520
        result = <optimized out>
        __func__ = "ExecProcNode"
#23 0x000055977b0721f2 in ExecutePlan (dest=0x55977d5f6f40, direction=<optimized out>, numberTuples=0, sendTuples=1 '\001', operation=CMD_SELECT, planstate=0x55977f158b70, estate=0x55977f158a30) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execMain.c:1549
        slot = <optimized out>
        current_tuple_count = 3
#24 standard_ExecutorRun (queryDesc=0x55977f02aff0, direction=<optimized out>, count=0) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execMain.c:337
        estate = 0x55977f158a30
        operation = CMD_SELECT
        dest = 0x55977d5f6f40
        sendTuples = <optimized out>
#25 0x000055977b17becf in PortalRunSelect (portal=portal@entry=0x55977d628850, forward=forward@entry=1 '\001', count=0, count@entry=9223372036854775807, dest=dest@entry=0x55977d5f6f40) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/tcop/pquery.c:942
        queryDesc = 0x55977f02aff0
        direction = <optimized out>
        nprocessed = <optimized out>
        __func__ = "PortalRunSelect"
#26 0x000055977b17d49a in PortalRun (portal=portal@entry=0x55977d628850, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x55977d5f6f40, altdest=altdest@entry=0x55977d5f6f40, completionTag=completionTag@entry=0x7ffd0388d230 "")
    at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/tcop/pquery.c:786
        save_exception_stack = 0x7ffd0388d140
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {3, -5146058679231179155, 94109131829056, 9223372036854775807, 94109132032080, 140724662751488, -5146058679254247827, -1422724223374322067}, __mask_was_saved = 0, __saved_mask = {__val = {140724662751008, 140724662751072, 140504415136352, 94109159304432, 2, 291, 
                556953972052161, 88, 69, 0, 94109132032080, 140724662751056, 94109094709185, 3, 69, 140724662751088}}}}
        result = <optimized out>
        nprocessed = <optimized out>
        saveTopTransactionResourceOwner = 0x55977d5e5bf0
        saveTopTransactionContext = 0x55977d5e5ae0
        saveActivePortal = 0x0
        saveResourceOwner = 0x55977d5e5bf0
        savePortalContext = 0x0
        saveMemoryContext = 0x55977d579530
        __func__ = "PortalRun"
#27 0x000055977b17a583 in exec_execute_message (max_rows=9223372036854775807, portal_name=0x55977d5f6b30 "") at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/tcop/postgres.c:1988
        completed = <optimized out>
        execute_is_fetch = 0 '\000'
        completionTag = "\000EGIN\000\000\060\000\000\000{\227U\000\000\260҈\003\375\177\000\000\321\313'{\227U\000\000\000\000\000\000\000\000\000\000@\256W}\227U\000\000\060\233Y}\227U\000\000\000\000\000\000\000\004\000"
        save_log_statement_stats = 0 '\000'
        was_logged = 0 '\000'
        dest = DestRemoteExecute
        receiver = 0x55977d5f6f40
        portal = 0x55977d628850
        sourceText = 0x55977f02acf0 "/* sq.fs=0 */select _a.id\n  from geraet _a\n  join sucheintrag _b on _a.id = _b.produkt\n where not _a.ldel and _a.imwebveroeffentlichen and (not _a.istonlinegesperrt or _a.istonlinegesperrt is null) an"...
        prepStmtName = 0x55977b38af0d "<unnamed>"
        portalParams = 0x55977f02af00
        msec_str = "`҈\003\375\177\000\000`҈\003\375\177\000\000@҈\003\375\177\000\000\346\344\t{\227U\000"
#28 PostgresMain (argc=<optimized out>, argv=argv@entry=0x55977d57ae40, dbname=0x55977d57ac00 "kris", username=<optimized out>) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/tcop/postgres.c:4108
        portal_name = 0x55977d5f6b30 ""
        max_rows = <optimized out>
        firstchar = 2103406400
        input_message = {data = 0x55977d5f6b30 "", len = 5, maxlen = 1024, cursor = 5}
        local_sigjmp_buf = {{__jmpbuf = {140724662751552, -5146058679675775379, 94109131320288, 1, 0, 94109131481216, -5146058679233276307, -1422724221276645779}, __mask_was_saved = 1, __saved_mask = {__val = {0, 94109131481216, 140504692432200, 18446603349046799841, 206158430256, 140724662751888, 140724662751680, 
                24, 656, 2, 2, 1024, 140724662751840, 140724662752004, 0, 140724662751776}}}}
        send_ready_for_query = 0 '\000'
        __func__ = "PostgresMain"
#29 0x000055977af1e678 in BackendRun (port=0x55977d5a2080) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/postmaster.c:4243
        ac = 1
        secs = 556953806
        usecs = 18812
        i = 1
        av = 0x55977d57ae40
        maxac = <optimized out>
#30 BackendStartup (port=0x55977d5a2080) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/postmaster.c:3917
        bn = <optimized out>
        pid = <optimized out>
#31 ServerLoop () at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/postmaster.c:1688
        rmask = {fds_bits = {8, 0 <repeats 15 times>}}
        selres = <optimized out>
        now = <optimized out>
        readmask = {fds_bits = {72, 0 <repeats 15 times>}}
        last_lockfile_recheck_time = 1503638599
        last_touch_time = 1503638119
        __func__ = "ServerLoop"
#32 0x000055977b12042b in PostmasterMain (argc=5, argv=<optimized out>) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/postmaster.c:1296
        opt = <optimized out>
        status = <optimized out>
        userDoption = <optimized out>
        listen_addr_saved = 1 '\001'
        i = <optimized out>
        output_config_variable = <optimized out>
        __func__ = "PostmasterMain"
#33 0x000055977af1f372 in main (argc=5, argv=0x55977d5791f0) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/main/main.c:228
No locals.
(gdb) quit
A debugging session is active.

        Inferior 1 [process 7656] will be detached.

gdb - Process 7549

Process: 7549

(gdb) bt full
#0  0x00007fc9cc439d30 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:81
No locals.
#1  0x00007fc50071b4b9 in grn_nanosleep () from /usr/lib/x86_64-linux-gnu/libgroonga.so.0
No symbol table info available.
#2  0x00007fc50087c3ec in grn_io_lock () from /usr/lib/x86_64-linux-gnu/libgroonga.so.0
No symbol table info available.
#3  0x00007fc50074a1dc in grn_table_delete_by_id () from /usr/lib/x86_64-linux-gnu/libgroonga.so.0
No symbol table info available.
#4  0x00007fc500f3223a in pgroonga_gettuple () from /usr/lib/postgresql/9.5/lib/pgroonga.so
No symbol table info available.
#5  0x000055977b274f82 in FunctionCall2Coll (flinfo=flinfo@entry=0x55977e06af80, collation=collation@entry=0, arg1=arg1@entry=94109156559488, arg2=arg2@entry=1) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/utils/fmgr/fmgr.c:1323
        fcinfo = {flinfo = 0x55977e06af80, context = 0x0, resultinfo = 0x0, fncollation = 0, isnull = 0 '\000', nargs = 2, arg = {94109156559488, 1, 94109156567624, 94109156566856, 0, 94109156567400, 140724662749056, 94109091378727, 12045119024, 94109156567624, 94109133744256, 1, 140724662750016, 94109094597175, 
            94109133744256, 0, 0, 1407374883553280, 94109156567624, 94109156566856, 1, 94109156567400, 0, 4294967295, 140504720172848, 1, 140504404246824, 140504720172848, 140724662749216, 94109093480836, 94109098793312, 1, 140724662749344, 94109093488297, 94109098980344, 140504362097488, 94109155878144, 1, 
            3107009407418542, 72057594037927936, 94107028422657, 347432, 18446603349046802193, 21715, 5428, 94107028428132, 140724662749424, 678, 3302889146560, 84, 10, 554050781378, 1, 140724662749423, 3096972068850127, 94109155878024, 94107028422781, 94109094614787, 140724662749648, 94109091334236, 320, 
            18729789737140224, 189231280, 140504692787360, 140484112348896, 4360869, 140724662749552, 94109093392080, 0, 140484112348896, 0, 140484710851200, 140484112348896, 1, 140724662749760, 94109093400936, 94109156549064, 6921, 94107028422768, 0, 140504720340112, 343524369235968, 0, 140724662749648, 
            94109156549064, 72057617640897872, 3096972068128383, 979877, 94107028502639, 94109155887188, 6, 94109156556272, 140724662749744, 17448787741933925, 140724662749776, 140484109693880, 140724662749744, 94109091631987, 140724662749776, 94109091632055}, 
          argnull = "\000\000PO\311\177\000\000\245\212B\000\000\000\000\000\240ʈ\003\375\177\000\000t\273\026{\227U\000\000\270st\001\305\177\000\000\220@\224~\227U\000\000\240ʈ\003\375\177\000\000\200\312\330~\227U\000\000\320\312\330~\227U\000\000\324\312\330~\227U\000\000\200\312\330~\227U\000\000\320\312\330~\22
7U\000\000\360ʈ\003"}
        result = <optimized out>
        __func__ = "FunctionCall2Coll"
#6  0x000055977af5cba6 in index_getnext_tid (scan=scan@entry=0x55977ed8ca80, direction=direction@entry=ForwardScanDirection) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/access/index/indexam.c:462
        procedure = 0x55977e06af80
        found = <optimized out>
        __func__ = "index_getnext_tid"
#7  0x000055977af5cda4 in index_getnext (scan=scan@entry=0x55977ed8ca80, direction=direction@entry=ForwardScanDirection) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/access/index/indexam.c:602
        heapTuple = <optimized out>
        tid = <optimized out>
#8  0x000055977b0877aa in IndexNext (node=node@entry=0x55977ed89fb8) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/nodeIndexscan.c:104
        estate = <optimized out>
        econtext = 0x55977ed8a1c8
        direction = ForwardScanDirection
        scandesc = 0x55977ed8ca80
        tuple = <optimized out>
        slot = 0x55977ed8c670
#9  0x000055977b07c5e1 in ExecScanFetch (recheckMtd=0x55977b087720 <IndexRecheck>, accessMtd=0x55977b087760 <IndexNext>, node=0x55977ed89fb8) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execScan.c:95
        estate = <optimized out>
#10 ExecScan (node=node@entry=0x55977ed89fb8, accessMtd=0x55977b087760 <IndexNext>, recheckMtd=0x55977b087720 <IndexRecheck>) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execScan.c:180
        econtext = 0x55977ed8a1c8
        qual = 0x0
        projInfo = 0x55977ed8c800
        isDone = ExprSingleResult
        resultSlot = <optimized out>
#11 0x000055977b087eac in ExecIndexScan (node=node@entry=0x55977ed89fb8) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/nodeIndexscan.c:498
No locals.
#12 0x000055977b0752a8 in ExecProcNode (node=node@entry=0x55977ed89fb8) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execProcnode.c:420
        result = <optimized out>
        __func__ = "ExecProcNode"
#13 0x000055977b08ec83 in ExecNestLoop (node=node@entry=0x55977ed89df8) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/nodeNestloop.c:123
        nl = 0x55977ed077b8
        innerPlan = 0x55977ed8b068
        outerPlan = 0x55977ed89fb8
        outerTupleSlot = <optimized out>
        innerTupleSlot = <optimized out>
        joinqual = 0x0
        otherqual = 0x0
        econtext = 0x55977ed89f08
        lc = <optimized out>
#14 0x000055977b075378 in ExecProcNode (node=0x55977ed89df8) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execProcnode.c:469
        result = <optimized out>
        __func__ = "ExecProcNode"
#15 0x000055977b080a11 in fetch_input_tuple (aggstate=aggstate@entry=0x55977ed89580) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/nodeAgg.c:484
        slot = <optimized out>
#16 0x000055977b081d7f in agg_retrieve_direct (aggstate=0x55977ed89580) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/nodeAgg.c:1749
        econtext = 0x55977ed89938
        pergroup = 0x55977ed91948
        firstSlot = 0x55977ed899e8
        numGroupingSets = 1
        node = 0x55977ed07560
        tmpcontext = 0x55977ed897a8
        peragg = 0x55977ed8ebe8
        outerslot = <optimized out>
        nextSetSize = <optimized out>
        result = <optimized out>
        hasGroupingSets = 0 '\000'
        currentSet = <optimized out>
        numReset = <optimized out>
        i = <optimized out>
#17 ExecAgg (node=node@entry=0x55977ed89580) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/nodeAgg.c:1487
        result = <optimized out>
#18 0x000055977b0753d8 in ExecProcNode (node=node@entry=0x55977ed89580) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execProcnode.c:496
        result = <optimized out>
        __func__ = "ExecProcNode"
#19 0x000055977b0721f2 in ExecutePlan (dest=0x55977d5f6b30, direction=<optimized out>, numberTuples=0, sendTuples=1 '\001', operation=CMD_SELECT, planstate=0x55977ed89580, estate=0x55977ed89440) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execMain.c:1549
        slot = <optimized out>
        current_tuple_count = 0
#20 standard_ExecutorRun (queryDesc=0x55977d68b160, direction=<optimized out>, count=0) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execMain.c:337
        estate = 0x55977ed89440
        operation = CMD_SELECT
        dest = 0x55977d5f6b30
        sendTuples = <optimized out>
#21 0x000055977b17becf in PortalRunSelect (portal=portal@entry=0x55977d648450, forward=forward@entry=1 '\001', count=0, count@entry=9223372036854775807, dest=dest@entry=0x55977d5f6b30) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/tcop/pquery.c:942
        queryDesc = 0x55977d68b160
        direction = <optimized out>
        nprocessed = <optimized out>
        __func__ = "PortalRunSelect"
#22 0x000055977b17d49a in PortalRun (portal=portal@entry=0x55977d648450, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x55977d5f6b30, altdest=altdest@entry=0x55977d5f6b30, completionTag=completionTag@entry=0x7ffd0388d230 "")
    at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/tcop/pquery.c:786
        save_exception_stack = 0x7ffd0388d140
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {3, -5146058679231179155, 94109131828016, 9223372036854775807, 94109132162128, 140724662751488, -5146058679254247827, -1422724223374322067}, __mask_was_saved = 0, __saved_mask = {__val = {140724662751008, 140724662751072, 140504415133568, 94109132435040, 2, 275, 
                556953973924577, 88, 69, 0, 94109132162128, 140724662751056, 94109094709185, 3, 69, 140724662751088}}}}
        result = <optimized out>
        nprocessed = <optimized out>
        saveTopTransactionResourceOwner = 0x55977d5e5bf0
        saveTopTransactionContext = 0x55977d5e5ae0
        saveActivePortal = 0x0
        saveResourceOwner = 0x55977d5e5bf0
        savePortalContext = 0x0
        saveMemoryContext = 0x55977d579530
        __func__ = "PortalRun"
#23 0x000055977b17a583 in exec_execute_message (max_rows=9223372036854775807, portal_name=0x55977d5f6720 "") at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/tcop/postgres.c:1988
        completed = <optimized out>
        execute_is_fetch = 0 '\000'
        completionTag = "\000EGIN\000\000\061\000\000j{\227U\000\000\260҈\003\375\177\000\000\321\313'{\227U\000\000\000\000\000\000\000\000\000\000@\256W}\227U\000\000\060\233Y}\227U\000\000\000\000\000\000\000\004\000"
        save_log_statement_stats = 0 '\000'
        was_logged = 0 '\000'
        dest = DestRemoteExecute
        receiver = 0x55977d5f6b30
        portal = 0x55977d648450
        sourceText = 0x55977d68ae60 "/* sq.fs=0 */select count(*)\n  from geraet _a\n  join sucheintrag _b on _a.id = _b.produkt\n where not _a.ldel and _a.imwebveroeffentlichen and (not _a.istonlinegesperrt or _a.istonlinegesperrt is null)"...
        prepStmtName = 0x55977b38af0d "<unnamed>"
        portalParams = 0x55977d68b070
        msec_str = "1184.674\000҈\003\375\177\000\000@҈\003\375\177\000\000\346\344\t{\227U\000"
#24 PostgresMain (argc=<optimized out>, argv=argv@entry=0x55977d57ae40, dbname=0x55977d57ac00 "kris", username=<optimized out>) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/tcop/postgres.c:4108
        portal_name = 0x55977d5f6720 ""
        max_rows = <optimized out>
        firstchar = 2103405360
        input_message = {data = 0x55977d5f6720 "", len = 5, maxlen = 1024, cursor = 5}
        local_sigjmp_buf = {{__jmpbuf = {140724662751552, -5146058679675775379, 94109131320288, 1, 0, 94109131481216, -5146058679233276307, -1422724221276645779}, __mask_was_saved = 1, __saved_mask = {__val = {0, 94109131481216, 140504692432200, 18446603349046799841, 206158430256, 140724662751888, 140724662751680, 
                24, 656, 2, 2, 1024, 140724662751840, 140724662752004, 0, 140724662751776}}}}
        send_ready_for_query = 0 '\000'
        __func__ = "PostgresMain"
#25 0x000055977af1e678 in BackendRun (port=0x55977d5a2080) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/postmaster.c:4243
        ac = 1
        secs = 556953805
        usecs = 878460
        i = 1
        av = 0x55977d57ae40
        maxac = <optimized out>
#26 BackendStartup (port=0x55977d5a2080) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/postmaster.c:3917
        bn = <optimized out>
        pid = <optimized out>
#27 ServerLoop () at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/postmaster.c:1688
        rmask = {fds_bits = {8, 0 <repeats 15 times>}}
        selres = <optimized out>
        now = <optimized out>
        readmask = {fds_bits = {72, 0 <repeats 15 times>}}
        last_lockfile_recheck_time = 1503638599
        last_touch_time = 1503638119
        __func__ = "ServerLoop"
#28 0x000055977b12042b in PostmasterMain (argc=5, argv=<optimized out>) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/postmaster.c:1296
        opt = <optimized out>
        status = <optimized out>
        userDoption = <optimized out>
        listen_addr_saved = 1 '\001'
        i = <optimized out>
        output_config_variable = <optimized out>
        __func__ = "PostmasterMain"
#29 0x000055977af1f372 in main (argc=5, argv=0x55977d5791f0) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/main/main.c:228
No locals.
(gdb) quit
A debugging session is active.

        Inferior 1 [process 7549] will be detached.

@kou
Copy link
Member

kou commented Aug 25, 2017

Thanks.

The important log is here:

*** Error in `postgres: postgres kris 127.0.0.1(33416) UPDATE': corrupted size vs. prev_size: 0x000055978046d530 ***
2017-08-24 23:38:59 CEST [15949-10] LOG:  server process (PID 12649) was terminated by signal 6: Aborted

All IO lock failures are caused by this crash. You can recover from the problem by #48 (comment) .

I want "backtrace" for the crash. But it's not logged... The process (PID: 12649) didn't log backtrace to pgroonga.log...:

2017-08-24 23:38:26.477907|n|12649: grn_init: <7.0.5>
2017-08-24 23:38:26.478043|n|12649: pgroonga: initialize: <2.0.1>

@peter-schmitz
Copy link
Author

Happy new year.

It happened again, same 'effect' and behaviour, but includes a different error message this time too.

Timeline:
04:54 - vacuum sucheintrag started
04.56 - vacuum crashed
05:08 - the first update on sucheintrag after the previous crash

org.postgresql.util.PSQLException: ERROR: pgroonga: failed to set column value: grn_io_lock failed".
(several retries to update the same data, same error.)

07:52 - record ID is nul exception appeared

org.postgresql.util.PSQLException: ERROR: pgroonga: failed to set column value: [ii][colu
mn][update] record ID is nil

I should be able to send you the whole pgroonga.log per mail if you want it.

04:54 - vacuum sucheintrag started + 05:08 - the first update on sucheintrag after the previous crash
... 2018-01-12 04:57:33.952777|d|11623: data_size=314560, 29569 2018-01-12 04:57:34.031975|d|11623: flushed! b=0x7fe8683f8000 free=249136, seg(29) 2018-01-12 04:57:34.283548|d|11623: flushing! b=0x7fe86b9f8000 free=0, seg(76) 2018-01-12 04:57:34.368016|d|11623: flushed! b=0x7fe86e478000 free=247584, seg(76) 2018-01-12 04:57:34.511927|d|11623: flushing! b=0x7fe86dc38000 free=4, seg(52) 2018-01-12 04:57:34.585918|d|11623: flushed! b=0x7fe86ba78000 free=245728, seg(52) 2018-01-12 04:57:34.821323|d|11623: flushing! b=0x7fe86d578000 free=8, seg(10) 2018-01-12 04:57:34.844847|d|11623: data_size=335584, 24749 2018-01-12 04:57:34.952668|d|11623: flushed! b=0x7fe86dd78000 free=258416, seg(10) 2018-01-12 04:57:34.971329|d|11623: flushing! b=0x7fe86cbf8000 free=8, seg(74) 2018-01-12 04:57:35.070949|d|11623: flushed! b=0x7fe86e3f8000 free=248528, seg(74) 2018-01-12 04:57:35.234121|d|11623: flushing! b=0x7fe86dfb8000 free=0, seg(95) 2018-01-12 04:57:35.270139|d|11623: flushed! b=0x7fe868878000 free=252000, seg(95) 2018-01-12 04:57:35.299991|d|11623: flushing! b=0x7fe86e638000 free=0, seg(101) 2018-01-12 04:57:35.357503|d|11623: flushed! b=0x7fe86e278000 free=254688, seg(101) 2018-01-12 04:57:35.418765|d|11623: flushing! b=0x7fe8688b8000 free=0, seg(45) 2018-01-12 04:57:35.491947|d|11623: flushed! b=0x7fe86dd38000 free=247584, seg(45) 2018-01-12 04:57:35.599846|d|11623: flushing! b=0x7fe86e2f8000 free=0, seg(66) 2018-01-12 04:57:35.639419|d|11623: flushed! b=0x7fe86bab8000 free=254928, seg(66) 2018-01-12 04:57:35.662379|d|11623: flushing! b=0x7fe86e378000 free=4, seg(46) 2018-01-12 04:57:35.663191|C|11623: -- CRASHED!!! -- 2018-01-12 04:57:35.712405|C|11623: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(+0xa21ce) [0x7fe88a13a1ce] 2018-01-12 04:57:35.712432|C|11623: /lib/x86_64-linux-gnu/libc.so.6(+0x36cb0) [0x7feccf74dcb0] 2018-01-12 04:57:35.712435|C|11623: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(+0x1c7e0a) [0x7fe88a25fe0a] 2018-01-12 04:57:35.712438|C|11623: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(+0x1ce6e0) [0x7fe88a2666e0] 2018-01-12 04:57:35.712441|C|11623: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(+0x1d0fed) [0x7fe88a268fed] 2018-01-12 04:57:35.712444|C|11623: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_ii_column_update+0x58b) [0x7fe88a27c5fb] 2018-01-12 04:57:35.712447|C|11623: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_obj_default_set_value_hook+0x81) [0x7fe88a14fa91] 2018-01-12 04:57:35.712450|C|11623: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_obj_set_value+0x1593) [0x7fe88a162763] 2018-01-12 04:57:35.712453|C|11623: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(+0xcb0a9) [0x7fe88a1630a9] 2018-01-12 04:57:35.712455|C|11623: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(+0xcb988) [0x7fe88a163988] 2018-01-12 04:57:35.712458|C|11623: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_table_cursor_delete+0x7a) [0x7fe88a16726a] 2018-01-12 04:57:35.712461|C|11623: /usr/lib/postgresql/9.5/lib/pgroonga.so(+0xc3b1) [0x7fe88a9543b1] 2018-01-12 04:57:35.712463|C|11623: postgres: postgres kris [local] VACUUM(FunctionCall4Coll+0x88) [0x55b9cf2a1148] 2018-01-12 04:57:35.712471|C|11623: postgres: postgres kris [local] VACUUM(index_bulk_delete+0x50) [0x55b9cef88ee0] 2018-01-12 04:57:35.712475|C|11623: postgres: postgres kris [local] VACUUM(+0x1ea140) [0x55b9cf097140] 2018-01-12 04:57:35.712477|C|11623: postgres: postgres kris [local] VACUUM(lazy_vacuum_rel+0x1075) [0x55b9cf0989d5] 2018-01-12 04:57:35.712480|C|11623: postgres: postgres kris [local] VACUUM(+0x1e8cb6) [0x55b9cf095cb6] 2018-01-12 04:57:35.712483|C|11623: postgres: postgres kris [local] VACUUM(vacuum+0x27e) [0x55b9cf096b1e] 2018-01-12 04:57:35.712486|C|11623: postgres: postgres kris [local] VACUUM(ExecVacuum+0x5b) [0x55b9cf096eab] 2018-01-12 04:57:35.712489|C|11623: postgres: postgres kris [local] VACUUM(standard_ProcessUtility+0x417) [0x55b9cf1aad47] 2018-01-12 04:57:35.712491|C|11623: postgres: postgres kris [local] VACUUM(+0x2fab51) [0x55b9cf1a7b51] 2018-01-12 04:57:35.712494|C|11623: postgres: postgres kris [local] VACUUM(+0x2fb796) [0x55b9cf1a8796] 2018-01-12 04:57:35.712497|C|11623: postgres: postgres kris [local] VACUUM(PortalRun+0x1c5) [0x55b9cf1a9365] 2018-01-12 04:57:35.712499|C|11623: postgres: postgres kris [local] VACUUM(PostgresMain+0x208d) [0x55b9cf1a6e5d] 2018-01-12 04:57:35.712502|C|11623: postgres: postgres kris [local] VACUUM(+0x9d678) [0x55b9cef4a678] 2018-01-12 04:57:35.712505|C|11623: postgres: postgres kris [local] VACUUM(PostmasterMain+0x104b) [0x55b9cf14c42b] 2018-01-12 04:57:35.712507|C|11623: postgres: postgres kris [local] VACUUM(main+0x842) [0x55b9cef4b372] 2018-01-12 04:57:35.712511|C|11623: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7feccf738f45] 2018-01-12 04:57:35.712513|C|11623: postgres: postgres kris [local] VACUUM(+0x9e3dd) [0x55b9cef4b3dd] 2018-01-12 04:57:35.712517|C|11623: ---------------- 2018-01-12 04:59:36.448982|n|24043: grn_fin (0) 2018-01-12 05:08:38.686864|n|24878: grn_init: <7.1.0> 2018-01-12 05:08:38.687028|n|24878: pgroonga: initialize: <2.0.2> 2018-01-12 05:08:42.499541|n|24878: io(base/721070/pgrn.0000109) collisions(1000/24): lock failed 1000 times 2018-01-12 05:09:55.100385|n|13227: grn_init: <7.1.0> 2018-01-12 05:09:55.100420|n|13704: grn_init: <7.1.0> 2018-01-12 05:09:55.100553|n|13704: pgroonga: initialize: <2.0.2> 2018-01-12 05:09:55.100554|n|13227: pgroonga: initialize: <2.0.2> 2018-01-12 05:09:55.102230|n|16260: grn_init: <7.1.0> 2018-01-12 05:09:55.102372|n|16260: pgroonga: initialize: <2.0.2> 2018-01-12 05:09:55.123125|i|16260: [object][search][index][key][exact] <Lexicon11251263_0.index> 2018-01-12 05:09:55.123150|i|16260: grn_ii_sel > (pch615b90e) 2018-01-12 05:09:55.123206|i|16260: n=8 (pch615b90e) 2018-01-12 05:09:55.123316|d|16260: [ii][cursor][min] skip: 0x55b9cfd43b00: min(0->3512748): chunk(1->2): chunk-used(true->true) 2018-01-12 05:09:55.123344|d|16260: [ii][cursor][min] skip: 0x55b9cfd022a0: min(0->3512748): chunk(1->2): chunk-used(true->true) 2018-01-12 05:09:55.123389|d|16260: [ii][cursor][min] skip: 0x55b9cfd30c10: min(0->3512748): chunk(1->16): chunk-used(true->true) 2018-01-12 05:09:55.123505|d|16260: [ii][cursor][min] skip: 0x55b9cfd27230: min(0->3840154): chunk(1->46): chunk-used(true->true) 2018-01-12 05:09:55.124373|i|16260: exact: 7 2018-01-12 05:09:55.124380|i|16260: hits=7 2018-01-12 05:09:55.138365|n|16261: grn_init: <7.1.0> 2018-01-12 05:09:55.138531|n|16261: pgroonga: initialize: <2.0.2>

2018-01-12 05:24:38.168786|w|24878: [DB Locked] time out(900000): io(base/721070/pgrn.0000109) collisions(900000/24) 2018-01-12 05:24:38.168830|e|24878: grn_io_lock failed 2018-01-12 05:24:38.169725|e|24878: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_io_lock+0x28e) [0x7fe888e204ae] 2018-01-12 05:24:38.169732|e|24878: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_ii_column_update+0x17a) [0x7fe888e021ea] 2018-01-12 05:24:38.169735|e|24878: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_obj_default_set_value_hook+0x81) [0x7fe888cd5a91] 2018-01-12 05:24:38.169738|e|24878: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_obj_set_value+0x1593) [0x7fe888ce8763] 2018-01-12 05:24:38.169741|e|24878: /usr/lib/postgresql/9.5/lib/pgroonga.so(+0xa649) [0x7fe8894d8649] 2018-01-12 05:24:38.169743|e|24878: /usr/lib/postgresql/9.5/lib/pgroonga.so(pgroonga_insert+0x5d) [0x7fe8894dbfcd] 2018-01-12 05:24:38.169746|e|24878: postgres: postgres kris 127.0.0.1(46532) UPDATE(FunctionCall6Coll+0xa6) [0x55b9cf2a1336] 2018-01-12 05:24:38.169749|e|24878: postgres: postgres kris 127.0.0.1(46532) UPDATE(index_insert+0xa5) [0x55b9cef887e5] 2018-01-12 05:24:38.169752|e|24878: postgres: postgres kris 127.0.0.1(46532) UPDATE(ExecInsertIndexTuples+0x2cc) [0x55b9cf09d54c] 2018-01-12 05:24:38.169754|e|24878: postgres: postgres kris 127.0.0.1(46532) UPDATE(+0x20bc28) [0x55b9cf0b8c28] 2018-01-12 05:24:38.169757|e|24878: postgres: postgres kris 127.0.0.1(46532) UPDATE(ExecModifyTable+0x3ff) [0x55b9cf0b934f] 2018-01-12 05:24:38.169760|e|24878: postgres: postgres kris 127.0.0.1(46532) UPDATE(ExecProcNode+0x188) [0x55b9cf0a1308] 2018-01-12 05:24:38.169763|e|24878: postgres: postgres kris 127.0.0.1(46532) UPDATE(standard_ExecutorRun+0x132) [0x55b9cf09e1f2] 2018-01-12 05:24:38.169765|e|24878: postgres: postgres kris 127.0.0.1(46532) UPDATE(+0x2fb509) [0x55b9cf1a8509] 2018-01-12 05:24:38.169768|e|24878: postgres: postgres kris 127.0.0.1(46532) UPDATE(+0x2fb751) [0x55b9cf1a8751] 2018-01-12 05:24:38.169771|e|24878: postgres: postgres kris 127.0.0.1(46532) UPDATE(PortalRun+0x1c5) [0x55b9cf1a9365] 2018-01-12 05:24:38.297806|n|24878: grn_fin (0)

I am currently trying to extract the backtrace from the crash file created by apport. Will add information if successfull.

07:52 - record ID is nul exception appeared the first time**
2018-01-12 07:52:24.019909|w|32593: [DB Locked] time out(900000): io(base/721070/pgrn.0000105) collisions(900000/33) 2018-01-12 07:52:24.019940|e|32593: grn_io_lock failed 2018-01-12 07:52:24.031849|e|32593: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_io_lock+0x28e) [0x7fe888f034ae] 2018-01-12 07:52:24.031873|e|32593: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_table_add+0x4c1) [0x7fe888db9fa1] 2018-01-12 07:52:24.031877|e|32593: /usr/lib/postgresql/9.5/lib/pgroonga.so(+0xa4da) [0x7fe8895bb4da] 2018-01-12 07:52:24.031879|e|32593: /usr/lib/postgresql/9.5/lib/pgroonga.so(pgroonga_insert+0x5d) [0x7fe8895befcd] 2018-01-12 07:52:24.031882|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(FunctionCall6Coll+0xa6) [0x55b9cf2a1336] 2018-01-12 07:52:24.031886|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(index_insert+0xa5) [0x55b9cef887e5] 2018-01-12 07:52:24.031888|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(ExecInsertIndexTuples+0x2cc) [0x55b9cf09d54c] 2018-01-12 07:52:24.031891|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(+0x20bc28) [0x55b9cf0b8c28] 2018-01-12 07:52:24.031894|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(ExecModifyTable+0x3ff) [0x55b9cf0b934f] 2018-01-12 07:52:24.031903|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(ExecProcNode+0x188) [0x55b9cf0a1308] 2018-01-12 07:52:24.031907|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(standard_ExecutorRun+0x132) [0x55b9cf09e1f2] 2018-01-12 07:52:24.031910|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(+0x2fb509) [0x55b9cf1a8509] 2018-01-12 07:52:24.031912|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(+0x2fb751) [0x55b9cf1a8751] 2018-01-12 07:52:24.031916|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(PortalRun+0x1c5) [0x55b9cf1a9365] 2018-01-12 07:52:24.031918|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(PostgresMain+0x17b3) [0x55b9cf1a6583] 2018-01-12 07:52:24.031921|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(+0x9d678) [0x55b9cef4a678] 2018-01-12 07:52:24.031951|e|32593: [ii][column][update] record ID is nil 2018-01-12 07:52:24.032791|e|32593: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_ii_column_update+0x5be) [0x7fe888ee562e] 2018-01-12 07:52:24.032796|e|32593: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_obj_default_set_value_hook+0x81) [0x7fe888db8a91] 2018-01-12 07:52:24.032799|e|32593: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_obj_set_value+0x1593) [0x7fe888dcb763] 2018-01-12 07:52:24.032801|e|32593: /usr/lib/postgresql/9.5/lib/pgroonga.so(+0xa649) [0x7fe8895bb649] 2018-01-12 07:52:24.032805|e|32593: /usr/lib/postgresql/9.5/lib/pgroonga.so(pgroonga_insert+0x5d) [0x7fe8895befcd] 2018-01-12 07:52:24.032808|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(FunctionCall6Coll+0xa6) [0x55b9cf2a1336] 2018-01-12 07:52:24.032810|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(index_insert+0xa5) [0x55b9cef887e5] 2018-01-12 07:52:24.032814|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(ExecInsertIndexTuples+0x2cc) [0x55b9cf09d54c] 2018-01-12 07:52:24.032818|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(+0x20bc28) [0x55b9cf0b8c28] 2018-01-12 07:52:24.032820|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(ExecModifyTable+0x3ff) [0x55b9cf0b934f] 2018-01-12 07:52:24.032823|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(ExecProcNode+0x188) [0x55b9cf0a1308] 2018-01-12 07:52:24.032826|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(standard_ExecutorRun+0x132) [0x55b9cf09e1f2] 2018-01-12 07:52:24.032828|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(+0x2fb509) [0x55b9cf1a8509] 2018-01-12 07:52:24.032831|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(+0x2fb751) [0x55b9cf1a8751] 2018-01-12 07:52:24.032834|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(PortalRun+0x1c5) [0x55b9cf1a9365] 2018-01-12 07:52:24.032836|e|32593: postgres: postgres kris 127.0.0.1(48044) UPDATE(PostgresMain+0x17b3) [0x55b9cf1a6583] 2018-01-12 07:52:24.765289|n|31407: grn_fin (0) 2018-01-12 07:52:24.768970|n|32593: grn_fin (0)

And backtrace of a blocked update process that later prints out that message:

gdb -p 31187 //timestamp: 2018-01-12 08:58:42 +0100

Program received signal SIGINT, Interrupt. 0x00007feccf7dbd30 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:81 81 in ../sysdeps/unix/syscall-template.S #0 0x00007feccf7dbd30 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:81 #1 0x00007fe889c2d789 in grn_nanosleep () from /usr/lib/x86_64-linux-gnu/libgroonga.so.0 #2 0x00007fe889d8d31c in grn_io_lock () from /usr/lib/x86_64-linux-gnu/libgroonga.so.0 #3 0x00007fe889c43fa1 in grn_table_add () from /usr/lib/x86_64-linux-gnu/libgroonga.so.0 #4 0x00007fe88a4454da in ?? () from /usr/lib/postgresql/9.5/lib/pgroonga.so #5 0x00007fe88a448fcd in pgroonga_insert () from /usr/lib/postgresql/9.5/lib/pgroonga.so #6 0x000055b9cf2a1336 in FunctionCall6Coll (flinfo=flinfo@entry=0x55b9cfdaca00, collation=collation@entry=0, arg1=arg1@entry=140636734764944, arg2=arg2@entry=140731194618512, arg3=arg3@entry=140731194618768, arg4=arg4@entry=9425658788630 0, arg5=140636734382304, arg6=0) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/utils/fmgr/fmgr.c:1436 #7 0x000055b9cef887e5 in index_insert (indexRelation=indexRelation@entry=0x7fe88a9e3790, values=values@entry=0x7ffe88dd3290, isnull=isnull@entry=0x7ffe88dd3390 "", heap_t_ctid=heap_t_ctid@entry=0x55b9d2704edc, heapRelation=heapRelation@e ntry=0x7fe88a9860e0, checkUnique=checkUnique@entry=UNIQUE_CHECK_NO) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/access/index/indexam.c:226 #8 0x000055b9cf09d54c in ExecInsertIndexTuples (slot=slot@entry=0x55b9d2704c18, tupleid=tupleid@entry=0x55b9d2704edc, estate=estate@entry=0x55b9d2ee5e20, noDupErr=noDupErr@entry=0 '\000', specConflict=specConflict@entry=0x0, arbiterIndex es=arbiterIndexes@entry=0x0) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execIndexing.c:388 #9 0x000055b9cf0b8c28 in ExecUpdate (tupleid=tupleid@entry=0x7ffe88dd3570, oldtuple=oldtuple@entry=0x0, slot=slot@entry=0x55b9d2704c18, planSlot=planSlot@entry=0x55b9d2ee7d60, epqstate=epqstate@entry=0x55b9d2ee6110, estate=estate@entry=0 x55b9d2ee5e20, canSetTag=1 '\001') at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/nodeModifyTable.c:1000 #10 0x000055b9cf0b934f in ExecModifyTable (node=node@entry=0x55b9d2ee6070) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/nodeModifyTable.c:1467 #11 0x000055b9cf0a1308 in ExecProcNode (node=node@entry=0x55b9d2ee6070) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execProcnode.c:389 #12 0x000055b9cf09e1f2 in ExecutePlan (dest=0x55b9cf6c0140 <donothingDR>, direction=<optimized out>, numberTuples=0, sendTuples=0 '\000', operation=CMD_UPDATE, planstate=0x55b9d2ee6070, estate=0x55b9d2ee5e20) at /home/apt/build/postgresql -9.5-9.5.4/build/../src/backend/executor/execMain.c:1549 #13 standard_ExecutorRun (queryDesc=0x55b9d203f308, direction=<optimized out>, count=0) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execMain.c:337 #14 0x000055b9cf1a8509 in ProcessQuery (plan=<optimized out>, sourceText=0x55b9d203f050 "UPDATE \"sucheintrag\" SET \"crea\"=$1,\"lmod\"=$2,\"ldel\"=$3,\"bot\"=$4,\"mandant\"=$5,\"volltextn\"=$6,\"volltexta\"=$7,\"volltextgeraetn\"=$8,\"v olltextgeraeta\"=$9,\"suchklasse\"=$10,\"dirtyvolltext\"=$11,\"produkt\"=$12 "..., params=0x55b9d203f160, dest=0x55b9cf6c0140 <donothingDR>, completionTag=0x7ffe88dd3b70 "") at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/tco p/pquery.c:185 #15 0x000055b9cf1a8751 in PortalRunMulti (portal=portal@entry=0x55b9cf92e970, isTopLevel=isTopLevel@entry=1 '\001', dest=0x55b9cf6c0140 <donothingDR>, dest@entry=0x55b9cf8dd050, altdest=0x55b9cf6c0140 <donothingDR>, altdest@entry=0x55b9cf 8dd050, completionTag=completionTag@entry=0x7ffe88dd3b70 "") at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/tcop/pquery.c:1275 #16 0x000055b9cf1a9365 in PortalRun (portal=portal@entry=0x55b9cf92e970, count=count@entry=1, isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x55b9cf8dd050, altdest=altdest@entry=0x55b9cf8dd050, completionTag=completionTag@entry=0x 7ffe88dd3b70 "") at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/tcop/pquery.c:812 #17 0x000055b9cf1a6583 in exec_execute_message (max_rows=1, portal_name=0x55b9cf8dcc40 "") at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/tcop/postgres.c:1988 #18 PostgresMain (argc=<optimized out>, argv=argv@entry=0x55b9cf860650, dbname=0x55b9cf860410 "kris", username=<optimized out>) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/tcop/postgres.c:4108 #19 0x000055b9cef4a678 in BackendRun (port=0x55b9cf8883d0) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/postmaster.c:4243 #20 BackendStartup (port=0x55b9cf8883d0) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/postmaster.c:3917 #21 ServerLoop () at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/postmaster.c:1688 #22 0x000055b9cf14c42b in PostmasterMain (argc=5, argv=<optimized out>) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/postmaster.c:1296 #23 0x000055b9cef4b372 in main (argc=5, argv=0x55b9cf85f1e0) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/main/main.c:228

pgroonga log of the backtraced pid
(result from a previous select) 2018-01-12 08:50:51.547631|i|31187: partial: 0 2018-01-12 08:50:51.547666|i|31187: hits=0 ... 2018-01-12 08:50:54.557449|n|31187: io(base/721070/pgrn.0000105) collisions(1000/21): lock failed 1000 times ... 2018-01-12 09:08:25.143670|w|31187: [DB Locked] time out(900000): io(base/721070/pgrn.0000105) collisions(900000/21) 2018-01-12 09:08:25.143713|e|31187: grn_io_lock failed 2018-01-12 09:08:25.144766|e|31187: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_io_lock+0x28e) [0x7fe889d8d4ae] 2018-01-12 09:08:25.144782|e|31187: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_table_add+0x4c1) [0x7fe889c43fa1] 2018-01-12 09:08:25.144786|e|31187: /usr/lib/postgresql/9.5/lib/pgroonga.so(+0xa4da) [0x7fe88a4454da] 2018-01-12 09:08:25.144790|e|31187: /usr/lib/postgresql/9.5/lib/pgroonga.so(pgroonga_insert+0x5d) [0x7fe88a448fcd] 2018-01-12 09:08:25.144794|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(FunctionCall6Coll+0xa6) [0x55b9cf2a1336] 2018-01-12 09:08:25.144798|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(index_insert+0xa5) [0x55b9cef887e5] 2018-01-12 09:08:25.144802|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(ExecInsertIndexTuples+0x2cc) [0x55b9cf09d54c] 2018-01-12 09:08:25.144807|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(+0x20bc28) [0x55b9cf0b8c28] 2018-01-12 09:08:25.144811|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(ExecModifyTable+0x3ff) [0x55b9cf0b934f] 2018-01-12 09:08:25.144815|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(ExecProcNode+0x188) [0x55b9cf0a1308] 2018-01-12 09:08:25.144818|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(standard_ExecutorRun+0x132) [0x55b9cf09e1f2] 2018-01-12 09:08:25.144822|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(+0x2fb509) [0x55b9cf1a8509] 2018-01-12 09:08:25.144826|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(+0x2fb751) [0x55b9cf1a8751] 2018-01-12 09:08:25.144830|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(PortalRun+0x1c5) [0x55b9cf1a9365] 2018-01-12 09:08:25.144833|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(PostgresMain+0x17b3) [0x55b9cf1a6583] 2018-01-12 09:08:25.144837|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(+0x9d678) [0x55b9cef4a678] 2018-01-12 09:08:25.144883|e|31187: [ii][column][update] record ID is nil 2018-01-12 09:08:25.145876|e|31187: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_ii_column_update+0x5be) [0x7fe889d6f62e] 2018-01-12 09:08:25.145900|e|31187: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_obj_default_set_value_hook+0x81) [0x7fe889c42a91] 2018-01-12 09:08:25.145905|e|31187: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_obj_set_value+0x1593) [0x7fe889c55763] 2018-01-12 09:08:25.145909|e|31187: /usr/lib/postgresql/9.5/lib/pgroonga.so(+0xa649) [0x7fe88a445649] 2018-01-12 09:08:25.145914|e|31187: /usr/lib/postgresql/9.5/lib/pgroonga.so(pgroonga_insert+0x5d) [0x7fe88a448fcd] 2018-01-12 09:08:25.145918|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(FunctionCall6Coll+0xa6) [0x55b9cf2a1336] 2018-01-12 09:08:25.145942|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(index_insert+0xa5) [0x55b9cef887e5] 2018-01-12 09:08:25.145947|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(ExecInsertIndexTuples+0x2cc) [0x55b9cf09d54c] 2018-01-12 09:08:25.145952|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(+0x20bc28) [0x55b9cf0b8c28] 2018-01-12 09:08:25.145956|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(ExecModifyTable+0x3ff) [0x55b9cf0b934f] 2018-01-12 09:08:25.145960|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(ExecProcNode+0x188) [0x55b9cf0a1308] 2018-01-12 09:08:25.145964|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(standard_ExecutorRun+0x132) [0x55b9cf09e1f2] 2018-01-12 09:08:25.145968|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(+0x2fb509) [0x55b9cf1a8509] 2018-01-12 09:08:25.145973|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(+0x2fb751) [0x55b9cf1a8751] 2018-01-12 09:08:25.145977|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(PortalRun+0x1c5) [0x55b9cf1a9365] 2018-01-12 09:08:25.145981|e|31187: postgres: postgres kris 127.0.0.1(52378) UPDATE(PostgresMain+0x17b3) [0x55b9cf1a6583] 2018-01-12 09:08:27.453489|n|31187: grn_fin (0)

@kou
Copy link
Member

kou commented Jan 12, 2018

Thanks.
Can you install libgroonga0-dbgsym and postgresql-9.5-pgroonga-dbgsym? They will add more information to backtrace.

@kou
Copy link
Member

kou commented Jan 12, 2018

Please send log to kou@cozmixng.org.

@anthonydafc
Copy link

Hello, i have exactly the same problem with 2 different database'. I tried to follow the steps above.
"Here is the right way to recover the IO lock:

Stop PostgreSQL.
Remove pgrn* files in data directory.
Start PostgreSQL.
Run REINDEX INDEX index_name; for all PGroonga indexes."

For my first database, it work ok. Hasn't blocked yet. For the second, the reindex won't work, I had to delete the index, copy the table, create index on new table and copy lines from old table to new one.

2 weeks later, i get the grn_io_lock failed message in the logs. Have you found a solution since. I'm using. Is there a better solution ? thanks for your help

@kou
Copy link
Member

kou commented Mar 6, 2018

Can you attach your pgroonga.log?

@anthonydafc
Copy link

Here's the log
pgroonga.log

@kou
Copy link
Member

kou commented Mar 7, 2018

Thanks. Can you also show your CREATE INDEX ... USING pgroonga ... and definition of PGroonga indexed table?

@kou
Copy link
Member

kou commented Mar 7, 2018

And what is your platform?

@anthonydafc
Copy link

My create index is : CREATE INDEX pgroonga_scrore_idxregexp
    ON public.trecherchepages USING pgroonga
    (fid_recherchepages, ftexte pgroonga_text_regexp_ops_v2)

The definition of the indexed table has been sent on your mail.

My platform is Postres 10.3.1 and PGronnga is 2.0.2

@kou
Copy link
Member

kou commented Mar 8, 2018

Thanks.
It seems that your data are too large for pgroonga_text_regexp_ops_v2.

Can you try removing pgroonga_text_regexp_ops_v2 or table partitioning https://www.postgresql.org/docs/10/static/ddl-partitioning.html ?

BTW, it seems that PGroonga is crashed sometimes but there are no backtrace in pgroonga.log... Can you try with make DEBUG=1? (You aren't use PGroonga package. You're using self build PGroonga. Right?)

@anthonydafc
Copy link

So....we've partitioned the table now. It seems better. but could you explain what "too many postings" means i pgroonga.log. Here's a sample of the log
pgroonga.log

@kou
Copy link
Member

kou commented Mar 19, 2018

Do you know "posting list" in inverted index?
See also: https://en.wikipedia.org/wiki/Inverted_index

Groonga needs to keep posting list (list of position information (document ID and location pair) of the token) per token. The space of each posting list has limit. We can't store postings over the limit.

@peter-schmitz
Copy link
Author

peter-schmitz commented Mar 22, 2018

I might have the backtrace you requested, not from the exact situation described above (see issue #61), but possibly similar:

I was running a "reindex index zga;" (was executed on a different process 608) while the server wanted to insert/update some data. Inserting data / Writes currently causes the table to be locked (normally only for a short time).

During that time, several search queries accessed the index in a read only fashion too and I think overall CPU usage was relatively high as well.

gdb -p 1088

(gdb) bt
#0  0x00007f1c4999ed30 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f1803763569 in grn_nanosleep (nanoseconds=nanoseconds@entry=1000000) at ctx.c:135
#2  0x00007f18038c444c in grn_io_lock (ctx=ctx@entry=0x7f180419d2e0 <PGrnContext>, io=0x55bd1524d280, timeout=900000) at io.c:1407
#3  0x00007f18038a59aa in grn_ii_column_update (ctx=ctx@entry=0x7f180419d2e0 <PGrnContext>, ii=0x55bd14e03880, rid=3616946, section=1, oldvalue=oldvalue@entry=0x7ffd4338cba0, newvalue=newvalue@entry=0x7f180419d440 <PGrnBuffers>, 
    posting=0x0) at ii.c:6423
#4  0x00007f1803778891 in grn_obj_default_set_value_hook (ctx=ctx@entry=0x7f180419d2e0 <PGrnContext>, nargs=nargs@entry=1, args=args@entry=0x7ffd4338cb98, user_data=user_data@entry=0x7ffd4338cc00) at db.c:1433
#5  0x00007f180378b563 in call_hook (flags=1, value=0x7f180419d440 <PGrnBuffers>, id=3616946, obj=0x55bd14f1c530, ctx=0x7f180419d2e0 <PGrnContext>) at db.c:7201
#6  grn_obj_set_value_column_var_size_scalar (flags=1, value=0x7f180419d440 <PGrnBuffers>, id=3616946, obj=0x55bd14f1c530, ctx=0x7f180419d2e0 <PGrnContext>) at db.c:7317
#7  grn_obj_set_value (ctx=0x7f180419d2e0 <PGrnContext>, obj=obj@entry=0x55bd14f1c530, id=id@entry=3616946, value=value@entry=0x7f180419d440 <PGrnBuffers>, flags=flags@entry=1) at db.c:7719
#8  0x00007f1803f7dc85 in PGrnInsert (index=index@entry=0x7f18044e39c0, sourcesTable=sourcesTable@entry=0x55bd14b93610, sourcesCtidColumn=<optimized out>, values=values@entry=0x7ffd4338d290, isnull=isnull@entry=0x7ffd4338d390 "", 
    ht_ctid=ht_ctid@entry=0x55bd150e2bd4) at src/pgroonga.c:2863
#9  0x00007f1803f800b5 in pgroonga_insert_raw (heap=<optimized out>, checkUnique=<optimized out>, ctid=0x55bd150e2bd4, isnull=0x7ffd4338d390 "", values=0x7ffd4338d290, index=0x7f18044e39c0) at src/pgroonga.c:2911
#10 pgroonga_insert (fcinfo=<optimized out>) at src/pgroonga.c:2938
#11 0x000055bd11d58336 in FunctionCall6Coll (flinfo=flinfo@entry=0x55bd13699220, collation=collation@entry=0, arg1=arg1@entry=139741128178112, arg2=arg2@entry=140725731250832, arg3=arg3@entry=140725731251088, 
    arg4=arg4@entry=94270590430164, arg5=139741128140672, arg6=0) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/utils/fmgr/fmgr.c:1436
#12 0x000055bd11a3f7e5 in index_insert (indexRelation=indexRelation@entry=0x7f18044e39c0, values=values@entry=0x7ffd4338d290, isnull=isnull@entry=0x7ffd4338d390 "", heap_t_ctid=heap_t_ctid@entry=0x55bd150e2bd4, 
    heapRelation=heapRelation@entry=0x7f18044da780, checkUnique=checkUnique@entry=UNIQUE_CHECK_NO) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/access/index/indexam.c:226
#13 0x000055bd11b5454c in ExecInsertIndexTuples (slot=slot@entry=0x55bd157d6bf8, tupleid=tupleid@entry=0x55bd150e2bd4, estate=estate@entry=0x55bd15072be0, noDupErr=noDupErr@entry=0 '\000', specConflict=specConflict@entry=0x0, 
    arbiterIndexes=arbiterIndexes@entry=0x0) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execIndexing.c:388
#14 0x000055bd11b6fc28 in ExecUpdate (tupleid=tupleid@entry=0x7ffd4338d570, oldtuple=oldtuple@entry=0x0, slot=slot@entry=0x55bd157d6bf8, planSlot=planSlot@entry=0x55bd15074b20, epqstate=epqstate@entry=0x55bd15072ed0, 
    estate=estate@entry=0x55bd15072be0, canSetTag=1 '\001') at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/nodeModifyTable.c:1000
#15 0x000055bd11b7034f in ExecModifyTable (node=node@entry=0x55bd15072e30) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/nodeModifyTable.c:1467
#16 0x000055bd11b58308 in ExecProcNode (node=node@entry=0x55bd15072e30) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execProcnode.c:389
#17 0x000055bd11b551f2 in ExecutePlan (dest=0x55bd12177140 <donothingDR>, direction=<optimized out>, numberTuples=0, sendTuples=0 '\000', operation=CMD_UPDATE, planstate=0x55bd15072e30, estate=0x55bd15072be0)
    at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execMain.c:1549
#18 standard_ExecutorRun (queryDesc=0x55bd1509b550, direction=<optimized out>, count=0) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/executor/execMain.c:337
#19 0x000055bd11c5f509 in ProcessQuery (plan=<optimized out>, 
    sourceText=0x55bd14efb900 "UPDATE \"sucheintrag\" SET \"crea\"=$1,\"lmod\"=$2,\"ldel\"=$3,\"bot\"=$4,\"mandant\"=$5,\"volltextn\"=$6,\"volltexta\"=$7,\"volltextgeraetn\"=$8,\"volltextgeraeta\"=$9,\"suchklasse\"=$10,\"dirtyvolltext\"=$11,\"produkt\"=$12 "..., params=0x55bd14efba10, dest=0x55bd12177140 <donothingDR>, completionTag=0x7ffd4338db70 "") at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/tcop/pquery.c:185
#20 0x000055bd11c5f751 in PortalRunMulti (portal=portal@entry=0x55bd12e91b00, isTopLevel=isTopLevel@entry=1 '\001', dest=0x55bd12177140 <donothingDR>, dest@entry=0x55bd12e401e0, altdest=0x55bd12177140 <donothingDR>, 
    altdest@entry=0x55bd12e401e0, completionTag=completionTag@entry=0x7ffd4338db70 "") at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/tcop/pquery.c:1275
#21 0x000055bd11c60365 in PortalRun (portal=portal@entry=0x55bd12e91b00, count=count@entry=1, isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x55bd12e401e0, altdest=altdest@entry=0x55bd12e401e0, 
    completionTag=completionTag@entry=0x7ffd4338db70 "") at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/tcop/pquery.c:812
#22 0x000055bd11c5d583 in exec_execute_message (max_rows=1, portal_name=0x55bd12e3fdd0 "") at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/tcop/postgres.c:1988
#23 PostgresMain (argc=<optimized out>, argv=argv@entry=0x55bd12dc2760, dbname=0x55bd12dc2520 "kris", username=<optimized out>) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/tcop/postgres.c:4108
#24 0x000055bd11a01678 in BackendRun (port=0x55bd12deb320) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/postmaster.c:4243
#25 BackendStartup (port=0x55bd12deb320) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/postmaster.c:3917
#26 ServerLoop () at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/postmaster.c:1688
#27 0x000055bd11c0342b in PostmasterMain (argc=5, argv=<optimized out>) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/postmaster/postmaster.c:1296
#28 0x000055bd11a02372 in main (argc=5, argv=0x55bd12dc11e0) at /home/apt/build/postgresql-9.5-9.5.4/build/../src/backend/main/main.c:228

Postgresql CLI (used Process 608)

kris=# reindex index zga;
ERROR:  deadlock detected
DETAIL:  Process 608 waits for AccessExclusiveLock on relation 14435435 of database 721070; blocked by process 1088.
Process 1088 waits for RowExclusiveLock on relation 723404 of database 721070; blocked by process 608.
HINT:  See server log for query details.

psql log:

2018-03-22 15:51:36 CET [608-3] postgres@kris ERROR:  deadlock detected
2018-03-22 15:51:36 CET [608-4] postgres@kris DETAIL:  Process 608 waits for AccessExclusiveLock on relation 14435435 of database 721070; blocked by process 1088.
        Process 1088 waits for RowExclusiveLock on relation 723404 of database 721070; blocked by process 608.
        Process 608: reindex index zga;
        Process 1088: UPDATE "sucheintrag" SET "crea"=$1,"lmod"=$2,"ldel"=$3,"bot"=$4,"mandant"=$5,"volltextn"=$6,"volltexta"=$7,"volltextgeraetn"=$8,"volltextgeraeta"=$9,"suchklasse"=$10,"dirtyvolltext"=$11,"produkt"=$12 WHERE "id"=$13
2018-03-22 15:51:36 CET [608-5] postgres@kris HINT:  See server log for query details.
2018-03-22 15:51:36 CET [608-6] postgres@kris STATEMENT:  reindex index zga;
2018-03-22 15:51:36 CET [5684-11] postgres@kris LOG:  duration: 741986.877 ms  parse <unnamed>: INSERT INTO "sucheintrag" ("id","crea","lmod","ldel","bot","mandant","volltextn","volltexta","volltextgeraetn","volltextgeraeta","suchklasse",
"dirtyvolltext","produkt") VALUES ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13)
2018-03-22 15:51:36 CET [12988-5] postgres@kris LOG:  duration: 398747.814 ms  parse <unnamed>: INSERT INTO "sucheintrag" ("id","crea","lmod","ldel","bot","mandant","volltextn","volltexta","volltextgeraetn","volltextgeraeta","suchklasse",
"dirtyvolltext","produkt") VALUES ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13)
2018-03-22 15:51:36 CET [1088-5] postgres@kris LOG:  duration: 73146.933 ms  parse <unnamed>: UPDATE "sucheintrag" SET "crea"=$1,"lmod"=$2,"ldel"=$3,"bot"=$4,"mandant"=$5,"volltextn"=$6,"volltexta"=$7,"volltextgeraetn"=$8,"volltextgeraeta"=$9,"suchklasse"=$10,"dirtyvolltext"=$11,"produkt"=$12 WHERE "id"=$13
2018-03-22 15:51:36 CET [6358-1] postgres@kris LOG:  duration: 21496.939 ms  parse <unnamed>: INSERT INTO "sucheintrag" ("id","crea","lmod","ldel","bot","mandant","volltextn","volltexta","volltextgeraetn","volltextgeraeta","suchklasse","dirtyvolltext","produkt") VALUES ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13)

/edit 2018-03-23: wording

@kou
Copy link
Member

kou commented Mar 24, 2018

Does PGROONGA have built in inverted index ?

Yes.

do we have to create an other index ?

No.

You talk about posting limit size ?

Yes.

what is the size ?

It's implementation details.

can we modify the size limit ?

No.

Another question : Can we create several indexes with PGRonnga or is only one allowed

Yes. We can create multiple PGroonga indexes.

@kou
Copy link
Member

kou commented Mar 25, 2018

I might have the backtrace you requested,

Thanks. But it's not the backtrace I want.

In that case, the current index (the index used by PID 1088) is broken (lock is remained). Maybe, postgres process that uses PGroonga was crashed. I want the backtrace on the crash.

@peter-schmitz
Copy link
Author

peter-schmitz commented Mar 26, 2018

Seems like I overlooked a "too many open files" error that occurect shortly after the "reindex index zva".
The reindex itself seems to have finished ok(?), but alot of concurrently running queries threw an error:

cat /proc//limits uses a 1024 softlimit and 4096 hardlimit,
cat /proc/sys/fs/file-nr shows a max limit of 800000
/etc/security/ulimit.conf has increased values for limits, but they don't seem to be used by postgres.
postgresql.conf:
max_files_per_process was set to the default value and should've been increased to 2000 after 15:31. Since this did not seem to help, it was reduced to 1000 again afterwards.

The rest of the reindex and the first "CRASH" in the current pgroonga log.

2018-03-22 15:30:21.231122|d|13811: n_terms = 1245, chunk_offset = 1046260, chunk_size = 1048576, total = 1401700KB
2018-03-22 15:30:21.281263|d|13811: n_terms = 1821, chunk_offset = 1048475, chunk_size = 1048576, total = 1404961KB
2018-03-22 15:30:21.427564|d|13811: n_terms = 1334, chunk_offset = 1048470, chunk_size = 1048576, total = 1414514KB
2018-03-22 15:30:21.498730|d|13811: n_terms = 2460, chunk_offset = 773677, chunk_size = 1048576, total = 1419528KB
2018-03-22 15:30:21.743180|i|13811: [ii][builder][fin] removed path: <base/721070/pgrn.000010BUddQcA>
2018-03-22 15:30:21.743270|n|13811: spec:267:update:Lexicon14580978_0.index:72(column:index):268(BuildingSources14580978)
2018-03-22 15:30:21.746948|e|14763: system call error: Too many open files: failed to open path: <base/721070/pgrn.000010D>
2018-03-22 15:30:21.752793|i|14767: [object][search][index][key][exact] <Lexicon14580978_0.index>
2018-03-22 15:30:21.752825|i|14767: grn_ii_sel > (802878)
2018-03-22 15:30:21.752910|i|14767: n=4 (802878)
2018-03-22 15:30:21.755758|e|14769: system call error: Too many open files: failed to open path: <base/721070/pgrn.000010D>
2018-03-22 15:30:21.762560|e|14776: system call error: Too many open files: failed to open path: <base/721070/pgrn.000010D>
2018-03-22 15:30:21.764751|e|14763: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_io_open+0xd56) [0x7f1803798f56]
2018-03-22 15:30:21.764774|e|14763: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_pat_open+0x26) [0x7f1803859e96]
2018-03-22 15:30:21.764779|e|14763: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_ctx_at+0xb2d) [0x7f180365518d]
2018-03-22 15:30:21.764783|e|14763: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_ctx_at+0x21e) [0x7f180365487e]
2018-03-22 15:30:21.764788|e|14763: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(+0xb867c) [0x7f180365767c]
2018-03-22 15:30:21.764792|e|14763: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_column_find_index_data+0xa5) [0x7f180367b435]
2018-03-22 15:30:21.764796|e|14763: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(+0x321e22) [0x7f18038c0e22]
2018-03-22 15:30:21.764800|e|14763: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(mrb_vm_exec+0x26a8) [0x7f1803938e68]
2018-03-22 15:30:21.764805|e|14763: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(mrb_vm_run+0xb1) [0x7f180393b551]
2018-03-22 15:30:21.764809|e|14763: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(mrb_run+0x1e) [0x7f180393535e]
2018-03-22 15:30:21.764813|e|14763: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(mrb_funcall_with_block+0x3af) [0x7f180393572f]
2018-03-22 15:30:21.764817|e|14763: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(mrb_funcall_with_block+0x788) [0x7f1803935b08]
2018-03-22 15:30:21.764821|e|14763: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(mrb_funcall_argv+0x22) [0x7f1803935b92]
2018-03-22 15:30:21.764825|e|14763: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(mrb_funcall+0xe0) [0x7f1803935c80]
2018-03-22 15:30:21.764830|e|14763: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_mrb_expr_estimate_size+0x68) [0x7f18038c02a8]
2018-03-22 15:30:21.764834|e|14763: /usr/lib/x86_64-linux-gnu/libgroonga.so.0(grn_expr_estimate_size+0xb2) [0x7f180372f4c2]
2018-03-22 15:30:21.764842|e|14763: grn_ctx_at: failed to open object: <269>(<Lexicon14580978_0>):<49>(table:pat_key)
2018-03-22 15:30:21.764852|e|14763: grn_ctx_at: failed to open object: <267>(<Lexicon14580978_0.index>):<72>(column:index)
2018-03-22 15:30:21.764869|C|14763: -- CRASHED!!! --


I'll send a link to the full pgroonga log per Mail, since there might be more information in there.

iirc, The last time I saw a "too many open files" error with postgres (and pgroonga) was around a year ago during my initial tests, where I did a lot of "drop index ..." and "create index" tests with slightly modified index definitions.
For example switched between Trigram and Bigram, tested with and without normalizers, etc.

@kou
Copy link
Member

kou commented Mar 30, 2018

Crashed with "too many open files" is the cause.
Can you try increasing the max number of file descriptors for PostgreSQL process? I don't tried but max_files_per_procces https://www.postgresql.org/docs/10/static/runtime-config-resource.html#GUC-MAX-FILES-PER-PROCESS may be related.

@peter-schmitz
Copy link
Author

peter-schmitz commented Apr 6, 2018

I was finally able to get the backtrace of some dumps working (apport really doesn't like to handle larger files in Ubuntu 14.04 )

Update history for version information from apt/history.log:

Since Package
2017-10-11 postgresql-9.5-pgroonga:amd64 (2.0.1-2~xenial1, 2.0.2-2~xenial1)
2018-01-02 libgroonga0:amd64 (7.0.9-2~xenial1, 7.1.0-2~xenial1)
2018-02-01 libgroonga0:amd64 (7.1.0-2~xenial1, 7.1.1-2~xenial1)
2018-03-08 postgresql-9.5-pgroonga:amd64 (2.0.2-2~xenial1, 2.0.3-2~xenial1)

The following warning is printed by gdb, if this is a problem for the accuracy of the backtraces, could you please point me to an archive of previous releases please?

Reading symbols from /usr/lib/postgresql/9.5/bin/postgres...Reading symbols from /usr/lib/debug/.build-id/f8/e198606e7867ebe36d35d405d5cc71ef732158.debug...done.
done.
[New LWP 11623]

warning: .dynamic section for "/usr/lib/postgresql/9.5/lib/pgroonga.so" is not at the expected address (wrong library or version mismatch?)

warning: .dynamic section for "/usr/lib/x86_64-linux-gnu/libgroonga.so.0" is not at the expected address (wrong library or version mismatch?)

Backtrace from the crash on march 26:

Core was generated by `postgres: postgres kris [local] VACUUM                                        '.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007feccf74dc37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56	../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x00007feccf74dc37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007feccf751028 in __GI_abort () at abort.c:89
#2  0x00007fe88a13a272 in grn_ctx_qe_exec_uri (ctx=0x55b9cfe588a8, path=<optimized out>, path_len=<optimized out>) at ctx.c:1140
#3  0x0000000000000000 in ?? ()
(gdb) thread apply all backtrace full
>
>Thread 1 (Thread 0x7fecd143d780 (LWP 11623)):
#0  0x00007feccf74dc37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
        resultvar = 0
        pid = 11623
        selftid = 11623
#1  0x00007feccf751028 in __GI_abort () at abort.c:89
        save_stage = 2
        act = {__sigaction_handler = {sa_handler = 0x55b9ceead000, sa_sigaction = 0x55b9ceead000}, sa_mask = {__val = {0, 94256528805888, 0, 0, 0, 4294967296, 124554052608, 29, 2832, 116, 94256545236928, 140655100023240, 140655070435782, 140636734112128, 94256545237160, 140636734103616}}, sa_flags = -807040856, 
          sa_restorer = 0x55b9cfe587c0}
        sigs = {__val = {32, 0 <repeats 15 times>}}
#2  0x00007fe88a13a272 in grn_ctx_qe_exec_uri (ctx=0x55b9cfe588a8, path=<optimized out>, path_len=<optimized out>) at ctx.c:1140
        l = <optimized out>
        command_name = <optimized out>
        command_name_size = <optimized out>
        buf = {header = {type = 0 '\000', impl_flags = 0 '\000', flags = 0, domain = 0}, u = {b = {head = 0x2 <error: Cannot access memory at address 0x2>, curr = 0x0, tail = 0xc0000 <error: Cannot access memory at address 0xc0000>}, v = {body = 0x2, sections = 0x0, n_sections = 786432}}}
        expr = 0x55b9cfe588a8
        val = 0x7fe88ab702e0 <currentData+16>
        request_id = {header = {type = 6 '\006', impl_flags = 0 '\000', flags = 0, domain = 0}, u = {b = {head = 0x55b9d0706c84 "5485\021\005", curr = 0x0, tail = 0x55b9cfdfc654 "P3\257"}, v = {body = 0x55b9d0706c84, sections = 0x0, n_sections = -807418284}}}
        request_timeout = 6.9169190417774516e-323
        p = 0x55b9d1827e3c "\001"
        e = 0x7fe88a942040 <grn_gctx+32> ""
        v = <optimized out>
        key_end = <optimized out>
        filename_end = <optimized out>
        __FUNCTION__ = "grn_ctx_qe_exec_uri"
#3  0x0000000000000000 in ?? ()
No symbol table info available.
(gdb) 

Seems like the VACUUM crash originated in groonga actually.

ProcMaps from march
7fe889c33000-7fe889c7f000 r-xp 00000000 09:02 102087675                  /usr/lib/x86_64-linux-gnu/libzstd.so.0.5.1
7fe889c7f000-7fe889e7e000 ---p 0004c000 09:02 102087675                  /usr/lib/x86_64-linux-gnu/libzstd.so.0.5.1
7fe889e7e000-7fe889e7f000 r--p 0004b000 09:02 102087675                  /usr/lib/x86_64-linux-gnu/libzstd.so.0.5.1
7fe889e7f000-7fe889e80000 rw-p 0004c000 09:02 102087675                  /usr/lib/x86_64-linux-gnu/libzstd.so.0.5.1
7fe889e80000-7fe889e97000 r-xp 00000000 09:02 102087450                  /usr/lib/x86_64-linux-gnu/liblz4.so.1.7.1
7fe889e97000-7fe88a096000 ---p 00017000 09:02 102087450                  /usr/lib/x86_64-linux-gnu/liblz4.so.1.7.1
7fe88a096000-7fe88a097000 r--p 00016000 09:02 102087450                  /usr/lib/x86_64-linux-gnu/liblz4.so.1.7.1
7fe88a097000-7fe88a098000 rw-p 00017000 09:02 102087450                  /usr/lib/x86_64-linux-gnu/liblz4.so.1.7.1
7fe88a098000-7fe88a5bd000 r-xp 00000000 09:02 102087143                  /usr/lib/x86_64-linux-gnu/libgroonga.so.0.0.0
7fe88a5bd000-7fe88a7bd000 ---p 00525000 09:02 102087143                  /usr/lib/x86_64-linux-gnu/libgroonga.so.0.0.0
7fe88a7bd000-7fe88a942000 r--p 00525000 09:02 102087143                  /usr/lib/x86_64-linux-gnu/libgroonga.so.0.0.0
7fe88a942000-7fe88a943000 rw-p 006aa000 09:02 102087143                  /usr/lib/x86_64-linux-gnu/libgroonga.so.0.0.0
7fe88a943000-7fe88a948000 rw-p 00000000 00:00 0 
7fe88a948000-7fe88a96e000 r-xp 00000000 09:02 2207172                    /usr/lib/postgresql/9.5/lib/pgroonga.so
7fe88a96e000-7fe88ab6e000 ---p 00026000 09:02 2207172                    /usr/lib/postgresql/9.5/lib/pgroonga.so
7fe88ab6e000-7fe88ab6f000 r--p 00026000 09:02 2207172                    /usr/lib/postgresql/9.5/lib/pgroonga.so
7fe88ab6f000-7fe88ab70000 rw-p 00027000 09:02 2207172                    /usr/lib/postgresql/9.5/lib/pgroonga.so

Backtrace from a crash in january (based on current debug symbols)

Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/postgresql/9.5/bin/postgres...Reading symbols from /usr/lib/debug/.build-id/f8/e198606e7867ebe36d35d405d5cc71ef732158.debug...done.
done.
[New LWP 11623]

warning: .dynamic section for "/usr/lib/postgresql/9.5/lib/pgroonga.so" is not at the expected address (wrong library or version mismatch?)

warning: .dynamic section for "/usr/lib/x86_64-linux-gnu/libgroonga.so.0" is not at the expected address (wrong library or version mismatch?)
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: postgres kris [local] VACUUM                                        '.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007feccf74dc37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56	../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt full
#0  0x00007feccf74dc37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
        resultvar = 0
        pid = 11623
        selftid = 11623
#1  0x00007feccf751028 in __GI_abort () at abort.c:89
        save_stage = 2
        act = {__sigaction_handler = {sa_handler = 0x55b9ceead000, sa_sigaction = 0x55b9ceead000}, sa_mask = {__val = {0, 94256528805888, 0, 0, 0, 4294967296, 124554052608, 29, 2832, 116, 94256545236928, 140655100023240, 140655070435782, 140636734112128, 94256545237160, 140636734103616}}, sa_flags = -807040856, 
          sa_restorer = 0x55b9cfe587c0}
        sigs = {__val = {32, 0 <repeats 15 times>}}
#2  0x00007fe88a13a272 in grn_ctx_qe_exec_uri (ctx=0x55b9cfe588a8, path=<optimized out>, path_len=<optimized out>) at ctx.c:1140
        l = <optimized out>
        command_name = <optimized out>
        command_name_size = <optimized out>
        buf = {header = {type = 0 '\000', impl_flags = 0 '\000', flags = 0, domain = 0}, u = {b = {head = 0x2 <error: Cannot access memory at address 0x2>, curr = 0x0, tail = 0xc0000 <error: Cannot access memory at address 0xc0000>}, v = {body = 0x2, sections = 0x0, n_sections = 786432}}}
        expr = 0x55b9cfe588a8
        val = 0x7fe88ab702e0 <currentData+16>
        request_id = {header = {type = 6 '\006', impl_flags = 0 '\000', flags = 0, domain = 0}, u = {b = {head = 0x55b9d0706c84 "5485\021\005", curr = 0x0, tail = 0x55b9cfdfc654 "P3\257"}, v = {body = 0x55b9d0706c84, sections = 0x0, n_sections = -807418284}}}
        request_timeout = 6.9169190417774516e-323
        p = 0x55b9d1827e3c "\001"
        e = 0x7fe88a942040 <grn_gctx+32> ""
        v = <optimized out>
        key_end = <optimized out>
        filename_end = <optimized out>
        __FUNCTION__ = "grn_ctx_qe_exec_uri"
#3  0x0000000000000000 in ?? ()
No symbol table info available.

@anthonydafc
Copy link

Hello,

We’ve modified a few things in postgres

using table partitionning, and tnew triggers, but i keep getting errors in bot postgres logs and pgroonga logs

If you could help
pgroonga.log
postgresql-2018-04-10_181108.log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants