When doing concurrent 2 mclient with one querying sys.queue, mserver segfault with the given (abbreviated) stacktrace
11 0x00007f4257d5a312 in __GI___assert_fail (assertion=0x7f42598838de "(v[i] & 0xC0) == 0x80", file=0x7f4259883710 "gdk_atoms.c", line=1283, function=0x7f4259883af6 <PRETTY_FUNCTION.10385> "strPut") at assert.c:101
12 0x00007f425959c941 in strPut (h=0x7f42341f3050, dst=0x7f4250dad290, v=0x7f41d6596b20 '\333' <repeats 160 times>, "\300") at gdk_atoms.c:1283
13 0x00007f425963fef5 in BUNappend (b=0x7f42342cc010, t=0x7f41d6596b20, force=0 '\000') at gdk_bat.c:1318
14 0x00007f4259d93785 in SYSMONqueue (cntxt=0x7f4252f7f330, mb=0x7f4234239880, stk=0x7f42342dee20, pci=0x7f42342fb5c0) at sysmon.c:103
15 0x00007f4259c1418a in runMALsequence (cntxt=0x7f4252f7f330, mb=0x7f4234239880, startpc=1, stoppc=0, stk=0x7f42342dee20, env=0x0, pcicaller=0x0) at mal_interpreter.c:631
16 0x00007f4259c132db in callMAL (cntxt=0x7f4252f7f330, mb=0x7f4234239880, env=0x7f4250dad998, argv=0x7f4250dada40, debug=0 '\000') at mal_interpreter.c:447
From what I could debug, here is what happened:
QRYqueue[0] and QRYqueue[1] are filled
QRYqueue[0] is finished and cleaned in runtimeProfileFinish, QRYqueue[1] is moved in QRYqueue[0].
We still have a QRYqueue information in QRYqueue[1], that should not pose problems since qtop is here
When a query is done on sys.queue, the loop in sysmon.c:89 is done from 0 to QRYqueue[i].tag, so we hit QRYqueue[1].query which has been freed but not setted to NULL.
Exporting qtop and doing the loop until qtop seems to solve the problem.
Looks like an off-by one (h->base[h->size] is set to 0)
No. memset sets the memory starting at h->base+h->free for (h->size-h->free) bytes. That is to say, from h->base+h->free up to (but not including(!)) h->base+h->free+(h->size-h->free) == h->base+h->size.
Running two clients causes an assertion in BUNappend
in SYSMONqueue, which is protected by a mal_delaylock (weird)
0 0x00007fd50b4389c8 in raise () from /lib64/libc.so.6
1 0x00007fd50b43a65a in abort () from /lib64/libc.so.6
2 0x00007fd50b431187 in __assert_fail_base () from /lib64/libc.so.6
3 0x00007fd50b431232 in __assert_fail () from /lib64/libc.so.6
4 0x00007fd50e0f44f1 in strPut (h=0x7fd454155c60, dst=0x7fd461f2f288, v=0x7fd45414f1f0 '\333' <repeats 40 times>, "q") at /export/scratch1/mk/default//package/gdk/gdk_atoms.c:1323
5 0x00007fd50e18ef54 in BUNappend (b=0x7fd454155b20, t=0x7fd45414f1f0, force=0 '\000') at /export/scratch1/mk/default//package/gdk/gdk_bat.c:955
6 0x00007fd50e83af7d in SYSMONqueue (cntxt=0x7fd5014bc330, mb=0x7fd454107280, stk=0x7fd45415a320, pci=0x7fd4541371b0) at /export/scratch1/mk/default//package/monetdb5/modules/mal/sysmon.c:77
Run the server under valgrind. One of the messages is:
==21299== Invalid read of size 1
==21299== at 0x5566DF1: strCmp (gdk_atoms.c:1084)
==21299== by 0x5601AF9: setcolprops (gdk_bat.c:818)
==21299== by 0x56025CC: BUNappend (gdk_bat.c:952)
==21299== by 0x4F94D5D: SYSMONqueue (sysmon.c:76)
==21299== by 0x4E85B1E: runMALsequence (mal_interpreter.c:629)
==21299== by 0x4E84CE6: callMAL (mal_interpreter.c:434)
==21299== by 0x18C816C1: SQLexecutePrepared (sql_execute.c:525)
==21299== by 0x18C81ADB: SQLengineIntern (sql_execute.c:576)
==21299== by 0x18C7F1E4: SQLengine (sql_scenario.c:1178)
==21299== by 0x4EACB4D: runPhase (mal_scenario.c:531)
==21299== by 0x4EACD2F: runScenarioBody (mal_scenario.c:574)
==21299== by 0x4EACE3F: runScenario (mal_scenario.c:594)
==21299== Address 0x1353c920 is 16 bytes inside a block of size 56 free'd
==21299== at 0x4C2CD5A: free (vg_replace_malloc.c:530)
==21299== by 0x55614DA: GDKfree (gdk_utils.c:1804)
==21299== by 0x4E7AF32: runtimeProfileFinish (mal_runtime.c:113)
==21299== by 0x4E86654: runMALsequence (mal_interpreter.c:760)
==21299== by 0x4E84CE6: callMAL (mal_interpreter.c:434)
==21299== by 0x18C816C1: SQLexecutePrepared (sql_execute.c:525)
==21299== by 0x18C81ADB: SQLengineIntern (sql_execute.c:576)
==21299== by 0x18C7F1E4: SQLengine (sql_scenario.c:1178)
==21299== by 0x4EACB4D: runPhase (mal_scenario.c:531)
==21299== by 0x4EACD2F: runScenarioBody (mal_scenario.c:574)
==21299== by 0x4EACE3F: runScenario (mal_scenario.c:594)
==21299== by 0x4EAE928: MSserveClient (mal_session.c:449)
==21299== Block was alloc'd at
==21299== at 0x4C2BBAD: malloc (vg_replace_malloc.c:299)
==21299== by 0x55610E0: GDKmalloc_prefixsize (gdk_utils.c:1679)
==21299== by 0x55611F7: GDKmallocmax (gdk_utils.c:1722)
==21299== by 0x55612FF: GDKmalloc (gdk_utils.c:1746)
==21299== by 0x55617B8: GDKstrdup (gdk_utils.c:1881)
==21299== by 0x4E7AA32: runtimeProfileInit (mal_runtime.c:87)
==21299== by 0x4E851E5: runMALsequence (mal_interpreter.c:500)
==21299== by 0x4E84CE6: callMAL (mal_interpreter.c:434)
==21299== by 0x18C816C1: SQLexecutePrepared (sql_execute.c:525)
==21299== by 0x18C81ADB: SQLengineIntern (sql_execute.c:576)
==21299== by 0x18C7F1E4: SQLengine (sql_scenario.c:1178)
==21299== by 0x4EACB4D: runPhase (mal_scenario.c:531)
In other words, the query being added by SYSMONqueue has aleady been freed.
Date: 2016-03-02 09:14:04 +0100
From: anthonin.bonnefoy
To: MonetDB5 devs <>
Version: 11.21.19 (Jul2015-SP4)
CC: @mlkersten
Last updated: 2016-10-13 10:04:07 +0200
Comment 21849
Date: 2016-03-02 09:14:04 +0100
From: anthonin.bonnefoy
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0 Iceweasel/45.0
Build Identifier:
When doing concurrent 2 mclient with one querying sys.queue, mserver segfault with the given (abbreviated) stacktrace
11 0x00007f4257d5a312 in __GI___assert_fail (assertion=0x7f42598838de "(v[i] & 0xC0) == 0x80", file=0x7f4259883710 "gdk_atoms.c", line=1283, function=0x7f4259883af6 <PRETTY_FUNCTION.10385> "strPut") at assert.c:101
12 0x00007f425959c941 in strPut (h=0x7f42341f3050, dst=0x7f4250dad290, v=0x7f41d6596b20 '\333' <repeats 160 times>, "\300") at gdk_atoms.c:1283
13 0x00007f425963fef5 in BUNappend (b=0x7f42342cc010, t=0x7f41d6596b20, force=0 '\000') at gdk_bat.c:1318
14 0x00007f4259d93785 in SYSMONqueue (cntxt=0x7f4252f7f330, mb=0x7f4234239880, stk=0x7f42342dee20, pci=0x7f42342fb5c0) at sysmon.c:103
15 0x00007f4259c1418a in runMALsequence (cntxt=0x7f4252f7f330, mb=0x7f4234239880, startpc=1, stoppc=0, stk=0x7f42342dee20, env=0x0, pcicaller=0x0) at mal_interpreter.c:631
16 0x00007f4259c132db in callMAL (cntxt=0x7f4252f7f330, mb=0x7f4234239880, env=0x7f4250dad998, argv=0x7f4250dada40, debug=0 '\000') at mal_interpreter.c:447
From what I could debug, here is what happened:
We still have a QRYqueue information in QRYqueue[1], that should not pose problems since qtop is here
Exporting qtop and doing the loop until qtop seems to solve the problem.
Reproducible: Always
Comment 21850
Date: 2016-03-02 09:40:06 +0100
From: anthonin.bonnefoy
By the way, in gdk_atoms.c:1247, the line
memset(h->base + h->free, 0, h->size - h->free);
Looks like an off-by one (h->base[h->size] is set to 0)
Comment 21909
Date: 2016-03-15 10:30:58 +0100
From: @sjoerdmullender
(In reply to anthonin.bonnefoy from comment 1)
No. memset sets the memory starting at h->base+h->free for (h->size-h->free) bytes. That is to say, from h->base+h->free up to (but not including(!)) h->base+h->free+(h->size-h->free) == h->base+h->size.
Comment 22072
Date: 2016-04-21 10:19:41 +0200
From: anthonin.bonnefoy
Yeah, my bad.
Still, the bug on sys.queue is valid. Freed memory is read and garbage can appear in the query column.
Just by doing 2 simultaneous
while mclient db -s "select * from sys.queue;" ; do :; done
This will trigger an "utf8strlenmax: Assertion `n == 0' failed."
Comment 22254
Date: 2016-07-25 21:04:50 +0200
From: @mlkersten
Running the one liner does not cause problems.
Running two clients causes an assertion in BUNappend
in SYSMONqueue, which is protected by a mal_delaylock (weird)
0 0x00007fd50b4389c8 in raise () from /lib64/libc.so.6
1 0x00007fd50b43a65a in abort () from /lib64/libc.so.6
2 0x00007fd50b431187 in __assert_fail_base () from /lib64/libc.so.6
3 0x00007fd50b431232 in __assert_fail () from /lib64/libc.so.6
4 0x00007fd50e0f44f1 in strPut (h=0x7fd454155c60, dst=0x7fd461f2f288, v=0x7fd45414f1f0 '\333' <repeats 40 times>, "q") at /export/scratch1/mk/default//package/gdk/gdk_atoms.c:1323
5 0x00007fd50e18ef54 in BUNappend (b=0x7fd454155b20, t=0x7fd45414f1f0, force=0 '\000') at /export/scratch1/mk/default//package/gdk/gdk_bat.c:955
6 0x00007fd50e83af7d in SYSMONqueue (cntxt=0x7fd5014bc330, mb=0x7fd454107280, stk=0x7fd45415a320, pci=0x7fd4541371b0) at /export/scratch1/mk/default//package/monetdb5/modules/mal/sysmon.c:77
Comment 22257
Date: 2016-07-25 21:37:33 +0200
From: @sjoerdmullender
Run the server under valgrind. One of the messages is:
==21299== Invalid read of size 1
==21299== at 0x5566DF1: strCmp (gdk_atoms.c:1084)
==21299== by 0x5601AF9: setcolprops (gdk_bat.c:818)
==21299== by 0x56025CC: BUNappend (gdk_bat.c:952)
==21299== by 0x4F94D5D: SYSMONqueue (sysmon.c:76)
==21299== by 0x4E85B1E: runMALsequence (mal_interpreter.c:629)
==21299== by 0x4E84CE6: callMAL (mal_interpreter.c:434)
==21299== by 0x18C816C1: SQLexecutePrepared (sql_execute.c:525)
==21299== by 0x18C81ADB: SQLengineIntern (sql_execute.c:576)
==21299== by 0x18C7F1E4: SQLengine (sql_scenario.c:1178)
==21299== by 0x4EACB4D: runPhase (mal_scenario.c:531)
==21299== by 0x4EACD2F: runScenarioBody (mal_scenario.c:574)
==21299== by 0x4EACE3F: runScenario (mal_scenario.c:594)
==21299== Address 0x1353c920 is 16 bytes inside a block of size 56 free'd
==21299== at 0x4C2CD5A: free (vg_replace_malloc.c:530)
==21299== by 0x55614DA: GDKfree (gdk_utils.c:1804)
==21299== by 0x4E7AF32: runtimeProfileFinish (mal_runtime.c:113)
==21299== by 0x4E86654: runMALsequence (mal_interpreter.c:760)
==21299== by 0x4E84CE6: callMAL (mal_interpreter.c:434)
==21299== by 0x18C816C1: SQLexecutePrepared (sql_execute.c:525)
==21299== by 0x18C81ADB: SQLengineIntern (sql_execute.c:576)
==21299== by 0x18C7F1E4: SQLengine (sql_scenario.c:1178)
==21299== by 0x4EACB4D: runPhase (mal_scenario.c:531)
==21299== by 0x4EACD2F: runScenarioBody (mal_scenario.c:574)
==21299== by 0x4EACE3F: runScenario (mal_scenario.c:594)
==21299== by 0x4EAE928: MSserveClient (mal_session.c:449)
==21299== Block was alloc'd at
==21299== at 0x4C2BBAD: malloc (vg_replace_malloc.c:299)
==21299== by 0x55610E0: GDKmalloc_prefixsize (gdk_utils.c:1679)
==21299== by 0x55611F7: GDKmallocmax (gdk_utils.c:1722)
==21299== by 0x55612FF: GDKmalloc (gdk_utils.c:1746)
==21299== by 0x55617B8: GDKstrdup (gdk_utils.c:1881)
==21299== by 0x4E7AA32: runtimeProfileInit (mal_runtime.c:87)
==21299== by 0x4E851E5: runMALsequence (mal_interpreter.c:500)
==21299== by 0x4E84CE6: callMAL (mal_interpreter.c:434)
==21299== by 0x18C816C1: SQLexecutePrepared (sql_execute.c:525)
==21299== by 0x18C81ADB: SQLengineIntern (sql_execute.c:576)
==21299== by 0x18C7F1E4: SQLengine (sql_scenario.c:1178)
==21299== by 0x4EACB4D: runPhase (mal_scenario.c:531)
In other words, the query being added by SYSMONqueue has aleady been freed.
Comment 22259
Date: 2016-07-25 21:58:24 +0200
From: MonetDB Mercurial Repository <>
Changeset 9f67d824a657 made by Sjoerd Mullender sjoerd@acm.org in the MonetDB repo, refers to this bug.
For complete details, see http//devmonetdborg/hg/MonetDB?cmd=changeset;node=9f67d824a657
Changeset description:
Comment 22260
Date: 2016-07-25 22:00:12 +0200
From: @sjoerdmullender
Should be fixed now.
Comment 24492
Date: 2016-10-13 10:04:07 +0200
From: @sjoerdmullender
Jun2016-SP2 has been released.
The text was updated successfully, but these errors were encountered: