Date: 2015-03-03 19:11:43 +0100
From: Richard Hughes <<richard.monetdb>>
To: Merovingian devs <>
Version: 11.19.9 (Oct2014-SP2)
Last updated: 2015-05-07 12:37:39 +0200
Comment 20682
Date: 2015-03-03 19:11:43 +0100
From: Richard Hughes <<richard.monetdb>>
Steps:
Start up quite a few databases (we now have 23 mserver5 instances)
service monetdb5-sql stop
Expected:
Everything exits eventually.
Actual:
We're left with 3 defunct mserver5 processes and monetdbd spinning one core.
This is intermittent, but not uncommon (50% of the time?).
0 0x00007fb090a04add in read () at ../sysdeps/unix/syscall-template.S:81
1 0x00000000004073a1 in read (__nbytes=,
__buf=, __fd=)
at /usr/include/x86_64-linux-gnu/bits/unistd.h:44
2 logFD (fd=57,
type=0x7fb08bee7e30 "database 'test' (31079) has exited with exit status
0", type@entry=0x41953a "MSG",
dbname=0x1f9f <error: Cannot access memory at address 0x1f9f>, pid=-1,
stream=0x223f580) at merovingian.c:152
3 0x000000000040766a in logListener (x=) at merovingian.c:232
4 0x00007fb0909fe0a4 in start_thread (arg=0x7fb08beea700)
at pthread_create.c:309
5 0x00007fb090732cbd in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
I guess what's happening is that the logListener is charging round its do...while loop as fast as possible and never checking _mero_keep_logging because the fd is always readable because it's in the closed state because the child has exited.
I haven't confirmed this hypothesis because that machine is serving actual users so I don't like to shut it down too often. I might try careful application of some big sleep() calls after the select() to see if I can widen the race window on a dev box.
Date: 2015-03-03 19:11:43 +0100
From: Richard Hughes <<richard.monetdb>>
To: Merovingian devs <>
Version: 11.19.9 (Oct2014-SP2)
Last updated: 2015-05-07 12:37:39 +0200
Comment 20682
Date: 2015-03-03 19:11:43 +0100
From: Richard Hughes <<richard.monetdb>>
Steps:
Expected:
Everything exits eventually.
Actual:
We're left with 3 defunct mserver5 processes and monetdbd spinning one core.
This is intermittent, but not uncommon (50% of the time?).
0 0x00007fb090a04add in read () at ../sysdeps/unix/syscall-template.S:81
1 0x00000000004073a1 in read (__nbytes=,
__buf=, __fd=)
at /usr/include/x86_64-linux-gnu/bits/unistd.h:44
2 logFD (fd=57,
type=0x7fb08bee7e30 "database 'test' (31079) has exited with exit status
0", type@entry=0x41953a "MSG",
dbname=0x1f9f <error: Cannot access memory at address 0x1f9f>, pid=-1,
stream=0x223f580) at merovingian.c:152
3 0x000000000040766a in logListener (x=) at merovingian.c:232
4 0x00007fb0909fe0a4 in start_thread (arg=0x7fb08beea700)
at pthread_create.c:309
5 0x00007fb090732cbd in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
I guess what's happening is that the logListener is charging round its do...while loop as fast as possible and never checking _mero_keep_logging because the fd is always readable because it's in the closed state because the child has exited.
I haven't confirmed this hypothesis because that machine is serving actual users so I don't like to shut it down too often. I might try careful application of some big sleep() calls after the select() to see if I can widen the race window on a dev box.
Comment 20792
Date: 2015-04-14 11:46:16 +0200
From: MonetDB Mercurial Repository <>
Changeset 275d5e3b8cf4 made by Sjoerd Mullender sjoerd@acm.org in the MonetDB repo, refers to this bug.
For complete details, see http//devmonetdborg/hg/MonetDB?cmd=changeset;node=275d5e3b8cf4
Changeset description:
Comment 20793
Date: 2015-04-14 12:34:47 +0200
From: @sjoerdmullender
I was able to reproduce the problem and saw that your assessment was correct. It should now be fixed.
The text was updated successfully, but these errors were encountered: