Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault in context of Query with Proc::Async / multithreading #117

Closed
Digicrat opened this issue Feb 19, 2018 · 3 comments
Closed

Segfault in context of Query with Proc::Async / multithreading #117

Digicrat opened this issue Feb 19, 2018 · 3 comments

Comments

@Digicrat
Copy link

I understand (after some research) that DBIish is not thread-safe out of the box (this should be documented more prominently and/or fixed, but that's another story). I've modified my application therefore such that all calls are wrapped with a lock (ie: "$!lock.protect({ ... })"), but am still experiencing the same unexplainable issue below.

I've been getting issues where a Proc::Async task is unexpectedly quitting (calls the done callback), but only when I uncomment a line of code that writes to a MySQL Database (calls execute on a prepared statement) in it's stdio callback.

The fact that this $sth.execute() call is the trigger is why I'm logging the issue here. I've also ruled out other issues (and created a temporary workaround) by logging the exact same data to a file instead of calling execute. There's nothing in the app I'm calling with Proc::Async that would exit at that point. My guess is there is an unhandled exception (despite adding an explicit try/catch block that's not triggering) causing the thread to terminate abnormally, possibly combined with a P6/Proc::Async bug hiding the underlying error.

After several tries, one attempt resulted in a SEG FAULT from moar instead of just terminating the thread. A backtrace on the generated coredump shows:
(gdb) bt
#0 0xf74169ab in MVM_string_is_cclass () from /usr/lib/libmoar.so
#1 0xf735cfd1 in MVM_interp_run () from /usr/lib/libmoar.so
#2 0xf736f384 in start_thread () from /usr/lib/libmoar.so
#3 0xf7093f72 in start_thread (arg=0xf32fab40) at pthread_create.c:312
#4 0xf71dd43e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:129

This is with the 2018.01 release, running within a 32-bit Ubuntu Docker image where it crashes the very first time this query is executed from within the Proc::Async callback. Running directly on the parent 64-bit Ubuntu machine (same version) it behaves slightly better in that it takes almost 50 executions before I get an error or segfault. The latter was installed from a binary release via rakudobrew and gives a nonsensical backtrace from it's coredump (below).

(gdb) bt
#0 0x00007f808de3040b in ?? ()
#1 0x0000000007471cd0 in ?? ()
#2 0x0000000000000000 in ?? ()

Unfortunately I can't share my actual script in this case and haven't had the time to try and create a smaller demo-case yet. The design is such that it's possible but unlikely that both threads are trying to write to the DB at the same time, but even if they were they are now all protected by a Perl6 lock to guarantee mutual exclusion.

@bbkr
Copy link

bbkr commented Feb 19, 2018

Hard to tell what's happening without code, but execute should not cause segfaults. Connect, prepare and dispose connection/statement are accessing the same data structures causing issues and must be wrapped.

What I found even weirder is that you're talking about stdio callback, and those are usually inside react block that tame concurrency (only 1 whenever is executed at a time).

Please try to provide demo-case. I'm not DBIish author but I've did tons of workarounds to get it running and maybe you are missing something obvious.

@Digicrat
Copy link
Author

Nothing about this one is making sense. I tried explicitly creating a second DB connection just for this callback, and am still getting the exception. On the other hand, I tried creating a simple test case following the same logical flow and can't reproduce it (I'll try again later). The issue also seems to be less predictable in that on very rare occasions it will run without errors, while on most it will crash very quickly.

I wouldn't discount that I'm missing something obvious -- though even if I am it shouldn't be causing a Segfault from Perl6/moar/rakudo. I'm starting to think the problem isn't directly in DBIish, but some low-level bug/fringe-case, perhaps with the NativeCall interface used by DBIish. (I actually switched from NativeCall to Proc::Async in my own script because the former had a slow but clear memory leak. That one I was able to create a standalone example of and submitted a bug report to the rakudo project).

@rbt rbt mentioned this issue Mar 11, 2020
@rbt
Copy link
Collaborator

rbt commented Apr 9, 2020

I'm pretty sure this is working for MySQL/MariaDB on the most recent Raku and this issue may be closed.

@rbt rbt closed this as completed May 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants