Describe the bug
I would like to run some complex algorithms on DNA sequence data in MonetDB using the embedded Python.
It is a dynamic programming algorithm implemented in Python, lets call it dp().
The inputs are two strings and some integer parameters in a table form (multiple rows), and the output is a table (2 strings and an integer in each row).
I inserted 50 strings in table A (~100 chars).
My main function would consist of two steps:
Step 1: sample 10 strings from A, run the dp() on each pair, and store some subresults in another table (usually results in couple hundred records)
Step 2: use a modified version of dp() on all the records from A using the subresults from Step 1.
If I run only Step 1 multiple times it works fine. If I run only Step 2 multiple times it works fine.
If I run Step 1 and Step 2 sequentually after starting the server it works fine.
But if I want to run Step 1, Step 2 in a loop (for measuring time etc) then I get errors:
Python error: the input parameters are not defined (although the exact same code ran without error before).
#2022-02-15 17:26:50: client1: createExceptionInternal: !ERROR: MALException:pyapi3.eval:PY000!Python exception
#2022-02-15 17:26:50: client1: createExceptionInternal: !ERROR: > 6. x, y = s1, s2
#2022-02-15 17:26:50: client1: createExceptionInternal: !ERROR: name 's1' is not defined
And the function looks like dp(s1, s2, ...)
Segmentation fault: mserver5 stops, with gdb the cause (see github repo for full backtrace):
Thread 30 "mserver5" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe3483700 (LWP 28220)]
0x00007ffff1d6d763 in PyAPIeval (cntxt=0x5555555dfc70, mb=0x7fffc4068180, stk=0x7fffc40f2170, pci=0x7fffc40b1060, grouped=false, mapped=false) at /home/<...>/MonetDB-11.43.9/sql/backends/monet5/UDF/pyapi3/pyapi3.c:226
226 varres = sqlfun ? sqlfun->varres : 0;
Software versions
MonetDB version build from source: 11.43.9
Running on Ubuntu 20.04.
Python version: 3.8.10.
Additional context
I tried to debug my code, but it is suspicious that the exact same code runs fine for the first time, but not for the second time.
I even removed the random sampling and fixed the rows it chooses.
Could it be some kind of memory issue? The server had 4GB RAM available (but I also tried it with 12GB on Windows).
Should it be possible to run relatively complex Python algorithms inside MonetDB?
Thanks in advance!
The text was updated successfully, but these errors were encountered:
…d, because it is scoped to the current session only.
Use the function itself instead.
However improvement has to be done for loader functions, because they use information from the resolved function. It should be done differently
Describe the bug
I would like to run some complex algorithms on DNA sequence data in MonetDB using the embedded Python.
It is a dynamic programming algorithm implemented in Python, lets call it dp().
The inputs are two strings and some integer parameters in a table form (multiple rows), and the output is a table (2 strings and an integer in each row).
I inserted 50 strings in table A (~100 chars).
My main function would consist of two steps:
If I run only Step 1 multiple times it works fine. If I run only Step 2 multiple times it works fine.
If I run Step 1 and Step 2 sequentually after starting the server it works fine.
But if I want to run Step 1, Step 2 in a loop (for measuring time etc) then I get errors:
And the function looks like dp(s1, s2, ...)
To Reproduce
In this repository you can find the scripts to reproduce the problem (dp() is the Needleman-Wunsch algorithm): https://github.com/liptakpanna/monetdb_reproduce
Software versions
MonetDB version build from source: 11.43.9
Running on Ubuntu 20.04.
Python version: 3.8.10.
Additional context
I tried to debug my code, but it is suspicious that the exact same code runs fine for the first time, but not for the second time.
I even removed the random sampling and fixed the rows it chooses.
Could it be some kind of memory issue? The server had 4GB RAM available (but I also tried it with 12GB on Windows).
Should it be possible to run relatively complex Python algorithms inside MonetDB?
Thanks in advance!
The text was updated successfully, but these errors were encountered: