Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault when submitting bulk jobs #5

Closed
unode opened this issue Dec 6, 2017 · 1 comment
Closed

segfault when submitting bulk jobs #5

unode opened this issue Dec 6, 2017 · 1 comment

Comments

@unode
Copy link

unode commented Dec 6, 2017

After:

export DRMAA_LIBRARY_PATH=~/test_drmaa/slurm-drmaa-1.2.0-dev.83fc288/slurm_drmaa/.libs/libdrmaa.so

When using libdrmaa via python

#!/usr/bin/env python
from __future__ import print_function
import os
import drmaa

LOGS = "logs/"
if not os.path.isdir(LOGS):
    os.mkdir(LOGS)

s = drmaa.Session()
s.initialize()
print("Supported contact strings:", s.contact)
print("Supported DRM systems:", s.drmsInfo)
print("Supported DRMAA implementations:", s.drmaaImplementation)
print("Version", s.version)

jt = s.createJobTemplate()
jt.remoteCommand = "/usr/bin/echo"
jt.args = ["Hello", "world"]
jt.jobName = "testdrmaa"
jt.jobEnvironment = os.environ.copy()
jt.workingDirectory = os.getcwd()

jt.outputPath = ":" + os.path.join(LOGS, "job-%A_%a.out")
jt.errorPath = ":" + os.path.join(LOGS, "job-%A_%a.err")
jt.nativeSpecification = "--cpus-per-task=2 --nodes=1 --mem-per-cpu=50 --partition=htc --tmp=100"

print("Submitting", jt.remoteCommand, "with", jt.args, "and logs to", jt.outputPath)
ids = s.runBulkJobs(jt, beginIndex=1, endIndex=2, step=1)
print("Job submitted with ids", ids)

s.deleteJobTemplate(jt)

The above code fails when calling runBulkJobs

Stack trace of the above script:

Program received signal SIGSEGV, Segmentation fault.
strlcpy (dest=dest@entry=0x7a9640 "9829091", src=0x0, size=size@entry=1024) at compat.c:50
50              while( *src  &&  --size > 0 )
(gdb) bt
#0  strlcpy (dest=dest@entry=0x7a9640 "9829091", src=0x0, size=size@entry=1024) at compat.c:50
#1  0x00007fffed772fac in drmaa_get_next_job_id (values=0x7ac5c0, value=0x7a9640 "9829091", value_len=1024) at drmaa_base.c:297
#2  0x00007fffeffed550 in ffi_call_unix64 () at /home/ilan/minonda/conda-bld/python_1494526091235/work/Python-3.6.1/Modules/_ctypes/libffi/src/x86/unix64.S:76
#3  0x00007fffeffeccf5 in ffi_call (cif=<optimized out>, fn=0x7fffed772e90 <drmaa_get_next_job_id>, rvalue=<optimized out>, avalue=0x7fffffffc6c0) at /home/ilan/minonda/conda-bld/python_1494526091235/work/Python-3.6.1/Modules/_ctypes/libffi/src/x86/ffi64.c:525
#4  0x00007fffeffe483c in _call_function_pointer (argcount=3, resmem=0x7fffffffc6f0, restype=<optimized out>, atypes=<optimized out>, avalues=0x7fffffffc6c0, pProc=0x7fffed772e90 <drmaa_get_next_job_id>, flags=4353) at /home/ilan/minonda/conda-bld/python_1494526091235/work/Python-3.6.1/Modules/_ctypes/callproc.c:809
#5  _ctypes_callproc (pProc=0x7fffed772e90 <drmaa_get_next_job_id>, argtuple=0x7fffffffc7e0, flags=4353, argtypes=<optimized out>, restype=0x7ffff0212f28, checker=0x0) at /home/ilan/minonda/conda-bld/python_1494526091235/work/Python-3.6.1/Modules/_ctypes/callproc.c:1147
#6  0x00007fffeffdcda3 in PyCFuncPtr_call (self=<optimized out>, inargs=<optimized out>, kwds=0x0) at /home/ilan/minonda/conda-bld/python_1494526091235/work/Python-3.6.1/Modules/_ctypes/_ctypes.c:3870
#7  0x00007ffff793fade in _PyObject_FastCallDict (func=0x7fffeea655c0, args=<optimized out>, nargs=<optimized out>, kwargs=0x0) at Objects/abstract.c:2316
#8  0x00007ffff7a1c2bb in call_function (pp_stack=0x7fffffffcb18, oparg=<optimized out>, kwnames=0x0) at Python/ceval.c:4822
#9  0x00007ffff7a1f15d in _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3284
#10 0x00007ffff7969e33 in gen_send_ex (gen=0x7fffefd90200, arg=<optimized out>, exc=<optimized out>, closing=<optimized out>) at Objects/genobject.c:189
#11 0x00007ffff7978f3e in listextend (self=0x7fffeea79d48, b=<optimized out>) at Objects/listobject.c:857
#12 0x00007ffff7979398 in list_init (self=0x7fffeea79d48, args=<optimized out>, kw=<optimized out>) at Objects/listobject.c:2316
#13 0x00007ffff79add4c in type_call (type=<optimized out>, args=0x7ffff7e8d470, kwds=0x0) at Objects/typeobject.c:915
#14 0x00007ffff793fade in _PyObject_FastCallDict (func=0x7ffff7d5bb40 <PyList_Type>, args=<optimized out>, nargs=<optimized out>, kwargs=0x0) at Objects/abstract.c:2316
#15 0x00007ffff7a1c2bb in call_function (pp_stack=0x7fffffffce58, oparg=<optimized out>, kwnames=0x0) at Python/ceval.c:4822
#16 0x00007ffff7a1f15d in _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3284
#17 0x00007ffff7a1aa60 in _PyEval_EvalCodeWithName (_co=0x7ffff01fc420, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=1, kwnames=0x7ffff7e9dba0, kwargs=0x7ffff7f8fba8, kwcount=3, kwstep=1, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x7ffff7ea3c30, qualname=0x7fffefd8d2b8) at Python/ceval.c:4128
#18 0x00007ffff7a1c48a in fast_function (kwnames=<optimized out>, nargs=1, stack=<optimized out>, func=0x7fffeea8c2f0) at Python/ceval.c:4939
#19 call_function (pp_stack=0x7fffffffd0f8, oparg=<optimized out>, kwnames=<optimized out>) at Python/ceval.c:4819
#20 0x00007ffff7a1e8dd in _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3300
#21 0x00007ffff7a1aa60 in _PyEval_EvalCodeWithName (_co=0x7ffff7f1b930, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=0, kwnames=0x0, kwargs=0x8, kwcount=0, kwstep=2, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0) at Python/ceval.c:4128
#22 0x00007ffff7a1aee3 in PyEval_EvalCodeEx (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kws=<optimized out>, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0) at Python/ceval.c:4149
#23 0x00007ffff7a1af2b in PyEval_EvalCode (co=<optimized out>, globals=<optimized out>, locals=<optimized out>) at Python/ceval.c:695
#24 0x00007ffff7a4d6c0 in run_mod (arena=0x7ffff7f79180, flags=0x7fffffffd450, locals=0x7ffff7f5cf30, globals=0x7ffff7f5cf30, filename=0x7ffff7ea3830, mod=0x683f58) at Python/pythonrun.c:980
#25 PyRun_FileExFlags (fp=0x64cc30, filename_str=<optimized out>, start=<optimized out>, globals=0x7ffff7f5cf30, locals=0x7ffff7f5cf30, closeit=<optimized out>, flags=0x7fffffffd450) at Python/pythonrun.c:933
#26 0x00007ffff7a4ec83 in PyRun_SimpleFileExFlags (fp=0x64cc30, filename=<optimized out>, closeit=1, flags=0x7fffffffd450) at Python/pythonrun.c:396
#27 0x00007ffff7a6a0b5 in run_file (p_cf=0x7fffffffd450, filename=0x603310 L<error reading variable>, fp=0x64cc30) at Modules/main.c:338
#28 Py_Main (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:810
#29 0x0000000000400c1d in main (argc=2, argv=<optimized out>) at ./Programs/python.c:69--

The above code runs fine with a libdrmaa built from https://github.com/ljyanesm/slurm-drmaa

@unode
Copy link
Author

unode commented May 2, 2018

Confirming that the fix works. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant