Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Valgrinderr #461

Merged
merged 2 commits into from
Mar 26, 2020
Merged

Valgrinderr #461

merged 2 commits into from
Mar 26, 2020

Conversation

nrnhines
Copy link
Member

Fix valgrind invalid read error during finalize that could cause segmentation violations and
a memory leak when iterating over variables of a mechanism.

Block was alloc'd during nrnpy_hoc (nrnpy_hoc.cpp:2690)
The hocmodule refcnt was mistakenly decremented.
@nrnhines nrnhines self-assigned this Mar 26, 2020
@pramodk
Copy link
Member

pramodk commented Mar 26, 2020

@nrnhines : could you bit provide details about this fix? Yesterday, me and @jorblancoa were debugging issue with the model which was working fine with Python2 but giving segfault with Python3. The backtrace was looking like this:

(gdb) bt
#0  0x00002aaaac785c7c in call_picklef (fname=0xb5cdb0 "\200\003c__main__\noptim\nq", size=21, narg=4, retsize=0x7fffffff3e30) at /scratch/snx3000/bp000174/soft/.stage/spack-stage-neuron-7.8.0b-srdjvddtoyp3bjp3n3z3quqfmar6rjx6/spack-src/src/nrnpython/nrnpy_p2h.cpp:733
#1  0x00002aaaab2a3d58 in BBSImpl::execute_helper (this=0xa0f490, size=0x7fffffff3e30, id=20, exec=true) at /scratch/snx3000/bp000174/soft/.stage/spack-stage-neuron-7.8.0b-srdjvddtoyp3bjp3n3z3quqfmar6rjx6/spack-src/src/nrniv/../parallel/ocbbs.cpp:1267
#2  0x00002aaaab2a4f28 in BBSImpl::execute (this=0xa0f490, id=20) at /scratch/snx3000/bp000174/soft/.stage/spack-stage-neuron-7.8.0b-srdjvddtoyp3bjp3n3z3quqfmar6rjx6/spack-src/src/nrniv/../parallel/bbs.cpp:304
#3  0x00002aaaab2a5762 in BBSImpl::worker (this=0xa0f490) at /scratch/snx3000/bp000174/soft/.stage/spack-stage-neuron-7.8.0b-srdjvddtoyp3bjp3n3z3quqfmar6rjx6/spack-src/src/nrniv/../parallel/bbs.cpp:440
#4  0x00002aaaab2a567e in BBS::worker (this=0x915470) at /scratch/snx3000/bp000174/soft/.stage/spack-stage-neuron-7.8.0b-srdjvddtoyp3bjp3n3z3quqfmar6rjx6/spack-src/src/nrniv/../parallel/bbs.cpp:422
#5  0x00002aaaab2a0e36 in worker (v=0x915470) at /scratch/snx3000/bp000174/soft/.stage/spack-stage-neuron-7.8.0b-srdjvddtoyp3bjp3n3z3quqfmar6rjx6/spack-src/src/nrniv/../parallel/ocbbs.cpp:273
#6  0x00002aaaaaf3a765 in hoc_call_ob_proc (ob=0xa50e80, sym=0x73b7c0, narg=0) at /scratch/snx3000/bp000174/soft/.stage/spack-stage-neuron-7.8.0b-srdjvddtoyp3bjp3n3z3quqfmar6rjx6/spack-src/src/nrnoc/../oc/hoc_oop.c:676
#7  0x00002aaaaaf3b8e7 in hoc_object_component () at /scratch/snx3000/bp000174/soft/.stage/spack-stage-neuron-7.8.0b-srdjvddtoyp3bjp3n3z3quqfmar6rjx6/spack-src/src/nrnoc/../oc/hoc_oop.c:1091
#8  0x00002aaaac7742e3 in component (po=0x2aaac53a2b28) at /scratch/snx3000/bp000174/soft/.stage/spack-stage-neuron-7.8.0b-srdjvddtoyp3bjp3n3z3quqfmar6rjx6/spack-src/src/nrnpython/nrnpy_hoc.cpp:446
#9  0x00002aaaac77493a in fcall (vself=0x2aaac53a2b28, vargs=0x2aaaaac84048) at /scratch/snx3000/bp000174/soft/.stage/spack-stage-neuron-7.8.0b-srdjvddtoyp3bjp3n3z3quqfmar6rjx6/spack-src/src/nrnpython/nrnpy_hoc.cpp:638
#10 0x00002aaaab26a293 in OcJumpImpl::fpycall (this=0x9de9c0, f=0x2aaaac7748b0 <fcall(void*, void*)>, a=0x2aaac53a2b28, b=0x2aaaaac84048) at /scratch/snx3000/bp000174/soft/.stage/spack-stage-neuron-7.8.0b-srdjvddtoyp3bjp3n3z3quqfmar6rjx6/spack-src/src/nrniv/../ivoc/ocjump.cpp:218
#11 0x00002aaaab26a076 in OcJump::fpycall (this=0xb825b0, f=0x2aaaac7748b0 <fcall(void*, void*)>, a=0x2aaac53a2b28, b=0x2aaaaac84048) at /scratch/snx3000/bp000174/soft/.stage/spack-stage-neuron-7.8.0b-srdjvddtoyp3bjp3n3z3quqfmar6rjx6/spack-src/src/nrniv/../ivoc/ocjump.cpp:151
#12 0x00002aaaac774d01 in hocobj_call (self=0x2aaac53a2b28, args=0x2aaaaac84048, kwrds=0x0) at /scratch/snx3000/bp000174/soft/.stage/spack-stage-neuron-7.8.0b-srdjvddtoyp3bjp3n3z3quqfmar6rjx6/spack-src/src/nrnpython/nrnpy_hoc.cpp:735
#13 0x00002aaaad6859e9 in _PyObject_FastCallDict (func=0x2aaac53a2b28, args=<optimized out>, nargs=<optimized out>, kwargs=kwargs@entry=0x0) at Objects/abstract.c:2331
#14 0x00002aaaad685e61 in _PyObject_FastCallKeywords (func=func@entry=0x2aaac53a2b28, stack=stack@entry=0xba74e0, nargs=nargs@entry=0, kwnames=kwnames@entry=0x0) at Objects/abstract.c:2496
#15 0x00002aaaad779db9 in call_function (pp_stack=pp_stack@entry=0x7fffffff4470, oparg=<optimized out>, kwnames=kwnames@entry=0x0) at Python/ceval.c:4848
#16 0x00002aaaad77cb45 in _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3322
#17 0x00002aaaad77a259 in _PyFunction_FastCall (globals=<optimized out>, nargs=7, args=<optimized out>, co=<optimized out>) at Python/ceval.c:4906
#18 fast_function (kwnames=0x0, nargs=7, stack=0x2aaaba2bbee8, func=0x2aaaba26b158) at Python/ceval.c:4941
#19 call_function (pp_stack=pp_stack@entry=0x7fffffff45e0, oparg=<optimized out>, kwnames=kwnames@entry=0x0) at Python/ceval.c:4845
#20 0x00002aaaad77cb45 in _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3322
#21 0x00002aaaad779c5a in _PyEval_EvalCodeWithName (_co=_co@entry=0x2aaaba30cc00, globals=globals@entry=0x2aaaba266c18, locals=locals@entry=0x2aaaba266c18, args=args@entry=0x0, argcount=argcount@entry=0, kwnames=kwnames@entry=0x0, kwargs=0x0, kwcount=0, kwstep=2, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0) at Python/ceval.c:4153
#22 0x00002aaaad77a2be in PyEval_EvalCodeEx (_co=_co@entry=0x2aaaba30cc00, globals=globals@entry=0x2aaaba266c18, locals=locals@entry=0x2aaaba266c18, args=args@entry=0x0, argcount=argcount@entry=0, kws=kws@entry=0x0, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0) at Python/ceval.c:4174
#23 0x00002aaaad77a2eb in PyEval_EvalCode (co=co@entry=0x2aaaba30cc00, globals=globals@entry=0x2aaaba266c18, locals=locals@entry=0x2aaaba266c18) at Python/ceval.c:730
#24 0x00002aaaad7afa6a in run_mod (arena=0x2aaaba1f4198, flags=0x0, locals=0x2aaaba266c18, globals=0x2aaaba266c18, filename=0x2aaaba356430, mod=0x7efba0) at Python/pythonrun.c:1025
#25 PyRun_FileExFlags (fp=fp@entry=0x7c8770, filename_str=filename_str@entry=0x7fffffff5dc4 "fitting.3.py", start=start@entry=257, globals=globals@entry=0x2aaaba266c18, locals=locals@entry=0x2aaaba266c18, closeit=closeit@entry=0, flags=0x0) at Python/pythonrun.c:978
#26 0x00002aaaad7afbbf in PyRun_SimpleFileExFlags (fp=0x7c8770, filename=<optimized out>, closeit=0, flags=0x0) at Python/pythonrun.c:420
#27 0x00002aaaac772ce4 in nrnpy_pyrun (fname=0x7fffffff5dc4 "fitting.3.py") at /scratch/snx3000/bp000174/soft/.stage/spack-stage-neuron-7.8.0b-srdjvddtoyp3bjp3n3z3quqfmar6rjx6/spack-src/src/nrnpython/nrnpython.cpp:108
#28 0x00002aaaac773183 in nrnpython_start (b=2) at /scratch/snx3000/bp000174/soft/.stage/spack-stage-neuron-7.8.0b-srdjvddtoyp3bjp3n3z3quqfmar6rjx6/spack-src/src/nrnpython/nrnpython.cpp:242
#29 0x00000000004036a0 in __static_initialization_and_destruction_0 (__initialize_p=0, __priority=4205504) at /scratch/snx3000/bp000174/soft/.stage/spack-stage-neuron-7.8.0b-srdjvddtoyp3bjp3n3z3quqfmar6rjx6/spack-src/src/ivoc/ivocmain.cpp:830
#30 0x0000000000402e88 in nrn_optarg_on (opt=0xb00402bc0 <error: Cannot access memory at address 0xb00402bc0>, pargc=0x7fffffff4ac8, argv=0x60a940) at /scratch/snx3000/bp000174/soft/.stage/spack-stage-neuron-7.8.0b-srdjvddtoyp3bjp3n3z3quqfmar6rjx6/spack-src/src/ivoc/ivocmain.cpp:296
#31 0x00002aaaaeaebf8a in __libc_start_main () from /lib64/libc.so.6
#32 0x0000000000402bea in nrn_nvkludge_dummy () at /scratch/snx3000/bp000174/soft/.stage/spack-stage-neuron-7.8.0b-srdjvddtoyp3bjp3n3z3quqfmar6rjx6/spack-src/src/nrniv/nvkludge.cpp:38
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Simply example with bulletin-board was working fine (relevant discussion here)

@nrnhines
Copy link
Member Author

nrnhines commented Mar 26, 2020

I noticed yesterday that although the pyiter3 pull request passed the travis ci that I was seeing
segmentation violations on my mac with 'make test' and also the same problem with python3.8 and
python3.6 on linux (though python3.7 and python3.5 worked). Running the tests that were giving
segfaults using valgrind, I saw that during finalize there were many invalid read errors due to
reading memory that was allocated during PyModule_Create. On removing the extra Py_DECREF
(which had been present for many years) all the invalid read errors went away. I thing it is worth
trying your test again with this change.

@pramodk
Copy link
Member

pramodk commented Mar 26, 2020

If this is related to pyiter3 only then it's not relevant for the bug that I mentioned. We are using following commit:

 → git show 92a208b
commit 92a208b6d7787c708f3c4fdd8121af9df22b6c6f
Author: adamjhn <adam.newton@yale.edu>
Date:   Tue Oct 29 12:49:07 2019 -0400

    bugfix: exception when deleting a species/parameter before initialization. (#295)

In case you know any reason that the Python2 model work but when it is ported to Python3 with minor changes would segfault, let me know. (I understand this is very high level description without any details but in case there is common pitfall with NEURON usage).

@nrnhines
Copy link
Member Author

I don't know that it is relevant, but it may be the fix you need. That bug dates back a long way

@nrnhines nrnhines merged commit fd87eaf into master Mar 26, 2020
@nrnhines nrnhines deleted the valgrinderr branch March 26, 2020 23:32
@pramodk
Copy link
Member

pramodk commented Mar 26, 2020

I don't know that it is relevant, but it may be the fix you need. That bug dates back a long way

Sorry for confusion - I meant we are using master branch from Tue Oct 29 12:49:07 2019.

@nrnhines
Copy link
Member Author

This is the kind of thing that can be a revealed bug or not depending on whether the particular
python reuses the module memory of an module that has one less reference than it needs.
The bug is specfic to python3. Things currently fail with python3.6.10 and python3.8.1 and succeed
with python3.5.9 and python3.7.6 on my linux system and failed on my mac. Now it succeeds
for all of those and the valgrind errors are gone.

olupton pushed a commit that referenced this pull request Dec 7, 2022
* Made sure nightly tag is set correctly 
* moved manual trigger variables into UI
* Fixed various issues for testing on different python versions
* Generate artifacts for wheels

This fixes #448
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants