-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Numpy complaining of reference count error in test suite. #8
Comments
Adding the call to |
No comments? Fine then, resolved. |
Hardcode84
added a commit
to Hardcode84/numba
that referenced
this issue
Nov 17, 2020
lower unknown loops to scf.while
esc
added a commit
to esc/numba
that referenced
this issue
May 10, 2024
This fixes a test, where a non-thread safe container is written to during testing To reproduce on at least `linux-64` and `osx-arm64` (and probably others): ``` NUMBA_THREADING_LAYER=workqueue SUBPROC_TEST=1 ./runtests.py -m 32 numba.tests.test_parfors.TestPrangeSpecific.test_tuple_hoisting ``` On `linux-64`, this can be debugged with `gdb`: ``` (gdb) bt closure____locals______listcomp____v15____v2build__list__0=...) at <string>:4438 (gdb) f 9 closure____locals______listcomp____v15____v2build__list__0=...) at <string>:4438 4438 <string>: No such file or directory. (gdb) i args sched = {meminfo = 0x0, parent = 0x0, nitems = 0, itemsize = 0, data = 0x0, shape = {0}, strides = {0}} closure____locals______listcomp____v15____v2build__list__0 = {meminfo = 0x0, parent = 0x0} (gdb) f 8 498 mi->data = NRT_Reallocate(mi->data, size); (gdb) i args mi = 0x21d1660 size = 2510933856 ``` On `osx-arm64` we can use `lldb`: ``` (lldb) run runtests.py -m 32 numba.tests.test_parfors.TestPrangeSpecific.test_tuple_hoisting Process 13575 launched: '/Users/esc/miniconda3-arm64/envs/numba_3.9/bin/python3' (arm64) Parallel: 0. Serial: 1 python3(13575,0x17025b000) malloc: Non-aligned pointer 0x600000256880 being freed (2) python3(13575,0x17025b000) malloc: *** set a breakpoint in malloc_error_break to debug Process 13575 stopped * thread numba#18, stop reason = signal SIGABRT frame #0: 0x000000019c35c704 libsystem_kernel.dylib`__pthread_kill + 8 libsystem_kernel.dylib`: -> 0x19c35c704 <+8>: b.lo 0x19c35c724 ; <+40> 0x19c35c708 <+12>: pacibsp 0x19c35c70c <+16>: stp x29, x30, [sp, #-0x10]! 0x19c35c710 <+20>: mov x29, sp Target 0: (python3) stopped. (lldb) bt * thread numba#18, stop reason = signal SIGABRT * frame #0: 0x000000019c35c704 libsystem_kernel.dylib`__pthread_kill + 8 frame #1: 0x000000019c393c28 libsystem_pthread.dylib`pthread_kill + 288 frame #2: 0x000000019c2a1ae8 libsystem_c.dylib`abort + 180 frame #3: 0x000000019c1c2e28 libsystem_malloc.dylib`malloc_vreport + 908 frame #4: 0x000000019c1d95d4 libsystem_malloc.dylib`malloc_zone_error + 104 frame #5: 0x000000019c1ca620 libsystem_malloc.dylib`_szone_free + 628 frame #6: 0x000000019c1b87f4 libsystem_malloc.dylib`nanov2_realloc + 356 frame #7: 0x000000019c1b85a4 libsystem_malloc.dylib`malloc_zone_realloc + 112 frame numba#8: 0x000000019c1b7110 libsystem_malloc.dylib`realloc + 388 frame numba#9: 0x000000013a9ff0f8 _nrt_python.cpython-39-darwin.so`NRT_MemInfo_varsize_realloc + 60 frame numba#10: 0x000000013d4f41e0 frame numba#11: 0x000000019c393fa8 libsystem_pthread.dylib`_pthread_start + 148 ```
esc
added a commit
to esc/numba
that referenced
this issue
May 10, 2024
This fixes a test, where a non-thread safe container is written to during testing To reproduce on at least `linux-64` and `osx-arm64` (and probably others): ``` NUMBA_THREADING_LAYER=workqueue SUBPROC_TEST=1 ./runtests.py -m 32 numba.tests.test_parfors.TestPrangeSpecific.test_tuple_hoisting ``` On `linux-64`, this can be debugged with `gdb`: ``` (gdb) bt closure____locals______listcomp____v15____v2build__list__0=...) at <string>:4438 ``` On `osx-arm64` we can use `lldb`: ``` (lldb) run runtests.py -m 32 numba.tests.test_parfors.TestPrangeSpecific.test_tuple_hoisting Process 13575 launched: '/Users/esc/miniconda3-arm64/envs/numba_3.9/bin/python3' (arm64) Parallel: 0. Serial: 1 python3(13575,0x17025b000) malloc: Non-aligned pointer 0x600000256880 being freed (2) python3(13575,0x17025b000) malloc: *** set a breakpoint in malloc_error_break to debug Process 13575 stopped * thread numba#18, stop reason = signal SIGABRT frame #0: 0x000000019c35c704 libsystem_kernel.dylib`__pthread_kill + 8 libsystem_kernel.dylib`: -> 0x19c35c704 <+8>: b.lo 0x19c35c724 ; <+40> 0x19c35c708 <+12>: pacibsp 0x19c35c70c <+16>: stp x29, x30, [sp, #-0x10]! 0x19c35c710 <+20>: mov x29, sp Target 0: (python3) stopped. (lldb) bt * thread numba#18, stop reason = signal SIGABRT * frame #0: 0x000000019c35c704 libsystem_kernel.dylib`__pthread_kill + 8 frame #1: 0x000000019c393c28 libsystem_pthread.dylib`pthread_kill + 288 frame #2: 0x000000019c2a1ae8 libsystem_c.dylib`abort + 180 frame #3: 0x000000019c1c2e28 libsystem_malloc.dylib`malloc_vreport + 908 frame #4: 0x000000019c1d95d4 libsystem_malloc.dylib`malloc_zone_error + 104 frame #5: 0x000000019c1ca620 libsystem_malloc.dylib`_szone_free + 628 frame #6: 0x000000019c1b87f4 libsystem_malloc.dylib`nanov2_realloc + 356 frame #7: 0x000000019c1b85a4 libsystem_malloc.dylib`malloc_zone_realloc + 112 frame numba#8: 0x000000019c1b7110 libsystem_malloc.dylib`realloc + 388 frame numba#9: 0x000000013a9ff0f8 _nrt_python.cpython-39-darwin.so`NRT_MemInfo_varsize_realloc + 60 frame numba#10: 0x000000013d4f41e0 frame numba#11: 0x000000019c393fa8 libsystem_pthread.dylib`_pthread_start + 148 ```
esc
added a commit
to esc/numba
that referenced
this issue
May 10, 2024
This fixes a test, where a non-thread safe container is written to during testing To reproduce on at least `linux-64` and `osx-arm64` (and probably others): ``` NUMBA_THREADING_LAYER=workqueue SUBPROC_TEST=1 ./runtests.py -m 32 numba.tests.test_parfors.TestPrangeSpecific.test_tuple_hoisting ``` On `linux-64`, this can be debugged with `gdb`: ``` (gdb) bt 0 0x00007ffff7c8018b in raise () from /lib/x86_64-linux-gnu/libc.so.6 1 0x00007ffff7c5f859 in abort () from /lib/x86_64-linux-gnu/libc.so.6 2 0x00007ffff7cca3ee in ?? () from /lib/x86_64-linux-gnu/libc.so.6 3 0x00007ffff7cd247c in ?? () from /lib/x86_64-linux-gnu/libc.so.6 4 0x00007ffff7cd412c in ?? () from /lib/x86_64-linux-gnu/libc.so.6 5 0x00007ffff7cd6105 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 6 0x00007ffff7cd82d6 in realloc () from /lib/x86_64-linux-gnu/libc.so.6 7 0x00007fffe706de21 in NRT_Reallocate (ptr=0x224aef0, size=2552489952) at numba/core/runtime/nrt.cpp:539 8 0x00007fffe706dcf6 in NRT_MemInfo_varsize_realloc (mi=0x1fe9580, size=2552489952) at numba/core/runtime/nrt.cpp:498 9 0x00007fffdcabde0d in _3cdynamic_3e::__numba_parfor_gufunc_0x7fffdc60eb20[abi:v19][abi:c8tJTC_2fWQAliW1xhDEoY6EEMEUOEMISPGsAQMVj4QniQ4IXKQEMXwoMGLoQDDVsQR1NHAZtvoQrhyQ_2fKR8sTqKIYOQAmjYgkW7ADge6ERATM1UUQpZoA](Array<unsigned long long, 1, C, mutable, aligned>, list_28Tuple_28DictType_5bint64_2cfloat64_5d_3civ_3dNone_3e_2c_20array_28float64_2c_201d_2c_20C_29_29_29_3civ_3dNone_3e) (sched=..., closure____locals______listcomp____v15____v2build__list__0=...) at <string>:4438 10 0x00007fffdcab624e in __gufunc__._ZN13_3cdynamic_3e36__numba_parfor_gufunc_0x7fffdc60eb20B3v19B120c8tJTC_2fWQAliW1xhDEoY6EEMEUOEMISPGsAQMVj4QniQ4IXKQEMXwoMGLoQDDVsQR1NHAZtvoQrhyQ_2fKR8sTqKIYOQAmjYgkW7ADge6ERATM1UUQpZoAE5ArrayIyLi1E1C7mutable7alignedE119list_28Tuple_28DictType_5bint64_2cfloat64_5d_3civ_3dNone_3e_2c_20array_28float64_2c_201d_2c_20C_29_29_29_3civ_3dNone_3e () 11 0x00007fffdc926a3b in thread_worker (arg=0x1bd49c0) at numba/np/ufunc/workqueue.c:567 12 0x00007ffff7f8f609 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 13 0x00007ffff7d5c293 in clone () from /lib/x86_64-linux-gnu/libc.so.6 ``` On `osx-arm64` we can use `lldb`: ``` (lldb) run runtests.py -m 32 numba.tests.test_parfors.TestPrangeSpecific.test_tuple_hoisting Process 13575 launched: '/Users/esc/miniconda3-arm64/envs/numba_3.9/bin/python3' (arm64) Parallel: 0. Serial: 1 python3(13575,0x17025b000) malloc: Non-aligned pointer 0x600000256880 being freed (2) python3(13575,0x17025b000) malloc: *** set a breakpoint in malloc_error_break to debug Process 13575 stopped * thread numba#18, stop reason = signal SIGABRT frame #0: 0x000000019c35c704 libsystem_kernel.dylib`__pthread_kill + 8 libsystem_kernel.dylib`: -> 0x19c35c704 <+8>: b.lo 0x19c35c724 ; <+40> 0x19c35c708 <+12>: pacibsp 0x19c35c70c <+16>: stp x29, x30, [sp, #-0x10]! 0x19c35c710 <+20>: mov x29, sp Target 0: (python3) stopped. (lldb) bt * thread numba#18, stop reason = signal SIGABRT * frame #0: 0x000000019c35c704 libsystem_kernel.dylib`__pthread_kill + 8 frame #1: 0x000000019c393c28 libsystem_pthread.dylib`pthread_kill + 288 frame #2: 0x000000019c2a1ae8 libsystem_c.dylib`abort + 180 frame #3: 0x000000019c1c2e28 libsystem_malloc.dylib`malloc_vreport + 908 frame #4: 0x000000019c1d95d4 libsystem_malloc.dylib`malloc_zone_error + 104 frame #5: 0x000000019c1ca620 libsystem_malloc.dylib`_szone_free + 628 frame #6: 0x000000019c1b87f4 libsystem_malloc.dylib`nanov2_realloc + 356 frame #7: 0x000000019c1b85a4 libsystem_malloc.dylib`malloc_zone_realloc + 112 frame numba#8: 0x000000019c1b7110 libsystem_malloc.dylib`realloc + 388 frame numba#9: 0x000000013a9ff0f8 _nrt_python.cpython-39-darwin.so`NRT_MemInfo_varsize_realloc + 60 frame numba#10: 0x000000013d4f41e0 frame numba#11: 0x000000019c393fa8 libsystem_pthread.dylib`_pthread_start + 148 ```
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When I run the test suite I'm (almost) consistently seeing the following output:
I'm pleased to see 32 of 33 tests pass, but the reported reference count error indicates we aren't doing something correctly.
On the surface, this looks very much like a problem reported on Numpy-discussion here: http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059866.html
That discussion identifies the Numpy C API where this error message is output.
The reference count error started happening in
test_all.py
runs when I got thefilter2d()
test passing (see.../tests/test_filter2d.py
; note that running this test alone doesn't seem to cause this problem). If this was a problem with reference counts on the result of a call toPyArray_Zeros()
, I would have thought it would have shown up before now (by way of.../tests/test_extern_call.py
). I can try adding a call toPy_IncRef()
in the generated code and see if that fixes it, but I'm not sure that just won't introduce a memory leak. I understand we don't care too much about memory leaks in the compiler at the moment, but leaks in generated code would be much more serious.The text was updated successfully, but these errors were encountered: