Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numpy complaining of reference count error in test suite. #8

Closed
jriehl opened this issue Jun 19, 2012 · 2 comments
Closed

Numpy complaining of reference count error in test suite. #8

jriehl opened this issue Jun 19, 2012 · 2 comments

Comments

@jriehl
Copy link
Contributor

jriehl commented Jun 19, 2012

When I run the test suite I'm (almost) consistently seeing the following output:

(numba)jriehl@dingo:~/git/numba/tests$ python -O test_all.py 
...
----------------------------------------------------------------------
Ran 33 tests in 0.369s

OK (skipped=1)
*** Reference count error detected: 
an attempt was made to deallocate 12 (d) ***
(numba)jriehl@dingo:~/git/numba/tests$  

I'm pleased to see 32 of 33 tests pass, but the reported reference count error indicates we aren't doing something correctly.

On the surface, this looks very much like a problem reported on Numpy-discussion here: http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059866.html

That discussion identifies the Numpy C API where this error message is output.

The reference count error started happening in test_all.py runs when I got the filter2d() test passing (see .../tests/test_filter2d.py; note that running this test alone doesn't seem to cause this problem). If this was a problem with reference counts on the result of a call to PyArray_Zeros(), I would have thought it would have shown up before now (by way of .../tests/test_extern_call.py). I can try adding a call to Py_IncRef() in the generated code and see if that fixes it, but I'm not sure that just won't introduce a memory leak. I understand we don't care too much about memory leaks in the compiler at the moment, but leaks in generated code would be much more serious.

@jriehl
Copy link
Contributor Author

jriehl commented Jun 19, 2012

Adding the call to Py_IncRef() in the code generated to simulate numpy.zeros_like() seemed to suppress the error report, but I can't guarantee I haven't introduced a memory leak in generated code. Comments welcome.

@jriehl
Copy link
Contributor Author

jriehl commented Aug 13, 2012

No comments? Fine then, resolved.

@jriehl jriehl closed this as completed Aug 13, 2012
Hardcode84 added a commit to Hardcode84/numba that referenced this issue Nov 17, 2020
esc added a commit to esc/numba that referenced this issue May 10, 2024
This fixes a test, where a non-thread safe container is written to
during testing

To reproduce on at least `linux-64` and `osx-arm64` (and probably
others):

```
NUMBA_THREADING_LAYER=workqueue SUBPROC_TEST=1 ./runtests.py -m 32 numba.tests.test_parfors.TestPrangeSpecific.test_tuple_hoisting
```

On `linux-64`, this can be debugged with `gdb`:

```
(gdb) bt
    closure____locals______listcomp____v15____v2build__list__0=...) at <string>:4438
(gdb) f 9
    closure____locals______listcomp____v15____v2build__list__0=...) at <string>:4438
4438    <string>: No such file or directory.
(gdb) i args
sched = {meminfo = 0x0, parent = 0x0, nitems = 0, itemsize = 0, data = 0x0, shape = {0}, strides = {0}}
closure____locals______listcomp____v15____v2build__list__0 = {meminfo = 0x0, parent = 0x0}
(gdb) f 8
498         mi->data = NRT_Reallocate(mi->data, size);
(gdb) i args
mi = 0x21d1660
size = 2510933856
```

On `osx-arm64` we can use `lldb`:

```
(lldb) run runtests.py -m 32 numba.tests.test_parfors.TestPrangeSpecific.test_tuple_hoisting
Process 13575 launched: '/Users/esc/miniconda3-arm64/envs/numba_3.9/bin/python3' (arm64)
Parallel: 0. Serial: 1
python3(13575,0x17025b000) malloc: Non-aligned pointer 0x600000256880 being freed (2)
python3(13575,0x17025b000) malloc: *** set a breakpoint in malloc_error_break to debug
Process 13575 stopped
* thread numba#18, stop reason = signal SIGABRT
    frame #0: 0x000000019c35c704 libsystem_kernel.dylib`__pthread_kill + 8
libsystem_kernel.dylib`:
->  0x19c35c704 <+8>:  b.lo   0x19c35c724               ; <+40>
    0x19c35c708 <+12>: pacibsp
    0x19c35c70c <+16>: stp    x29, x30, [sp, #-0x10]!
    0x19c35c710 <+20>: mov    x29, sp
Target 0: (python3) stopped.
(lldb) bt
* thread numba#18, stop reason = signal SIGABRT
  * frame #0: 0x000000019c35c704 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x000000019c393c28 libsystem_pthread.dylib`pthread_kill + 288
    frame #2: 0x000000019c2a1ae8 libsystem_c.dylib`abort + 180
    frame #3: 0x000000019c1c2e28 libsystem_malloc.dylib`malloc_vreport + 908
    frame #4: 0x000000019c1d95d4 libsystem_malloc.dylib`malloc_zone_error + 104
    frame #5: 0x000000019c1ca620 libsystem_malloc.dylib`_szone_free + 628
    frame #6: 0x000000019c1b87f4 libsystem_malloc.dylib`nanov2_realloc + 356
    frame #7: 0x000000019c1b85a4 libsystem_malloc.dylib`malloc_zone_realloc + 112
    frame numba#8: 0x000000019c1b7110 libsystem_malloc.dylib`realloc + 388
    frame numba#9: 0x000000013a9ff0f8 _nrt_python.cpython-39-darwin.so`NRT_MemInfo_varsize_realloc + 60
    frame numba#10: 0x000000013d4f41e0
    frame numba#11: 0x000000019c393fa8 libsystem_pthread.dylib`_pthread_start + 148
```
esc added a commit to esc/numba that referenced this issue May 10, 2024
This fixes a test, where a non-thread safe container is written to
during testing

To reproduce on at least `linux-64` and `osx-arm64` (and probably
others):

```
NUMBA_THREADING_LAYER=workqueue SUBPROC_TEST=1 ./runtests.py -m 32 numba.tests.test_parfors.TestPrangeSpecific.test_tuple_hoisting
```

On `linux-64`, this can be debugged with `gdb`:

```
(gdb) bt
    closure____locals______listcomp____v15____v2build__list__0=...) at <string>:4438

```

On `osx-arm64` we can use `lldb`:

```
(lldb) run runtests.py -m 32 numba.tests.test_parfors.TestPrangeSpecific.test_tuple_hoisting
Process 13575 launched: '/Users/esc/miniconda3-arm64/envs/numba_3.9/bin/python3' (arm64)
Parallel: 0. Serial: 1
python3(13575,0x17025b000) malloc: Non-aligned pointer 0x600000256880 being freed (2)
python3(13575,0x17025b000) malloc: *** set a breakpoint in malloc_error_break to debug
Process 13575 stopped
* thread numba#18, stop reason = signal SIGABRT
    frame #0: 0x000000019c35c704 libsystem_kernel.dylib`__pthread_kill + 8
libsystem_kernel.dylib`:
->  0x19c35c704 <+8>:  b.lo   0x19c35c724               ; <+40>
    0x19c35c708 <+12>: pacibsp
    0x19c35c70c <+16>: stp    x29, x30, [sp, #-0x10]!
    0x19c35c710 <+20>: mov    x29, sp
Target 0: (python3) stopped.
(lldb) bt
* thread numba#18, stop reason = signal SIGABRT
  * frame #0: 0x000000019c35c704 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x000000019c393c28 libsystem_pthread.dylib`pthread_kill + 288
    frame #2: 0x000000019c2a1ae8 libsystem_c.dylib`abort + 180
    frame #3: 0x000000019c1c2e28 libsystem_malloc.dylib`malloc_vreport + 908
    frame #4: 0x000000019c1d95d4 libsystem_malloc.dylib`malloc_zone_error + 104
    frame #5: 0x000000019c1ca620 libsystem_malloc.dylib`_szone_free + 628
    frame #6: 0x000000019c1b87f4 libsystem_malloc.dylib`nanov2_realloc + 356
    frame #7: 0x000000019c1b85a4 libsystem_malloc.dylib`malloc_zone_realloc + 112
    frame numba#8: 0x000000019c1b7110 libsystem_malloc.dylib`realloc + 388
    frame numba#9: 0x000000013a9ff0f8 _nrt_python.cpython-39-darwin.so`NRT_MemInfo_varsize_realloc + 60
    frame numba#10: 0x000000013d4f41e0
    frame numba#11: 0x000000019c393fa8 libsystem_pthread.dylib`_pthread_start + 148
```
esc added a commit to esc/numba that referenced this issue May 10, 2024
This fixes a test, where a non-thread safe container is written to
during testing

To reproduce on at least `linux-64` and `osx-arm64` (and probably
others):

```
NUMBA_THREADING_LAYER=workqueue SUBPROC_TEST=1 ./runtests.py -m 32 numba.tests.test_parfors.TestPrangeSpecific.test_tuple_hoisting
```

On `linux-64`, this can be debugged with `gdb`:

```
(gdb) bt
0  0x00007ffff7c8018b in raise () from /lib/x86_64-linux-gnu/libc.so.6
1  0x00007ffff7c5f859 in abort () from /lib/x86_64-linux-gnu/libc.so.6
2  0x00007ffff7cca3ee in ?? () from /lib/x86_64-linux-gnu/libc.so.6
3  0x00007ffff7cd247c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
4  0x00007ffff7cd412c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
5  0x00007ffff7cd6105 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
6  0x00007ffff7cd82d6 in realloc () from /lib/x86_64-linux-gnu/libc.so.6
7  0x00007fffe706de21 in NRT_Reallocate (ptr=0x224aef0, size=2552489952) at numba/core/runtime/nrt.cpp:539
8  0x00007fffe706dcf6 in NRT_MemInfo_varsize_realloc (mi=0x1fe9580, size=2552489952) at numba/core/runtime/nrt.cpp:498
9  0x00007fffdcabde0d in _3cdynamic_3e::__numba_parfor_gufunc_0x7fffdc60eb20[abi:v19][abi:c8tJTC_2fWQAliW1xhDEoY6EEMEUOEMISPGsAQMVj4QniQ4IXKQEMXwoMGLoQDDVsQR1NHAZtvoQrhyQ_2fKR8sTqKIYOQAmjYgkW7ADge6ERATM1UUQpZoA](Array<unsigned long long, 1, C, mutable, aligned>, list_28Tuple_28DictType_5bint64_2cfloat64_5d_3civ_3dNone_3e_2c_20array_28float64_2c_201d_2c_20C_29_29_29_3civ_3dNone_3e) (sched=...,
   closure____locals______listcomp____v15____v2build__list__0=...) at <string>:4438
10 0x00007fffdcab624e in __gufunc__._ZN13_3cdynamic_3e36__numba_parfor_gufunc_0x7fffdc60eb20B3v19B120c8tJTC_2fWQAliW1xhDEoY6EEMEUOEMISPGsAQMVj4QniQ4IXKQEMXwoMGLoQDDVsQR1NHAZtvoQrhyQ_2fKR8sTqKIYOQAmjYgkW7ADge6ERATM1UUQpZoAE5ArrayIyLi1E1C7mutable7alignedE119list_28Tuple_28DictType_5bint64_2cfloat64_5d_3civ_3dNone_3e_2c_20array_28float64_2c_201d_2c_20C_29_29_29_3civ_3dNone_3e ()
11 0x00007fffdc926a3b in thread_worker (arg=0x1bd49c0) at numba/np/ufunc/workqueue.c:567
12 0x00007ffff7f8f609 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
13 0x00007ffff7d5c293 in clone () from /lib/x86_64-linux-gnu/libc.so.6
```

On `osx-arm64` we can use `lldb`:

```
(lldb) run runtests.py -m 32 numba.tests.test_parfors.TestPrangeSpecific.test_tuple_hoisting
Process 13575 launched: '/Users/esc/miniconda3-arm64/envs/numba_3.9/bin/python3' (arm64)
Parallel: 0. Serial: 1
python3(13575,0x17025b000) malloc: Non-aligned pointer 0x600000256880 being freed (2)
python3(13575,0x17025b000) malloc: *** set a breakpoint in malloc_error_break to debug
Process 13575 stopped
* thread numba#18, stop reason = signal SIGABRT
    frame #0: 0x000000019c35c704 libsystem_kernel.dylib`__pthread_kill + 8
libsystem_kernel.dylib`:
->  0x19c35c704 <+8>:  b.lo   0x19c35c724               ; <+40>
    0x19c35c708 <+12>: pacibsp
    0x19c35c70c <+16>: stp    x29, x30, [sp, #-0x10]!
    0x19c35c710 <+20>: mov    x29, sp
Target 0: (python3) stopped.
(lldb) bt
* thread numba#18, stop reason = signal SIGABRT
  * frame #0: 0x000000019c35c704 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x000000019c393c28 libsystem_pthread.dylib`pthread_kill + 288
    frame #2: 0x000000019c2a1ae8 libsystem_c.dylib`abort + 180
    frame #3: 0x000000019c1c2e28 libsystem_malloc.dylib`malloc_vreport + 908
    frame #4: 0x000000019c1d95d4 libsystem_malloc.dylib`malloc_zone_error + 104
    frame #5: 0x000000019c1ca620 libsystem_malloc.dylib`_szone_free + 628
    frame #6: 0x000000019c1b87f4 libsystem_malloc.dylib`nanov2_realloc + 356
    frame #7: 0x000000019c1b85a4 libsystem_malloc.dylib`malloc_zone_realloc + 112
    frame numba#8: 0x000000019c1b7110 libsystem_malloc.dylib`realloc + 388
    frame numba#9: 0x000000013a9ff0f8 _nrt_python.cpython-39-darwin.so`NRT_MemInfo_varsize_realloc + 60
    frame numba#10: 0x000000013d4f41e0
    frame numba#11: 0x000000019c393fa8 libsystem_pthread.dylib`_pthread_start + 148
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant