Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Returning borrowed reference Python objects causes reference count error. #21

Closed
alexbw opened this issue Aug 15, 2012 · 9 comments
Closed

Comments

@alexbw
Copy link

alexbw commented Aug 15, 2012

Making a small modification to the fbcorr code like this, to call the fbcorr() method from within a for loop causes a crash:

import numpy as np

from numba.decorators import jit
nd4type = [[[['d']]]]

@jit(ret_type=nd4type, arg_types=(nd4type, nd4type, nd4type))
def fbcorr(imgs, filters, output):
    n_imgs, n_rows, n_cols, n_channels = imgs.shape
    n_filters, height, width, n_ch2 = filters.shape

    for ii in range(n_imgs):
        for rr in range(n_rows - height + 1):
            for cc in range(n_cols - width + 1):
                for hh in xrange(height):
                    for ww in xrange(width):
                        for jj in range(n_channels):
                            for ff in range(n_filters):
                                imgval = imgs[ii, rr + hh, cc + ww, jj]
                                filterval = filters[ff, hh, ww, jj]
                                output[ii, ff, rr, cc] += imgval * filterval

    return output

def main ():
    imgs = np.random.randn(10, 64, 64, 3)
    filt = np.random.randn(6, 5, 5, 3)
    output = np.zeros((10, 60, 60, 6))

    import time
    t0 = time.time()
    for i in range(5):
        fbcorr(imgs, filt, output)
    print time.time() - t0

Produces the following crash:

In [3]: main()
python(59743,0x7fff7bf5d960) malloc: *** error for object 0x10dcd5000: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
[1]    59743 abort      ipython
@jriehl
Copy link
Contributor

jriehl commented Aug 17, 2012

Good catch. James submitted (and I pushed) that example knowing that it was causing segmentation faults, but we didn't have time to debug it at the SciPy sprint.

(Conjecture, but feels spot on:) This code (along with other code posted on the discussion list) returns the output parameter. Since the generated code doesn't increment the reference count, the output array gets free()'d when the return value is ignored and popped off the interpreter's value stack. Subsequent uses of that array are writing to unmapped memory, causing segmentation faults, or double free()'s.

See if removing the "return output" line fixes it, and I'll try the same.

As far as the prognosis for fixing this goes, I can most likely write a quick fix that detects escaping array objects, and increments the reference count before returning to Python (taking care to remove the reference count increment added to fix issue 2). This complicates LLVM-to-LLVM code calls, however, since nesting calls will return an incorrect reference count (for example, a JIT'ed function returning the array returned by another JIT'ed function would have 1 more reference count than needed, causing a memory leak).

@jriehl
Copy link
Contributor

jriehl commented Aug 17, 2012

Oops. Just for hygiene, the referenced fix was for issue #8. #8

@jriehl
Copy link
Contributor

jriehl commented Aug 17, 2012

See the commit for my modified version of the fbcorr example. This works for me, but note that the return type has changed, in addition to removing the return statement (when Numba currently tries to cast None to a 4-d array of doubles, it hits a LLVM assertion and dies).

@alexbw
Copy link
Author

alexbw commented Aug 19, 2012

I will test this for my code tomorrow.

@alexbw
Copy link
Author

alexbw commented Aug 20, 2012

Works for me now! I don't know how you want to inform users of this quirk. Outside of Numba, these pointer problems are either hyper-explicit (Cython) or completely transparent (Python/Numpy).

I wouldn't mind Numba throwing an error if I tried to return an array. I also can't allocate arrays of arbitrary sizes (np.zeros() et al don't work right now, I think), so it seems reasonable to beep at the user if a return statement will cause problems.

@alexbw
Copy link
Author

alexbw commented Aug 20, 2012

This is what I was writing it all for: http://nbviewer.ipython.org/3407544

It now works quite well! So astoundingly fast for how readable the code is.

@alexbw
Copy link
Author

alexbw commented Aug 20, 2012

And one last syntactic question:
Is it preferred to use numba.double[:,:] over [['d']]? Either way, why is one preferred?

@jriehl
Copy link
Contributor

jriehl commented Aug 23, 2012

The [['d']] type notation was an early ad hoc design, and will be deprecated. If you don't like typing "numba.double" all the time, the shorthand "numba.d" is also available.

@markflorisson
Copy link
Contributor

This has been fixed for the ast backend for a while now. SInce the bytecode backend is deprecated, I'm closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants