Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

invalid result on master when aggregating flat array with size=0 #846

Closed
shoyer opened this issue Oct 29, 2014 · 9 comments · Fixed by #852
Closed

invalid result on master when aggregating flat array with size=0 #846

shoyer opened this issue Oct 29, 2014 · 9 comments · Fixed by #852

Comments

@shoyer
Copy link

shoyer commented Oct 29, 2014

import numpy as np
from numba import guvectorize

@guvectorize(['void(float64[:, :], float64[:])'],
              '(x, y)->()')
def my_sum(a, out):
    asum = 0.0
    for ai in a.flat:
        asum += ai
    out[0] = asum

x = np.arange(0.0).reshape(2, 0)
print repr(x)
print repr(my_sum(x)) 
array([], shape=(2, 0), dtype=float64)
1.7272338168659652e-77

I'm running the master version of numba (actually 0.15.1-28-g2f5d025) and numpy 1.9.0 on Python 2.7/OSX (via conda).

This is indeed pathological data, but this part of the bottleneck test suite failed when I tested it for numbagg.

@shoyer shoyer changed the title error on master when aggregating flat array with size=0 invalid result on master when aggregating flat array with size=0 Oct 29, 2014
@pitrou
Copy link
Contributor

pitrou commented Oct 29, 2014

Is it specific to the use of guvectorize? Or does it also apply to normal jitted functions?

@shoyer
Copy link
Author

shoyer commented Oct 29, 2014

@pitrou Good question. Here is my jit version, which for which I cannot reproduce the issue:

import numpy as np
from numba import jit

@jit
def my_sum(a):
    asum = 0.0
    for ai in a.flat:
        asum += ai
    return asum

x = np.arange(0.0).reshape(2, 0)
print repr(x)
print repr(my_sum(x)) 

@sklam
Copy link
Member

sklam commented Oct 30, 2014

This maybe a numpy bug. Consider:

import numpy as np
import numpy.core.umath_tests as ut

x = np.arange(0.0).reshape(2, 0)
y = np.arange(0.0).reshape(0, 2)
print(x, x.shape, x.data.nbytes)
print(y, y.shape, y.data.nbytes)
print(ut.matrix_multiply(x, y))

which prints:

[] (2, 0) 0
[] (0, 2) 0
[[  1.72723371e-077   1.72723371e-077]
 [  2.12435947e-314   2.78134365e-309]]

@shoyer
Copy link
Author

shoyer commented Oct 31, 2014

Interesting. I should note that this does work on the numba 0.15.1 (I accidentally omitted that in my original report), so it is likely related to work refactoring .flat.

Also might be related to this numpy issue: numpy/numpy#5195

@pitrou
Copy link
Contributor

pitrou commented Oct 31, 2014

The fact that it works with @jit is a bit fishy. It would be nice to know what the a array looks like inside the @guvectorize version.

@shoyer
Copy link
Author

shoyer commented Nov 4, 2014

@pitrou I tried inserting some print statements, but of course that causes the function to be executed in python mode, which removes the bug.

@pitrou
Copy link
Contributor

pitrou commented Nov 4, 2014

If you use the print() function and only print integers and floats, the function should be able to compile in nopython mode (add nopython=True to the jit() call to ensure this).

@shoyer
Copy link
Author

shoyer commented Nov 4, 2014

OK, perhaps this output may be helpful:

import numpy as np
from numba import guvectorize

@guvectorize(['void(float64[:, :], float64[:])'],
              '(x, y)->()', nopython=True)
def my_sum(a, out):
    asum = 0.0
    print(asum)
    adding = 99999999 # some sentinel value
    for n, ai in enumerate(a.flat):
        print(adding)
        print(ai)
        asum += ai
    print(asum)
    out[0] = asum

for shape in [(0, 0), (0, 2), (2, 0)]:
    print('\nshape {}:'.format(shape))
    x = np.arange(0.0).reshape(shape)
    print repr(my_sum(x)) 
shape (0, 0):
0.0
0.0
0.0

shape (0, 2):
0.0
99999999
0.0
99999999
3.10503636984e+231
3.10503636984e+231
3.1050363698376418e+231

shape (2, 0):
0.0
99999999
1.32768739285e-315
1.32768739285e-315
1.3276873928472899e-315

It looks like .flat is sometimes reading from unallocated memory.

@pitrou
Copy link
Contributor

pitrou commented Nov 4, 2014

Thank you. Yes, it's an issue with .flat. It only manifests on the code path for non-contigouous arrays (you can also trigger it with explicit typing in the jit() call, e.g. @jit('void(float64[:, :])', nopython=True). The problem is that if one dimension is zero-sized (but not all), the iteration logic becomes incorrect. IMO the best fix will be to add a special path if there is a zero-sized dimension, since the iterator is empty then.

pitrou added a commit to pitrou/numba that referenced this issue Nov 4, 2014
@pitrou pitrou mentioned this issue Nov 4, 2014
seibert added a commit that referenced this issue Nov 5, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants