-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect behavior with numpy array equality against a scalar unicode string #7859
Comments
Bug confirmed. The problem is in the second |
Debug summary:
Minimal reproducer: import numpy as np
import numba as nb
@nb.njit
def foo(a):
return a == 'A'
a = np.array(['A', 'B'], dtype=np.dtype('<U1'))
x = foo(a)
y = foo.py_func(a)
np.testing.assert_equal(x, y) |
Hi, @sklam @stuartarchibald . I am trying to fix this issue. Right now, I found two string-related types, I think we need to add some conditions to handle I guess complete supporting for ufunc with string perhaps needs many code additions/changes. I will firstly propose one PR to only focus on fixing this issue. Then, try to fullfil all related parts with this general problem. Any useful info for me? |
@dlee992, yes, i agree adding numba/numba/cpython/numbers.py Lines 1060 to 1089 in e5bbae7
See how it trigger the compiler within itself. The unicode equal ufunc can likely reuse the unicode equality like this: def unicode_eq_impl(context, builder, sig, args):
def eq(a, b):
return a == b
res = context.compile_internal(builder, eq, sig, args)
return impl_ret_untracked(context, builder, sig.return_type, res) |
@sklam , interesting thing:
|
Looks like numpy is not using the ufunc for unicode/string array equality. In [3]: a=np.asarray("ABC")
In [4]: a
Out[4]: array('ABC', dtype='<U3')
In [5]: np.equal(a, a)
---------------------------------------------------------------------------
UFuncTypeError Traceback (most recent call last)
<ipython-input-5-e945fc627bdf> in <module>
----> 1 np.equal(a, a)
UFuncTypeError: ufunc 'equal' did not contain a loop with signature matching types (dtype('<U3'), dtype('<U3')) -> dtype('bool') It's not obvious from a quick search where the numpy code for |
It's probably this: Note the remarks about string arrays and ufuncs not being defined, there's a special case: ? |
Good find @stuartarchibald. So Numba should follow that and special case the handling for |
#7884 might help. |
I discussed with Seberg at this PR numpy/numpy#21041, about unifying string ufuncs into the universal way. Still in discussion and need more NumPy PRs for |
Hello,
I have a different behavior when executing the following code with or with numba:
Without numba: (prints two vectors of boolean, OK)
[ True False]
[ True False]
With numba: (second print returns just a boolean, NOK)
[ True False]
False
Thanks for your help!
Anthony
The text was updated successfully, but these errors were encountered: