refactor np.where to use overload #8258

guilhermeleobas · 2022-07-19T17:43:36Z

Refactor np.where to use @overload
Broadcast arrays of different shapes. This removes a restriction in the old code where only arrays with the same shape were supported.
Add NumPy tests
Add type check for supported types + tests

guilhermeleobas · 2022-07-21T03:06:34Z

Thanks for the review, @apmasell

stuartarchibald · 2022-10-18T09:46:45Z

@guilhermeleobas thanks for the patch. In the interests of making this easier to review, perhaps split the API refactoring and the addition of new functionality into separate patches?

…numpy version < 1.20

guilhermeleobas · 2022-10-19T02:58:51Z

@stuartarchibald, this is a new impl. of np.where that works with broadcast arrays. You can ignore the diff as none of the old code was actually used.

stuartarchibald · 2022-11-08T14:07:35Z

@guilhermeleobas please could you resolve the conflicts against main? Many thanks.

…rload_np_where

kc611

The code looks functionally good and well-written. There are mostly issues with handling of None and a few nitpicks but otherwise it LGTM.

kc611 · 2022-11-17T12:36:57Z

numba/np/arraymath.py

+            if x.layout == y.layout == condition.layout:
+                layout = x.layout
+            else:
+                layout = 'C'


Should this be A format ?

Having a C would mean the implementation would tend to go though _where_fast_inner_impl even though arrays might have different layouts (which may not be C or F) ?

(The resulting array should be C ordered but that is automatically handled since the default layout is C ordered.)

I just kept the original behavior, which assigns the layout to C:

numba/numba/core/typing/npydecl.py

Lines 710 to 715 in 3bee7be

if (cond.ndim == x.ndim == y.ndim):

if x.layout == y.layout == cond.layout:

retty = types.Array(retdty, x.ndim, x.layout)

else:

retty = types.Array(retdty, x.ndim, 'C')

return signature(retty, *args)

kc611 · 2022-11-17T12:56:54Z

numba/np/arraymath.py

+            raise NumbaTypeError(msg.format(name))
+
+    if is_nonelike(x) and is_nonelike(y):
+        return _where_cond_none_none


The support for None objects seem to deviate from what's expected:

import numpy as np import numba def foo_np(cond, x, y): return np.where(cond, x, y) @numba.njit def foo_nb(cond, x, y): return np.where(cond, x, y) cond = np.array([None, 1]) x = np.array([0, 1]) y = np.array([3, 4]) print(foo_np(cond, x, y)) print(foo_nb(cond, x, y)) # Error cond = None x = np.array([0, 1]) y = np.array([3, 4]) print(foo_np(cond, x, y)) print(foo_nb(cond, x, y)) # Error cond = np.array([0, 1]) x = np.array([0, 1]) y = None print(foo_np(cond, x, y)) print(foo_nb(cond, x, y)) # Error cond = np.array([0, 1]) x = None y = None print(foo_np(cond, x, y)) print(foo_nb(cond, x, y)) # Wrong Results

If this is not supported then it should be caught as a proper error. Otherwise, The behaviour of None in condition seem to be the same as 0 and within arrays it seems to be treated as a normal element.

Supporting None inputs will be tricky. I'll try to address this in the following days.

kc611 · 2022-11-17T14:31:18Z

numba/tests/test_array_methods.py

+    def test_np_where_numpy_ndim(self):
+        # https://github.com/numpy/numpy/blob/fe2bb380fd9a084b622ff3f00cb6f245e8c1a10e/numpy/core/tests/test_multiarray.py#L8737-L8749
+        pyfunc = np_where_3
+        cfunc = jit(nopython=True)(pyfunc)


Might be better to use the njit and func_name.py_func API over here?

I'm following the convention used in the test file.

kc611 · 2022-11-17T14:31:59Z

numba/tests/test_array_methods.py

+        tmpmask = c != 0
+        c[c == 0] = 41247212
+        c[tmpmask] = 0
+        np.testing.assert_equal(cfunc(c, b, a), r)


We could add some tests involving None ?

stuartarchibald

Thanks for the patch, this largely looks good, great to see the NumPy tests passing too. I've left a few comments inline, once resolved should be good to merge.

numba/tests/test_array_methods.py

stuartarchibald · 2022-11-07T11:25:49Z

numba/np/arraymath.py

+    if is_nonelike(x) and is_nonelike(y):
+        return _where_cond_none_none


I think the use of kwargs with default None is going to potentially cause issues. Consider:

In [5]: np.where([3], None, None) Out[5]: array([None], dtype=object)

vs.

In [13]: np.array([3]).nonzero() Out[13]: (array([0]),)

I've changed the code to not use None as default value.

numba/np/arraymath.py

numba/tests/test_array_methods.py

stuartarchibald · 2022-11-30T13:40:46Z

numba/np/arraymath.py

+    #
+    # >>> np.where([0, 1], None, None)
+    # array([None, None])
+    if x is None and y is None:


I think this should be:

Suggested change

if x is None and y is None:

if is_nonelike(x) and is_nonelike(y):

as this is in the typing domain, however, I think there's further issues, e.g.:

from numba import njit import numpy as np @njit def foo(a, x, y): return np.where(a, x, y) args = (1, None, None) expected = foo.py_func(*args) got = foo(*args) print(expected, type(expected)) print(got, type(got))

produces:

None <class 'numpy.ndarray'> (array([0]),) <class 'tuple'>

setting args = (np.ones(4), None, None) also does something similarly strange.

stuartarchibald · 2022-11-30T13:44:34Z

numba/np/arraymath.py

    for idx, c in np.ndenumerate(cond):
-        res[idx] = x if c else y[idx]
+        res[idx] = x[idx] if c else y[idx]


This fails to unify for res in the case of 'unusual' inputs like:
cond, x, y = np.arange(-2, 2, 1), np.zeros((4, 4)), np.ones((4, 4), dtype='<U5'))

Is there anything we can do in this case?

Whilst it's maybe possible to assess the output type based on the inputs and explicitly ban unsupported combinations, I think it's ok to leave it as is, the error message is reasonably informative and working out what's "unsupported" is probably complicated. Do you feel differently?

numba/np/arraymath.py

stuartarchibald · 2022-11-30T14:04:32Z

numba/np/arraymath.py

+        cond_ = np.broadcast_to(cond1, shape)
+        x_ = np.broadcast_to(x1, shape)
+        y_ = np.broadcast_to(y1, shape)


I guess there are cases where it's faster the compute then broadcast opposed to broadcast then compute. e.g. where the cond is smaller dimension than x and y. Perhaps leave this opt for now and concentrate on correctness with view of getting this merged!

guilhermeleobas · 2022-12-05T23:10:34Z

@kc611 @stuartarchibald, would you folks be ok with not supporting None inputs in this patch? And instead raise an error when x or y is None.

The current approach fails with None values, as the expression np.asarray(None) is not supported. Once support for it gets included, np.where would work without any major changes.

stuartarchibald · 2022-12-06T12:58:02Z

@kc611 @stuartarchibald, would you folks be ok with not supporting None inputs in this patch? And instead raise an error when x or y is None.

The current approach fails with None values, as the expression np.asarray(None) is not supported. Once support for it gets included, np.where would work without any major changes.

@guilhermeleobas I think that would be fine, the existing implementation in Numba doesn't support it either so it's not a regression. Thanks!

stuartarchibald

Thanks for the updates @guilhermeleobas, think they address everything in the review. I'm inclined to leave the issue with unifying 'unusual' array types for now unless you feel strongly otherwise. Thanks again!

guilhermeleobas marked this pull request as ready for review July 19, 2022 19:07

guilhermeleobas requested review from sklam and stuartarchibald as code owners July 19, 2022 19:07

guilhermeleobas requested a review from apmasell July 20, 2022 01:56

guilhermeleobas added the 3 - Ready for Review label Jul 20, 2022

apmasell previously approved these changes Jul 20, 2022

View reviewed changes

stuartarchibald assigned stuartarchibald and apmasell Jul 26, 2022

stuartarchibald added this to the Numba 0.57 RC milestone Jul 26, 2022

guilhermeleobas mentioned this pull request Aug 23, 2022

meta-issue: migrate @glue_* functions to @overload #8254

Closed

kc611 mentioned this pull request Sep 2, 2022

Refactor numba.np.arraymath methods from lower_builtins to overloads #8415

Merged

8 tasks

stuartarchibald added 4 - Waiting on author Waiting for author to respond to review Effort - medium Medium size effort needed and removed 3 - Ready for Review labels Oct 18, 2022

refactor np.where to use overload

53c587e

guilhermeleobas dismissed apmasell’s stale review via 924893e October 19, 2022 02:40

guilhermeleobas force-pushed the guilhermeleobas/overload_np_where branch from fec7c97 to 924893e Compare October 19, 2022 02:40

expose an internal version of np.broadcast_shapes that works even if …

66eaa41

…numpy version < 1.20

guilhermeleobas force-pushed the guilhermeleobas/overload_np_where branch from 924893e to 66eaa41 Compare October 19, 2022 02:45

remove dead code "array_where"

b52e3ff

Merge remote-tracking branch 'upstream/main' into guilhermeleobas/ove…

99e74c1

…rload_np_where

stuartarchibald added the highpriority label Nov 8, 2022

import __broadcast_shapes on arraymath.py

553219e

guilhermeleobas added 4 - Waiting on reviewer Waiting for reviewer to respond to author and removed 4 - Waiting on author Waiting for author to respond to review labels Nov 11, 2022

kc611 requested changes Nov 17, 2022

View reviewed changes

Address some of the reviewer comments

b767753

stuartarchibald reviewed Nov 30, 2022

View reviewed changes

stuartarchibald added 4 - Waiting on author Waiting for author to respond to review and removed 4 - Waiting on reviewer Waiting for reviewer to respond to author labels Nov 30, 2022

guilhermeleobas added 2 commits December 2, 2022 11:03

address a few more comments

ea7a408

raise error when None inputs are used in np.where

d7e70b7

stuartarchibald approved these changes Dec 6, 2022

View reviewed changes

stuartarchibald added 5 - Ready to merge Review and testing done, is ready to merge and removed 4 - Waiting on author Waiting for author to respond to review labels Dec 6, 2022

sklam merged commit 0441bb1 into numba:main Dec 6, 2022

gmarkall mentioned this pull request May 4, 2023

Use of isinstance() error in numpy.where #8936

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor np.where to use overload #8258

refactor np.where to use overload #8258

guilhermeleobas commented Jul 19, 2022 •

edited

guilhermeleobas commented Jul 21, 2022

stuartarchibald commented Oct 18, 2022

guilhermeleobas commented Oct 19, 2022 •

edited

stuartarchibald commented Nov 8, 2022

kc611 left a comment

kc611 Nov 17, 2022

guilhermeleobas Nov 21, 2022

kc611 Nov 17, 2022

guilhermeleobas Nov 21, 2022

kc611 Nov 17, 2022

guilhermeleobas Nov 21, 2022

kc611 Nov 17, 2022

stuartarchibald left a comment

stuartarchibald Nov 7, 2022

guilhermeleobas Dec 5, 2022

stuartarchibald Nov 30, 2022

stuartarchibald Nov 30, 2022

guilhermeleobas Dec 2, 2022

stuartarchibald Dec 6, 2022

stuartarchibald Nov 30, 2022

guilhermeleobas commented Dec 5, 2022 •

edited

stuartarchibald commented Dec 6, 2022

stuartarchibald left a comment

	if (cond.ndim == x.ndim == y.ndim):
	if x.layout == y.layout == cond.layout:
	retty = types.Array(retdty, x.ndim, x.layout)
	else:
	retty = types.Array(retdty, x.ndim, 'C')
	return signature(retty, *args)

		if is_nonelike(x) and is_nonelike(y):
		return _where_cond_none_none

	if x is None and y is None:
	if is_nonelike(x) and is_nonelike(y):

refactor np.where to use overload #8258

refactor np.where to use overload #8258

Conversation

guilhermeleobas commented Jul 19, 2022 • edited

guilhermeleobas commented Jul 21, 2022

stuartarchibald commented Oct 18, 2022

guilhermeleobas commented Oct 19, 2022 • edited

stuartarchibald commented Nov 8, 2022

kc611 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stuartarchibald left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guilhermeleobas commented Dec 5, 2022 • edited

stuartarchibald commented Dec 6, 2022

stuartarchibald left a comment

Choose a reason for hiding this comment

guilhermeleobas commented Jul 19, 2022 •

edited

guilhermeleobas commented Oct 19, 2022 •

edited

guilhermeleobas commented Dec 5, 2022 •

edited