BUG: (closes #4352) any and all applied to object arrays should return bool #11857

Cheukting · 2018-09-01T11:44:43Z

BUG: (closes #4352) numpy any and all applied to object arrays should return bool

Fix problem of np.any and np.all not always return bool.

(from EuroSciPy 2018 sprints)

seberg

Thanks, unfortunately this misses the arr.any(). There are also C-level equivalent functions IIRC, so those should be fixed up as well :/. Although it seems that numpy.core._methods includes _any so it is probably easy to give it a similar treatment (in python code).

Overall, while it seems easy, I am not sure this is actually easy to get right.

numpy/core/fromnumeric.py

numpy/core/tests/test_umath.py

Cheukting · 2018-09-03T03:50:16Z

@seberg thanks for the review. Right now any() and all() does not call _any at _method but directly call the ufuncs. Now sure if it will change in future version. Will have a look at arr.any() as well, thanks for reminding.

seberg · 2018-09-03T06:58:10Z

To be a bit more clear, what about: ``` np.any(np.array([1.5, 2.4], dtype=object)) np.any(np.array([object(), 2.4], dtype=object)) ``` and horrifying: ``` np.any(np.array([np.array([3]), 2.4], dtype=object)) ``` which gives back an array that is actually an item (because tat array, can be converted by `bool()`, grrrr...

seberg · 2018-09-03T07:19:29Z

Ah, random realization, `dtype=np.bool_` probably works, although it will probably cast all inputs to bool from the start (and not just force output dtype selection I think). On the other hand, that casting probably already occurs for all but object dtype, since the ufunc is typed `dd->?`, and reductions need all the same, so they force `??->?` probably, so giving dtype will not change anything for them currently.

Cheukting · 2018-09-06T13:46:45Z

so what would be expected if it's
np.any(np.array([object(), 2.4], dtype=object)) and np.any(np.array([np.array([3]), 2.4], dtype=object)) ? In this case should object() to be converted to None and np.array([3]) to 3? These 2 cases is a bit extreme so want to be sure.

Cheukting · 2018-09-06T14:19:34Z

unfortunately dtype is not supported in any and all ufunc so dtype=np.bool_ will not work

eric-wieser · 2018-09-06T15:37:49Z

unfortunately dtype is not supported in any and all ufunc so dtype=np.bool_ will not work

[citation needed]. Just change

result = _wrapreduction(a, np.logical_or, 'any', axis, None, out,
                            keepdims=keepdims)

to

result = _wrapreduction(a, np.logical_or, 'any', axis, np.bool_, out,
                            keepdims=keepdims)

eric-wieser · 2018-09-06T15:40:35Z

@seberg

dtype=np.bool_ probably works, although it
will probably cast all inputs to bool from the start (and not just
force output dtype selection I think)

As far as I know, it's impossible in python for bool(a or b) to produce a different value to bool(a) or bool(b). The only difference is that bool will be evaluated non-lazily.

I think the way to go here would be:

Pass dtype=bool under the hood to any and all
Add a dtype argument to any / all in case the user really wants to override it back to np.object_

Cheukting · 2018-09-06T15:44:21Z

#11857 (comment)

@eric-wieser I have tried that and it causes an error in the _wrapreduction as the reduction it called in this if statement:

    if type(obj) is not mu.ndarray:
        try:
            reduction = getattr(obj, method)
        except AttributeError:
            pass
        else:
            # This branch is needed for reductions like any which don't
            # support a dtype.
            if dtype is not None:
                return reduction(axis=axis, dtype=dtype, out=out, **passkwargs)
            else:
                return reduction(axis=axis, out=out, **passkwargs)

should be without the dtype

eric-wieser · 2018-09-06T15:46:47Z

I don't understand what you mean. What error do you get if you make the change I proposed above?

Cheukting · 2018-09-06T16:05:42Z

@eric-wieser the error would be as follow:

    def _wrapreduction(obj, ufunc, method, axis, dtype, out, **kwargs):
        passkwargs = {}
        for k, v in kwargs.items():
            if v is not np._NoValue:
                passkwargs[k] = v

        if type(obj) is not mu.ndarray:
            try:
                reduction = getattr(obj, method)
            except AttributeError:
                pass
            else:
                # This branch is needed for reductions like any which don't
                # support a dtype.
                if dtype is not None:
>                   return reduction(axis=axis, dtype=dtype, out=out, **passkwargs)
E                   TypeError: all() got an unexpected keyword argument 'dtype'

eric-wieser · 2018-09-06T16:32:39Z

What is type(obj) when that happens?

Cheukting · 2018-09-06T16:34:45Z

obj = matrix([[ True,  True,  True],
        [ True,  True,  True],
        [ True,  True,  True]])

eric-wieser · 2018-09-06T16:36:45Z

Then matrix.all needs changing too.

Cheukting · 2018-09-06T17:10:52Z

I think it also happened to MaskedArray.all ...... actually applying np.bool_ will cause around 70 fails in the tests. Not sure changing all methods is the best way to apply this fix... please advice.

eric-wieser · 2018-09-06T17:14:52Z

Perhaps this PR should come in two parts then - adding support for a dtype argument to all the functions, then a follow-up that attempts to change the default

seberg · 2018-09-06T18:47:49Z

Doing the first step would be very nice in any caes maybe (though did not think about it too much). We also have to worry a bit about changing someone code. I am not really worried about it, but I don't see a nice way to warn here.
That shouldn't stop us, but it might delay actually changing it for a bit I think.

EDIT: Sorry that numpy has "traps" like this. Things sometimes look easy on first sight and then get sticky very fast when it gets being careful about breaking things and just the details.

Cheukting · 2018-09-06T21:17:55Z

Ok, I will give part 1 a try then. I will leave it here after I finish part 1 then. Thanks for being patient with me and giving advice.

Cheukting · 2018-09-19T20:36:38Z

The problem was pin pointed, in core/code_generators/generate_umath.py:

'logical_and':
    Ufunc(2, 1, One,
          docstrings.get('numpy.core.umath.logical_and'),
          'PyUFunc_SimpleBinaryComparisonTypeResolver',
          TD(nodatetime_or_obj, out='?', simd=[('avx2', ints)]),
          TD(O, f='npy_ObjectLogicalAnd'),
          ),
'logical_not':
    Ufunc(1, 1, None,
          docstrings.get('numpy.core.umath.logical_not'),
          None,
          TD(nodatetime_or_obj, out='?', simd=[('avx2', ints)]),
          TD(O, f='npy_ObjectLogicalNot'),
          ),
'logical_or':
    Ufunc(2, 1, Zero,
          docstrings.get('numpy.core.umath.logical_or'),
          'PyUFunc_SimpleBinaryComparisonTypeResolver',
          TD(nodatetime_or_obj, out='?', simd=[('avx2', ints)]),
          TD(O, f='npy_ObjectLogicalOr'),
          ),

For dtype = 'O' it will fall into the last td (e.g. TD(O, f='npy_ObjectLogicalOr')) which will require npy_ObjectLogicalNot (and adding out='?' only does NOT work)

which is implemented in core/src/umath/funcs.inc.src:

/* Emulates Python's 'a or b' behavior */
static PyObject *
npy_ObjectLogicalOr(PyObject *i1, PyObject *i2)
{
    if (i1 == NULL) {
        Py_XINCREF(i2);
        return i2;
    }
    else if (i2 == NULL) {
        Py_INCREF(i1);
        return i1;
    }
    else {
        int retcode = PyObject_IsTrue(i1);
        if (retcode == -1) {
            return NULL;
        }
        else if (retcode) {
            Py_INCREF(i1);
            return i1;
        }
        else {
            Py_INCREF(i2);
            return i2;
        }
    }
}

which is mimicking the python and, not sure if changing it to always return bool is a good thing... please advice

hameerabbasi · 2019-03-29T08:35:25Z

I think #11857 (comment) is a simple fix that will work. If needed, I can make a PR.

gimseng · 2020-10-08T13:37:39Z

Any progress on this? What's the tldr of the discussion so far? Thanks !

Closes numpygh-4352

rgommers · 2021-05-09T12:13:19Z

@Cheukting I resolved the merge conflicts and rebased to get a sense of what's going on with this. Locally all tests pass for me.

Argh, I was almost done writing a really long comment, and now I lost it due to a UI glitch. Let me summarize only:

Current tests look good and pass
your try-except is correct but needs to be made a bit more specific, or changed to hasattr(a, 'astype')
the reason that @eric-wieser's suggestion for using np.bool_ cannot be done instead of the try-except is that that would not be backwards-compatible (_wrapreduction will call obj.any/obj/all for third-party obj which is not guaranteed to have a dtype keyword; so fixing up np.matrix is not enough)
the bigger issue is that ndarray.any/all still have the old behaviour
you want to write some new tests that verify the equivalence of functions and methods. ideally with @pytest.mark.parametrize looping over all dtypes. I'd expect only object dtype to fail
then we should get back to the question you ended with in your last comment.

rgommers · 2021-05-09T12:32:12Z

which is mimicking the python and, not sure if changing it to always return bool is a good thing

I think any and all should always return a bool. I don't see why they should follow and, which has different semantics:

>>> 4 and np
<module 'numpy' from '/home/rgommers/code/numpy/numpy/__init__.py'>
>>> all([3, 4, np, [1]])
True

Cheukting · 2021-05-10T19:54:53Z

Hi, @rgommers sorry I was having some issues with my computer yesterday so I have not much progress. I will try again this coming weekend and let you know how it goes. Thanks for helping me out.

rgommers · 2021-05-11T09:20:35Z

That's too bad, the sprint was fun! Anyway, I'll keep an eye on this, would be nice to get it merged:)

charris · 2022-06-09T15:44:18Z

@Cheukting This needs a rebase.

Cheukting · 2022-06-09T16:02:31Z

Hi @charris

Thanks for reminding me about this. I am afraid this is too old and would not be relevant anymore. I am happy to close it.

Looking forward to work on something else in the future :-)

seberg · 2022-06-09T16:04:13Z

Still very relevant, but we should maybe decide whether we want to change this for any/all, or directly in np.logical_or.reduce().
And it might be a "larger" change (in the sense that it is hard to add a warning, but might break someone who relies on the current behavior). The change itself is fairly straight forward...

Cheukting · 2022-06-09T16:30:47Z

Ok, I will reopen and rebase it to keep this discussion. However, I would need more information and guidance to move forward. I would really love to get this done and move on 😅

seberg · 2022-06-09T16:36:07Z

We need to make a decision, lets see if we can manage in the next triage meeting. My suspicion is that:

The change here (as proposed) modifying only any and all may just fly.
The larger change would likely be nice, but maybe should be part of a 2.0 release (which might be the next release, but its hard to plan)

Closes numpygh-4352

…py18

seberg · 2022-12-14T17:34:44Z

Mailing list ping to see if anyone voices concerns about trying this: https://mail.python.org/archives/list/numpy-discussion@python.org/thread/6JE4FGYCRTCURGPSDZV4X7HKY77PUZMU/

I need to take a closer look, because right now I am confused about the None special path, and not sure we should bother about returning a Python bool for scalars (unless we currently do, which may be!).

melissawm · 2023-07-15T15:19:26Z

@seberg would you mind taking a look again or guiding @Cheukting through on what needs to be done here? If the best idea is to start from scratch or for someone else to take over, that is ok too. Cheers!

seberg · 2023-07-17T13:17:41Z

OK, lets close it... We could aim for logical_or directly nowadays, but not sure if that is even better.

In either case, this doesn't look terrible or should be complicated. The dtype=np.bool_ solution was always the right approach, NumPy arrays support it already, but the main thing is that _wrapreduction doesn't support that not passing anything should default to using it.

We can add dtype, but maybe it doesn't even matter: We should just set it implicitly in the array method (change default) and inline _wrapreduction to pass it explicitly.

(I am not convicned that any/all need dtype=, if you really want that, using np.logical_or.reduce() seems OK; although we are missing a reference to it.)

Cheukting changed the title ~~BUG: (#4352) numpy any and all applied to object arrays should return…~~ BUG: (closes #4352) numpy any and all applied to object arrays should return… Sep 1, 2018

Cheukting changed the title ~~BUG: (closes #4352) numpy any and all applied to object arrays should return…~~ BUG: (closes #4352) any and all applied to object arrays should return bool Sep 1, 2018

seberg reviewed Sep 2, 2018

View reviewed changes

numpy/core/fromnumeric.py Outdated Show resolved Hide resolved

numpy/core/tests/test_umath.py Outdated Show resolved Hide resolved

mattip mentioned this pull request Feb 13, 2019

BUG: Ensured any and all return boolean values, gh-4352 #5267

Closed

eric-wieser mentioned this pull request Feb 24, 2019

BUG: np.any() and np.all() should return boolean for object arrays #13022

Closed

seberg mentioned this pull request Aug 5, 2019

numpy any and all applied to object arrays should return booleans. #4352

Closed

eric-wieser added the 00 - Bug label Aug 25, 2019

gimseng mentioned this pull request Oct 8, 2020

BUG: any() and all() behavior on string series is different from python pandas-dev/pandas#36880

Closed

3 tasks

Base automatically changed from master to main March 4, 2021 02:04

rgommers added the component: numpy._core label May 9, 2021

BUG: any and all applied to object arrays should return booleans

3b3bf9f

Closes numpygh-4352

rgommers force-pushed the euroscipy18 branch from 65e4efd to 3b3bf9f Compare May 9, 2021 11:42

Cheukting closed this Jun 9, 2022

Cheukting reopened this Jun 9, 2022

seberg added the triage review Issue/PR to be discussed at the next triage meeting label Jun 9, 2022

Cheukting added 2 commits June 9, 2022 18:44

BUG: any and all applied to object arrays should return booleans

967a665

Closes numpygh-4352

Merge branch 'euroscipy18' of github.com:Cheukting/numpy into eurosci…

3535e8c

…py18

Cheukting changed the base branch from main to maintenance/1.0.3.x June 9, 2022 16:51

Cheukting changed the base branch from maintenance/1.0.3.x to main June 9, 2022 16:51

seberg closed this Jul 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: (closes #4352) any and all applied to object arrays should return bool #11857

BUG: (closes #4352) any and all applied to object arrays should return bool #11857

Cheukting commented Sep 1, 2018

seberg left a comment

Cheukting commented Sep 3, 2018

seberg commented Sep 3, 2018 via email

seberg commented Sep 3, 2018 via email

Cheukting commented Sep 6, 2018

Cheukting commented Sep 6, 2018

eric-wieser commented Sep 6, 2018

eric-wieser commented Sep 6, 2018

Cheukting commented Sep 6, 2018

eric-wieser commented Sep 6, 2018 •

edited

Loading

Cheukting commented Sep 6, 2018

eric-wieser commented Sep 6, 2018

Cheukting commented Sep 6, 2018

eric-wieser commented Sep 6, 2018

Cheukting commented Sep 6, 2018

eric-wieser commented Sep 6, 2018

seberg commented Sep 6, 2018 •

edited

Loading

Cheukting commented Sep 6, 2018

Cheukting commented Sep 19, 2018 •

edited

Loading

hameerabbasi commented Mar 29, 2019

gimseng commented Oct 8, 2020

rgommers commented May 9, 2021

rgommers commented May 9, 2021

Cheukting commented May 10, 2021

rgommers commented May 11, 2021

charris commented Jun 9, 2022

Cheukting commented Jun 9, 2022

seberg commented Jun 9, 2022

Cheukting commented Jun 9, 2022 •

edited

Loading

seberg commented Jun 9, 2022

seberg commented Dec 14, 2022

melissawm commented Jul 15, 2023

seberg commented Jul 17, 2023

BUG: (closes #4352) any and all applied to object arrays should return bool #11857

BUG: (closes #4352) any and all applied to object arrays should return bool #11857

Conversation

Cheukting commented Sep 1, 2018

seberg left a comment

Choose a reason for hiding this comment

Cheukting commented Sep 3, 2018

seberg commented Sep 3, 2018 via email

seberg commented Sep 3, 2018 via email

Cheukting commented Sep 6, 2018

Cheukting commented Sep 6, 2018

eric-wieser commented Sep 6, 2018

eric-wieser commented Sep 6, 2018

Cheukting commented Sep 6, 2018

eric-wieser commented Sep 6, 2018 • edited Loading

Cheukting commented Sep 6, 2018

eric-wieser commented Sep 6, 2018

Cheukting commented Sep 6, 2018

eric-wieser commented Sep 6, 2018

Cheukting commented Sep 6, 2018

eric-wieser commented Sep 6, 2018

seberg commented Sep 6, 2018 • edited Loading

Cheukting commented Sep 6, 2018

Cheukting commented Sep 19, 2018 • edited Loading

hameerabbasi commented Mar 29, 2019

gimseng commented Oct 8, 2020

rgommers commented May 9, 2021

rgommers commented May 9, 2021

Cheukting commented May 10, 2021

rgommers commented May 11, 2021

charris commented Jun 9, 2022

Cheukting commented Jun 9, 2022

seberg commented Jun 9, 2022

Cheukting commented Jun 9, 2022 • edited Loading

seberg commented Jun 9, 2022

seberg commented Dec 14, 2022

melissawm commented Jul 15, 2023

seberg commented Jul 17, 2023

eric-wieser commented Sep 6, 2018 •

edited

Loading

seberg commented Sep 6, 2018 •

edited

Loading

Cheukting commented Sep 19, 2018 •

edited

Loading

Cheukting commented Jun 9, 2022 •

edited

Loading