Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG (Possible): masked array divide by zero array seems to screen out nan and inf #18744

Open
jpkrooney opened this issue Apr 8, 2021 · 1 comment
Labels
component: numpy.ma masked arrays

Comments

@jpkrooney
Copy link

jpkrooney commented Apr 8, 2021

Under certain circumstances dividing a masked array by regular array with zeros seems to unexpactantly screen out nan and inf answers.

Reproducing code example:

import numpy as np
from numpy import ma

# Make masked and regular array
x = np.array([ 0.,  1., 0.,  1.])
xm = ma.masked_equal(x, -1)
y = np.array([ 0.,  0., 0.,  0.])

If we divide x by y we get:

x/y
Out[239]: array([nan, inf, nan, inf])

If we divide xm by y we get:

xm/y
Out[240]: 
masked_array(data=[--, --, --, --],
             mask=[ True,  True,  True,  True],
       fill_value=-1.0,
            dtype=float64)

...it has masked the nan and inf values even though they are not -1

If we divide xm by y and get just the data we get:

(xm/y).data
Out[242]: array([0., 1., 0., 1.])

🤯.. now we have data where I would expect nan and inf. I'm not very experienced in Python but this looks like an unexpected result and I thought I should report it (I spent alot of time tracing unexpected results from a function that turns out to be due to this).

Edit: Where I found this in the wild is even more insidious because there was no specific .data step - it happened silently as follows:

Suppose our calculation was done in a function:

def somefunc3(a,b):
    c = a / b
    return c

somefunc3(xm, y)
Out[76]: 
masked_array(data=[--, --, --, --],
             mask=[ True,  True,  True,  True],
       fill_value=-1.0,
            dtype=float64)

It output the masked array without nan and inf.

Now suppose we were stuffing the result of somefunc into a larger array:

d = np.zeros((4, 2))

d[:,0] = somefunc3(xm, y)
d
Out[77]: 
array([[0., 0.],
       [1., 0.],
       [0., 0.],
       [1., 0.]])

Now it silently converted the masked array back to a regular array and put in 1 or 0 when it should be nan or inf. Note that when I ran this on my machine I got a divide by zero warning only one time, but all other times I ran it I did not (I have no idea why).

NumPy/Python version information:

1.18.1 3.7.6 (default, Jan 8 2020, 13:42:34)
[Clang 4.0.1 (tags/RELEASE_401/final)]
Edit: same behaviour on my other machine with versions:
1.19.2 3.8.5
[Clang 10.0.0]

@jpkrooney jpkrooney changed the title Possible masked array divide by zero bug BUG (Possible): masked array divide by zero array seems to screen out nan and inf Apr 9, 2021
@squarebat
Copy link

squarebat commented Apr 11, 2021

While reproducing this myself, I found something more interesting, the division actually doesn't mask outputs that are actually -1, but incorrectly masks inf and nan

>>> x = np.array([ 0.,  1., 0.,  1.])
>>> xm = np.ma.masked_equal(x, -1)
>>> x/0
array([nan, inf, nan, inf])
>>> xm/0
masked_array(data=[--, --, --, --],
             mask=[ True,  True,  True,  True],
       fill_value=-1.0,
            dtype=float64)
>>> xm/-1
masked_array(data=[0.0, -1.0, 0.0, -1.0],
             mask=[False, False, False, False],
       fill_value=-1.0)

#The -1.0 wasn't masked

I tried division on other masked_arrays but the values are not being masked at all after division. I am guessing division with masked arrays is supposed to work this way, and doesn't actually mask values on its own. I further assumed that nans and infs get masked by default since they're invalid numbers in most cases, though that is not the case when I do this:

>>> xm = np.ma.masked_equal(x/0, -1)
>>> xm
masked_array(data=[nan, inf, nan, inf],
             mask=False,
       fill_value=-1.0)

This makes me think it's not a bug/feature with the masking itself, but rather the way masked_arrays interact with the divide operator. Even if it's not a bug, it could create a lot of misunderstandings.

Edit: Looked at some other issues regarding masked arrays. Seemingly masked arrays give unexpected results when used with functions not part of the ma module.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: numpy.ma masked arrays
Projects
None yet
Development

No branches or pull requests

3 participants