BUG: `/` and `/=` behaves different in masked arrays with `nan` #20506

scratchmex · 2021-12-02T23:03:01Z

Describe the issue:

I think the operators / and /= should behave the same as one is just a shorthand for the other. They do not with masked arrays and with nans as you see in the code example. The / auto masks the nan and /= does not.
We should see why __truedivide__ and __itruedivide__ are implemented different for MaskedArray class.

Reproduce the code example:

>> a = np.ma.array([1, np.nan])
>> b = [1, np.nan]

>>> a / b
[1.0 --]

>> a /= b
>>> a
[ 1. nan]

Error message:

No response

NumPy/Python version information:

this happens in main right now.
1.22.0.dev0+1977.g114d91919 3.8.10 (default, Sep 28 2021, 16:10:42) [GCC 9.3.0]

The text was updated successfully, but these errors were encountered:

math2001 · 2021-12-09T07:42:37Z

The code that handles __itruediv__ and __truediv__ is very different unfortunately. Two different strategies are used to protect against division by zero, which have different side-effects on NaN, causing this bug.

`itruediv`. Mask bad values pre division

numpy/numpy/ma/core.py

Lines 4325 to 4341 in d565c4b

    
               def __itruediv__(self, other): 
        
                   """ 
        
                   True divide self by other in-place. 
        
                   """ 
        
                   other_data = getdata(other) 
        
                   dom_mask = _DomainSafeDivide().__call__(self._data, other_data) 
        
                   other_mask = getmask(other) 
        
                   new_mask = mask_or(other_mask, dom_mask) 
        
                   # The following 3 lines control the domain filling 
        
                   if dom_mask.any(): 
        
                       (_, fval) = ufunc_fills[np.true_divide] 
        
                       other_data = np.where(dom_mask, fval, other_data) 
        
                   self._mask |= new_mask 
        
                   self._data.__itruediv__(np.where(self._mask, self.dtype.type(1), 
        
                                                    other_data)) 
        
                   return self

The whole thing is in self._data.__itruediv__(np.where(self._mask, self.dtype.type(1), other_data)), noting that self._mask has been updated ORed with the other's mask, and the safe domain. Dividing by NaN is considered safe by _DomainSafeDivide, which results in NaN post-division. It isn't masked.

`truediv`. Filter out `!umath.isfinite` values post division

numpy/numpy/ma/core.py

Lines 1153 to 1164 in d565c4b

    
           (da, db) = (getdata(a), getdata(b)) 
        
           # Get the result 
        
           with np.errstate(divide='ignore', invalid='ignore'): 
        
               result = self.f(da, db, *args, **kwargs) 
        
           # Get the mask as a combination of the source masks and invalid 
        
           m = ~umath.isfinite(result) 
        
           m |= getmask(a) 
        
           m |= getmask(b) 
        
           # Apply the domain 
        
           domain = ufunc_domain.get(self.f, None) 
        
           if domain is not None: 
        
               m |= domain(da, db)

Problem is here: m = ~umath.isfinite(result). umath.isfinite(np.nan) is False, which gets negated, and so NaN is masked off.

Solving

I couldn't it explicitly in the docs, but best on the masked arrays' rationale, we should get a masked value, not NaN.

So the problem comes from _DomainSafeDivide:

numpy/numpy/ma/core.py

Lines 842 to 851 in d565c4b

    
           def __call__(self, a, b): 
        
               # Delay the selection of the tolerance to here in order to reduce numpy 
        
               # import times. The calculation of these parameters is a substantial 
        
               # component of numpy's import time. 
        
               if self.tolerance is None: 
        
                   self.tolerance = np.finfo(float).tiny 
        
               # don't call ma ufuncs from __array_wrap__ which would fail for scalars 
        
               a, b = np.asarray(a), np.asarray(b) 
        
               with np.errstate(invalid='ignore'): 
        
                   return umath.absolute(a) * self.tolerance >= umath.absolute(b)

NaN > NaN is False (safe), but we want True (unsafe). So, we should extend the return like so: return <check_small_values> | isnan(a) | isnan(b)

scratchmex added the 00 - Bug label Dec 2, 2021

scratchmex changed the title ~~BUG: / and /= behaves different with nan~~ BUG: / and /= behaves different in masked arrays with nan Dec 2, 2021

WarrenWeckesser added the component: numpy.ma masked arrays label Dec 4, 2021

math2001 linked a pull request Dec 9, 2021 that will close this issue

BUG: numpy.ma mask nan on __itruediv__ #20551

Open

seberg mentioned this issue Oct 5, 2022

BUG: Masked division considers large float64 values as inf #22347

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: `/` and `/=` behaves different in masked arrays with `nan` #20506

BUG: `/` and `/=` behaves different in masked arrays with `nan` #20506

scratchmex commented Dec 2, 2021 •

edited

math2001 commented Dec 9, 2021

BUG: / and /= behaves different in masked arrays with nan #20506

BUG: / and /= behaves different in masked arrays with nan #20506

Comments

scratchmex commented Dec 2, 2021 • edited

Describe the issue:

Reproduce the code example:

Error message:

NumPy/Python version information:

math2001 commented Dec 9, 2021

__itruediv__. Mask bad values pre division

__truediv__. Filter out !umath.isfinite values post division

Solving

BUG: `/` and `/=` behaves different in masked arrays with `nan` #20506

BUG: `/` and `/=` behaves different in masked arrays with `nan` #20506

scratchmex commented Dec 2, 2021 •

edited

`itruediv`. Mask bad values pre division

`truediv`. Filter out `!umath.isfinite` values post division