Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: numpy.ma: Fix ma.in1d #20011

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

tirthasheshpatel
Copy link
Contributor

@tirthasheshpatel tirthasheshpatel commented Oct 1, 2021

Fixes #19877

Fixed ma.in1d and add tests cases from gh-19877.

I just rewrote the function to use numpy.in1d instead of reinventing the wheel for masked arrays. To me, this seems more maintainable. I don't know if there is any reason to handle masked arrays differently. If so, sorry; I'll close this PR.

@rgommers rgommers added the component: numpy.ma masked arrays label Oct 12, 2021
@InessaPawson InessaPawson added the triage review Issue/PR to be discussed at the next triage meeting label Feb 25, 2022
@mattip mattip self-requested a review June 1, 2022 16:52
@mattip mattip removed the triage review Issue/PR to be discussed at the next triage meeting label Jun 1, 2022
@mattip
Copy link
Member

mattip commented Jun 1, 2022

Hi @tirthasheshpatel we discussed this at the last triage meeting and would like to review/merge. Could you rebase off main in order to pick up any CI changes that might have happened since this was first submitted?

@tirthasheshpatel
Copy link
Contributor Author

tirthasheshpatel commented Jun 2, 2022

we discussed this at the last triage meeting and would like to review/merge.

Great, thanks!

Could you rebase off main in order to pick up any CI changes that might have happened since this was first submitted?

Done.

@mattip
Copy link
Member

mattip commented Jun 2, 2022

Seems to make sense to me. @WarrenWeckesser want to take a look?

out = np.in1d(ar1, ar2, assume_unique=assume_unique,
invert=invert)
if m is not nomask:
res[~m] = out
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the output array be masked by the intersection of the mask of ar1 and ar2?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this function, the mask of the output array is determined by the mask of the first argument ar1 only. This is not a pairwise operation; the mask of the second argument actually has no effect on the output.

ar1, ar2 = ma.asarray(ar1), ma.asarray(ar2)
m = getmask(ar1)
res = ma.empty_like(ar1, dtype='bool')
if ar1.mask is not nomask:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already have ar1.mask in m, so here and in the next line, use m instead of ar1.mask.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and if you prefer the readability of ar1.mask, you could rename m to ar1_mask.

@WarrenWeckesser
Copy link
Member

WarrenWeckesser commented Jun 15, 2022

It looks like this changed the behavior of ma.in1d with multidimensional array arguments. In the main development branch, this works:

In [6]: ma.in1d([[0, 1, 2, 3], [4, 5, 6, 7]], [0, 2, 3])
Out[6]: 
masked_array(data=[ True, False,  True,  True, False, False, False, False],
             mask=False,
       fill_value=True)

But in this branch, we get an exception:

In [2]: ma.in1d([[0, 1, 2, 3], [4, 5, 6, 7]], [0, 2, 3])
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-a44d06ca9c57> in <cell line: 1>()
----> 1 ma.in1d([[0, 1, 2, 3], [4, 5, 6, 7]], [0, 2, 3])

~/py3.10.1/lib/python3.10/site-packages/numpy/ma/extras.py in in1d(ar1, ar2, assume_unique, invert)
   1208         res[~m] = out
   1209     else:
-> 1210         res[:] = out
   1211     return res
   1212 

~/py3.10.1/lib/python3.10/site-packages/numpy/ma/core.py in __setitem__(self, indx, value)
   3375         if _mask is nomask:
   3376             # Set the data, then the mask
-> 3377             _data[indx] = dval
   3378             if mval is not nomask:
   3379                 _mask = self._mask = make_mask_none(self.shape, _dtype)

ValueError: could not broadcast input array from shape (8,) into shape (2,4)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wrong results of ma.isin() and ma.in1d() for masked arrays
6 participants