Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification of setxor1d documentation #14670

Open
synapticarbors opened this issue Oct 9, 2019 · 4 comments
Open

Clarification of setxor1d documentation #14670

synapticarbors opened this issue Oct 9, 2019 · 4 comments

Comments

@synapticarbors
Copy link

I've been working on adding an implementation of np.setxor1d to numba (numba/numba#4677), and noticed that the documentation might need some clarifications, and I also had some questions about the expected behavior:

  • The assume_unique argument has the following description:

assume_unique : bool
If True, the input arrays are both assumed to be unique, which can speed up the calculation.
Default is False.

More specifically the code to work properly requires that the inputs are unique, 1D and sorted.

  • The input arrays ar1 and ar2, according to the docs, should just be "array-like", but a user gets unexpected errors if the arrays have ndim > 1, when assume_unique is True. The function will fail either at the point where the first call to np.concatenate if the two arrays have incompatible dimensions, or at the second call to np.concatenate when there is an attempt to concat with the 1D [True] value.

I'm not sure what the proper way of handling this is. Perhaps just a clarification of the docs, or a call to ravel() if assume_unique is True to get around the concat issue if the N-D arrays do indeed only contain unique values and they are sorted according to the (default?) order.

I'm happy to put in a PR once I get some feedback from the Numpy devs about the approach that they feel like is most appropriate.

cc/ @stuartarchibald

@synapticarbors
Copy link
Author

I would appreciate feedback on this issue if anyone has the time to take a quick look. Just re-upping this in case it got lost in the shuffle two weeks ago.

@mattip
Copy link
Member

mattip commented Dec 8, 2019

I think the intention in setxor1d is that the input be 1d, but only in certain cases does it actually matter. So the choices are:

  • clearly state in the documentation that ravel will be applied if ndims > 1 (and make the code enforce that)
  • always error if ndims > 1
    The second choice seems like it would break more code, so I would prefer the first one. Other opinions?

@stuartarchibald
Copy link
Contributor

@mattip thanks for your input. I think I'd go with option 1. This is also something Numba can do. @synapticarbors do you have a preference?

@synapticarbors
Copy link
Author

I think option #1 is probably better to avoid breaking existing code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants