Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for np.unique #2902

Merged
merged 6 commits into from
May 3, 2018
Merged

Support for np.unique #2902

merged 6 commits into from
May 3, 2018

Conversation

Rik-de-Kort
Copy link
Contributor

@Rik-de-Kort Rik-de-Kort commented Apr 17, 2018

closes #2884

Fairly minimal, just the single array argument is supported,, not any of the return options or the axis argument.

@codecov-io
Copy link

codecov-io commented Apr 17, 2018

Codecov Report

Merging #2902 into master will decrease coverage by <.01%.
The diff coverage is 77.77%.

@@            Coverage Diff             @@
##           master    #2902      +/-   ##
==========================================
- Coverage   85.84%   85.84%   -0.01%     
==========================================
  Files         326      326              
  Lines       68309    68329      +20     
  Branches     7720     7722       +2     
==========================================
+ Hits        58640    58654      +14     
- Misses       8428     8434       +6     
  Partials     1241     1241

@rhkleijn
Copy link

Great to see support for more numpy functions in numba!
The implementation currently does not seem to support empty arrays, i.e. np.array([]).
I guess replacing [b[0]] with [_ for _ in b[:1]] would take care of that.
[*b[:1]] would have been a bit less verbose but numba does not seem to support that construct.

@Rik-de-Kort Rik-de-Kort changed the title Support for np.unique [WIP] Support for np.unique Apr 17, 2018
@Rik-de-Kort
Copy link
Contributor Author

Thanks for the feedback. I'm pretty sure we should be able to determine the size of the array and branch the implementation based on that, but I don't know how to check for an empty array.

@Rik-de-Kort Rik-de-Kort changed the title [WIP] Support for np.unique Support for np.unique Apr 17, 2018
@Rik-de-Kort
Copy link
Contributor Author

Turns out it's not possible to differentiate on emptiness. Current implementation might replace [x for x in b[:1]] by simply list(b[:1]) (else we add an array to a list), but a cast has to happen somewhere.

Copy link
Member

@sklam sklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution. I have some minor suggestions.

@overload(np.unique)
def np_unique(a):
def np_unique_impl(a):
b = np.sort(a.flatten())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance nitpick: use .ravel() instead of .flatten() so you can save one copying.

def np_unique(a):
def np_unique_impl(a):
b = np.sort(a.flatten())
return np.array([x for x in b[:1]] + [x for i, x in enumerate(b[1:]) if b[i] != x])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pep8 nitpick: line more than 79col

@sklam
Copy link
Member

sklam commented Apr 30, 2018

Turns out it's not possible to differentiate on emptiness.

You can check the size of the array by arr.size.

@sklam sklam merged commit 4f6b632 into numba:master May 3, 2018
@stuartarchibald stuartarchibald added this to the Numba 0.39 RC milestone May 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request numpy.unique
5 participants