Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of asarray() when input is already an array #13580

Closed
anntzer opened this issue May 17, 2019 · 4 comments
Closed

Performance of asarray() when input is already an array #13580

anntzer opened this issue May 17, 2019 · 4 comments
Labels
component: numpy._core Priority: high High priority, also add milestones for urgent issues

Comments

@anntzer
Copy link
Contributor

anntzer commented May 17, 2019

I have noticed that np.asarray(x) and np.asarray(x, float) have pretty big overheads when the input, x, is already a float array. One could perhaps check whether that is already the case, immediately returning x if possible. The additional check would have a negligible cost in the case where a conversion is needed (because the conversion e.g. from a x that's a list is much slower than anything else).

Reproducing code example:

In [1]: x = np.random.rand(10, 10)                                                               

In [2]: %timeit np.asarray(x)                                                                    
345 ns ± 0.53 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [3]: %timeit np.asarray(x, float)                                                             
613 ns ± 0.274 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

# Check for fast path.
In [4]: %timeit x if type(x) is np.ndarray and x.dtype == float else np.asarray(x, float)        
214 ns ± 0.894 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [5]: x = np.random.rand(10, 10).tolist()                                                      

In [6]: %timeit np.asarray(x)                                                                    
6.77 µs ± 6.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [7]: %timeit np.asarray(x, float)                                                             
6.78 µs ± 14.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

# The additional check is a bit slower but not so much, relatively speaking.
In [8]: %timeit x if type(x) is np.ndarray and x.dtype == float else np.asarray(x, float)        
6.93 µs ± 10.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Numpy/Python version information:

1.16.3 3.7.3 (default, Mar 26 2019, 21:43:19) 
[GCC 8.2.1 20181127]
@hmaarrfk
Copy link
Contributor

I think the discussion #11804 is relevant.

@rgommers rgommers added component: numpy._core Priority: high High priority, also add milestones for urgent issues labels Jul 13, 2019
@rgommers
Copy link
Member

@seberg and I just discussed this: the expected speedup from moving asarray to C is almost 2x, and asarray can be a performance bottleneck on real-world code bases because it's called so often. It may also give a nice speedup on test suites.

@seberg seberg self-assigned this Jul 16, 2019
@seberg seberg removed their assignment Dec 3, 2019
seberg added a commit to seberg/numpy that referenced this issue Dec 6, 2019
This moves all argument parsing into specialized code (unless an
error with the arguments itself is found).
Especially, the move to C, while also avoiding any kwarg parsing
in the case of e.g. `np.asarray(arr)` means that such simple call
speed up by more than a factor of 3.
That optimization is still very small, but assuming that it is called
a lot (and occasioanlly in tight loops) it may be worth it.

Closes numpygh-13580
@seberg
Copy link
Member

seberg commented Nov 5, 2021

This has been fixed by my new argparsing code and moving these array "aliases" to C.

Using asarray, etc. is now preferable to using array (if you pass fewer keywords). But generally using keyword arguments comes with only a very small overhead penalty now.

@seberg seberg closed this as completed Nov 5, 2021
@rgommers
Copy link
Member

rgommers commented Nov 5, 2021

That's great news, thanks @seberg!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: numpy._core Priority: high High priority, also add milestones for urgent issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants