Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nanmax_nb(a) does the same as np.nanmax(a, axis=0) but slower #37

Closed
Ziink opened this issue Aug 13, 2020 · 2 comments
Closed

nanmax_nb(a) does the same as np.nanmax(a, axis=0) but slower #37

Ziink opened this issue Aug 13, 2020 · 2 comments

Comments

@Ziink
Copy link

Ziink commented Aug 13, 2020

ndarr = np.random.randint(0,10000,size=(1000, 1000))
%timeit nd1 = vbt.nb.nanmax_nb(ndarr)
%timeit nd2 = np.nanmax(ndarr, axis=0)
np.array_equal(nd1, nd2)

Output:
809 µs ± 3.07 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
651 µs ± 1.61 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
True

Probably same for nanmean_nb etc.

@polakowo
Copy link
Owner

Generally, vectorized Numpy operations are mostly superior to Numba's by a small margin. The reason why some functions such as nanmax_nb were written in Numba is that they can be used inside of user's njited code. Current version of Numba doesn't support axis argument in functions such as nanmax so you will find nanmax_nb quite handy. Also this particular function can be further optimized (it just does np.nanmax on each column) but it's used nowhere in vectorbt (pd.DataFrame.vbt.max uses NumPy and optionally bottleneck) so you can just ignore its existence. Finally, vectorbt's true power is in window functions, most other are there just for convenience.

@Ziink
Copy link
Author

Ziink commented Aug 14, 2020

Makes sense.

Thanks.

@Ziink Ziink closed this as completed Aug 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants