Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate using Numba within Scikit-Image #9

Open
mrocklin opened this issue May 21, 2018 · 8 comments
Open

Investigate using Numba within Scikit-Image #9

mrocklin opened this issue May 21, 2018 · 8 comments

Comments

@mrocklin
Copy link
Member

In https://github.com/scikit-image/scikit-image/wiki/UC-Berkeley-(BIDS)-sprint,-May-28-Jun-2-2018 scikit-image devs write the following:

figure out whether we can reliably distribute Numba, and if so, do we want to get that dependency.

I suspect that there are a few things within Numba that we might want to investigate, depending on time

General performance

We could look at using Numba to rewrite existing Cython code to see if there is any difference in CPU or memory use, and also how the development experience feels.

We might want to read through this document on performance to guide work here: http://numba.pydata.org/numba-doc/dev/user/performance-tips.html

Parallelism

Numba has both implicit parallelism and explicit pranges.

https://numba.pydata.org/numba-doc/dev/user/parallel.html

Stencil operators

Many of the scikit-image functions are well defined on a local neighborhood. This might be a good fit for the experimental stencil decorator:

https://numba.pydata.org/numba-doc/dev/user/stencil.html#numba-stencil

GPU algorithms

I'm not sure what the comfort level is of those attending, but we might also consider looking at writing CUDA code with Numba

https://numba.pydata.org/numba-doc/dev/cuda/index.html

Disadvantages

We should look at its behavior on ARM and Power architectures and see if the current bugs affect scikit-image use.

We should also get a feel for debugging and troubleshooting to see how it compares to the Cython experience.

cc @stefanv @emmanuelle @jni @kne42 @jakirkham

@mrocklin
Copy link
Member Author

I was playing around with @numba.stencil and parallel operations and ran into some usability problems. I raised them here numba/numba#2982

Things seem fine now. That issue contains decently usable examples and a small discussion on when parallelism may or may not be helpful.

@jni
Copy link
Contributor

jni commented May 22, 2018

Incidentally, a crazy idea I had is that we could have both Cython and Numba by using Cython's new Pure Python annotations. We could then (a) compile if available, but (b) ship pure Python anyway, and (c) use Numba if available. There's some hairy logic to figure out, but it might be doable and if it works I think it would be pretty awesome. =)

@jni
Copy link
Contributor

jni commented May 22, 2018

Also incidentally, my experience re optimizing and debugging Numba code is that it is still a long way from Cython. The only very nice thing is that you can NUMBA_DISABLE_JIT=1 when it comes to testing and get coverage reports for your Numba functions, which is pretty cool. (I'm not sure how coverage works with Cython actually.)

@stuartarchibald
Copy link

@jni We'd be most interested in hearing about any specific issues you've experienced in optimizing and debugging Numba. Also, about what features the cython stack has that you feel are missing in Numba.

The most recent Numba (0.38, https://github.com/numba/numba/blob/5ba86a9ad3c2c425b48bc16e4adb908a8830bc9c/CHANGE_LOG#L1-L143) has had a load of user facing improvements added based on community feedback. We have a tracking ticket here: numba/numba#2888, please could you file any issue(s)/feedback against that?

This numba/numba#2793 is WIP and brings magics that provide more Numba diagnostic capabilities to tools like Jupyter/ipython and via an API elsewhere, it is hoped that this will make the next release.

Thanks.

@jakirkham
Copy link
Contributor

It would be pretty useful to have closer interop between Dask and Numba so one could add a custom function to operate on blocks and have it JIT'd for high performance. Remember running into some pickling issues when I tried this last, which may or may not be a blocker depending on how this is implemented (i.e. compilation before vs. after transmission).

@mrocklin
Copy link
Member Author

mrocklin commented May 23, 2018 via email

@jni
Copy link
Contributor

jni commented May 25, 2018

@stuartarchibald I've updated numba/numba#2888 with some specific concerns. Thank you!

@stuartarchibald
Copy link

@jni great, thanks for doing that, much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants