Is numba worth the trouble if numpy can do most of the work? #23

danielballan · 2014-01-23T02:55:11Z

This S.O. question, in particular the original poster's comment on the answer, sums up my feelings as I try work fit numba into my code. Seems like a lot of common numpy idioms are not supported.

Explicit loops, which seem to be what numba wants, make code less readable and obviously slower for users who might not have numba available. Disappointing.

On the other hand, the line profiler %lprun shows me that the refine function in feautre.py takes 95% of the runtime during feature location, so making that faster is worth the trouble, even if it does take a lot of trouble.

Thoughts?

The text was updated successfully, but these errors were encountered:

danielballan · 2014-01-26T19:26:45Z

OK, I have numba working on feature.py, similar to how Nathan used it on identification.py: it speeds up the refine loop. For extra credit, I managed to keep general dimensions, not just 2D.

I have more work to do before I merge this into master, but for this is the speedup so far:

numba-dan branch:
noise-filled dummy image: 1 loops, best of 3: 1.15 s per loop
image of particles: 1 loops, best of 3: 1.61 s per loop

on master:
noise-filled dummy image: 1 loops, best of 3: 1.32 s per loop
image of particles: 1 loops, best of 3: 1.9 s per loop

danielballan · 2014-01-26T19:43:01Z

Figures above had memorization code removed. The times are better with memoization reinstated, so masks are only generated once.

numba-dan with memoization turned on
1 loops, best of 3: 966 ms per loop
1 loops, best of 3: 1.38 s per loop

Feature tests, fixes, and performance boosts.

nkeim · 2014-01-27T22:46:38Z

Yes, numba (or cython, etc.) is worth the trouble! :) At least when you're tracking a flowing material made of close-packed particles, which really taxes the subnet code. Switching the subnet code to numba gave me roughly an order of magnitude in speed for my data.

I have battled numba before, so if you are running into problems, I may be able to help. Also, now that you have the numba code running and passing tests, there may be ways to speed it up. As in Cython, it's possible for the code to silently drop back into the Python object model, which means a huge penalty.

I have re-forked the master branch of this repo and will probably add the numba subnet code soon. I don't mean to be impatient --- I just have to add a feature to the linker for my research, and I thought I'd try to quit my old numba branch, which means that I need comparable performance from the new one soon.

I'm so glad that refine can work with numba! Thanks!

danielballan · 2014-01-28T13:38:50Z

Great! Once I have something that I'm sure is working, I'll ask for your help. Maybe later today if things go well.

Don't worry about seeming impatient -- I was worried about seeming that way myself. I'd like to have something stable to work on soon. I think what remains is:

Develop PIMS to the point where the API is stable so that the tutorials for trackpy will be stable too. As you can see, Tom has been hard at work nearly rewriting the whole thing, making it much better.
Merge in my numba refine branch.
Merge in your numba subnet branch, and set up the existing unit tests to run on it.
Test things on real data for a few days.
Update README/docs/tutorials at least to the point that there is no wrong information.

Anything else on our collective wish list before declare victory on v0.2 and start using it full time?

danielballan · 2014-01-29T22:46:41Z

We have our numba subnet. #44 paves the way for a drop-in replacement or 2D special case for refine, whether it numba, cython, or C. Since Nathan has indicated that his needs are met in runtrackpy, I will close this issue.

nkeim pushed a commit to nkeim/trackpy that referenced this issue Jan 27, 2014

Merge pull request soft-matter#23 from danielballan/fix-topn

8d08b18

Feature tests, fixes, and performance boosts.

danielballan closed this as completed Jan 29, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is numba worth the trouble if numpy can do most of the work? #23

Is numba worth the trouble if numpy can do most of the work? #23

danielballan commented Jan 23, 2014

danielballan commented Jan 26, 2014

danielballan commented Jan 26, 2014

nkeim commented Jan 27, 2014

danielballan commented Jan 28, 2014

danielballan commented Jan 29, 2014

Is numba worth the trouble if numpy can do most of the work? #23

Is numba worth the trouble if numpy can do most of the work? #23

Comments

danielballan commented Jan 23, 2014

danielballan commented Jan 26, 2014

danielballan commented Jan 26, 2014

nkeim commented Jan 27, 2014

danielballan commented Jan 28, 2014

danielballan commented Jan 29, 2014