Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use pykdtree when available #1150

Merged
merged 3 commits into from
Oct 27, 2018
Merged

Use pykdtree when available #1150

merged 3 commits into from
Oct 27, 2018

Conversation

QuLogic
Copy link
Member

@QuLogic QuLogic commented Oct 20, 2018

Rationale

The pykdtree project by @storpipfugl is super fast compared to scipy, so this tries to use it if it's available.

Implications

There is no effect on tests, but a great effect on the examples:

                   example   function     master   pykdtree     ratio
0          aurora_forecast    imshow0   1.614568   0.697115  0.431765
1          aurora_forecast    imshow1   1.620853   0.704454  0.434619
2          aurora_forecast       draw   0.916317   1.119499  1.221738
3        eccentric_ellipse     imshow   2.400456   1.269291  0.528771
4        eccentric_ellipse       draw   0.341589   0.677258  1.982666
5   effects_of_the_ellipse  add_image   0.000008   0.000008  1.031250
6   effects_of_the_ellipse       draw   1.835607   1.980497  1.078933
7             eyja_volcano  add_image   0.000003   0.000003  0.928571
8             eyja_volcano       draw   3.526339  10.100620  2.864336
9            geostationary     imshow  96.402943   1.546087  0.016038
10           geostationary       draw   0.643935   0.896134  1.391653
11                     wms    add_wms   0.514387   0.521981  1.014763
12                     wms       draw   2.045766   2.763029  1.350609

In almost all cases, the imshow is approximately half the time while draw is slightly slower. For some reason, the draw in eyja_volcano is much slower which I need to investigate a bit, but the real win here is geostationary which is two orders of magnitude faster.

This speedup should hopefully improve doc builds a bit (at least where it was added.)

@QuLogic
Copy link
Member Author

QuLogic commented Oct 20, 2018

Oh, actually, I think that eyja_volcano one is a fluke. I re-ran everything 5 times and took averages:

                                  master_mean  master_std  pykdtree_mean  pykdtree_std    ratio  ratio_err
example                function                                                                           
aurora_forecast        draw          1.678460    0.115361       1.657326      0.326995 0.987408   0.206300
                       imshow0       1.620274    0.034479       0.701028      0.009098 0.432660   0.010784
                       imshow1       1.596412    0.003287       0.717052      0.014395 0.449165   0.009064
eccentric_ellipse      draw          1.033083    0.433136       0.838907      0.155962 0.812043   0.372432
                       imshow        2.452178    0.067519       1.273574      0.009867 0.519364   0.014856
effects_of_the_ellipse add_image     0.000008    0.000001       0.000008      0.000001 0.988636   0.160593
                       draw          2.272155    0.206711       2.140229      0.113149 0.941938   0.099112
eyja_volcano           add_image     0.000003    0.000000       0.000003      0.000000 0.955224   0.071966
                       draw          8.419627    9.380086       4.448825      0.128522 0.528387   0.588861
geostationary          draw          0.889276    0.086658       0.927128      0.150869 1.042565   0.197748
                       imshow       98.028681    1.924029       1.555004      0.018169 0.015863   0.000362
wms                    add_wms       0.508353    0.039948       0.501570      0.018772 0.986658   0.085879
                       draw          2.873218    0.163422       2.394889      0.180951 0.833521   0.078828

Most imshow are ~50%, draw is between ~5% under and 5% over (but it's generally small and lots of variance anyway), and geostationary is still the most amazing running with just 1.5% of the original time.

Docs are 2.5 minutes faster, so maybe we should add pykdtree on both builds?

except TypeError:
kdtree = scipy.spatial.cKDTree(xyz)
_, indices = kdtree.query(target_xyz, k=1)
mask = indices >= len(xyz)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is there because we don't really care about the distance and pykdtree returns differing results for out-of-domain vs. nan storpipfugl/pykdtree#35

Use pykdtree on 'latest' for Travis, and for 3.6 for CircleCI.
@QuLogic
Copy link
Member Author

QuLogic commented Oct 27, 2018

Rebased; I enabled pykdtree on both doc builds because a) it's so much faster, and b) there's already checks for plain scipy on Travis.

@ajdawson
Copy link
Member

I don't see any reason why we wouldn't want to do this, so 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants