ENH: Port cKDTree query methods to C++, allow pickling on Python 2, and add support for periodic queries #4890

sturlamolden · 2015-05-19T02:40:35Z

New WIP PR after rebase disaster.

ewmoore · 2015-05-19T12:42:55Z

scipy/spatial/ckdtree/src/ckdtree_count_neighbors.cxx

+#include "ckdtree_cpp_rectangle.h"
+
+static void
+__count_neighbors_traverse(const ckdtree      *self,


Aren't all double underscore names reserved in C++?

Does this help with the performance regressions? https://pv.github.io/scipy-bench/#regressions?sort=3&dir=desc

The C++ compilers do not complain about the double underscores, but I'll check. They are in the Cython code, but it is trivial to remove them.

Double underscores are gone. They served no purpose anyway in the C++ code.

sturlamolden · 2015-05-19T18:58:11Z

query_ball_point is now ported to C++. With some minor modifications it should also be possible to multithread this function.

query_ball_tree and sparse_distance_matrixremains to be ported to C++.

sturlamolden · 2015-05-19T20:20:22Z

count_neighborshas a test failure according to Travis, returns 34 instead of 33.

Update: Found one typo, but now it returns 32 instead of 33. Bugger.
Update: Squashed

sturlamolden · 2015-05-20T12:47:32Z

query_ball_treeis ported to C++, sparse_distance_matrixonly remaining method to port.

query_ball_point is multithreaded similarly to query.

sturlamolden · 2015-05-20T17:10:36Z

All query methods now in C++

Added formerly requested option of returning other containers than dok_matrix from sparse_distance_matrix. Test case needed for these.

sturlamolden · 2015-05-20T18:49:00Z

Remaining issues:

testing on Windows
code review, final bug fixing
use prefetching in sparse_distance_matrix (if I bother)

sturlamolden · 2015-05-21T07:35:49Z

Travis is happy with this, it seems.

rgommers · 2015-05-21T20:02:27Z

Copying @pv's question that didn't get answered yet: does this help with the performance regressions? https://pv.github.io/scipy-bench/#regressions?sort=3&dir=desc ?

I did some browsing through all the cKDTree benchmarks and while we get a major improvement for 0.16.x there are a few things that got a lot slower. Would be good to copy benchmark results for this PR in a comment here before merging.

sturlamolden · 2015-05-21T21:45:43Z

I can do a systematic benchmark of cKDTree in 0.15.x, 0.16.x and this PR. That is clearly needed before this is ready to merge. I tried to read the benchmarks from @pv , but I am not quite sure what they measure and how to read them.

0.16.x provides two ways of constructing the kd-tree: sliding midpoint as before and median as in scikit-learn. I would not be surprised if the new method could give performance boost or regression depending on the query method. Since the median method makes kd-tree construction slower (but queries can be faster), the net effect can also depend on how much the kd-tree construction account for the whole run-time (e.g. it matters relatively more for smaller data sets than for large). The effect on query time will also depend on the distribution of the data, though @jakevdp thinks the median method is only inferior in contrived situations. The effect on run-time from the changes here probably depends on the hardware as well, and maybe also the compiler.

Another thing to note is that speed is not everything. This PR also allows the GIL to be released while doing the queries. In 0.15.x the GIL is never released; in 0.16.x only .queryreleases the GIL; here all query methods release the GIL. This is rather important because id-tree queries can be lengthy and will lock-up the interpreter while they run with the GIL locked. Even if we do not multithread the queries, concurrent i/o will suffer, GUIs will not respond, and real-time graphics might not update in time. kd-trees are often used for computer graphics (e.g. in games and simulations), so this is rather important.

sturlamolden · 2015-05-21T21:48:48Z

I should probably set up a repo for the benchmarks.

sturlamolden · 2015-05-22T01:33:19Z

Repo for benchmarks:
https://github.com/sturlamolden/cKDTree-bench

sturlamolden · 2015-05-22T04:10:19Z

Benchmarks:
http://nbviewer.ipython.org/github/sturlamolden/cKDTree-bench/blob/master/benchmark-i7.ipynb

sturlamolden · 2015-05-22T04:12:52Z

The only performance regression from 0.15.x to 0.16.x is in the kd-tree creation time. But that is on a time scale three orders of magnitude smaller than the duration of the query methods.

This PR is equally fast or faster than 0.16.x, depending on the query method.

rgommers · 2015-05-22T06:29:48Z

The existing benchmarks are here: benchmarks/benchmarks/spatial.py. It looks like the benchmark results do depend on the shape of the input data.

The rendered benchmarks are at https://pv.github.io/scipy-bench/# (look for spatial). They show performance as function of time, from 2009 to now. As an example of a regression, see https://pv.github.io/scipy-bench/#spatial.Neighbors.time_sparse_distance_matrix. On the left sidebar, deselect KDTree and select cKDTree under class. Then you'll see a graph with a jump up in 2013 (the Cython rewrite) and other jump up (click and drag a zoom window onto the plot to zoom in) from a couple of weeks ago.

Now for a particular benchmark it can be the case that that benchmark is not realistic and can therefore be ignored or should be adapted. The query, query_pairs and query_ball_points ones look OK. build becoming slower might be expected. sparse_distance_matrix becoming slower I'm not sure about.

Note that you can run the benchmarks locally with $ python runtests.py --bench -s spatial

rgommers · 2015-05-22T06:46:31Z

The output you'll get is:

$ python runtests.py --bench -s spatial
Building, see build.log...
    ... build in progress
    ... build in progress
    ... build in progress
    ... build in progress
    ... build in progress
    ... build in progress
    ... build in progress
    ... build in progress
Build OK
Running benchmarks for Scipy version 0.17.0.dev0+7b893ae at /home/rgommers/Code/bldscipy/build/testenv/lib/python2.7/site-packages/scipy/__init__.pyc
· Discovering benchmarks
· Running 6 total benchmarks (1 commits * 1 environments * 6 benchmarks)
[  0.00%] ·· Building for /usr/bin/python
[  0.00%] ·· Benchmarking /usr/bin/python
[ 16.67%] ··· Running spatial.Build.time_build                                                       ok
[ 16.67%] ···· 
               =================== ========= =========
               --                         class       
               ------------------- -------------------
                    (m, n, r)        KDTree   cKDTree 
               =================== ========= =========
                 (3, 10000, 1000)   33.08ms    3.46ms 
                 (8, 10000, 1000)   64.40ms    4.73ms 
                (16, 10000, 1000)   37.49ms    6.80ms 
               =================== ========= =========

[ 33.33%] ··· Running spatial.Neighbors.time_count_neighbors                                         ok
[ 33.33%] ···· 
               ================== ============== ========== =========
               --                                       class        
               --------------------------------- --------------------
                  (m, n1, n2)      probe radius    KDTree    cKDTree 
               ================== ============== ========== =========
                (3, 1000, 1000)        0.2        410.96ms    2.32ms 
                (3, 1000, 1000)        0.5        698.07ms    3.90ms 
                (8, 1000, 1000)        0.2         1.71s      8.46ms 
                (8, 1000, 1000)        0.5         2.11s     11.01ms 
                (16, 1000, 1000)       0.2         2.50s     14.63ms 
                (16, 1000, 1000)       0.5         2.71s     17.44ms 
               ================== ============== ========== =========

[ 50.00%] ··· Running spatial.Neighbors.time_sparse_distance_matrix                                  ok
[ 50.00%] ···· 
               ================== ============== ======== =========
               --                                      class       
               --------------------------------- ------------------
                  (m, n1, n2)      probe radius   KDTree   cKDTree 
               ================== ============== ======== =========
                (3, 1000, 1000)        0.2        1.41s     1.97ms 
                (3, 1000, 1000)        0.5        2.53s     6.99ms 
                (8, 1000, 1000)        0.2        5.39s     6.83ms 
                (8, 1000, 1000)        0.5        6.59s     9.42ms 
                (16, 1000, 1000)       0.2        9.42s    14.19ms 
                (16, 1000, 1000)       0.5        10.68s   17.21ms 
               ================== ============== ======== =========

[ 66.67%] ··· Running spatial.Query.time_query                                                       ok
[ 66.67%] ···· 
               =================== ========== ========== ==============
               --                                 class                
               ------------------- ------------------------------------
                    (m, n, r)        KDTree    cKDTree    cKDTree_flat 
               =================== ========== ========== ==============
                 (3, 10000, 1000)   247.76ms    2.73ms      72.60ms    
                 (8, 10000, 1000)    4.26s     22.65ms      105.84ms   
                (16, 10000, 1000)    27.65s    235.68ms     166.01ms   
               =================== ========== ========== ==============

[ 83.33%] ··· Running spatial.Radius.time_query_ball_point                                           ok
[ 83.33%] ···· 
               ================== ============== ======== ========= ==============
               --                                              class              
               --------------------------------- ---------------------------------
                   (m, n, r)       probe radius   KDTree   cKDTree   cKDTree_flat 
               ================== ============== ======== ========= ==============
                (3, 10000, 1000)       0.2        2.08s     8.49ms     77.31ms    
                (3, 10000, 1000)       0.5        6.04s    26.86ms     83.03ms    
               ================== ============== ======== ========= ==============

[100.00%] ··· Running spatial.Radius.time_query_pairs                                                ok
[100.00%] ···· 
               ================ ============== ========== ========== ==============
               --                                             class                
               ------------------------------- ------------------------------------
                  (m, n, r)      probe radius    KDTree    cKDTree    cKDTree_flat 
               ================ ============== ========== ========== ==============
                (3, 1000, 30)        0.2        388.76ms   977.07μs      3.75ms    
                (3, 1000, 30)        0.5        506.58ms    1.63ms       4.29ms    
                (8, 1000, 30)        0.2         2.51s      3.86ms       5.72ms    
                (8, 1000, 30)        0.5         2.46s      4.72ms       5.67ms    
                (16, 1000, 30)       0.2         2.63s      7.39ms       8.66ms    
                (16, 1000, 30)       0.5         2.62s      8.11ms       8.71ms    
               ================ ============== ========== ========== ==============

sturlamolden · 2015-05-22T10:56:23Z

cKDTree did not have sparse_distance_matrix prior to the Cython rewrite. Clearly that could not have caused a regression :)

sturlamolden · 2015-05-22T10:58:39Z

In my tests I used the dummy data generator from scikit-learn to get a more realistic dataset than just random numbers.

pv · 2015-05-22T11:38:08Z

sparse_distance_matrix has been in cKDTree since 882820e in 2012. Obviously, the regression is after that.

pv · 2015-05-22T11:45:36Z

The first regression in the sparse_distance_matrix is probably due to fixing the implementation to give correct result, probably a9565c0

The kdtree benchmarks currently under benchmarks/ are those that were previously under scipy/spatial/benchmarks --- I have no idea whether they are sensible or not. If you think they are not sensible, they should be replaced. (Given that KDTrees have various applications, it's not clear to me there is a good condition for "realistic" datasets.)

pv · 2015-05-22T11:48:44Z

The more recent performance regression in sparse_distance_matrix benchmarks is bea41ad, as indicated in the plots.

sturlamolden · 2015-05-22T13:54:17Z

Did you look at the timings in the ipython notebook? Here you can see the runtime of sparse_distance_matrix reduced from 4.07 s in SciPy 0.15.x to 3.12 s in 0.16.x. Yet your benchmark show a performance regression. I am not sure what is going on here.

sturlamolden · 2015-05-22T14:04:49Z

What I mean by "realistic dataset" is that the data has some structure so that the n-dimensional sample space is not uniformly populated. This is at least the common situation is classification problems.

pv · 2015-05-22T14:05:15Z

Note that it's not really "my" benchmark --- these are the old benchmarks written by the original ckdtree authors. The benchmarks are in the scipy repository --- you can try running them on your machine and to see if you can reproduce the results.

sturlamolden · 2015-11-11T09:05:41Z

I am not expecting the Travis build to succeed (I don't know how to shut it down), but now the merge conflicts are taken care of. The next step is to fix the build issue on SunOS and Cygwin due to ::infinite.

sturlamolden · 2015-11-11T09:08:11Z

But on the positive side, we are up to date with scipy/master again :-)

sturlamolden · 2015-11-11T09:32:00Z

As expected, all the builds are failing because the symbol ::infinity was removed in scipy/master and is now missing.

sturlamolden · 2015-11-11T12:19:39Z

Bugger, I forgot about using prefixes on the commit messages :-(

sturlamolden · 2015-11-11T13:39:22Z

Great, what do you think @rainwoodman ? Are we happy now? More that needs to be done? Fix my last commit messages perhaps (I don't know how), but apart from that, does this look good?

sturlamolden · 2015-11-11T22:14:54Z

@rgommers Should we put an 0.17 milestone on this? This PR is basically done, unless @rainwoodman has anything else to add.

(It remains to verify that it builds on Wintendo. My old PC just died of old age, but I am setting up a virtual box to run on my Mac. Thank God I kept the DVD.)

sturlamolden · 2015-11-11T23:30:04Z

Summary of what is in this PR:

All query methods are in C++ and release the GIL
Build method is in C++ and releases the GIL
Support for periodic spaces for using cKDTree in cosmology simulations
Pickling is supported on Python 2 as well
sparse_distance_matrix can return other containers than dok_matrix which prevents truncation to 0 of distances between excluded pairs.
Many conditional branches are replaced with C++ templates to ensure the fastest execution path
query_ball_point has a major speed improvement over the current implementation
query_ball_point can in additon optionally be multi-threaded

rainwoodman · 2015-11-12T00:23:10Z

Sure. I am happy with this. Thanks!

sturlamolden · 2015-11-12T03:41:26Z

WIP tag is removed.

rainwoodman · 2015-11-13T17:47:56Z

We will likely write a code for 2-point galaxy clustering analysis in cosmology, with an extension to the original Gray and Moore 2000 to the anisotropic case.

This code will be based on the current ckdtree code in this branch, but would add these features to ckdtree:

weighted input data, multiple weights shall be supported. The tree nodes will be augmented to include the sum of weights for proper pruning.
cKDTree(data, leafsize, boxsize, weight)
An anisotropic count_neighbors method with pruning, that returns number of pairs as a function of r and z/r, where z is the distance along a selected axis:
count_neighbors_2d(self, other, r, mu, p)

From my eyes these two features are quite doable. What do your think? @sturlamolden @rgommers ?

We will be testing the full program (which would just blend some parallelism and IO interface with the modified ckdtree) against the standard code used in SDSS galaxy clustering analysis (https://github.com/bareid/xi), and fix bugs along the way.

I am guessing the time-line for a new ckdtree PR will be the next 2 to 3 months, probably not soon enough to make it into next release?

sturlamolden · 2015-11-13T20:17:18Z

They are probably doable. I am not sure what the weights do, though, but now that we use templates it is easy to plug them in without producing a performance regression.

0.17 will probably be tagged and branched soon. It is that time of the year. Then 0.18 will be next summer, or something like that. :-)

sturlamolden · 2015-11-16T02:48:16Z

Can someone please tag this PR with 0.17 milestone?

rgommers · 2015-11-22T18:28:29Z

I've verified that this builds and passes all tests with MinGW (python 3.4, sse3 build). So in it goes. Thanks @sturlamolden and @rainwoodman!

ENH: Port cKDTree query methods to C++, allow pickling on Python 2, and add support for periodic queries

rgommers · 2015-11-22T18:29:08Z

@rainwoodman a next release is never more than 6 months away, so no worries about the timing for the next PR.

rgommers · 2015-11-22T18:30:53Z

I'll send an update for the release notes.

sturlamolden mentioned this pull request May 19, 2015

WIP: Porting the remaining cKDTree query methods to C++ #4860

Closed

sturlamolden changed the title ~~WPI: Port cKDTree query methods to C++~~ WIP: Port cKDTree query methods to C++ May 19, 2015

sturlamolden mentioned this pull request May 19, 2015

Porting the rest of cKDTree to C++ #4855

Closed

ewmoore reviewed May 19, 2015
View reviewed changes

This was referenced May 20, 2015

cKDTree.query_pairs could return a (n, 2) ndarray rather than a set of pairs #3840

Closed

KDTree.sparse_distance_matrix does not handle elements at 0 distance well (Trac #1834) #2353

Closed

rgommers added enhancement A new feature or improvement scipy.spatial labels May 21, 2015

sturlamolden changed the title ~~WIP: Port cKDTree query methods to C++~~ WIP: Port cKDTree query methods to C++ and add support for periodic queries Nov 11, 2015

sturlamolden added 6 commits November 11, 2015 11:53

infinity fixes for C++ files

60cb325

change deps for ckdtree build

eb1d63e

update ckdtree.pyx to use numpy.math.INFINITY

4929848

fix pyflakes and pep8 errors

171efe6

update to use ckdtree_isinf instead of numpy.math.isinf

500bded

change name recarray to ndarray

0aa6741

sturlamolden changed the title ~~WIP: Port cKDTree query methods to C++ and add support for periodic queries~~ WIP: Port cKDTree query methods to C++, allow pickling on Python 2, and add support for periodic queries Nov 11, 2015

fix documentation rendering error

3cc6e72

sturlamolden changed the title ~~WIP: Port cKDTree query methods to C++, allow pickling on Python 2, and add support for periodic queries~~ ENH: Port cKDTree query methods to C++, allow pickling on Python 2, and add support for periodic queries Nov 12, 2015

ev-br added this to the 0.17.0 milestone Nov 16, 2015

rgommers added a commit that referenced this pull request Nov 22, 2015

Merge pull request #4890 from sturlamolden/cpp_ckdtree

ee8eba7

ENH: Port cKDTree query methods to C++, allow pickling on Python 2, and add support for periodic queries

rgommers merged commit ee8eba7 into scipy:master Nov 22, 2015

ev-br mentioned this pull request Nov 22, 2015

allow cKDTree.query taking a list input in k. #3867

Merged

perimosocordiae mentioned this pull request Mar 31, 2016

ENH: sparse: add fast_lil_matrix sparse matrix #6004

Closed

ENH: Port cKDTree query methods to C++, allow pickling on Python 2, and add support for periodic queries #4890

ENH: Port cKDTree query methods to C++, allow pickling on Python 2, and add support for periodic queries #4890

Conversation

sturlamolden commented May 19, 2015

ewmoore May 19, 2015

Choose a reason for hiding this comment

pv May 19, 2015 via email

Choose a reason for hiding this comment

sturlamolden May 19, 2015

Choose a reason for hiding this comment

sturlamolden May 19, 2015

Choose a reason for hiding this comment

sturlamolden commented May 19, 2015

sturlamolden commented May 19, 2015

sturlamolden commented May 20, 2015

sturlamolden commented May 20, 2015

sturlamolden commented May 20, 2015

sturlamolden commented May 21, 2015

rgommers commented May 21, 2015

sturlamolden commented May 21, 2015

sturlamolden commented May 21, 2015

sturlamolden commented May 22, 2015

sturlamolden commented May 22, 2015

sturlamolden commented May 22, 2015

rgommers commented May 22, 2015

rgommers commented May 22, 2015

sturlamolden commented May 22, 2015

sturlamolden commented May 22, 2015

pv commented May 22, 2015 via email

pv commented May 22, 2015

pv commented May 22, 2015

sturlamolden commented May 22, 2015

sturlamolden commented May 22, 2015

pv commented May 22, 2015

sturlamolden commented Nov 11, 2015

sturlamolden commented Nov 11, 2015

sturlamolden commented Nov 11, 2015

sturlamolden commented Nov 11, 2015

sturlamolden commented Nov 11, 2015

sturlamolden commented Nov 11, 2015

sturlamolden commented Nov 11, 2015

rainwoodman commented Nov 12, 2015

sturlamolden commented Nov 12, 2015

rainwoodman commented Nov 13, 2015

sturlamolden commented Nov 13, 2015

sturlamolden commented Nov 16, 2015

rgommers commented Nov 22, 2015

rgommers commented Nov 22, 2015

rgommers commented Nov 22, 2015