Added query_ball_point and related tests to scipy.spatial.cKDTree. #262

patvarilly · 2012-07-05T15:06:49Z

scipy.spatial.cKDTree implements only part of the more general scipy.spatial.KDTree interface in Cython. This pull requests ports over the method query_ball_point to cKDTree and adds corresponding unit tests to verify correctness.

I have used a slightly modified version of Anne Archibald's benchmarking script to measure the resulting performance gains (see http://mail.scipy.org/pipermail/scipy-dev/2008-October/009854.html, script copied below). The corresponding numbers show a ~ 250-350-fold speedup (this seems relatively insensitive to the query ball radius, see below):

dimension 3, 10000 points
KDTree constructed: 0.0823722
cKDTree constructed: 0.00159502
KDTree 1000 lookups: 0.574862
cKDTree 1000 lookups: 0.00470614
flat cKDTree 1000 lookups: 0.342408
KDTree 1000 ball lookups: 5.54947
cKDTree 1000 ball lookups: 0.0189772
flat cKDTree 1000 ball lookups: 0.343151
Ball lookups agree? True

dimension 8, 10000 points
KDTree constructed: 0.168782
cKDTree constructed: 0.00218606
KDTree 1000 lookups: 10.7511
cKDTree 1000 lookups: 0.091007
flat cKDTree 1000 lookups: 0.675035
KDTree 1000 ball lookups: 6.13426
cKDTree 1000 ball lookups: 0.017458
flat cKDTree 1000 ball lookups: 0.351695
Ball lookups agree? True

dimension 16, 10000 points
KDTree constructed: 0.11167
cKDTree constructed: 0.00220299
KDTree 1000 lookups: 62.2512
cKDTree 1000 lookups: 1.01098
flat cKDTree 1000 lookups: 1.44906
KDTree 1000 ball lookups: 3.89043
cKDTree 1000 ball lookups: 0.015486
flat cKDTree 1000 ball lookups: 0.352298
Ball lookups agree? True

For a probe radius of 0.5 instead of 0.2 (see script for meaning), I get the following results in 3 dimensions:

dimension 3, 10000 points
KDTree constructed: 0.0826931
cKDTree constructed: 0.00157905
KDTree 1000 lookups: 0.556555
cKDTree 1000 lookups: 0.00440192
flat cKDTree 1000 lookups: 0.338442
KDTree 1000 ball lookups: 16.3933
cKDTree 1000 ball lookups: 0.0493851
flat cKDTree 1000 ball lookups: 0.423063
Ball lookups agree? True

As far as I can tell, the issue is that KDTree.query_ball_point is written for readability, but ends up doing large numbers of list operations to build the results.

From here on, it should not be much harder to port over the remaining methods of KDTree to cKDTree, although these are not a current priority for me. As an aside, it seemed at first glance that the public interfaces for KDTree.query() and cKDTree.query() have fallen out of sync, though I didn't check thoroughly.

All the best,
Patrick

import numpy as np
import time

from scipy.spatial import KDTree, cKDTree

m = 3
n = 10000
r = 1000

data = np.concatenate((np.random.randn(n//2,m),
np.random.randn(n-n//2,m)+np.ones(m)))
queries = np.concatenate((np.random.randn(r//2,m),
np.random.randn(r-r//2,m)+np.ones(m)))

print "dimension %d, %d points" % (m,n)

t = time.time()
T1 = KDTree(data)
print "KDTree constructed:\t%g" % (time.time()-t)
t = time.time()
T2 = cKDTree(data)
print "cKDTree constructed:\t%g" % (time.time()-t)

t = time.time()
w = T1.query(queries)
print "KDTree %d lookups:\t%g" % (r, time.time()-t)
del w

t = time.time()
w = T2.query(queries)
print "cKDTree %d lookups:\t%g" % (r, time.time()-t)
del w

T3 = cKDTree(data,leafsize=n)
t = time.time()
w = T3.query(queries)
print "flat cKDTree %d lookups:\t%g" % (r, time.time()-t)
del w

t = time.time()
w1 = T1.query_ball_point(queries, 0.2)
print "KDTree %d ball lookups:\t%g" % (r, time.time()-t)

t = time.time()
w2 = T2.query_ball_point(queries, 0.2)
print "cKDTree %d ball lookups:\t%g" % (r, time.time()-t)

t = time.time()
w3 = T3.query_ball_point(queries, 0.2)
print "flat cKDTree %d ball lookups:\t%g" % (r, time.time()-t)

all_good = True
for a, b in zip(w1, w2):
if sorted(a) != sorted(b):
all_good = False
for a, b in zip(w1, w3):
if sorted(a) != sorted(b):
all_good = False

print "Ball lookups agree? %s" % str(all_good)

patvarilly · 2012-07-08T23:23:51Z

A quick benchmark of query_ball_tree in cKDTree vs KDTree (code below). The speedup here is also ~300-fold:

dimension 3, 1000 points in tree 1, 1000 points in tree 2
KDTrees constructed: 0.0186899
cKDTrees constructed: 0.000445127
KDTree query_ball_tree(r = 0.5): 2.69068
cKDTree query_ball_tree(r = 0.5): 0.00940299

import numpy as np
import time

from scipy.spatial import KDTree, cKDTree

m = 3
n1 = 1000
n2 = 1000

data1 = np.concatenate((np.random.randn(n1//2,m),
np.random.randn(n1-n1//2,m)+np.ones(m)))
data2 = np.concatenate((np.random.randn(n2//2,m),
np.random.randn(n2-n2//2,m)+np.ones(m)))

print "dimension %d, %d points in tree 1, %d points in tree 2" % (m,n1,n2)

t = time.time()
T1 = KDTree(data1)
T2 = KDTree(data2)
print "KDTrees constructed:\t%g" % (time.time()-t)
t = time.time()
cT1 = cKDTree(data1)
cT2 = cKDTree(data1)
print "cKDTrees constructed:\t%g" % (time.time()-t)

t = time.time()
T1.query_ball_tree(T2, 0.5)
print "KDTree query_ball_tree(r = 0.5):\t%g" % (time.time()-t)

t = time.time()
cT1.query_ball_tree(cT2, 0.5)
print "cKDTree query_ball_tree(r = 0.5):\t%g" % (time.time()-t)

patvarilly · 2012-07-09T21:02:46Z

I've simplified the logic in KDTree.query_pairs, then ported the simplified logic to cKDTree (basically, being careful about not visiting the same pair of nodes twice, instead of keeping track of visited pairs in a set and testing against the set every time a new node pair is visited). Here is a quick benchmark (cKDTree version is ~750-fold faster with respect to old query_pairs, ~ 370-fold faster than the revised version):

dimension 3, 1000 points in tree
KDTree constructed: 0.00983906
cKDTree constructed: 0.000268936
KDTree old query_pairs(r = 0.5): 2.66407
KDTree query_pairs(r = 0.5): 1.31732
cKDTree query_pairs(r = 0.5): 0.00354481

import numpy as np
import time

from scipy.spatial import KDTree, cKDTree

m = 3
n = 1000

data = np.concatenate((np.random.randn(n//2,m),
np.random.randn(n-n//2,m)+np.ones(m)))

print "dimension %d, %d points in tree" % (m,n)

t = time.time()
T = KDTree(data)
print "KDTree constructed:\t%g" % (time.time()-t)
t = time.time()
cT = cKDTree(data)
print "cKDTree constructed:\t%g" % (time.time()-t)

t = time.time()
T.old_query_pairs(0.5) # Manually reinserted old query_pairs() back into KDTree for this benchmark
print "KDTree old query_pairs(r = 0.5):\t%g" % (time.time()-t)

t = time.time()
T.query_pairs(0.5)
print "KDTree query_pairs(r = 0.5):\t%g" % (time.time()-t)

t = time.time()
cT.query_pairs(0.5)
print "cKDTree query_pairs(r = 0.5):\t%g" % (time.time()-t)

patvarilly · 2012-07-11T22:17:19Z

The benchmark for sparse_distance_matrix:

dimension 3, 1000 points in tree 1, 1000 points in tree 2
KDTrees constructed: 0.0191472
cKDTrees constructed: 0.000463009
KDTree sparse_distance_matrix(r = 0.2): 4.14861
cKDTree sparse_distance_matrix(r = 0.2): 0.0185668
KDTree sparse_distance_matrix(r = 0.5): 7.69248
cKDTree sparse_distance_matrix(r = 0.5): 0.037811

import numpy as np
import time

from scipy.spatial import KDTree, cKDTree

m = 3
n1 = 1000
n2 = 1000

data1 = np.concatenate((np.random.randn(n1//2,m),
np.random.randn(n1-n1//2,m)+np.ones(m)))
data2 = np.concatenate((np.random.randn(n2//2,m),
np.random.randn(n2-n2//2,m)+np.ones(m)))

print "dimension %d, %d points in tree 1, %d points in tree 2" % (m,n1,n2)

t = time.time()
T1 = KDTree(data1)
T2 = KDTree(data2)
print "KDTrees constructed:\t%g" % (time.time()-t)
t = time.time()
cT1 = cKDTree(data1)
cT2 = cKDTree(data1)
print "cKDTrees constructed:\t%g" % (time.time()-t)

t = time.time()
T1.sparse_distance_matrix(T2, 0.2)
print "KDTree sparse_distance_matrix(r = 0.2):\t%g" % (time.time()-t)

t = time.time()
cT1.sparse_distance_matrix(cT2, 0.2)
print "cKDTree sparse_distance_matrix(r = 0.2):\t%g" % (time.time()-t)

t = time.time()
T1.sparse_distance_matrix(T2, 0.5)
print "KDTree sparse_distance_matrix(r = 0.5):\t%g" % (time.time()-t)

t = time.time()
cT1.sparse_distance_matrix(cT2, 0.5)
print "cKDTree sparse_distance_matrix(r = 0.5):\t%g" % (time.time()-t)

patvarilly · 2012-07-12T23:30:27Z

cKDTree is now feature complete with respect to KDTree.

Here is a benchmark for cKDTree.count_neighbors. The speedup here isn't as great because my current Cython code ends up allocating and freeing an array on every call to __count_neighbors_traverse. A better solution would be to allocate as many arrays as there are levels in the two trees passed to count_neighbors at the beginning, and then use these arrays as a stack as you go up and down the trees. That said, a 20-fold improvement for now is not too bad, either.

dimension 3, 1000 points in tree 1, 1000 in tree 2
KDTrees constructed: 0.0217071
cKDTrees constructed: 0.000464916
KDTree count_neighbors(r = 0.5): 1.88414
cKDTree count_neighbors(r = 0.5): 0.07599

import numpy as np
import time

from scipy.spatial import KDTree, cKDTree

m = 3
n1 = 1000
n2 = 1000

data1 = np.concatenate((np.random.randn(n1//2,m),
np.random.randn(n1-n1//2,m)+np.ones(m)))
data2 = np.concatenate((np.random.randn(n2//2,m),
np.random.randn(n2-n2//2,m)+np.ones(m)))

print "dimension %d, %d points in tree 1, %d in tree 2" % (m,n1,n2)

t = time.time()
T1 = KDTree(data1)
T2 = KDTree(data2)
print "KDTrees constructed:\t%g" % (time.time()-t)
t = time.time()
cT1 = cKDTree(data1)
cT2 = cKDTree(data1)
print "cKDTrees constructed:\t%g" % (time.time()-t)

t = time.time()
T1.count_neighbors(T2, 0.5)
print "KDTree count_neighbors(r = 0.5):\t%g" % (time.time()-t)

t = time.time()
cT1.count_neighbors(cT2, 0.5)
print "cKDTree count_neighbors(r = 0.5):\t%g" % (time.time()-t)

sturlamolden · 2012-07-13T16:47:00Z

Countless integer size errors.

Bug fixes:

https://github.com/sturlamolden/scipy/blob/master/scipy/spatial/ckdtree.pyx

sturlamolden · 2012-07-14T01:26:29Z

The code is also full of missing variable declarations. Almost none of the range loops are correctly cythonized due to this, and also many other speed-critical variables lack declaration.

rgommers · 2012-07-14T11:20:25Z

As commented on the mailing list (http://thread.gmane.org/gmane.comp.python.scientific.devel/16703/focus=16734), here are Sturla's edits on top of Patrick's: https://github.com/rgommers/scipy/tree/fixes-sturla

rgommers · 2012-07-15T19:35:39Z

Patrick, is the last commit you added ("Numerous bug fixes") the same as the last 5 commits of https://github.com/rgommers/scipy/tree/fixes-sturla? I spent some effort to port them and give them reasonable commit messages, so it would be better if you could use those. If you're not sure how I can help with that.

patvarilly · 2012-07-15T20:38:11Z

Ralf, that's right, I was trying to bring in the changes into my branch to address some of the issues that Sturla brought up, but I did it by copying the final files from fixes-sturla and committing them. Could you let me know the correct way to bring these in?

sturlamolden · 2012-07-15T20:40:00Z

The tests for query_pairs still fail on Windows 64.

rgommers · 2012-07-15T21:06:21Z

To remove the last commit from your branch and this PR:

$ git reset --hard HEAD^   # go one commit back
$ git push origin ckdtree-improvements  --force   # assume your github remote is called "origin"

Then to grab commits from my repo

$ git remote add rgommers https://github.com/rgommers/scipy.git
$ git fetch rgommers
$ git merge rgommers/fixes-sturla

Then push that back to Github so it shows up here. The second step above can also be replaced by asking me to send you a PR with my changes.

You can also use git cherry-pick commit-id instead of using git merge. Then you can carry over commits one by one. In this case merge should be easier though, because it takes all commits at once.

patvarilly · 2012-07-15T21:27:52Z

Thanks for the detailed instructions! All set now, hopefully. Will comment on the substantive parts before going to bed tonight.

patvarilly · 2012-07-19T22:49:49Z

Following up on Sturla's suggestion, I implemented a version of query that has the same recursive traversal strcuture as all the other methods, and that uses the rectangle-to-point distance tracker class for readability. Unfortunately, it's about 3 times as slow as the old version of query:

dimension 3, 10000 points
KDTree constructed: 0.0961771
cKDTree constructed: 0.00191593
KDTree 1000 lookups: 0.666298
cKDTree 1000 new lookups: 0.011121
cKDTree 1000 old lookups: 0.00368786
flat cKDTree 1000 new lookups: 0.079838
flat cKDTree 1000 old lookups: 0.0734

As far as I can tell, this is because the old version of query uses a priority queue to continuosly select for processing the closest unexamined kd-tree node. This is something that can't be done in a recursive algorithm. I committed the code for historical curiosity, but will remove it in the following commit.

patvarilly · 2012-07-20T11:23:09Z

Updated benchmarks with current version of the code. I ran these on a different machine, but the improvement factors should be mostly unaffected:

query_ball_point: ~ 200-fold improvement
query_ball_tree: ~ 350-fold improvement
query_pairs: ~ 350-fold improvement
count_neighbors: ~ 70-fold improvement
sparse_distance_matrix: ~ 2000- to 3000- fold improvement (improvement from previous benchmark due to cythonizing loop indices)

So making the code immensely more readable accrues only a slight speed penalty for some functions, and has made others (notably sparse_distance_matrix) much faster.

query_ball_point (query radius 0.5, same benchmark code as above)

dimension 3, 10000 points
KDTree 1000 ball lookups: 5.31274
cKDTree 1000 ball lookups: 0.024559
flat cKDTree 1000 ball lookups: 0.0826628

dimension 8, 10000 points
KDTree 1000 ball lookups: 5.80942
cKDTree 1000 ball lookups: 0.024523
flat cKDTree 1000 ball lookups: 0.0908511

dimension 16, 10000 points
KDTree 1000 ball lookups: 3.98487
cKDTree 1000 ball lookups: 0.022228
flat cKDTree 1000 ball lookups: 0.0916049

query_ball_tree

dimension 3, 1000 points in tree 1, 1000 points in tree 2
KDTree query_ball_tree(r = 0.5): 2.34604
cKDTree query_ball_tree(r = 0.5): 0.00636292

query_pairs

dimension 3, 1000 points in tree
KDTree query_pairs(r = 0.5): 1.2968
cKDTree query_pairs(r = 0.5): 0.00355601

count_neighbors

dimension 3, 1000 points in tree 1, 1000 in tree 2
KDTree count_neighbors(r = 0.5): 1.71428
cKDTree count_neighbors(r = 0.5): 0.0246949

sparse_distance_matrix

dimension 3, 1000 points in tree 1, 1000 points in tree 2
KDTree sparse_distance_matrix(r = 0.2): 3.69791
cKDTree sparse_distance_matrix(r = 0.2): 0.00170898
KDTree sparse_distance_matrix(r = 0.5): 6.51638
cKDTree sparse_distance_matrix(r = 0.5): 0.00226092

rgommers · 2012-07-20T15:54:57Z

@patvarilly you used a script adapted from the mailing list right? It may be useful to put that under spatial/benchmarks and make it available as spatial.bench(). You can find an example of how this is done in scipy/linalg (although it seems to be partially broken, missing imports).

rgommers · 2012-07-20T15:55:36Z

And: timings look great, nice job!

sturlamolden · 2012-07-26T18:24:15Z

All tests pass on Windows 64 and Python 2.7. The code looks nice and clean now. I think it's ok to go into SciPy if the tests pass on the other platforms as well. Patrick's rewrite removes serveral errors from the current cKDTree code (integer size and exception handling) and makes it fully compatible with KDTree (i.e. can be used as a drop-in replacement). We should still look into GIL handling, but this should be postponed to a second update. KDTree should be kept as it is easier to extend from Python.

rgommers · 2012-07-26T21:57:49Z

All passes on OS X with Python 3.2. too. So guess this is ready. bench() would be nice to have still.

sturlamolden · 2012-07-30T14:22:53Z

I have placed a ticket for this PR on the trac (ticket #1712).

sturlamolden · 2012-07-30T14:30:32Z

On Win64 .query() should return an array of dtype=int if a C long does not overflow. There is only a test for this on scalar return values.

sturlamolden · 2012-07-30T17:38:20Z

I have an implementation of the last issue ready. It still needs testing, I will send Patrick a pull request when I'm done.
For now it is here:
https://github.com/sturlamolden/scipy/tree/ckdtree-improvements

sturlamolden · 2012-07-30T19:29:21Z

Ok, tests passes. I've sent a pull request to Patrick that should be merged first.

rgommers · 2012-08-25T15:06:24Z

@patvarilly have you had a chance to look at Sturla's PR? Would be nice to get this merged soon.

…ee-improvements

patvarilly · 2012-09-06T22:58:01Z

Alright, Sturla's PR is (finally) in, and the benchmarks are now self-contained in spatial/benchmarks (runnable by saying "import scipy.spatial; scipy.spatial.bench()". As far as I can tell, that would wrap up this pull request and make it ready for merging. My apologies for the extreme delay in getting this last bit done. I look forward to your comments.

sturlamolden · 2012-09-07T11:07:47Z

We should regenerate ckdtree.c with Cython 0.17 to avoid any Cython related bugs, and re-run all the tests. I haven't had time to install the latest Cython yet. It was released on September 1th.

patvarilly · 2012-09-07T15:01:12Z

OK, regenerated ckdtree.c with Cython 0.17. I re-ran the tests and they all pass.

rgommers · 2012-09-07T17:49:41Z

Benchmarks look good. They run a bit slow but I don't think that's a problem. And hard to avoid if you want to show differences of O(1000) without printing too many digits.

rgommers · 2012-09-07T17:52:13Z

All tests pass for me too. Ready to merge I think.

rgommers · 2012-09-07T18:17:59Z

Checked the change to setup.py, this isn't needed for setupscons.py or bento.

rgommers · 2012-09-07T18:21:56Z

One more request @patvarilly: would you mind adding a short paragraph describing the main changes in doc/release/0.12.0-notes?

patvarilly · 2012-09-09T11:53:35Z

All set. I don't know what caused the indentation to go a bit off, but it's now fixed. What do you mean by "Checked the change to setup.py, this isn't needed for setupscons.py or bento"? Should I have made any other changes to make the benchmarks work? I usually compile and install SciPy by running "python setup.py install --user" in my scipy directory.

rgommers · 2012-09-09T19:50:14Z

No other changes are needed. We have 3 build systems, one based on numpy.distutils (setup.py files), one based on numscons (SConscript files) and one based on Bento (bento.info / bscript files). So for most changes to setup.py, those need to be made for all build systems. In this case that wasn't needed.

rgommers · 2012-09-09T19:50:43Z

Thanks for adding to the release notes and fixing the indentation.

rgommers · 2012-09-09T19:52:28Z

Merged in eec9174. Thanks again Patrick and Sturla!

rgommers mentioned this pull request Aug 6, 2012

DOC: add info on recursion limit to KDTree docstring. #281

Merged

patvarilly added 3 commits September 6, 2012 00:01

First benchmarks incorporated

40b2a47

Finished cKDTree benchmark

6d4fd65

Merge remote-tracking branch 'origin/ckdtree-improvements' into ckdtr…

554d339

…ee-improvements

Regenerated ckdtree.pyx with Cython 0.17, all tests pass

4ef8b1a

Added note to 0.12 release notes, fixed indentation in ckdtree.pyx

f8371b9

rgommers closed this Sep 9, 2012

scipy-gitbot mentioned this pull request Apr 25, 2013

Enchancement and bugfix to cKDTree (pull request 262) (Trac #1712) #2231

Closed

sjsrey mentioned this pull request Jul 1, 2013

Transition to scipy.spatial.cKDTree from scipy.spatial.KDTree pysal/pysal#263

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added query_ball_point and related tests to scipy.spatial.cKDTree. #262

Added query_ball_point and related tests to scipy.spatial.cKDTree. #262

patvarilly commented Jul 5, 2012

patvarilly commented Jul 8, 2012

patvarilly commented Jul 9, 2012

patvarilly commented Jul 11, 2012

patvarilly commented Jul 12, 2012

sturlamolden commented Jul 13, 2012

sturlamolden commented Jul 14, 2012

rgommers commented Jul 14, 2012

rgommers commented Jul 15, 2012

patvarilly commented Jul 15, 2012

sturlamolden commented Jul 15, 2012

rgommers commented Jul 15, 2012

patvarilly commented Jul 15, 2012

patvarilly commented Jul 19, 2012

patvarilly commented Jul 20, 2012

rgommers commented Jul 20, 2012

rgommers commented Jul 20, 2012

sturlamolden commented Jul 26, 2012

rgommers commented Jul 26, 2012

sturlamolden commented Jul 30, 2012

sturlamolden commented Jul 30, 2012

sturlamolden commented Jul 30, 2012

sturlamolden commented Jul 30, 2012

rgommers commented Aug 25, 2012

patvarilly commented Sep 6, 2012

sturlamolden commented Sep 7, 2012

patvarilly commented Sep 7, 2012

rgommers commented Sep 7, 2012

rgommers commented Sep 7, 2012

rgommers commented Sep 7, 2012

rgommers commented Sep 7, 2012

patvarilly commented Sep 9, 2012

rgommers commented Sep 9, 2012

rgommers commented Sep 9, 2012

rgommers commented Sep 9, 2012

Added query_ball_point and related tests to scipy.spatial.cKDTree. #262

Added query_ball_point and related tests to scipy.spatial.cKDTree. #262

Conversation

patvarilly commented Jul 5, 2012

patvarilly commented Jul 8, 2012

patvarilly commented Jul 9, 2012

patvarilly commented Jul 11, 2012

patvarilly commented Jul 12, 2012

sturlamolden commented Jul 13, 2012

sturlamolden commented Jul 14, 2012

rgommers commented Jul 14, 2012

rgommers commented Jul 15, 2012

patvarilly commented Jul 15, 2012

sturlamolden commented Jul 15, 2012

rgommers commented Jul 15, 2012

patvarilly commented Jul 15, 2012

patvarilly commented Jul 19, 2012

patvarilly commented Jul 20, 2012

rgommers commented Jul 20, 2012

rgommers commented Jul 20, 2012

sturlamolden commented Jul 26, 2012

rgommers commented Jul 26, 2012

sturlamolden commented Jul 30, 2012

sturlamolden commented Jul 30, 2012

sturlamolden commented Jul 30, 2012

sturlamolden commented Jul 30, 2012

rgommers commented Aug 25, 2012

patvarilly commented Sep 6, 2012

sturlamolden commented Sep 7, 2012

patvarilly commented Sep 7, 2012

rgommers commented Sep 7, 2012

rgommers commented Sep 7, 2012

rgommers commented Sep 7, 2012

rgommers commented Sep 7, 2012

patvarilly commented Sep 9, 2012

rgommers commented Sep 9, 2012

rgommers commented Sep 9, 2012

rgommers commented Sep 9, 2012