Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Added query_ball_point and related tests to scipy.spatial.cKDTree. #262

Closed
wants to merge 5 commits into from
+2,964 −2,736

3 participants

@patvarilly

scipy.spatial.cKDTree implements only part of the more general scipy.spatial.KDTree interface in Cython. This pull requests ports over the method query_ball_point to cKDTree and adds corresponding unit tests to verify correctness.

I have used a slightly modified version of Anne Archibald's benchmarking script to measure the resulting performance gains (see http://mail.scipy.org/pipermail/scipy-dev/2008-October/009854.html, script copied below). The corresponding numbers show a ~ 250-350-fold speedup (this seems relatively insensitive to the query ball radius, see below):

dimension 3, 10000 points
KDTree constructed: 0.0823722
cKDTree constructed: 0.00159502
KDTree 1000 lookups: 0.574862
cKDTree 1000 lookups: 0.00470614
flat cKDTree 1000 lookups: 0.342408
KDTree 1000 ball lookups: 5.54947
cKDTree 1000 ball lookups: 0.0189772
flat cKDTree 1000 ball lookups: 0.343151
Ball lookups agree? True

dimension 8, 10000 points
KDTree constructed: 0.168782
cKDTree constructed: 0.00218606
KDTree 1000 lookups: 10.7511
cKDTree 1000 lookups: 0.091007
flat cKDTree 1000 lookups: 0.675035
KDTree 1000 ball lookups: 6.13426
cKDTree 1000 ball lookups: 0.017458
flat cKDTree 1000 ball lookups: 0.351695
Ball lookups agree? True

dimension 16, 10000 points
KDTree constructed: 0.11167
cKDTree constructed: 0.00220299
KDTree 1000 lookups: 62.2512
cKDTree 1000 lookups: 1.01098
flat cKDTree 1000 lookups: 1.44906
KDTree 1000 ball lookups: 3.89043
cKDTree 1000 ball lookups: 0.015486
flat cKDTree 1000 ball lookups: 0.352298
Ball lookups agree? True

For a probe radius of 0.5 instead of 0.2 (see script for meaning), I get the following results in 3 dimensions:

dimension 3, 10000 points
KDTree constructed: 0.0826931
cKDTree constructed: 0.00157905
KDTree 1000 lookups: 0.556555
cKDTree 1000 lookups: 0.00440192
flat cKDTree 1000 lookups: 0.338442
KDTree 1000 ball lookups: 16.3933
cKDTree 1000 ball lookups: 0.0493851
flat cKDTree 1000 ball lookups: 0.423063
Ball lookups agree? True

As far as I can tell, the issue is that KDTree.query_ball_point is written for readability, but ends up doing large numbers of list operations to build the results.

From here on, it should not be much harder to port over the remaining methods of KDTree to cKDTree, although these are not a current priority for me. As an aside, it seemed at first glance that the public interfaces for KDTree.query() and cKDTree.query() have fallen out of sync, though I didn't check thoroughly.

All the best,
Patrick


import numpy as np
import time

from scipy.spatial import KDTree, cKDTree

m = 3
n = 10000
r = 1000

data = np.concatenate((np.random.randn(n//2,m),
np.random.randn(n-n//2,m)+np.ones(m)))
queries = np.concatenate((np.random.randn(r//2,m),
np.random.randn(r-r//2,m)+np.ones(m)))

print "dimension %d, %d points" % (m,n)

t = time.time()
T1 = KDTree(data)
print "KDTree constructed:\t%g" % (time.time()-t)
t = time.time()
T2 = cKDTree(data)
print "cKDTree constructed:\t%g" % (time.time()-t)

t = time.time()
w = T1.query(queries)
print "KDTree %d lookups:\t%g" % (r, time.time()-t)
del w

t = time.time()
w = T2.query(queries)
print "cKDTree %d lookups:\t%g" % (r, time.time()-t)
del w

T3 = cKDTree(data,leafsize=n)
t = time.time()
w = T3.query(queries)
print "flat cKDTree %d lookups:\t%g" % (r, time.time()-t)
del w

t = time.time()
w1 = T1.query_ball_point(queries, 0.2)
print "KDTree %d ball lookups:\t%g" % (r, time.time()-t)

t = time.time()
w2 = T2.query_ball_point(queries, 0.2)
print "cKDTree %d ball lookups:\t%g" % (r, time.time()-t)

t = time.time()
w3 = T3.query_ball_point(queries, 0.2)
print "flat cKDTree %d ball lookups:\t%g" % (r, time.time()-t)

all_good = True
for a, b in zip(w1, w2):
if sorted(a) != sorted(b):
all_good = False
for a, b in zip(w1, w3):
if sorted(a) != sorted(b):
all_good = False

print "Ball lookups agree? %s" % str(all_good)

@patvarilly

A quick benchmark of query_ball_tree in cKDTree vs KDTree (code below). The speedup here is also ~300-fold:

dimension 3, 1000 points in tree 1, 1000 points in tree 2
KDTrees constructed: 0.0186899
cKDTrees constructed: 0.000445127
KDTree query_ball_tree(r = 0.5): 2.69068
cKDTree query_ball_tree(r = 0.5): 0.00940299


import numpy as np
import time

from scipy.spatial import KDTree, cKDTree

m = 3
n1 = 1000
n2 = 1000

data1 = np.concatenate((np.random.randn(n1//2,m),
np.random.randn(n1-n1//2,m)+np.ones(m)))
data2 = np.concatenate((np.random.randn(n2//2,m),
np.random.randn(n2-n2//2,m)+np.ones(m)))

print "dimension %d, %d points in tree 1, %d points in tree 2" % (m,n1,n2)

t = time.time()
T1 = KDTree(data1)
T2 = KDTree(data2)
print "KDTrees constructed:\t%g" % (time.time()-t)
t = time.time()
cT1 = cKDTree(data1)
cT2 = cKDTree(data1)
print "cKDTrees constructed:\t%g" % (time.time()-t)

t = time.time()
T1.query_ball_tree(T2, 0.5)
print "KDTree query_ball_tree(r = 0.5):\t%g" % (time.time()-t)

t = time.time()
cT1.query_ball_tree(cT2, 0.5)
print "cKDTree query_ball_tree(r = 0.5):\t%g" % (time.time()-t)

@patvarilly

I've simplified the logic in KDTree.query_pairs, then ported the simplified logic to cKDTree (basically, being careful about not visiting the same pair of nodes twice, instead of keeping track of visited pairs in a set and testing against the set every time a new node pair is visited). Here is a quick benchmark (cKDTree version is ~750-fold faster with respect to old query_pairs, ~ 370-fold faster than the revised version):

dimension 3, 1000 points in tree
KDTree constructed: 0.00983906
cKDTree constructed: 0.000268936
KDTree old query_pairs(r = 0.5): 2.66407
KDTree query_pairs(r = 0.5): 1.31732
cKDTree query_pairs(r = 0.5): 0.00354481


import numpy as np
import time

from scipy.spatial import KDTree, cKDTree

m = 3
n = 1000

data = np.concatenate((np.random.randn(n//2,m),
np.random.randn(n-n//2,m)+np.ones(m)))

print "dimension %d, %d points in tree" % (m,n)

t = time.time()
T = KDTree(data)
print "KDTree constructed:\t%g" % (time.time()-t)
t = time.time()
cT = cKDTree(data)
print "cKDTree constructed:\t%g" % (time.time()-t)

t = time.time()
T.old_query_pairs(0.5) # Manually reinserted old query_pairs() back into KDTree for this benchmark
print "KDTree old query_pairs(r = 0.5):\t%g" % (time.time()-t)

t = time.time()
T.query_pairs(0.5)
print "KDTree query_pairs(r = 0.5):\t%g" % (time.time()-t)

t = time.time()
cT.query_pairs(0.5)
print "cKDTree query_pairs(r = 0.5):\t%g" % (time.time()-t)

@patvarilly

The benchmark for sparse_distance_matrix:

dimension 3, 1000 points in tree 1, 1000 points in tree 2
KDTrees constructed: 0.0191472
cKDTrees constructed: 0.000463009
KDTree sparse_distance_matrix(r = 0.2): 4.14861
cKDTree sparse_distance_matrix(r = 0.2): 0.0185668
KDTree sparse_distance_matrix(r = 0.5): 7.69248
cKDTree sparse_distance_matrix(r = 0.5): 0.037811


import numpy as np
import time

from scipy.spatial import KDTree, cKDTree

m = 3
n1 = 1000
n2 = 1000

data1 = np.concatenate((np.random.randn(n1//2,m),
np.random.randn(n1-n1//2,m)+np.ones(m)))
data2 = np.concatenate((np.random.randn(n2//2,m),
np.random.randn(n2-n2//2,m)+np.ones(m)))

print "dimension %d, %d points in tree 1, %d points in tree 2" % (m,n1,n2)

t = time.time()
T1 = KDTree(data1)
T2 = KDTree(data2)
print "KDTrees constructed:\t%g" % (time.time()-t)
t = time.time()
cT1 = cKDTree(data1)
cT2 = cKDTree(data1)
print "cKDTrees constructed:\t%g" % (time.time()-t)

t = time.time()
T1.sparse_distance_matrix(T2, 0.2)
print "KDTree sparse_distance_matrix(r = 0.2):\t%g" % (time.time()-t)

t = time.time()
cT1.sparse_distance_matrix(cT2, 0.2)
print "cKDTree sparse_distance_matrix(r = 0.2):\t%g" % (time.time()-t)

t = time.time()
T1.sparse_distance_matrix(T2, 0.5)
print "KDTree sparse_distance_matrix(r = 0.5):\t%g" % (time.time()-t)

t = time.time()
cT1.sparse_distance_matrix(cT2, 0.5)
print "cKDTree sparse_distance_matrix(r = 0.5):\t%g" % (time.time()-t)

@patvarilly

cKDTree is now feature complete with respect to KDTree.

Here is a benchmark for cKDTree.count_neighbors. The speedup here isn't as great because my current Cython code ends up allocating and freeing an array on every call to __count_neighbors_traverse. A better solution would be to allocate as many arrays as there are levels in the two trees passed to count_neighbors at the beginning, and then use these arrays as a stack as you go up and down the trees. That said, a 20-fold improvement for now is not too bad, either.

dimension 3, 1000 points in tree 1, 1000 in tree 2
KDTrees constructed: 0.0217071
cKDTrees constructed: 0.000464916
KDTree count_neighbors(r = 0.5): 1.88414
cKDTree count_neighbors(r = 0.5): 0.07599


import numpy as np
import time

from scipy.spatial import KDTree, cKDTree

m = 3
n1 = 1000
n2 = 1000

data1 = np.concatenate((np.random.randn(n1//2,m),
np.random.randn(n1-n1//2,m)+np.ones(m)))
data2 = np.concatenate((np.random.randn(n2//2,m),
np.random.randn(n2-n2//2,m)+np.ones(m)))

print "dimension %d, %d points in tree 1, %d in tree 2" % (m,n1,n2)

t = time.time()
T1 = KDTree(data1)
T2 = KDTree(data2)
print "KDTrees constructed:\t%g" % (time.time()-t)
t = time.time()
cT1 = cKDTree(data1)
cT2 = cKDTree(data1)
print "cKDTrees constructed:\t%g" % (time.time()-t)

t = time.time()
T1.count_neighbors(T2, 0.5)
print "KDTree count_neighbors(r = 0.5):\t%g" % (time.time()-t)

t = time.time()
cT1.count_neighbors(cT2, 0.5)
print "cKDTree count_neighbors(r = 0.5):\t%g" % (time.time()-t)

@sturlamolden

The code is also full of missing variable declarations. Almost none of the range loops are correctly cythonized due to this, and also many other speed-critical variables lack declaration.

@rgommers
Owner
@rgommers
Owner

Patrick, is the last commit you added ("Numerous bug fixes") the same as the last 5 commits of https://github.com/rgommers/scipy/tree/fixes-sturla? I spent some effort to port them and give them reasonable commit messages, so it would be better if you could use those. If you're not sure how I can help with that.

@patvarilly

Ralf, that's right, I was trying to bring in the changes into my branch to address some of the issues that Sturla brought up, but I did it by copying the final files from fixes-sturla and committing them. Could you let me know the correct way to bring these in?

@sturlamolden

The tests for query_pairs still fail on Windows 64.

@rgommers
Owner

To remove the last commit from your branch and this PR:

$ git reset --hard HEAD^   # go one commit back
$ git push origin ckdtree-improvements  --force   # assume your github remote is called "origin"

Then to grab commits from my repo

$ git remote add rgommers https://github.com/rgommers/scipy.git
$ git fetch rgommers
$ git merge rgommers/fixes-sturla

Then push that back to Github so it shows up here. The second step above can also be replaced by asking me to send you a PR with my changes.

You can also use git cherry-pick commit-id instead of using git merge. Then you can carry over commits one by one. In this case merge should be easier though, because it takes all commits at once.

@patvarilly

Thanks for the detailed instructions! All set now, hopefully. Will comment on the substantive parts before going to bed tonight.

@patvarilly

Following up on Sturla's suggestion, I implemented a version of query that has the same recursive traversal strcuture as all the other methods, and that uses the rectangle-to-point distance tracker class for readability. Unfortunately, it's about 3 times as slow as the old version of query:

dimension 3, 10000 points
KDTree constructed: 0.0961771
cKDTree constructed: 0.00191593
KDTree 1000 lookups: 0.666298
cKDTree 1000 new lookups: 0.011121
cKDTree 1000 old lookups: 0.00368786
flat cKDTree 1000 new lookups: 0.079838
flat cKDTree 1000 old lookups: 0.0734

As far as I can tell, this is because the old version of query uses a priority queue to continuosly select for processing the closest unexamined kd-tree node. This is something that can't be done in a recursive algorithm. I committed the code for historical curiosity, but will remove it in the following commit.

@patvarilly

Updated benchmarks with current version of the code. I ran these on a different machine, but the improvement factors should be mostly unaffected:

query_ball_point: ~ 200-fold improvement
query_ball_tree: ~ 350-fold improvement
query_pairs: ~ 350-fold improvement
count_neighbors: ~ 70-fold improvement
sparse_distance_matrix: ~ 2000- to 3000- fold improvement (improvement from previous benchmark due to cythonizing loop indices)

So making the code immensely more readable accrues only a slight speed penalty for some functions, and has made others (notably sparse_distance_matrix) much faster.


query_ball_point (query radius 0.5, same benchmark code as above)

dimension 3, 10000 points
KDTree 1000 ball lookups: 5.31274
cKDTree 1000 ball lookups: 0.024559
flat cKDTree 1000 ball lookups: 0.0826628

dimension 8, 10000 points
KDTree 1000 ball lookups: 5.80942
cKDTree 1000 ball lookups: 0.024523
flat cKDTree 1000 ball lookups: 0.0908511

dimension 16, 10000 points
KDTree 1000 ball lookups: 3.98487
cKDTree 1000 ball lookups: 0.022228
flat cKDTree 1000 ball lookups: 0.0916049


query_ball_tree

dimension 3, 1000 points in tree 1, 1000 points in tree 2
KDTree query_ball_tree(r = 0.5): 2.34604
cKDTree query_ball_tree(r = 0.5): 0.00636292


query_pairs

dimension 3, 1000 points in tree
KDTree query_pairs(r = 0.5): 1.2968
cKDTree query_pairs(r = 0.5): 0.00355601


count_neighbors

dimension 3, 1000 points in tree 1, 1000 in tree 2
KDTree count_neighbors(r = 0.5): 1.71428
cKDTree count_neighbors(r = 0.5): 0.0246949


sparse_distance_matrix

dimension 3, 1000 points in tree 1, 1000 points in tree 2
KDTree sparse_distance_matrix(r = 0.2): 3.69791
cKDTree sparse_distance_matrix(r = 0.2): 0.00170898
KDTree sparse_distance_matrix(r = 0.5): 6.51638
cKDTree sparse_distance_matrix(r = 0.5): 0.00226092

@rgommers
Owner

@patvarilly you used a script adapted from the mailing list right? It may be useful to put that under spatial/benchmarks and make it available as spatial.bench(). You can find an example of how this is done in scipy/linalg (although it seems to be partially broken, missing imports).

@rgommers
Owner

And: timings look great, nice job!

@sturlamolden

All tests pass on Windows 64 and Python 2.7. The code looks nice and clean now. I think it's ok to go into SciPy if the tests pass on the other platforms as well. Patrick's rewrite removes serveral errors from the current cKDTree code (integer size and exception handling) and makes it fully compatible with KDTree (i.e. can be used as a drop-in replacement). We should still look into GIL handling, but this should be postponed to a second update. KDTree should be kept as it is easier to extend from Python.

@rgommers
Owner

All passes on OS X with Python 3.2. too. So guess this is ready. bench() would be nice to have still.

@sturlamolden

I have placed a ticket for this PR on the trac (ticket #1712).

@sturlamolden

On Win64 .query() should return an array of dtype=int if a C long does not overflow. There is only a test for this on scalar return values.

@sturlamolden

I have an implementation of the last issue ready. It still needs testing, I will send Patrick a pull request when I'm done.
For now it is here:
https://github.com/sturlamolden/scipy/tree/ckdtree-improvements

@sturlamolden

Ok, tests passes. I've sent a pull request to Patrick that should be merged first.

@rgommers
Owner

@patvarilly have you had a chance to look at Sturla's PR? Would be nice to get this merged soon.

@patvarilly

Alright, Sturla's PR is (finally) in, and the benchmarks are now self-contained in spatial/benchmarks (runnable by saying "import scipy.spatial; scipy.spatial.bench()". As far as I can tell, that would wrap up this pull request and make it ready for merging. My apologies for the extreme delay in getting this last bit done. I look forward to your comments.

@sturlamolden

We should regenerate ckdtree.c with Cython 0.17 to avoid any Cython related bugs, and re-run all the tests. I haven't had time to install the latest Cython yet. It was released on September 1th.

@patvarilly

OK, regenerated ckdtree.c with Cython 0.17. I re-ran the tests and they all pass.

@rgommers
Owner

Benchmarks look good. They run a bit slow but I don't think that's a problem. And hard to avoid if you want to show differences of O(1000) without printing too many digits.

@rgommers
Owner

All tests pass for me too. Ready to merge I think.

@rgommers
Owner

Checked the change to setup.py, this isn't needed for setupscons.py or bento.

@rgommers
Owner

One more request @patvarilly: would you mind adding a short paragraph describing the main changes in doc/release/0.12.0-notes?

@patvarilly

All set. I don't know what caused the indentation to go a bit off, but it's now fixed. What do you mean by "Checked the change to setup.py, this isn't needed for setupscons.py or bento"? Should I have made any other changes to make the benchmarks work? I usually compile and install SciPy by running "python setup.py install --user" in my scipy directory.

@rgommers
Owner

No other changes are needed. We have 3 build systems, one based on numpy.distutils (setup.py files), one based on numscons (SConscript files) and one based on Bento (bento.info / bscript files). So for most changes to setup.py, those need to be made for all build systems. In this case that wasn't needed.

@rgommers
Owner

Thanks for adding to the release notes and fixing the indentation.

@rgommers
Owner

Merged in eec9174. Thanks again Patrick and Sturla!

@rgommers rgommers closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.