-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added query_ball_point and related tests to scipy.spatial.cKDTree. #262
Conversation
A quick benchmark of query_ball_tree in cKDTree vs KDTree (code below). The speedup here is also ~300-fold: dimension 3, 1000 points in tree 1, 1000 points in tree 2 import numpy as np from scipy.spatial import KDTree, cKDTree m = 3 data1 = np.concatenate((np.random.randn(n1//2,m), print "dimension %d, %d points in tree 1, %d points in tree 2" % (m,n1,n2) t = time.time() t = time.time() t = time.time() |
I've simplified the logic in KDTree.query_pairs, then ported the simplified logic to cKDTree (basically, being careful about not visiting the same pair of nodes twice, instead of keeping track of visited pairs in a set and testing against the set every time a new node pair is visited). Here is a quick benchmark (cKDTree version is ~750-fold faster with respect to old query_pairs, ~ 370-fold faster than the revised version): dimension 3, 1000 points in tree import numpy as np from scipy.spatial import KDTree, cKDTree m = 3 data = np.concatenate((np.random.randn(n//2,m), print "dimension %d, %d points in tree" % (m,n) t = time.time() t = time.time() t = time.time() t = time.time() |
The benchmark for sparse_distance_matrix: dimension 3, 1000 points in tree 1, 1000 points in tree 2 import numpy as np from scipy.spatial import KDTree, cKDTree m = 3 data1 = np.concatenate((np.random.randn(n1//2,m), print "dimension %d, %d points in tree 1, %d points in tree 2" % (m,n1,n2) t = time.time() t = time.time() t = time.time() t = time.time() t = time.time() |
cKDTree is now feature complete with respect to KDTree. Here is a benchmark for cKDTree.count_neighbors. The speedup here isn't as great because my current Cython code ends up allocating and freeing an array on every call to __count_neighbors_traverse. A better solution would be to allocate as many arrays as there are levels in the two trees passed to count_neighbors at the beginning, and then use these arrays as a stack as you go up and down the trees. That said, a 20-fold improvement for now is not too bad, either. dimension 3, 1000 points in tree 1, 1000 in tree 2 import numpy as np from scipy.spatial import KDTree, cKDTree m = 3 data1 = np.concatenate((np.random.randn(n1//2,m), print "dimension %d, %d points in tree 1, %d in tree 2" % (m,n1,n2) t = time.time() t = time.time() t = time.time() |
Countless integer size errors. Bug fixes: https://github.com/sturlamolden/scipy/blob/master/scipy/spatial/ckdtree.pyx |
The code is also full of missing variable declarations. Almost none of the range loops are correctly cythonized due to this, and also many other speed-critical variables lack declaration. |
As commented on the mailing list (http://thread.gmane.org/gmane.comp.python.scientific.devel/16703/focus=16734), here are Sturla's edits on top of Patrick's: https://github.com/rgommers/scipy/tree/fixes-sturla |
Patrick, is the last commit you added ("Numerous bug fixes") the same as the last 5 commits of https://github.com/rgommers/scipy/tree/fixes-sturla? I spent some effort to port them and give them reasonable commit messages, so it would be better if you could use those. If you're not sure how I can help with that. |
Ralf, that's right, I was trying to bring in the changes into my branch to address some of the issues that Sturla brought up, but I did it by copying the final files from fixes-sturla and committing them. Could you let me know the correct way to bring these in? |
The tests for query_pairs still fail on Windows 64. |
To remove the last commit from your branch and this PR:
Then to grab commits from my repo
Then push that back to Github so it shows up here. The second step above can also be replaced by asking me to send you a PR with my changes. You can also use |
Thanks for the detailed instructions! All set now, hopefully. Will comment on the substantive parts before going to bed tonight. |
Following up on Sturla's suggestion, I implemented a version of query that has the same recursive traversal strcuture as all the other methods, and that uses the rectangle-to-point distance tracker class for readability. Unfortunately, it's about 3 times as slow as the old version of query: dimension 3, 10000 points As far as I can tell, this is because the old version of query uses a priority queue to continuosly select for processing the closest unexamined kd-tree node. This is something that can't be done in a recursive algorithm. I committed the code for historical curiosity, but will remove it in the following commit. |
Updated benchmarks with current version of the code. I ran these on a different machine, but the improvement factors should be mostly unaffected: query_ball_point: ~ 200-fold improvement So making the code immensely more readable accrues only a slight speed penalty for some functions, and has made others (notably sparse_distance_matrix) much faster. query_ball_point (query radius 0.5, same benchmark code as above) dimension 3, 10000 points dimension 8, 10000 points dimension 16, 10000 points query_ball_tree dimension 3, 1000 points in tree 1, 1000 points in tree 2 query_pairs dimension 3, 1000 points in tree count_neighbors dimension 3, 1000 points in tree 1, 1000 in tree 2 sparse_distance_matrix dimension 3, 1000 points in tree 1, 1000 points in tree 2 |
@patvarilly you used a script adapted from the mailing list right? It may be useful to put that under spatial/benchmarks and make it available as |
And: timings look great, nice job! |
All tests pass on Windows 64 and Python 2.7. The code looks nice and clean now. I think it's ok to go into SciPy if the tests pass on the other platforms as well. Patrick's rewrite removes serveral errors from the current cKDTree code (integer size and exception handling) and makes it fully compatible with KDTree (i.e. can be used as a drop-in replacement). We should still look into GIL handling, but this should be postponed to a second update. KDTree should be kept as it is easier to extend from Python. |
All passes on OS X with Python 3.2. too. So guess this is ready. |
I have placed a ticket for this PR on the trac (ticket #1712). |
On Win64 .query() should return an array of dtype=int if a C long does not overflow. There is only a test for this on scalar return values. |
I have an implementation of the last issue ready. It still needs testing, I will send Patrick a pull request when I'm done. |
Ok, tests passes. I've sent a pull request to Patrick that should be merged first. |
@patvarilly have you had a chance to look at Sturla's PR? Would be nice to get this merged soon. |
Alright, Sturla's PR is (finally) in, and the benchmarks are now self-contained in spatial/benchmarks (runnable by saying "import scipy.spatial; scipy.spatial.bench()". As far as I can tell, that would wrap up this pull request and make it ready for merging. My apologies for the extreme delay in getting this last bit done. I look forward to your comments. |
We should regenerate ckdtree.c with Cython 0.17 to avoid any Cython related bugs, and re-run all the tests. I haven't had time to install the latest Cython yet. It was released on September 1th. |
OK, regenerated ckdtree.c with Cython 0.17. I re-ran the tests and they all pass. |
Benchmarks look good. They run a bit slow but I don't think that's a problem. And hard to avoid if you want to show differences of O(1000) without printing too many digits. |
All tests pass for me too. Ready to merge I think. |
Checked the change to setup.py, this isn't needed for setupscons.py or bento. |
One more request @patvarilly: would you mind adding a short paragraph describing the main changes in doc/release/0.12.0-notes? |
All set. I don't know what caused the indentation to go a bit off, but it's now fixed. What do you mean by "Checked the change to setup.py, this isn't needed for setupscons.py or bento"? Should I have made any other changes to make the benchmarks work? I usually compile and install SciPy by running "python setup.py install --user" in my scipy directory. |
No other changes are needed. We have 3 build systems, one based on numpy.distutils (setup.py files), one based on numscons (SConscript files) and one based on Bento (bento.info / bscript files). So for most changes to setup.py, those need to be made for all build systems. In this case that wasn't needed. |
Thanks for adding to the release notes and fixing the indentation. |
Merged in eec9174. Thanks again Patrick and Sturla! |
scipy.spatial.cKDTree implements only part of the more general scipy.spatial.KDTree interface in Cython. This pull requests ports over the method query_ball_point to cKDTree and adds corresponding unit tests to verify correctness.
I have used a slightly modified version of Anne Archibald's benchmarking script to measure the resulting performance gains (see http://mail.scipy.org/pipermail/scipy-dev/2008-October/009854.html, script copied below). The corresponding numbers show a ~ 250-350-fold speedup (this seems relatively insensitive to the query ball radius, see below):
dimension 3, 10000 points
KDTree constructed: 0.0823722
cKDTree constructed: 0.00159502
KDTree 1000 lookups: 0.574862
cKDTree 1000 lookups: 0.00470614
flat cKDTree 1000 lookups: 0.342408
KDTree 1000 ball lookups: 5.54947
cKDTree 1000 ball lookups: 0.0189772
flat cKDTree 1000 ball lookups: 0.343151
Ball lookups agree? True
dimension 8, 10000 points
KDTree constructed: 0.168782
cKDTree constructed: 0.00218606
KDTree 1000 lookups: 10.7511
cKDTree 1000 lookups: 0.091007
flat cKDTree 1000 lookups: 0.675035
KDTree 1000 ball lookups: 6.13426
cKDTree 1000 ball lookups: 0.017458
flat cKDTree 1000 ball lookups: 0.351695
Ball lookups agree? True
dimension 16, 10000 points
KDTree constructed: 0.11167
cKDTree constructed: 0.00220299
KDTree 1000 lookups: 62.2512
cKDTree 1000 lookups: 1.01098
flat cKDTree 1000 lookups: 1.44906
KDTree 1000 ball lookups: 3.89043
cKDTree 1000 ball lookups: 0.015486
flat cKDTree 1000 ball lookups: 0.352298
Ball lookups agree? True
For a probe radius of 0.5 instead of 0.2 (see script for meaning), I get the following results in 3 dimensions:
dimension 3, 10000 points
KDTree constructed: 0.0826931
cKDTree constructed: 0.00157905
KDTree 1000 lookups: 0.556555
cKDTree 1000 lookups: 0.00440192
flat cKDTree 1000 lookups: 0.338442
KDTree 1000 ball lookups: 16.3933
cKDTree 1000 ball lookups: 0.0493851
flat cKDTree 1000 ball lookups: 0.423063
Ball lookups agree? True
As far as I can tell, the issue is that KDTree.query_ball_point is written for readability, but ends up doing large numbers of list operations to build the results.
From here on, it should not be much harder to port over the remaining methods of KDTree to cKDTree, although these are not a current priority for me. As an aside, it seemed at first glance that the public interfaces for KDTree.query() and cKDTree.query() have fallen out of sync, though I didn't check thoroughly.
All the best,
Patrick
import numpy as np
import time
from scipy.spatial import KDTree, cKDTree
m = 3
n = 10000
r = 1000
data = np.concatenate((np.random.randn(n//2,m),
np.random.randn(n-n//2,m)+np.ones(m)))
queries = np.concatenate((np.random.randn(r//2,m),
np.random.randn(r-r//2,m)+np.ones(m)))
print "dimension %d, %d points" % (m,n)
t = time.time()
T1 = KDTree(data)
print "KDTree constructed:\t%g" % (time.time()-t)
t = time.time()
T2 = cKDTree(data)
print "cKDTree constructed:\t%g" % (time.time()-t)
t = time.time()
w = T1.query(queries)
print "KDTree %d lookups:\t%g" % (r, time.time()-t)
del w
t = time.time()
w = T2.query(queries)
print "cKDTree %d lookups:\t%g" % (r, time.time()-t)
del w
T3 = cKDTree(data,leafsize=n)
t = time.time()
w = T3.query(queries)
print "flat cKDTree %d lookups:\t%g" % (r, time.time()-t)
del w
t = time.time()
w1 = T1.query_ball_point(queries, 0.2)
print "KDTree %d ball lookups:\t%g" % (r, time.time()-t)
t = time.time()
w2 = T2.query_ball_point(queries, 0.2)
print "cKDTree %d ball lookups:\t%g" % (r, time.time()-t)
t = time.time()
w3 = T3.query_ball_point(queries, 0.2)
print "flat cKDTree %d ball lookups:\t%g" % (r, time.time()-t)
all_good = True
for a, b in zip(w1, w2):
if sorted(a) != sorted(b):
all_good = False
for a, b in zip(w1, w3):
if sorted(a) != sorted(b):
all_good = False
print "Ball lookups agree? %s" % str(all_good)