New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: add info on recursion limit to KDTree docstring. #281
Conversation
I think you could also address a recursion error by increasing the leaf size. |
Thanks to Oleksandr Huziy and David Parker.
Thanks @jakevdp, added that. |
By the way, what was the motivation for adding this note? The default recursion limit is usually ~1000, meaning that the query algorithm should be fine for up to about 2^1000 points, unless I'm mistaken. Is there an algorithmic detail I'm forgetting? |
A user who ran into it: http://thread.gmane.org/gmane.comp.python.scientific.user/32087 Don't think he had 2^1000 points. He provided his code and data, if you want to look into it. |
I checked this out, and I understand the issue now: I didn't realize that our KD tree used the sliding midpoint rule. With this split criterion, there's no guarantee of O[log(n)] nodes. For some structured datasets, the majority of nodes may actually be empty! I have some opinions about this (I think the benefits of sliding midpoint are overstated, and the potential for these sorts of node proliferation problems outweighs any potential performance gain) but that's probably a discussion for another thread. |
So merge this, and revise or add to the implementation later? Have you seen #262 by the way? Should be of interest to you. |
I'd say merge this as is for now. We can think about optimal splitting criteria at some other point. I've been following #262 as well -- it's looking like a good contribution. |
DOC: add info on recursion limit to KDTree docstring.
OK, thanks. Merged. |
The midpoint rule is very easy to change or even control with a keyword argument. I believe I used the median computed with mergesort originally (on SciPy's cookbook), but Anne changed it to the sliding midpoint. It is usually more efficent, but not always. |
I think #262 should be merged, but wait until after 0.11.0 is released. |
@sturlamolden #262 indeed looks good (thanks for all your work on that!), once Patrick is back from holiday and adds your last PR and perhaps the benchmark, I agree it can be merged. No need to wait for 0.11.0, because that's already in a separate branch. We don't live in the svn age anymore:) |
No description provided.