In [1]:
import numpy as np
from sklearn.neighbors import BallTree

from rangesearch_box import rangesearch_box

# Initialization

In [2]:
ndim = 5
rad = .2
box = [(-rad, rad)] * ndim

In [3]:
X = np.random.rand(500, ndim)
X[:5]

array([[ 0.66120795,  0.16419294,  0.30479838,  0.05863241,  0.80880002],
       [ 0.60626541,  0.0991643 ,  0.46577911,  0.61364594,  0.982528  ],
       [ 0.98692162,  0.8922461 ,  0.00218318,  0.29426143,  0.69431703],
       [ 0.54370348,  0.35400763,  0.00119577,  0.25608221,  0.18517756],
       [ 0.19516881,  0.10861635,  0.06603759,  0.39365626,  0.62877324]])

# Compare results

In order to compare the results of the range query, we create here corresponding metric for ball tree.<br>
We can note that we loose the radius property of it.

In [4]:
def box_metric(box):
    def _box_metric(a, b):
        return any([not(box_d_min < b_d - a_d < box_d_max) for a_d, b_d, (box_d_min, box_d_max) in zip(a, b, box)])
    return _box_metric

In [5]:
bt = BallTree(X, metric=box_metric(box))
bt.query_radius(X[(2, 3), :], 0.)

array([array([ 2, 28]), array([368,   3])], dtype=object)

In [6]:
search = rangesearch_box(X)
search(X[(2, 3), :], box)

array([[  2,  28],
       [  3, 368]])

# Compare perf

## Few points

In [7]:
%timeit search(X[(2, 6, 10), :], box)
%timeit bt.query_radius(X[(2, 6, 10), :], 0.)

1000 loops, best of 3: 312 µs per loop
100 loops, best of 3: 7.84 ms per loop


## More points

In [8]:
y = np.random.rand(500, ndim)

%timeit search(y, box)
%timeit bt.query_radius(y, 0.)

10 loops, best of 3: 36.9 ms per loop
1 loop, best of 3: 1.27 s per loop


# Conclusion

Despite of the Cython part in the implementation of balltree, <br>
the rangesearch-box algorithm (based of python + numpy) seems to be faster.

However, it is important to note, there is a tradeoff since we are limited in the box shape we can define.