Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
[MRG + 1] BUG: remove checks from PyFunc distance metric (fixes #6287) #6288
Once this pilot check is deleted, users may receive ambiguous error messages.
What if just change this check a little bit such as
It will raise
in the above case.
I'd initially put the check where it is because it happens only once. A check such as #6289 will happen for every evaluation, and I'm afraid the impact on performance will be quite large (though admittedly, the user-defined function is not particularly performant as-is).
@jakevdp Thanks for your useful opinion!
What do you think if change
d = self.func(x1arr, x2arr, **self.kwargs) try: return d except TypeError: raise TypeError("Customize function must return a float")
Since the usual case (i.e. if user didn't do something silly) is no exception, I think
I've tested it with the following script:
import numpy as np from sklearn.neighbors import BallTree import timeit n_samples = 10 ** 5 n_dim = 100 X = np.asarray(range(n_samples * n_dim)).reshape(n_samples, n_dim) def correct_distance(x, y): return np.sum((x - y) ** 2) def balltree(): b = BallTree(X, metric=correct_distance) time = timeit.Timer(balltree) print min(time.repeat(number=10))
which means adding