New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for callable metrics #93
Comments
Almost all the problems I've encountered while maintaining openTSNE have been However, it should be fairly simple to achieve what you want. I've made it so that the |
Sorry, I wasn't clear. I'm not suggesting you tie I'm trying to apply Again, I appreciate the work you've done on this. I'm not trying to be pushy. Just wanted to let you know that |
Hmm, I'm all for custom distance metrics, but the "numba-compiled callable" threw me off a little bit. I'm not very familiar with numba so is it possible to call numba compiled from regular old python code? What I'm getting at is that opentsne supports both exact neighbor search via scikit-learn and approximate neighbor search via pynndescent, and I'd like to keep some consistency between the two. My concern was that if the function is numba-compiled, that it wouldn't be possible to use that callable in the scikit-learn Putting all numba-related things aside, adding custom distance metrics was definitely on my to-do list, but I've been and will likely be short on time for a while, so if you're willing to open a PR, I'd be more than happy to take a look. The changes should be fairly minimal anyhow. |
Ah, I see. Sorry I assumed you were familiar with numba as it's one of the main dependencies for pynndescent. numba is a library that JIT-compiles pure python/numpy functions to make them faster, so a numba compiled function works just like any other python function in practice. scikit-learn's from sklearn.neighbors import BallTree
from numba import njit
import numpy as np
@njit(fastmath=True)
def l1(x, y):
return np.sum(np.abs(x - y))
tree = BallTree(np.random.normal(size=(1000,10)), metric=l1)
distances, indices = tree.query(np.random.normal(size=(100,10)), k=5) The only issue I can see is if the function isn't numba compiled and it's passed to pynndescent then pynndescent will throw an uninformative error message from numba. Whether or how to deal with this might require some thought, or maybe just make clear in the docstring and leave this for pynndescent to deal with. I assume if the user is savvy enough to pass a custom metric then they'd expect error messages are a possibility. I'll take a look at the code and submit a PR with the changes. |
I would hope that numba has some kind of way to check if a function is compiled. Then if it isn't, we could do that within the After a bit of searching, I found it should be fairly straightforward to check if a function is an instance of the |
Fixed via #94. |
Expected behaviour
pynndescent supports passing callable metrics compiled with numba, so I would expect openTSNE to be able to support this
Actual behaviour
When I pass a numba-compiled callable metric, openTSNE throws a ValueError
Steps to reproduce the behavior
Here is a minimal working example:
The text was updated successfully, but these errors were encountered: