Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User-defined metrics #23

Closed
BenConnault opened this issue Mar 16, 2016 · 7 comments
Closed

User-defined metrics #23

BenConnault opened this issue Mar 16, 2016 · 7 comments

Comments

@BenConnault
Copy link

Thanks for a great package.

I was able to do a knn search on a BallTree using my own distance by simply doing this:

import NearestNeighbors: Metric, evaluate, euclidean

type MyMetric <: Metric end

evaluate(d::MyMetric,a, b)=min(1.0,euclidean(a,b))
evaluate(d::MyMetric,a::AbstractMatrix, b::AbstractArray,col::Int, do_end::Bool=true)=
    evaluate(d, slice(a, :, col), b)

This is already awesome! What would be even better is if an official API was provided for doing this. This would guarantee that a quick hack like the above will not break at the next update and also that proper checks and optimizations are enforced.

@KristofferC
Copy link
Owner

I would suspect that you might not even need the second function. It should fall back to https://github.com/KristofferC/NearestNeighbors.jl/blob/master/src/evaluation.jl#L45 if no method with those argument types for that Metric is defined.

You are right that a small example in the README would be good. The API is just the one from Distances.jl: https://github.com/JuliaStats/Distances.jl#computing-the-distance-between-two-vectors

@BenConnault
Copy link
Author

I found I had to define the second function too. Without it I get:

ERROR: LoadError: MethodError: `evaluate` has no method matching 
evaluate(::dev.MyMetric, ::Array{Float64,2}, ::Array{Float64,1}, ::Int64)
Closest candidates are:
 evaluate(::Union{Distances.Chebyshev,Distances.ChiSqDist, [...]
[...]
 in create_bsphere at /home/ben/.julia/v0.4/NearestNeighbors/src/hyperspheres.jl:38
 in build_BallTree at /home/ben/.julia/v0.4/NearestNeighbors/src/ball_tree.jl:98
 in build_BallTree at /home/ben/.julia/v0.4/NearestNeighbors/src/ball_tree.jl:113 (repeats 4 times)
 in BallTree at /home/ben/.julia/v0.4/NearestNeighbors/src/ball_tree.jl:67

In fact this is how I found I had to implement this. Feel free to use/point towards the above snippet if this is useful.

As far as the API goes, good to know that sticking to the user-facing interface of Distances.jl will work. Looking at metrics.jl in Distances.jl, would implementing eval_op, eval_reduce (and/or something else) lead to better performance than a naive evaluate(d::MyMetric,a, b) interface implementation like I did?

@KristofferC
Copy link
Owner

Right now, it might be a bit cheaper to create the eval_op and company, because I believe that creating the slice is expensive compared to the cost of the evaluate function. However, with some refactoring for #24 it should cost the same.

@KristofferC
Copy link
Owner

I added some docs about this.

@KristofferC
Copy link
Owner

For the new version, the second function you defined should not be needed.

@avonmoll
Copy link

avonmoll commented Apr 3, 2019

Perhaps I'm missing something, but I'm still having trouble with defining a custom metric. I want to compute the weighted nearest neighbor. That is, each point comprising the tree has an associated weight; points that are weighted highly will seem further away. This is somewhat related to the idea of a weighted Voronoi diagram where the tree members represent generator points in the WVD. Here is a snippet:

using NearestNeighbors, Distances

struct WEuclidean <: Metric end

function evaluate(d::WEuclidean, a,b)
    w = a[end] * b[end]
    return w * euclidean(a[1:end-1], b[1:end-1])
end

anchors = [0.25 0.75; 0.75 0.25; 0.5 1]
balltree = BallTree(anchors, WEuclidean())

So the last element in each vector is its weight. When I query the tree to find nearest neighbors, the query point will have a 1 there.

When I run the snippet, I get:

ERROR: MethodError: no method matching evaluate(::WEuclidean, ::SArray{Tuple{3},Float64,1,3}, ::MArray{Tuple{3},Float64,1,3})
Closest candidates are:
  evaluate(::PreMetric, ::AbstractArray{T,1} where T, ::AbstractArray{T,1} where T, ::Bool) at /home/alex/.julia/packages/NearestNeighbors/N7lgR/src/evaluation.jl:25
  evaluate(::CorrDist, ::AbstractArray, ::AbstractArray) at /home/alex/.julia/packages/Distances/HOWRG/src/metrics.jl:269
  evaluate(::MeanAbsDeviation, ::Any, ::Any) at /home/alex/.julia/packages/Distances/HOWRG/src/metrics.jl:442
  ...

But aren't ::SArray{Tuple{3},Float64,1,3} and ::MArray{Tuple{3},Float64,1,3}) subtypes of Any?

@dkarrasch
Copy link
Contributor

You need to define

function Distances.evaluate(...)

or state

import Distances: evaluate

upfront.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants