Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to return at most N nearest neighbors? #1

Closed
astrojhgu opened this issue Mar 28, 2018 · 4 comments · Fixed by #3
Closed

Is it possible to return at most N nearest neighbors? #1

astrojhgu opened this issue Mar 28, 2018 · 4 comments · Fixed by #3

Comments

@astrojhgu
Copy link

According to the API doc, there is a find_nearest function. However, by default, it can only return the nearest neighbor node. How can I find out at most N nearest neighbors? This is import for interpolation.

@kornelski
Copy link
Owner

kornelski commented Mar 28, 2018

There's no built-in function for this, but it is possible by customizing matching done in the library.

There's find_nearest_custom that takes a visitor:

vpsearch/src/lib.rs

Lines 96 to 102 in c273038

pub trait BestCandidate<Item: MetricSpace<Impl> + Copy, Impl> where Self: Sized {
/// find_nearest() will return this type
type Output;
/// This is a visitor method. If the given distance is smaller than previously seen, keep the item (or its index).
/// UserData is the same as for `MetricSpace<Impl>`, and it's `()` by default.
fn consider(&mut self, item: &Item, distance: Item::Distance, candidate_index: usize, user_data: &Item::UserData);

and if you create your own implementation of that visitor:

vpsearch/src/lib.rs

Lines 111 to 130 in c273038

impl<Item: MetricSpace<Impl> + Copy, Impl> BestCandidate<Item, Impl> for ReturnByIndex<Item, Impl> {
type Output = (usize, Item::Distance);
#[inline]
fn consider(&mut self, _: &Item, distance: Item::Distance, candidate_index: usize, _: &Item::UserData) {
if distance < self.distance {
self.distance = distance;
self.idx = candidate_index;
}
}
#[inline]
fn distance(&self) -> Item::Distance {
self.distance
}
fn result(self, _: &Item::UserData) -> (usize, Item::Distance) {
(self.idx, self.distance)
}
}

you can collect N elements instead of picking one.

@astrojhgu
Copy link
Author

astrojhgu commented Mar 29, 2018

If I always keep N nearest visited nodes in my implementation of the BestCandidate trait, would it be guaranteed that after the find_nearest_custom function return, the N remaining nodes are the the nearest N's among all the nodes?

The reason why I'm curious about this question is that tree-like structure can trim unnecessary searching branches to speedup the search. What if the trimmed branches contain nodes that is among the N nearest ones?

@kornelski
Copy link
Owner

You're right! There's one more trick you have to do:

best_candidate.distance()

has to return distance of the Nth farthest node, which will force it to search everything in that radius.

@gijs-s
Copy link
Contributor

gijs-s commented Oct 15, 2020

I found myself needing the same thing as described in this issue, I made a pull request adding examples on how to achieve this! See #3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants