Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about recall calculation #519

Closed
oleg0x opened this issue May 14, 2024 · 4 comments
Closed

Question about recall calculation #519

oleg0x opened this issue May 14, 2024 · 4 comments

Comments

@oleg0x
Copy link

oleg0x commented May 14, 2024

Hello,
Could you clarify please how recall is calculated in the case when an algorithm search returns 10 equal vectors each of which is the closest to the request? Is it possible that in such case the recall (from get_recall_values function?) will be 100%?

@maumueller
Copy link
Collaborator

We look at the distances of the reported points. Every point that is within the distance (+ small epsilon) of the k-th nearest neighbor is considered a correct result.

@oleg0x
Copy link
Author

oleg0x commented May 16, 2024

Suppose for k=10 an algorithm returns 10 copies of the same point which is the closest to a query. Will the recall be 100% or 10% in this case?

@maumueller
Copy link
Collaborator

Now I get what you mean. As far as I can see, we do not check whether an algorithm returns duplicated points. It would get 100% recall by just returning the nearest neighbor 10 times.

@oleg0x
Copy link
Author

oleg0x commented May 22, 2024

OK, thanks. But now it seems this is a loophole for ANNS algorithms developers :). If someone fix it, the results may change a little.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants