Question about recall calculation #519

oleg0x · 2024-05-14T14:05:12Z

Hello,
Could you clarify please how recall is calculated in the case when an algorithm search returns 10 equal vectors each of which is the closest to the request? Is it possible that in such case the recall (from get_recall_values function?) will be 100%?

maumueller · 2024-05-15T11:57:27Z

We look at the distances of the reported points. Every point that is within the distance (+ small epsilon) of the k-th nearest neighbor is considered a correct result.

oleg0x · 2024-05-16T08:49:06Z

Suppose for k=10 an algorithm returns 10 copies of the same point which is the closest to a query. Will the recall be 100% or 10% in this case?

maumueller · 2024-05-16T10:23:22Z

Now I get what you mean. As far as I can see, we do not check whether an algorithm returns duplicated points. It would get 100% recall by just returning the nearest neighbor 10 times.

oleg0x · 2024-05-22T11:10:12Z

OK, thanks. But now it seems this is a loophole for ANNS algorithms developers :). If someone fix it, the results may change a little.

maumueller closed this as completed May 15, 2024

maumueller mentioned this issue May 22, 2024

Assert that indexes are unique. #526

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about recall calculation #519

Question about recall calculation #519

oleg0x commented May 14, 2024

maumueller commented May 15, 2024

oleg0x commented May 16, 2024

maumueller commented May 16, 2024

oleg0x commented May 22, 2024

Question about recall calculation #519

Question about recall calculation #519

Comments

oleg0x commented May 14, 2024

maumueller commented May 15, 2024

oleg0x commented May 16, 2024

maumueller commented May 16, 2024

oleg0x commented May 22, 2024