Self-loops in NN graph #12

SamGRosen · 2024-03-21T02:06:09Z

Using R version 4.2.2 and rnndescent 0.1.4, the returned NN graph contains self-loops. Is this by design, for computational reasons, or a bug? Does this affect performance if the returned graphs are used as the init argument in subsequent calls to nnd_knn?
I could always just call the functions with $k+1$ neighbors and remove the first column, but I'm not sure if this will lead to a noticeable difference in speed or recall.

Ex:

test <- matrix(rnorm(200), nrow=20)
brute <- brute_force_knn(test, 4)
approx <- nnd_knn(test, 4)
brute$idx[, 1] #  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
approx$idx[, 1] #  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

I see now that this is the behavior of pynndescent. However, could you elaborate how to best handle this for init arguments, keep the loops or remove them?

The text was updated successfully, but these errors were encountered:

jlmelville · 2024-03-21T04:14:41Z

Yes, this is by design. I also found this very confusing initially -- I even opened an issue about it in the UMAP repo (lmcinnes/umap#53) many years ago, but it turns out it's not uncommon to define nearest neighbors this way.

If you need $k$ distinct non-self neighbors, then the best way forward is always to ask for $k + 1$ neighbors. In terms of init, I don't think it really matters because there is nothing in any of the code that relies on self loops not existing, it's assumed that the self neighbor is just as valid an item in a neighbor list as any other item. For e.g. k = 15, the fact that you only have 14 "real" neighbors to initialize with versus 15 probably doesn't affect the behavior of the algorithm that much: most candidates are either already seen or aren't moving the search in the right direction.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Self-loops in NN graph #12

Self-loops in NN graph #12

SamGRosen commented Mar 21, 2024 •

edited

Loading

jlmelville commented Mar 21, 2024

Self-loops in NN graph #12

Self-loops in NN graph #12

Comments

SamGRosen commented Mar 21, 2024 • edited Loading

jlmelville commented Mar 21, 2024

SamGRosen commented Mar 21, 2024 •

edited

Loading