Distance-weighted random sampling of non-neighbor negatives #67

sheffe · 2020-04-27T13:46:13Z

Not a fully-baked feature request, just a directional hunch. I've found the conclusions from this paper Sampling Matters in Deep Embedding Learning pretty intuitive -- (1) the method for choosing negative samples is critical to the overall embedding, maybe more than the specific loss function, and (2) a distance-weighted sampling of negatives had some nice properties during training and better results compared to uniform random sampling or oversampling hard cases.

I'm brand-new to Annoy, not confident on the implementation details or performance changes here, but I suspect that the prebuilt index could be used for both positive and negative sampling. An example: the current approach draws random negatives in sequence and chooses the first index not in a neighbor list. A distance-weighted approach for choosing a negative for each triplet might work like this:

Draw a random set of candidate negatives
Drop any candidate negatives already in the neighbor list
Choose from the remaining set of candidates with probabilities proportional to 1/f(dist(i, j)), where f(dist) could be just 1/dist, 1/sqrt(dist), etc

Annoy gives us the dist(i, j) without much of a performance hit. Weighted choice of the candidate negatives puts a (tunable) thumb on the scale for triplets that contain closer/harder-negative matches.

This idea probably does increase some hyperparameter selection headaches. I think the impactful choices here are the size of the initial set of candidate negatives and (especially) f(dist).

The text was updated successfully, but these errors were encountered:

Szubie · 2020-04-28T15:41:16Z

Hi, thanks for the feedback and the link to the paper, it was a good read!

I like the idea and think it would be interesting to try out and see how it impacts the results of the embeddings. The idea of weighting the sampling of triplets did occur to me back when I was initially writing ivis, but I was thinking about weighting the selection of positive examples rather than negatives at that time, as it would be almost free. Weighting the selection of neighbors never made it past initial prototyping though, since it didn't have a positive impact.

From that paper you linked it does sound like weighting the sampling of negative examples may well the worth trying though. Hopefully the performance impact is negligible, since Annoy is pretty efficient, but will keep an eye on it. It would be nice to keep the number of hyperparameters low, but if there are clear benefits from weighted sampling it shouldn't be a problem to expand the number slightly.

idroz · 2020-04-28T16:01:20Z

I think this would be worth trying, provided we can benchmark this feature against current performance. We have some benchmarking code that was used to assess L1/L2 distances between points in embedding and original spaces, e.g. https://bering-ivis.readthedocs.io/en/latest/comparisons.html#quantitative-evaluation

I can make that notebook available on ivis repo and we could use it with a new feature branch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distance-weighted random sampling of non-neighbor negatives #67

Distance-weighted random sampling of non-neighbor negatives #67

sheffe commented Apr 27, 2020

Szubie commented Apr 28, 2020

idroz commented Apr 28, 2020

Distance-weighted random sampling of non-neighbor negatives #67

Distance-weighted random sampling of non-neighbor negatives #67

Comments

sheffe commented Apr 27, 2020

Szubie commented Apr 28, 2020

idroz commented Apr 28, 2020