Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does Faiss distance function support cosine distance? #593

Closed
2 of 4 tasks
Qiaogx opened this issue Sep 18, 2018 · 15 comments
Closed
2 of 4 tasks

Does Faiss distance function support cosine distance? #593

Qiaogx opened this issue Sep 18, 2018 · 15 comments

Comments

@Qiaogx
Copy link

Qiaogx commented Sep 18, 2018

Currently, I see faiss support L2 distance and inner product distance. My question is whether faiss distance function support cosine distance.

Thanks.
Gary

Summary

Platform

OS:

Faiss version:

Faiss compilation options:

Running on:

  • CPU
  • GPU

Interface:

  • C++
  • Python

Reproduction instructions

@Enet4
Copy link
Contributor

Enet4 commented Sep 18, 2018

This seems like a duplicate of #95. The cosine distance is obtained with the inner product after normalizing all vectors to unit norm.

@Qiaogx
Copy link
Author

Qiaogx commented Sep 18, 2018

Thanks Eduardo!

Does that mean if I want to use cosine similarity, I need to convert both the query vectors and the centroids vectors to unit vector beforehand?

@Enet4
Copy link
Contributor

Enet4 commented Sep 18, 2018

Does that mean if I want to use cosine similarity, I need to convert both the query vectors and the centroids vectors to unit vector beforehand?

That is correct.

@Qiaogx
Copy link
Author

Qiaogx commented Sep 18, 2018

Thanks for your quick reply. And I will try to add cosine distance option to faiss, since sometimes the input vectors are not unit vectors and converting them beforehand on CPU cost more time.

@mdouze
Copy link
Contributor

mdouze commented Sep 18, 2018

@Qiaogx please don't spend effort on that. Anything you can do will be more expensive than normalizing the vectors.

@Qiaogx
Copy link
Author

Qiaogx commented Sep 18, 2018

Hi Matthijs,

When using cosine similarity, except normalizing the query and centroid vectors, are there any other factors I need to pay attention to? I normalized the vectors and set l2Distance_ to False in faiss/gpu/impl/FlatIndex.cu, but It did not work.

Thanks.

@liqima
Copy link

liqima commented Sep 22, 2018

Hi Matthijs,

When using cosine similarity, except normalizing the query and centroid vectors, are there any other factors I need to pay attention to? I normalized the vectors and set l2Distance_ to False in faiss/gpu/impl/FlatIndex.cu, but It did not work.

Thanks.

this works.

quantizer = faiss.IndexFlatIP(d)
index = faiss.IndexIVFFlat(quantizer, d, nlist, faiss.METRIC_INNER_PRODUCT)

@Qiaogx
Copy link
Author

Qiaogx commented Sep 28, 2018

Thank you@liqima. However, it seems IndexIVFPQ does not fully support inner product.

@mdouze
Copy link
Contributor

mdouze commented Oct 1, 2018

Inner product con IndexIVFPQ is fully supported on CPU, otherwise it is a bug.
Another note: for normalized vectors, L2 and inner product search are equivalent so you can use L2 search.

@mdouze
Copy link
Contributor

mdouze commented Oct 23, 2018

No activity. Closing.

@mdouze mdouze closed this as completed Oct 23, 2018
@ZQ-XPY
Copy link

ZQ-XPY commented Jun 24, 2019

Hi,@Qiaogx,can you tell me how do you achieve the cosine distance? normalized vector or other ways?

@mdouze
Copy link
Contributor

mdouze commented Jun 24, 2019

@gabrer
Copy link

gabrer commented Jun 17, 2020

Note that there is now a FAQ entry for this question:
https://github.com/facebookresearch/faiss/wiki/FAQ#how-can-i-index-vectors-for-cosine-distance

This is the updated link:
https://github.com/facebookresearch/faiss/wiki/MetricType-and-distances#how-can-i-index-vectors-for-cosine-similarity

@mdouze
Copy link
Contributor

mdouze commented Jun 17, 2020

Thanks! updated the link above to avoid confusion.

@thomasahle
Copy link

Note that normalizing the vectors before K-means, and the cluster centers after K-means, is not quite the same as cosine-sim K-means.
For the best performance, you should normalize the norm of the cluster centers after each step of Lloyd's algorithm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants