Skip to content

feat: FAISS in OpenSearch: Support HNSW for cosine #3217

Merged
tstadel merged 58 commits intomainfrom
os_faiss_cosine
Sep 23, 2022
Merged

feat: FAISS in OpenSearch: Support HNSW for cosine #3217
tstadel merged 58 commits intomainfrom
os_faiss_cosine

Conversation

@tstadel
Copy link
Member

@tstadel tstadel commented Sep 14, 2022

Related Issues

Proposed Changes:

  • normalize vectors manually at index time
  • normalize vectors manually at query time
  • use dot_product space under the hood
  • make normalization more efficient by using two-dimensional numpy array instead of normalizing each vector separately
  • run cosine tests for all document stores
  • run all document_store integration tests (not only OpenSearch)
  • fix test_faiss_and_milvus.py tests not running

pushed to another PR:

  • streamline validation of embedding shape across all document stores (BaseDocumentStore_validate_embeddings_shape)
  • introduce DenseRetriever abstraction to facilitate usage of retrievers in document store's update_embeddings

How did you test it?

  • made test test_cosine_similarity of test_faiss_and_milvus.py generic to run on all document stores

Notes for the reviewer

Checklist

@tstadel tstadel requested review from a team as code owners September 14, 2022 16:01
@tstadel tstadel requested review from bogdankostic and removed request for a team September 14, 2022 16:01
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@tstadel tstadel marked this pull request as draft September 21, 2022 09:47
@tstadel tstadel marked this pull request as ready for review September 21, 2022 09:47
Copy link
Contributor

@bogdankostic bogdankostic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seem to be some conflicts, otherwise LGTM! Also, I added a comment wrt #3102

# global ef_search setting affects only nmslib, for faiss it is set in the field mapping
if self.index_type == "hnsw":
if self.knn_engine == "nmslib" and self.index_type == "hnsw":
index_definition["settings"]["index"]["knn.algo_param.ef_search"] = 20
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably out of scope for this PR, just a reminder that at some point we should probably allow to set custom values for ef_search (see also #3102)

@tstadel tstadel requested a review from agnieszka-m September 22, 2022 21:31
@tstadel tstadel merged commit 05a86b9 into main Sep 23, 2022
@tstadel tstadel deleted the os_faiss_cosine branch September 23, 2022 11:26
@masci masci mentioned this pull request Oct 4, 2022
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FAISS in OpenSearch: Support HNSW for cosine

3 participants