Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] RAG: Add support for Int8 embeddings #118

Open
svilupp opened this issue Apr 3, 2024 · 0 comments
Open

[FR] RAG: Add support for Int8 embeddings #118

svilupp opened this issue Apr 3, 2024 · 0 comments
Labels

Comments

@svilupp
Copy link
Owner

svilupp commented Apr 3, 2024

It would be great to have support for embeddings compressed to Int8 as per HuggingFace: Embedding Quantization.

Potential implementation would be to:

  • define an embedder (<:AbstractEmbedder for get_embeddings) and the corresponding finder (<:AbstractSimilarityFinder for find_similar)
  • Both would have the vectors with necessary min_values and max_values fields to hold the effective range for each embedding dimension (eg, length(min_values)=length(max_values)=D)
  • define methods for these types
  • The conversion to Int8 could be done post hoc (after build_index) via a utility function and then the resulting finder with the range to allow converting to Int8 (to be provided to the airag)
  • It should implement the two-stage pass with rescore_multiplier=4 (first on Int8 embeddings, then with Float x Int8)
@svilupp svilupp added the RAG label Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant