Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework Dependencies: ship with barebones dependencies & bundle different features as extras #136

Open
bclavie opened this issue Feb 14, 2024 · 1 comment
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@bclavie
Copy link
Owner

bclavie commented Feb 14, 2024

Putting this out there as a way to alleviate the many dependencies issues. I'll soon be shipping a PLAID (&compression, that will come later)-free indexing method, which will alleviate the need to run custom CUDA code or faiss when indexing small collections (anything up to ~2000 256 token documents can still be queried in hundreds of milliseconds on CPU).

Once this index has shipped, I am planning to overhaul dependencies, as I'm being told more and more that RAGatouille is making it into prod use cases and the "full fat" default version is kind of annoying. This is where I'm currently at in terms of versions:

ragatouille

Features: Search, In-memory encoding, uncompressed indexing
Deps:

  • colbert-ai (backbone)
  • srsly (serialisation)
  • torch (backbone) --> maybe optional

ragatouille[train]

REMOVE SENTENCE-TRANSFORMERS
Features: Training, hard negative mining
Additional deps:

  • Voyager (hard neg dense quick retrieval)

ragatouille[plaid-cpu]

Features: Plaid indexing on CPU
Additional Deps:

  • faiss-cpu
  • llama-index (for now, for chunking)

ragatouille[plaid-gpu]

Features: Plaid indexing on GPU
Additional Deps:

  • faiss-gpu
  • llama-index (for now, for chunking)

ragatouille[langchain]

Features: Allows export as langchain retriever
Additional Deps:

  • langchain
  • langchain_core

ragatouille[onnx]

Features: Allows ONNX format export (for Vespa)
Additional Deps:

  • onnx

ragatouille[all]

Features: Everything
Deps:

  • all of the above

Any feedback on this would be appreciated at this stage -- very early thoughts still! One big question is whether torch (which is required) should ship with the base version, or be optional to facilitate env compatibility.

@phaistos
Copy link

There are issues when adding ragatouille to a llama-index 0.10.x project. It pulls an 0.9.x artifact and some of the core namespaces get confused, e.g. you can't import LLM from core anymore. Since it doesn't seem integral to your project, perhaps you could bump it up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants