Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/ LanceDB integration #350

Closed
wants to merge 4 commits into from
Closed

Conversation

raghavdixit99
Copy link

@raghavdixit99 raghavdixit99 commented May 6, 2024

Hi,
I have added the code for LanceDB vector store class.
supports:

  • init_collection
  • upsert/bulk upsert
  • filtered search ( have added metadata filtering support as well but we should decide on a schema for main PR )
  • filtered deletion
  • unique values in metadata

TODO : Code cleanup/minor fixes pending along with final schema. Add docs wherever necessary.

Testing :

I tested via adding an example, added configs/local_ollama_lancedb.json and tested via run_test_client.py.

facing an issue in bulk insert have commented the code in r2r/main/app.py which I think is causing the issue, storage happens in embedding.py via upsert_entries, but it is being called sequentially so bulk insert isnt happening, and the run() method doesnt accept list() it accepts a single DocumentPage class.

If I am doing something wrong here please let me know, otherwise would need to request some changes in the r2r code.

Thanks


🚀 This description was created by Ellipsis for commit 74e9b5b

Summary:

Integrates LanceDB as a new vector database provider, enhancing the system's capabilities with metadata filtering and addressing bulk insert issues.

Key points:

  • Added LanceDB integration for vector storage and retrieval.
  • Modified configuration and core files to support LanceDB.
  • Addressed bulk insert issues and added metadata filtering.
  • Tested integration with example client and configurations.

Generated with ❤️ by ellipsis.dev

Copy link

vercel bot commented May 6, 2024

@raghavdixit99 is attempting to deploy a commit to the Sciphi-Team Team on Vercel.

A member of the Team first needs to authorize it.

@raghavdixit99
Copy link
Author

hi @emrgnt-cmplxty , could you have a look at the dev PR, will raise an official one once I get some clarity on the above as well as the metadata schema.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested.

  • Reviewed the entire pull request up to 74e9b5b
  • Looked at 945 lines of code in 11 files
  • Took 1 minute and 22 seconds to review
More info
  • Skipped 1 files when reviewing.
  • Skipped posting 0 additional comments because they didn't meet confidence threshold of 50%.

Workflow ID: wflow_WSkosawfqu8bLiwr


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with review rules, user-specific overrides, quiet mode, and more. See docs.

r2r/vector_dbs/lancedb/base.py Outdated Show resolved Hide resolved
r2r/vector_dbs/lancedb/base.py Show resolved Hide resolved
r2r/vector_dbs/lancedb/base.py Show resolved Hide resolved
r2r/vector_dbs/lancedb/base.py Outdated Show resolved Hide resolved
raghavdixit99 and others added 2 commits May 6, 2024 03:15
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
@emrgnt-cmplxty emrgnt-cmplxty deleted the branch SciPhi-AI:dev May 30, 2024 00:49
@AyushExel
Copy link

@emrgnt-cmplxty hey can we get some action on this one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants