Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: faiss filter from list #6537

Merged
merged 2 commits into from
Jun 21, 2023

Conversation

HenriZuber
Copy link
Contributor

Feature

Using FAISS on a retrievalQA task, I found myself wanting to allow in multiple sources. From what I understood, the filter feature takes in a dict of form {key: value} which then will check in the metadata for the exact value linked to that key.
I added some logic to be able to pass a list which will be checked against instead of an exact value. Passing an exact value will also work.

Here's an example of how I could then use it in my own project:

    pdfs_to_filter_in = ["file_A", "file_B"]
    filter_dict = {
        "source": [f"source_pdfs/{pdf_name}.pdf" for pdf_name in pdfs_to_filter_in]
    }
    retriever = db.as_retriever()
    retriever.search_kwargs = {"filter": filter_dict}

I added an integration test based on the other ones I found in tests/integration_tests/vectorstores/test_faiss.py under test_faiss_with_metadatas_and_list_filter().

It doesn't feel like this is worthy of its own notebook or doc, but I'm open to suggestions if needed.

Who can review?

Tag maintainers/contributors who might be interested:

VectorStores related: @dev2049

@vercel
Copy link

vercel bot commented Jun 21, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Jun 21, 2023 5:04pm

@dev2049 dev2049 added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Jun 21, 2023
@dev2049
Copy link
Contributor

dev2049 commented Jun 21, 2023

thanks @HenriZuber!

@dev2049 dev2049 merged commit e0605b4 into langchain-ai:master Jun 21, 2023
tconkling added a commit to tconkling/langchain that referenced this pull request Jun 22, 2023
* master:
  MD header text splitter returns Documents (langchain-ai#6571)
  Fix callback forwarding in async plan method for OpenAI function agent (langchain-ai#6584)
  bump 209 (langchain-ai#6593)
  Clarifai integration (langchain-ai#5954)
  Add missing word in comment (langchain-ai#6587)
  Add AzureML endpoint LLM wrapper (langchain-ai#6580)
  Add OpenLLM wrapper(langchain-ai#6578)
  feat: interfaces for async embeddings, implement async openai (langchain-ai#6563)
  Upgrade the version of AwaDB and add some new interfaces (langchain-ai#6565)
  add motherduck docs (langchain-ai#6572)
  Detailed using the Twilio tool to send messages with 3rd party apps incl. WhatsApp (langchain-ai#6562)
  Change Data Loader Namespace (langchain-ai#6568)
  Remove duplicate databricks entries in ecosystem integrations (langchain-ai#6569)
  Fix whatsappchatloader - enable parsing new datetime format on WhatsApp chat (langchain-ai#6555)
  Wait for all futures (langchain-ai#6554)
  feat: faiss filter from list (langchain-ai#6537)
  update pr tmpl (langchain-ai#6552)
  Remove unintended double negation in docstring (langchain-ai#6541)
  Minor Grammar Fixes in Docs and Comments (langchain-ai#6536)
This was referenced Jun 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm PR looks good. Use to confirm that a PR is ready for merging.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants