langchain.vectorstores.Chroma support for EmbeddingFunction.call update of ChromaDB #13441

Mshardul · 2023-11-16T04:32:31Z

System Info

This was working fine for my previous configuration,
langchain v0.0.225
chromadb v0.4.7

But now neither this is working, nor the latest version of both
langchain v0.0.336
chromadb v0.4.17

Who can help?

No response

Information

The official example notebooks/scripts
My own modified scripts

Related Components

Reproduction

I have the packages installed

Running these pieces of code

from langchain.document_loaders import TextLoader
from langchain.indexes import VectorstoreIndexCreator

loader = TextLoader(file_path)
index = VectorstoreIndexCreator().from_loaders([loader])  # this is where I am getting the error

OR

from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.embeddings.openai import OpenAIEmbeddings

text_splitter = RecursiveCharacterTextSplitter()
splits = text_splitter.split_documents(docs)
embedding = OpenAIEmbeddings()
vectordb = Chroma.from_documents(  # this is where I am getting the error
            documents=splits,
            embedding=embedding,
        )

Here is the error

Expected EmbeddingFunction.__call__ to have the following signature: odict_keys(['self', 'input']), got odict_keys(['self', 'args', 'kwargs'])\nPlease see https://docs.trychroma.com/embeddings for details of the EmbeddingFunction interface.\nPlease note the recent change to the EmbeddingFunction interface: https://docs.trychroma.com/migration#migration-to-0416---november-7-2023 \n

Expected behavior

Earlier a chromadb instance would be created, and I would be able to query it with my prompts. That is the expected behaviour.

The text was updated successfully, but these errors were encountered:

dosubot · 2023-11-16T04:32:59Z

🤖

Hello @Mshardul,

Nice to meet you! I'm a bot designed to assist with issues, answer questions, and help you become a contributor to the LangChain project. While we wait for a human maintainer to join us, I'm here to start the troubleshooting process with you. Please hold tight while I take a look at your issue and come up with a detailed response.

dosubot · 2023-11-16T04:38:11Z

🤖

The error you're encountering is likely due to a recent change in the ChromaDB library. The EmbeddingFunction.__call__ method in LangChain v0.0.336 might not be compatible with the updated signature in ChromaDB v0.4.17.

Based on the similar issues found in the LangChain repository, there are a few potential solutions:

Downgrade the Chroma library to a previous version (0.4.15) that is compatible with LangChain v0.0.336. This can be done by adding chromadb==0.4.15 to the requirements file or installing it directly using pip: pip install chromadb==0.4.15.
Update both LangChain and Chroma to their latest versions using pip install -U langchain chromadb. According to the issues found, the compatibility issue has been fixed in LangChain v0.0.332 and above.
If you're still encountering the issue after updating, you could try using the Chroma.add_documents() method instead of Chroma.from_documents(). This was suggested in the Bug after the openai updated in Embedding issue.

Unfortunately, without the exact source code of OpenAIEmbeddings and ChromaDB's EmbeddingFunction.__call__, I can't provide a more specific solution. If none of the above solutions work, could you please provide more details about the OpenAIEmbeddings class and the exact error message you're seeing?

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

umair313 · 2023-11-16T11:16:42Z

@Mshardul What is your python version?
its working for me on python 3.9.18.

Janaksunuwar · 2023-11-16T18:14:26Z

These version combinations worked for me.

!pip install langchain==0.0.316
!pip install openai==0.28.1
!pip install chromadb==0.4.15

DarkTeddie · 2023-11-20T20:57:05Z

Issue Resolution: Embedding Function Override for GPT4All/OpenAi with Chroma Update Compatibility

Hello,

I noticed your query about an issue with OpenAIEmbedding and wanted to share my solution, as I faced a similar problem with GPT4All. The root of the issue lies in the incompatibility between Langchain's embedding function implementation and the new requirements introduced by Chroma's latest update.

Problem Identified:

Langchain's embedding function lacks the __call__ method, which is now required by Chroma.
This problem is also present in OpenAI's implementation.

Relevant Documentation and Source Code:

Chroma Embedding Functions: Chroma Documentation
GPT4All in Langchain: GPT4All Source Code
OpenAI in Langchain: OpenAI Source Code

Solution Implemented:
I resolved this by creating a custom embedding function, inheriting from the existing GPT4AllEmbeddings class, and adding the __call__ method. This method is designed to output the result of the embed_document method.

Here's a snippet of the custom class implementation:

class CustomEmbeddingFunction(ExistingEmbeddingFunction):
    def __init__(self):
        super().__init__()

    def _existing_document_encoding_method(self, texts: List[str]) -> List[List[float]]:
        """
        Embed a list of documents using the EmbeddingModel.
        
        Args:
            texts: List of texts to embed.

        Returns:
            List of embeddings, one for each text.
        """
        embeddings = [self.client.embed(str(text)) for text in texts]
        return [list(map(float, e)) for e in embeddings]

    def __call__(self, input: List[str]) -> Embeddings:
        return self._embed_documents(input)

I hope this solution works for you as well.

deeeed · 2023-12-04T14:40:25Z

+1 would be great to have a fix on that

dosubot bot added Ɑ: embeddings Related to text embedding models module Ɑ: vector store Related to vector store module 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature 🔌: chroma Primarily related to ChromaDB integrations labels Nov 16, 2023

dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Mar 11, 2024

dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 18, 2024

dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Mar 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

langchain.vectorstores.Chroma support for EmbeddingFunction.call update of ChromaDB #13441

langchain.vectorstores.Chroma support for EmbeddingFunction.call update of ChromaDB #13441

Mshardul commented Nov 16, 2023

dosubot bot commented Nov 16, 2023

dosubot bot commented Nov 16, 2023

umair313 commented Nov 16, 2023

Janaksunuwar commented Nov 16, 2023

DarkTeddie commented Nov 20, 2023

deeeed commented Dec 4, 2023

langchain.vectorstores.Chroma support for EmbeddingFunction.__call__ update of ChromaDB #13441

langchain.vectorstores.Chroma support for EmbeddingFunction.__call__ update of ChromaDB #13441

Comments

Mshardul commented Nov 16, 2023

System Info

Who can help?

Information

Related Components

Reproduction

Expected behavior

dosubot bot commented Nov 16, 2023

dosubot bot commented Nov 16, 2023

Sources

umair313 commented Nov 16, 2023

Janaksunuwar commented Nov 16, 2023

DarkTeddie commented Nov 20, 2023

deeeed commented Dec 4, 2023

langchain.vectorstores.Chroma support for EmbeddingFunction.call update of ChromaDB #13441

langchain.vectorstores.Chroma support for EmbeddingFunction.call update of ChromaDB #13441