Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

langchain & chroma - Basic Example #13191

Closed
4entertainment opened this issue Nov 10, 2023 · 3 comments
Closed

langchain & chroma - Basic Example #13191

4entertainment opened this issue Nov 10, 2023 · 3 comments
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature 🔌: chroma Primarily related to ChromaDB integrations Ɑ: embeddings Related to text embedding models module 🤖:question A specific question about the codebase, product, project, or how to use a feature Ɑ: vector store Related to vector store module

Comments

@4entertainment
Copy link

Issue you'd like to raise.

No response

Suggestion:

# import
from langchain.embeddings.sentence_transformer import SentenceTransformerEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.document_loaders import TextLoader
from silly import no_ssl_verification
from langchain.embeddings.huggingface import HuggingFaceEmbeddings

with no_ssl_verification():
    # load the document and split it into chunks
    loader = TextLoader("state_of_the_union.txt")
    documents = loader.load()

    # split it into chunks
    text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
    docs = text_splitter.split_documents(documents)

    # create the open-source embedding function
    embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
    # hfemb = HuggingFaceEmbeddings()
    # load it into Chroma
    db = Chroma.from_documents(docs, embedding_function)

    # query it
    query = "What did the president say about Ketanji Brown Jackson"
    docs = db.similarity_search(query)

    # print results
    print(docs[0].page_content)

i use this code but i get the following error:
ValueError: Expected EmbeddingFunction.call to have the following signature: odict_keys(['self', 'input']), got odict_keys(['self', 'args', 'kwargs'])
Please see https://docs.trychroma.com/embeddings for details of the EmbeddingFunction interface.
Please note the recent change to the EmbeddingFunction interface: https://docs.trychroma.com/migration#migration-to-0416---november-7-2023

can you give me fixed code and explain?

Copy link

dosubot bot commented Nov 10, 2023

🤖

Hello @4entertainment,

It's nice to see you again! I appreciate you reaching out with your issue.

I'm currently looking into it and will provide a detailed response shortly.

@dosubot dosubot bot added Ɑ: embeddings Related to text embedding models module Ɑ: vector store Related to vector store module 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature 🤖:question A specific question about the codebase, product, project, or how to use a feature 🔌: chroma Primarily related to ChromaDB integrations labels Nov 10, 2023
Copy link

dosubot bot commented Nov 10, 2023

🤖

Based on the error message you're receiving, it seems like the issue is related to the recent changes in the EmbeddingFunction interface in Chroma version 0.4.16. The LangChain repository has similar solved issues that suggest updating LangChain to the latest version (0.0.332) which is compatible with Chroma version 0.4.16. You can do this by running the command pip install -U langchain chromadb.

However, since the latest version of LangChain is v0.0.333, I recommend updating to this version instead. You can do this by running the command pip install -U langchain==0.0.333 chromadb.

If the issue persists, it might be related to the local embeddings setup. It is suggested to check if the custom OpenAI embeddings endpoint works with the new SDK alone. If the issue is specific to GPT4AllEmbeddings, it is suggested to submit a new issue specific to GPT4AllEmbeddings.

Regarding the correct signature for the EmbeddingFunction.__call__ method, I wasn't able to find an answer within the repository. However, based on the error message, it seems like the method is expected to have the signature odict_keys(['self', 'input']), but it currently has the signature odict_keys(['self', 'args', 'kwargs']). This suggests that the method should take a single argument 'input' after 'self', instead of 'args' and 'kwargs'.

I hope this helps! If you have any more questions or if the issue persists, please let me know.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

Copy link

dosubot bot commented Feb 9, 2024

Hi, @4entertainment,

I'm helping the LangChain team manage their backlog and am marking this issue as stale. From what I understand, you encountered a ValueError related to the expected signature of the EmbeddingFunction and a recent change to the EmbeddingFunction interface. Dosubot provided a detailed response, suggesting updating LangChain to the latest version (0.0.333) and provided commands for updating. Additionally, potential issues related to local embeddings setup and the correct signature for the EmbeddingFunction.__call__ method were mentioned, along with links to relevant issues and code references.

Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days. Thank you!

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Feb 9, 2024
@dosubot dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 16, 2024
@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Feb 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature 🔌: chroma Primarily related to ChromaDB integrations Ɑ: embeddings Related to text embedding models module 🤖:question A specific question about the codebase, product, project, or how to use a feature Ɑ: vector store Related to vector store module
Projects
None yet
Development

No branches or pull requests

1 participant