-
Notifications
You must be signed in to change notification settings - Fork 15.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fallback to {} for None metadata from Chroma #1714
fallback to {} for None metadata from Chroma #1714
Conversation
* master: (68 commits) hotfix (langchain-ai#1742) Harrison/move docs (langchain-ai#1741) move docs (langchain-ai#1740) bump version to 114 (langchain-ai#1739) Harrison/latex splitter (langchain-ai#1738) Harrison/blackboard loader (langchain-ai#1737) docs: add docs link to agent toolkits (langchain-ai#1735) fix: agent json parser fails with text in suffix (langchain-ai#1734) Harrison/official method (langchain-ai#1728) Sagemaker Endpoint LLM (langchain-ai#1686) adding new agent types in comments (langchain-ai#1711) (OpenAI) Add model_name to LLMResult.llm_output (langchain-ai#1713) Fix all the bug in init Tool in docs (langchain-ai#1725) Bump duckdb-engine to 0.7.0 (langchain-ai#1726) Add HTML document_loader that includes page title metadata (langchain-ai#1720) fix async in agent (langchain-ai#1723) pydantic/json parsing (langchain-ai#1722) Loosen PyYAML dependency (langchain-ai#1698) Adding ability to `return_pl_id` to all PromptLayer Models in LangChain (langchain-ai#1699) fallback to {} for None metadata from Chroma (langchain-ai#1714) ...
Hey there - I still seem to be getting this error when I'm using chroma with pandas dataframe and the dataframe loader. My exact error happens when I do: docsearch = Chroma.from_documents(texts, embeddings) I get this error: Expected metadata value to be a str, int, or float, got None. If I use a dataframe with just one column it works but then I don't have any metadata. If I have other columns and specify one as the content column and the others as metadata my documents show the metadata but then running the above command still gives me the error. Any ideas? |
@apremjee8 i have the same problem with you , now do you solve it ? |
i have solve it , i read the source code, your data key's value must be not None, you can deal with document , if there exist None value |
Can you clarify how you fixed this? Did you change metadata manually? I've been trying to figure out how to fix for last couple of days, and just haven't had any luck. |
I'm still getting this error as well when I'm using the example given in the langchain documentation on a website that doesn't generate language attribute:
The first member of the dict is:
And the None value in the language key that is return after splitting generates the same error
|
@Merdaneth |
@jeffchuber sure I can. But I shouldn't get back a data structure that produces invalid JSON in the metadata from the document/content loader functionality of langchain in the first place. For this use case I solved it like this:
|
The basic vector store example started breaking because
Document
requirednot None
for metadata, but Chroma stores metadata asNone
if none is provided. This creates a fallback which fixes the basic tutorial https://langchain.readthedocs.io/en/latest/modules/indexes/examples/vectorstores.htmlHere is the error that was generated