similarity Search Issue #2225

mohitraj · 2023-03-31T11:19:55Z

We are using Chroma for storing the records in vector form. When searching the query, the return documents do not give accurate results.
c1 = Chroma('langchain', embedding, persist_directory)
qa = ChatVectorDBChain(vectorstore=c1, combine_docs_chain=doc_chain, question_generator=question_generator,top_k_docs_for_context=12, return_source_documents=True)

What is the solution to get accurate results?

ghost · 2023-03-31T12:10:06Z

can tuke chucksize and overlaping paramter when you splitting the text and see will it improve acc. In my case it actually work

mohitraj · 2023-03-31T12:41:28Z

What is your chunk size and overlapping parameter?

khimaros · 2023-04-21T17:50:28Z

for me, when using LlamaCppEmbedding, chunk and overlap was not helpful. the results returned are almost in reverse order of what they should be with the best results almost dead last.

dosubot · 2023-09-10T16:04:26Z

Hi, @mohitraj! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, the issue you raised is about the return documents from a similarity search using Chroma not giving accurate results. In the comments, there were suggestions to try different chunk sizes and overlapping parameters, but it seems that these parameters did not help in improving the accuracy of the search. Unfortunately, there doesn't appear to be a resolution to this issue at the moment.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your understanding and contribution to the LangChain project. If you have any further questions or concerns, please don't hesitate to reach out.

sergerdn mentioned this issue Apr 13, 2023

Complete testing for Vector Stores #2816

Closed

dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Sep 10, 2023

dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 18, 2023

dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Sep 18, 2023

dosubot bot mentioned this issue Oct 15, 2023

Issue: Similarity Search on Chroma does not retrieve relevant chunk for homogeneous document search #11815

Closed

dosubot bot mentioned this issue Oct 26, 2023

Issue: Similarity_search on Vector Store does not work. #12326

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

similarity Search Issue #2225

similarity Search Issue #2225

mohitraj commented Mar 31, 2023

ghost commented Mar 31, 2023

mohitraj commented Mar 31, 2023

khimaros commented Apr 21, 2023

dosubot bot commented Sep 10, 2023

similarity Search Issue #2225

similarity Search Issue #2225

Comments

mohitraj commented Mar 31, 2023

ghost commented Mar 31, 2023

mohitraj commented Mar 31, 2023

khimaros commented Apr 21, 2023

dosubot bot commented Sep 10, 2023