Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Chat with docs, i.e. Retrieval-based QA etc #17

Closed
fkcptlst opened this issue Apr 28, 2023 · 6 comments
Closed

[Feature request] Chat with docs, i.e. Retrieval-based QA etc #17

fkcptlst opened this issue Apr 28, 2023 · 6 comments

Comments

@fkcptlst
Copy link

I like the user interactive design of this project. Is it possible to combine retrieval based QA with the user-friendly interaction of this project?

@fkcptlst fkcptlst changed the title [Feature request] Chat with docs, i.e. QA etc [Feature request] Chat with docs, i.e. Retrieval-based QA etc Apr 28, 2023
@freedmand
Copy link
Owner

Thanks! I currently have no plans to add a chat-based layer of functionality on top of Semantra.

My primary motivation for building Semantra was to see how useful of an interface you could make without using any chat completions on top of embeddings. I have experimented with semantic search applications that use chat LLMs, and I've found that no matter what you try there is a risk they will a) fabricate plausible information, b) produce truthful information but not guided by context that uses outside knowledge, or c) provide truly unexpected responses (e.g. via prompt injection). On top of that, they are slow, computationally expensive, and cost significantly more to use than embeddings.

I do want to understand more though: what do you feel that you get with a chatbot experience that you don't feel is being met via the current interface of Semantra that shows relevant search results and even highlights relevant phrases within each result?

@fkcptlst
Copy link
Author

Thanks for the reply. I'm also experimenting with langchain's retrieval-QA. It turns out that a very long prompt is needed to avoid hallucination problem, which is very costly.

Semantra's semantic search functionality is good enough for "data acquisition" purpose, which is great. But if I want some insights from LLM(which may sound silly, but LLMs sometimes do provide useful insights), I still have to copy-paste relevant paragraphs.

Like you said, retrieval-QA can be very costly, so now I suppose it isn't worth adding it.

@hazxone
Copy link

hazxone commented Jun 4, 2023

@freedmand
I've experimenting semantic search areas with most of the new LLM models. Here are my observations that related to the issues you highlighted.

  1. Hallucination / fabricate information not guided by context:
  • this can be solved by using the right model and prompt

  • some of the models will always answer regardless of whether the question related to the context or not, they just ignored the context.

  • some models you have to prompt it in the right way. e.g:
    `
    Instruction:


    lorem ipsum ....


    Query:
    `
    In this prompt, the model will just answer without context

    `
    Instruction:

    [Context] lorem ipsum .... [/Context]

    Query:
    `
    In this prompt, the exact same model will say no if there are no answer in the context

Much better way that I found is we do mini CoT (chain of thought) two stages questioning.
First, we asked the model if it's possible to answer the question if we give the following context - just yes or no answer
If yes, then answer the question, if no, terminate the request. This gave lower hallucination result.

  1. Regarding slow, and expensive computation:
    Bigger parameters doesn't always better in term of semantic search (answer within context). There are <7b models that can correctly answer QA compared to 40B model. My observation is its more on the training data rather than the size. Answering context requires following instruction more than chatting.
    We can solve this by deploying small local LLM.

  2. Regarding : what do you feel that you get with a chatbot experience that you don't feel is being met via the current interface of Semantra that shows relevant search results and even highlights relevant phrases within each result?

  • Initially, I also thought a good search is enough without having any bot to answer the question. But when I use multiple documents, it quickly become confusing for the user.
  • For example, if I'm indexing a local law pdf, and then I asked, "What is the maximum jailtime for stealing a car?"
    there are many paragraphs that returned by the search results and the user need to read one by one and understand the context of each paragraph. Sure, eventually the user will get the answer, but it will take a long time.
  • But if we have a bot that first answer the question, the user just need to verify if the answer is correct from the highlighted answer. Which is much faster and efficient way to resolve any questions. The digesting and understanding of all the context now are handled by the bot and human just need to validate the answer.

@yych42
Copy link
Contributor

yych42 commented Jun 4, 2023

By the way semantra is designed, I think it should be possible to build on top of its api? Then this project could focus on providing better research results across a wide range of documents, and a separate project could build on top of it to provide the chat interface.

@hazxone regarding the case with legal documents, and just potentially numerous and confusing results for the user, I feel like the right step forward is to build/fine tune better embedding models that can deal with this kind of queries better. My tests so far is that whatever that might confuse the user might confuse the LLM even more. Our team got way better results for legal documents after putting each clause in association with each section and chapter, and then embedding each clause and those metadata one by one instead of relying on arbitrary length cutoffs (e.g., Chapter 2. Natural Person - Section 1. Civil Rights - Clause X. Loren ipsum…)

@freedmand
Copy link
Owner

I think that's exactly right that with a stable API adding any integrations including chatbots would be very easy. With v0.2 (which is under development but may take some time) I want Semantra to be easy to use as an application, backend API server, and even an importable Python library with a simple interface.

Interesting to hear the use case of splitting text. I'm also looking into being able to define plug-ins to cover a variety of splitting cases. Embeddings are often sensitive to the length of the input, meaning that sometimes short queries match short texts well so it's not always good to have tons of chunks at different sizes (arbitrary equivalent-sized chunks thus works well as a good default). But having a way to delineate sub-sections within documents that should always be cut-off and never part of overlapping chunks would certainly be useful to improve results quality.

@hazxone
Copy link

hazxone commented Jun 5, 2023

Thank you for the answer. Understand and I agree that chatbot should be separated from the main program.
Just a suggestion, it would be nice to have one small button in the main UI for curious user, so they can trigger the API to see what the AI answer is, without having to launch another separate script/program.

@yych42 Great ideas on separating the document by logical section. Did you separate it manually? because that would be difficult to automate/find the right pattern to separate for general documents.

And also (idea for v0.2) apart from documents, I think search semantically into a bookmarks/webpage also quite useful.
Let say the user index few medium posts, reddit threads, and in the next few months they would like to find back some important paragraph from all their bookmarks. As of right now, we can only search the page title or search again in google (which most likely it gave totally different updated result).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants