-
Notifications
You must be signed in to change notification settings - Fork 3.2k
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OA Retrieval System Proposal #3058
Comments
Also added a POC I had done: REALM encoded wikipedia data |
Hey, Just to repost demo from discord where I tried out @kenhktsui POC with plugins as a POC also |
I can also contribute here. |
I can contribute here. |
(I have unassigned myself in favor of umbra-scientia since gh has a 10 people assignment limit) |
I think the LLM can decide of if it wants to use a retrieval query. To train the model to use it we could add some flags in the data labeling interface, like "requires information retrieval". Disadvantage of that is that it has to be clear for the labelers what knowledge can be expected from the model and what it should look up. I think the query may very well be just the original user message, a good vector DB with good embeddings should already return the text snippets that are most relevant for that query, ranked by similarity. So I would say use as many retrieved text snippets as possible with the context length leaving some room for the reply. Obviously that information wouldn't be included in the chat context, only the final answer. I have some questions about the general plan:
Anyway I have some experience setting up things like this. I'd be honored to contribute! |
Adding one more consideration here: there are (at least) three ways of incorporating retrieval into LLM, with different degrees of coupling.
|
I think there is a 4th line, kind of a blend of 1 and 3, as presented in the RETRO paper (https://arxiv.org/abs/2112.04426)[https://arxiv.org/abs/2112.04426]:
|
Meeting minutes:
We all agreed to spend another week of paper reading, discord chats and exploring small tasks. @kpoeppel will share some papers on retrieval Please comment if I missed something |
Another way to incorporate retrieval, sort of an upgrade to 3, is to take a pretrained non-retrieval LLM and fine-tune it with retrieval augmentation simply adding retrieved documents into the input. You can either use a pretrained retriever like RETRO, co-train a retriever like REALM during this fine-tuning stage, or use a nonparametric retriever like BM25 which works surprisingly well. This method was introduced by a very recent paper, though I'm blanking out on the name. Hopefully someone will be able to identify this particular paper. |
Meeting MinutesEmbedding Method Team
Prompt-Injection TeamNo updates yet |
Some clarification: After those first experiments we can later extrapolate to llama-30B. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
High Level OA Retrieval System
Goal
Options available
use a professional vector-db in which we index documents based on embeddings, like for example all of wikipedia
Segment the data into chunks (sentences/para)
Generate embeddings for each
Store the embeddings for retrieval ( FAISS,etc)
When presented with query retrieve related chunks from DB using some
metrics, for example cosine similarity
Prompt LLM using query + retrieved chunks to generate the answer
https://paperswithcode.com/dataset/beir
LangChain being considered ?
LLamaIndex ?
VectorDB(s) under consideration
Benchmarks : http://ann-benchmarks.com/ ?
Draw backs :
Design or Workflow
Overall there are some similarities between retrieval and OA plugins (i.e. in the simplest case retrieval could be a plugin). The retrieval system will be a bit more closely integrated to the inference system for the easily updatable knowledge of the assistant
Need to come to a consensus on the workflow
Other design thoughts
There are 2 schools of thought for this system
Vs
the use-case of retrieval based models are mostly in knowledge seeking mode
Open questions
Timeline for First Version
TBD
The text was updated successfully, but these errors were encountered: