Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusions about REPLUG #40

Closed
richhh520 opened this issue Feb 21, 2024 · 2 comments
Closed

Confusions about REPLUG #40

richhh520 opened this issue Feb 21, 2024 · 2 comments

Comments

@richhh520
Copy link

  1. How to train the retriever of REPLUG? How to update the embeddings of query and documents?
  2. In your replug_parallel_reader.ipynb, why directly import the default BM25Retriever rather than the trained Retriever?
  3. What does PromptModel do? What is its function? what does ReplugHFLocalInvocationLayer do? It seems this part is not mentioned in the paper.

Thanks for your help!

@danielfleischer
Copy link
Contributor

Hi, thanks for the questions. Here we focused more on the LLM part of the REPLUG paper, where documents can be fed in parallel. In general, documents can be retrieved using any method; for the simplicity of the notebook we showed in memory DB that uses BM25 with re-ranker using off-the-shelf sentence transformer. We don't have training code for embedders here, we might add it in the future.

The PromptModel is an abstraction around LLMs where you provide a model name and it generates text based on a prompt, abstracting away whether it is a local model, the hardware specifications or perhaps even a cloud-based service. The API is part of Haystack v1, which we use for the development of our library.

@kerkathy
Copy link

Thanks for the work, here also look forward to the implementation of the REPLUG training :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants