Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cross Encoder Recommendation for RAG #2640

Open
yildize opened this issue May 12, 2024 · 0 comments
Open

Cross Encoder Recommendation for RAG #2640

yildize opened this issue May 12, 2024 · 0 comments

Comments

@yildize
Copy link

yildize commented May 12, 2024

In your documentation (https://www.sbert.net/docs/pretrained_cross-encoders.html), I see many different cross-encoders trained on different datasets like msmarco, squad, sts, nli, ...

What would be your suggestion for an asymmetric "retrieval augmented generation" pipeline (retrieving passages for given queries)?

  • Would it be "ms-marco cross encoders"? Why not STSbenchmark or SQuAD models?
  • What should I consider to choose my model?

Why not train a cross encoder on all (or at least multiple) of those datasets just like you did for "all" bi-encoders?

Thanks in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant