Skip to content
This repository was archived by the owner on May 9, 2024. It is now read-only.

Added cross encoder re-ranker#3

Closed
yfulwani wants to merge 1 commit into
oneapi-src:mainfrom
yfulwani:main
Closed

Added cross encoder re-ranker#3
yfulwani wants to merge 1 commit into
oneapi-src:mainfrom
yfulwani:main

Conversation

@yfulwani
Copy link
Copy Markdown

@yfulwani yfulwani commented May 8, 2023

No description provided.

Comment thread README.md
### Re-ranking

In this reference kit, we focus on the document retrieval aspect of building a vertical search engine to obtain an initial list of the top-K most similar documents in the corpus for a given query. Often times, this is sufficient for building a feature rich system. However, in some situations, a 3rd component, the re-ranker, which is not included in this reference kit, could be added to the search pipeline to improve results. In this architecture, for a given query, the *document retrieval* step will use one model to rapidly obtain a list of the top-K documents (as shown in this reference kit), followed by a *re-ranking* step which will use a different model to re-order the list of K retrieved documents before returning to the user. The second re-ranking refinement step has been shown to improve user satisfaction, especially when fine-tuned on a custom corpus, but may be unnecessary as a starting point for building a functional vertical search engine. To extend this reference implementation with re-ranking, we direct you to https://www.sbert.net/examples/applications/retrieve_rerank/README.html for further details on implementation where Intel® oneAPI optimizations can also be applied to speed up re-ranking models.
In this reference kit, we focus on the document retrieval aspect of building a vertical search engine to obtain an initial list of the top-K most similar documents in the corpus for a given query. Often times, this is sufficient for building a feature rich system. However, in some situations, a 3rd component, the re-ranker, could be added to the search pipeline to improve results. In this architecture, for a given query, the *document retrieval* step will use one model to rapidly obtain a list of the top-K documents, followed by a *re-ranking* step which will use a different model to re-order the list of K retrieved documents before returning to the user. The second re-ranking refinement step has been shown to improve user satisfaction, especially when fine-tuned on a custom corpus, but may be unnecessary as a starting point for building a functional vertical search engine. To know more about re-ranker, we direct you to https://www.sbert.net/examples/applications/retrieve_rerank/README.html for further details. In this reference kit we use `cross-encoder/ms-marco-MiniLM-L-6-v2` model as re-ranker. For more details about different re-ranker models visit https://www.sbert.net/docs/pretrained-models/ce-msmarco.html.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @yfulwani for this contribution but unfortunately we wouldn't accept PRs from your main branch. Please submit your PR from another branch in your forked repository and make sure your main branch remains identical with our main branch at all times. You are not supposed to manually push your main branch.

@rbernalc
Copy link
Copy Markdown
Contributor

rbernalc commented Jun 1, 2023

Hi @yfulwani If you need further assitance on the right process to submit PRs from a forked repository please let us know.

aagalleg pushed a commit that referenced this pull request Feb 16, 2024
@aagalleg aagalleg deleted the branch oneapi-src:main February 16, 2024 20:05
@aagalleg aagalleg closed this Feb 16, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants