Can anyone explain how exactly the reranker is calculating the score? #197

nithinreddyyyyyy · 2024-04-12T07:25:17Z

Could someone explain how the reranker calculates its scores? I'm observing scores such as 15.67 and 12.33, which are unexpected because I anticipated scores like 97 or 96. I use LangChain's get_relevant_docs to retrieve documents with relevance scores such as 98 and 96, and then send these snippets to the ColBERT reranker for reranking. The reranker then returns scores like 15.67 and 12.33. Should I consider a higher reranker score as indicative of a better snippet? Additionally, I would like to understand the technique used by the reranker to compute these scores.

The text was updated successfully, but these errors were encountered:

bclavie · 2024-04-12T13:17:25Z

Hey! The scores are raw MaxSim scores, as used by ColBERT. MaxSim is well explained in this Vespa blog post about ColBERT, but it's basically the sum of token similarities between your queries and the most relevant document queries. The scores we return aren't normalised, so they aren't between 0 and 1 (or 0 and 100) as you might usually expect from retrieval services that provide normalised relevance scores.

As a general rule, you cannot compare scores outputted by different retrieval models, they only really make sense within the context of a single model. One model's 0.978 similarity might be another model's 0.784 (completely made-up numbers!), as they aren't absolute but relative scores: they're only useful in the context of comparing scores given to different documents by the same model.

levnikolaevich · 2024-04-19T12:24:29Z

@bclavie, good afternoon!

Could you please help me with the following...

You have an example where colbert is used for re-ranking documents retrieved from another index.
https://github.com/bclavie/RAGatouille/blob/main/examples/04-reranking.ipynb
Here in the post, re-ranking is applied to documents from an index that were retrieved using Colbert itself.
https://til.simonwillison.net/llms/colbert-ragatouille

Question: Does it make sense to re-rank documents that were found in an index created by RAGatouille itself, or is there no point in doing that?

Thank you very much in advance!

bclavie added the question Further information is requested label Apr 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can anyone explain how exactly the reranker is calculating the score? #197

Can anyone explain how exactly the reranker is calculating the score? #197

nithinreddyyyyyy commented Apr 12, 2024

bclavie commented Apr 12, 2024 •

edited

Loading

levnikolaevich commented Apr 19, 2024 •

edited

Loading

Can anyone explain how exactly the reranker is calculating the score? #197

Can anyone explain how exactly the reranker is calculating the score? #197

Comments

nithinreddyyyyyy commented Apr 12, 2024

bclavie commented Apr 12, 2024 • edited Loading

levnikolaevich commented Apr 19, 2024 • edited Loading

bclavie commented Apr 12, 2024 •

edited

Loading

levnikolaevich commented Apr 19, 2024 •

edited

Loading