# Hello there.

_Version 0.3 of this notebook. Last updated 2024-05-17._

Welcome to `rerankers`!

This notebook is an introduction notebook. It doesn't solve any problem in particular, but showcases the various rerankers you can load, their arguments combinations, and the unified output format of `Results` and `RankedResults`.

This is pretty much all there is to the library: a lightweight layer aiming to make it completely painless to slot in any reranker you want!


First, let's load the `Reranker` function, and fetch our API keys from our .env file to use the API-based rerankers (Cohere, Jina, RankGPT):

In [1]:
# This is the only rerankers import you'll ever need for inference
from rerankers import Reranker


# Let's do this manually, so we don't need to install pydotenv
import os

with open(".env", "r") as file:
    for line in file:
        if line.strip() and not line.startswith("#"):
            key, value = line.strip().split("=", 1)
            os.environ[key] = value.strip()

  from .autonotebook import tqdm as notebook_tqdm


Next, let's define an imaginary query, and two documents that our totally-real first-stage retrievers has pulled up from a just-as-real corpus of online movie reviews.

These will serve as the documents we want to score againt the query with our various rerankers:

In [2]:
query = "Gone with the wind is an absolute masterpiece"
docs = [
    "Gone with the wind is a masterclass in bad storytelling.",
    "Gone with the wind is an all-time classic",
]

That's pretty much it for the set-up phase. Let's get started with using various rerankers now.

## General Use

This section will showcase the general use of `rerankers`, as well as the output format.

In [3]:
ranker = Reranker("cross-encoder")

Loading default cross-encoder model for language en
Default Model: mixedbread-ai/mxbai-rerank-base-v1
Loading TransformerRanker model mixedbread-ai/mxbai-rerank-base-v1
No device set
Using device mps
No dtype set
Using dtype torch.float16




Loaded model mixedbread-ai/mxbai-rerank-base-v1
Using device mps.
Using dtype torch.float16.


Ooh, that's a bit moisy, maybe we want to cut down on this defaults noise? Just set `verbose` to 0!

In [4]:
ranker = Reranker("cross-encoder", verbose=0)

Loading default cross-encoder model for language en
Loading TransformerRanker model mixedbread-ai/mxbai-rerank-base-v1


Much better! You still get a bit of information thrown at you, but most of what's going on is no longer printed.

This is all you have to do to load a model, it's now ready to rank documents:

In [5]:
results = ranker.rank(query=query, docs=docs)
results

RankedResults(results=[Result(document=Document(text='Gone with the wind is an all-time classic', doc_id=1, metadata={}), score=3.607421875, rank=1), Result(document=Document(text='Gone with the wind is a masterclass in bad storytelling.', doc_id=0, metadata={}), score=0.685546875, rank=2)], query='Gone with the wind is an absolute masterpiece', has_scores=True)

⚠️ Please note: the score outputted from *any* model does not have inherent meaning! It's only useful for ranking purposes, and relatively to other outputs **from the same model**. You cannot compare score between models or make assumptions based on a score seeming high or low! ⚠️

`results`, just like any `rank()` output in `rerankers`, is a `RankedResults` objects. `RankedResults` has a few useful helper functions, and contains `Result` objects, which are always an atomic reranking results, containing the document's text, the document id, its 1-indexed rank as well as the score returned by the model.

Not all models output scores, so RankedResults has an `has_scores` attributes that lets you know if scores are present or not. It also always contains the original `query`, for easier mapping.

The most useful function of a `RankedResults` object is `top_k`, which lets you retrieve the top `k` results, as ranked by the nodel:

In [6]:
results.top_k(1)[0].text

'Gone with the wind is an all-time classic'

If you're using `rerankers` to harvest scores (perhaps for distilling knowledge into a retrieval model?), there's also a helpful function to retrieve the score for a particular doc_id:

In [7]:
results.get_score_by_docid(1)

3.607421875

Speaking of docids, you can of course set your own! If unspecified, they'll always be integers, corresponding to the index of a given document in the input list. You can specify your own doc_ids pretty easily, by passing them to the rank() function. Please note that `doc_ids` **must** be the same length as `docs`!

In [8]:
results = ranker.rank(
    query=query,
    docs=docs,
    doc_ids=["The Not-So Similar Document", "The Similar Document"],
)
results.top_k(1)

[Result(document=Document(text='Gone with the wind is an all-time classic', doc_id='The Similar Document', metadata={}), score=3.607421875, rank=1)]

That's it for the basics of using a reranker, and how the `Result` and `RankedResults` objects work. The goal is simplicity, so hopefully this wasn't too much!

Keep reading if you want some more information about the various models we support.

## Single-Label Cross-Encoders
Cross-encoders are probably the most common types of re-rankers.

You might be familiar with bi-encoders (also called "embeddings models") which encode the query and the document independently to be scored later. A lot of popular modulars, like OpenAI's embeddings, Jina models, BGE-embeddings models, etc... are bi-encoder models.

Cross-encoders, on the other hand, take in **both** a query and a document as input, and output a score rather than a representation. This is the basic logic behind all "reranker" model approaches: they're aware of the full content of both a target document and a query at inference time, which means they can take subtle interactions into account.  
This differs from bi-encoders, which have to provide a single representation for any given document without being aware of the context in which that representation will be used.

Using cross-encoders in `rerankers` is very simple, and relies on nothing more than `transformers` and `torch`.

`rerankers` lets you load any model you want, but also ships with overall balanced (observe performance vs size) defaults for all model families. In the previous example, we used a default model: we just specified `cross-encoder`, which fetched the default cross-encoder model, `mixedbread-ai/mxbai-rerank-base-v1`.

However, you can load any model you want! By default, `rerankers` tries to load any model given as a cross-encoder, unless it recognises it as something else, but this'll give you a warning that it's doing so, so you're better off specifying `model_type` to suppress it. Let's load a MiniLM reranker to try it out:

In [9]:
ranker = Reranker("cross-encoder/ms-marco-MiniLM-L-6-v2", model_type="cross-encoder")

Loading TransformerRanker model cross-encoder/ms-marco-MiniLM-L-6-v2
No device set
Using device mps
No dtype set
Using dtype torch.float16
Loaded model cross-encoder/ms-marco-MiniLM-L-6-v2
Using device mps.
Using dtype torch.float16.


As you can see, it downloaded the model from the hub, and it's now ready to use, just like before (but note the different score!):

In [10]:
results = ranker.rank(query=query, docs=docs)
results.top_k(1)

[Result(document=Document(text='Gone with the wind is an all-time classic', doc_id=1, metadata={}), score=4.33984375, rank=1)]

`Reranker()` does take a few optional arguments to give you more control over model loading. One of them is `lang`, which is only useful if you're using the default model. You can set it to the 2-letter ISO language code of your target language, and it'll try to load a relevant model if there is such a default. Sadly, for a lot of languages, that'll just default to a multilingual model as NLP is very English-centric. If you're aware of any language specific models, please do contribute them to the defaults list in a PR!

In [11]:
ranker = Reranker("cross-encoder", lang="fr")

Loading default cross-encoder model for language fr
Default Model: antoinelouis/crossencoder-camembert-base-mmarcoFR
Loading TransformerRanker model antoinelouis/crossencoder-camembert-base-mmarcoFR
No device set
Using device mps
No dtype set
Using dtype torch.float16
Loaded model antoinelouis/crossencoder-camembert-base-mmarcoFR
Using device mps.
Using dtype torch.float16.


We won't try out all the other arguments, but these are the ones you can use for `cross-encoder` models, explained:

In [12]:
ranker = Reranker(
    "cross-encoder/ms-marco-MiniLM-L-6-v2",
    model_type="cross-encoder",
    verbose=1,  # How verbose the reranker will be. Defaults to 1, setting it to 0 will suppress most messages.
    dtype=None,  # Which dtype the model should use. If None will figure out if your platform + model combo supports fp16 and use it if so, other fp32.
    device=None,  # Which device the model should use. If None will figure out what the most powerful supported platform available is (cuda > mps > cpu)
    batch_size=16,  # The batch size the model will use. Defaults to 16
)

Loading TransformerRanker model cross-encoder/ms-marco-MiniLM-L-6-v2
No device set
Using device mps
No dtype set
Using dtype torch.float16
Loaded model cross-encoder/ms-marco-MiniLM-L-6-v2
Using device mps.
Using dtype torch.float16.


That's all there is to this part! The defaults are generally pretty robust, but feel free to tweak them to your heart's content.

## API-based models

There is an increasing amount of providers training powerful reranking models, and making them available via API calls. `rerankers` aims to support all the major ones, and make it as easy as possible to use them. Currently, we support [Jina.ai reranker](https://jina.ai/reranker/) and [Cohere Rerank](https://cohere.com/rerank).

Loading and using them is extremely easy. All you need is an API key, and you're ready to go:

In [13]:
# Jina
ranker = Reranker("jina", api_key=os.environ["JINA_API_KEY"])
results = ranker.rank(query=query, docs=docs)
results.top_k(1)

Auto-updated model_name to jina-reranker-v1-base-en for API provider jina
Loading APIRanker model jina-reranker-v1-base-en
{'model': 'jina-reranker-v1-base-en', 'usage': {'total_tokens': 36, 'prompt_tokens': 36}, 'results': [{'index': 0, 'document': {'text': 'Gone with the wind is a masterclass in bad storytelling.'}, 'relevance_score': 0.7841538190841675}, {'index': 1, 'document': {'text': 'Gone with the wind is an all-time classic'}, 'relevance_score': 0.6780072450637817}]}


[Result(document=Document(text='Gone with the wind is a masterclass in bad storytelling.', doc_id=0, metadata={}), score=0.7841538190841675, rank=1)]

In [14]:
# Cohere
ranker = Reranker("cohere", api_key=os.environ["COHERE_API_KEY"])
results = ranker.rank(query=query, docs=docs)
results.top_k(1)

Auto-updated model_name to rerank-english-v3.0 for API provider cohere
Loading APIRanker model rerank-english-v3.0
{'id': 'f8178696-a98f-41db-acee-144bfa5b516e', 'results': [{'document': {'text': 'Gone with the wind is a masterclass in bad storytelling.'}, 'index': 0, 'relevance_score': 0.9961606}, {'document': {'text': 'Gone with the wind is an all-time classic'}, 'index': 1, 'relevance_score': 0.98415464}], 'meta': {'api_version': {'version': '1'}, 'billed_units': {'search_units': 1}}}


[Result(document=Document(text='Gone with the wind is a masterclass in bad storytelling.', doc_id=0, metadata={}), score=0.9961606, rank=1)]

Cohere also supports two very-nice features: 
- It provides a multilingual version of its reranker.
- It allows you to fine-tune models, to be served by their API.

We support both of these features. To use the multilingual version of cohere's reranker, simply explicitely pass a `lang` argument, just like you would to load the default for a non-English language:

In [15]:
# Cohere
ranker = Reranker("cohere", lang="en", api_key=os.environ["COHERE_API_KEY"])
ranker.rank(
    query="Tell me about lord of the rings",
    docs=[
        "Dune is an incredibly confusing masterpiece in worldbuilding...",
        "The silmarillion is a prequel to the Lord of The Rings...",
        "Green Lantern uses a powerful ring to rule over his planet...",
    ],
)

Auto-updated model_name to rerank-english-v3.0 for API provider cohere
Loading APIRanker model rerank-english-v3.0
{'id': '310200e1-122a-4563-98b9-a0a9a92fd69d', 'results': [{'document': {'text': 'The silmarillion is a prequel to the Lord of The Rings...'}, 'index': 1, 'relevance_score': 0.029256709}, {'document': {'text': 'Green Lantern uses a powerful ring to rule over his planet...'}, 'index': 2, 'relevance_score': 0.0014721896}, {'document': {'text': 'Dune is an incredibly confusing masterpiece in worldbuilding...'}, 'index': 0, 'relevance_score': 6.4522144e-05}], 'meta': {'api_version': {'version': '1'}, 'billed_units': {'search_units': 1}}}


RankedResults(results=[Result(document=Document(text='The silmarillion is a prequel to the Lord of The Rings...', doc_id=1, metadata={}), score=0.029256709, rank=1), Result(document=Document(text='Green Lantern uses a powerful ring to rule over his planet...', doc_id=2, metadata={}), score=0.0014721896, rank=2), Result(document=Document(text='Dune is an incredibly confusing masterpiece in worldbuilding...', doc_id=0, metadata={}), score=6.4522144e-05, rank=3)], query='Tell me about lord of the rings', has_scores=True)

To use a pre-trained model, it's slightly different. It won't necessarily have `cohere` in its model name, so you need to let rerankers know you're trying to load a cohere model:

In [16]:
# wrap in a try/except, as we don't have a fine-tuned model to use for this example!
try:
    ranker = Reranker(
        "my-finetuned-model-name",
        api_provider="cohere",
        api_key=os.environ["COHERE_API_KEY"],
    )
except:
    pass

Loading TransformerRanker model my-finetuned-model-name


## T5-Based Rerankers

T5-based rerankers leverage T5, a SequenceToSequence Encoder-Decoder language model, to rank documents. They generally do so by querying the model, and constraining its prediction to two tokens, one representing relevance and the other irrelevance. Those logits being then usable as relative relevance scores, similarly to cross-encoder based rerankers.

They have been popular in the Information Retrieval litterature, and perform quite well on some tasks.Load a t5-based model using the same `Reranker()` call:

In [17]:
ranker = Reranker("t5")
results = ranker.rank(query=query, docs=docs)
results.top_k(2)

Loading default t5 model for language en
Default Model: unicamp-dl/InRanker-base
Loading T5Ranker model unicamp-dl/InRanker-base
No device set
Using device cpu
No dtype set
Device set to `cpu`, setting dtype to `float32`
Using dtype torch.float32
Loading model unicamp-dl/InRanker-base, this might take a while...
Using device cpu.
Using dtype torch.float32.


You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565


T5 true token set to ▁true
T5 false token set to ▁false
Returning normalised scores...


Scoring...:   0%|          | 0/1 [00:00<?, ?it/s]huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
Scoring...: 100%|██████████| 1/1 [00:00<00:00,  2.99it/s]


[Result(document=Document(text='Gone with the wind is a masterclass in bad storytelling.', doc_id=0, metadata={}), score=0.9964194297790527, rank=1),
 Result(document=Document(text='Gone with the wind is an all-time classic', doc_id=1, metadata={}), score=0.9739130139350891, rank=2)]

There's quite a lot going on with T5 models' initialisation, so you might want to pass `verbose=0` if you don't care.
Of course, you can always use `model_type='t5'` to load any non-default t5 model:

In [18]:
ranker = Reranker("unicamp-dl/ptt5-base-pt-msmarco-10k-v2", model_type="t5", verbose=0)

Loading T5Ranker model unicamp-dl/ptt5-base-pt-msmarco-10k-v2


The full argument for T5 models are in the same vein as for ColBERT: they follow the transformers ones, along with a bunch of model-specific ones:

In [19]:
ranker = Reranker(
    "t5",
    model_type="t5",
    verbose=1,  # How verbose the reranker will be. Defaults to 1, setting it to 0 will suppress most messages.
    dtype=None,  # Which dtype the model should use. If None will figure out if your platform + model combo supports fp16 and use it if so, other fp32.
    device=None,  # Which device the model should use. If None will figure out what the most powerful supported platform available is (cuda > mps > cpu)
    batch_size=16,  # The batch size the model will use. Defaults to 16
    token_false="auto",  # The output token corresponding to non-relevance.
    token_true="auto",  # The output token corresponding to relevance.
    return_logits=False,  # Whether to return a normalised score or the raw logit for `token_true`.
)

Loading default t5 model for language en
Default Model: unicamp-dl/InRanker-base
Loading T5Ranker model unicamp-dl/InRanker-base
No device set
Using device cpu
No dtype set
Device set to `cpu`, setting dtype to `float32`
Using dtype torch.float32
Loading model unicamp-dl/InRanker-base, this might take a while...
Using device cpu.
Using dtype torch.float32.
T5 true token set to ▁true
T5 false token set to ▁false
Returning normalised scores...


`token_false` and `token_true` are very important! Leaving them to `auto` will use our mapping of the ones used by the most popular t5 rerankers, or default to the most frequently used ones if not present. If you're using a custom T5 reranker, you probably want to specify exactly which token your model was trained with, but you can safely ignore it for off-the-shelves models!

## RankGPT

RankGPT is a new reranking approach, which leverages LLMs to perform zero-shot reranking. The idea is pretty simple: give a powerful LLM your query and your documents, and have it perform listwise reranking: that is, compare the documents to each other and create a relevance ranking. It works surprisingly well in a variety of zero-shot contexts.

The initial approach uses OpenAI's GPT-3.5 and GPT-4, though there are now more work in the area, including RankZephyr, a specifically finetuned 7B LLM.
We currently only support the original RankGPT implementation, although you can use it with any LLM provider thanks to LiteLLM.

Loading a RankGPT ranker is pretty similar to loading any other model:

In [20]:
from rerankers import Reranker

ranker = Reranker("rankgpt", api_key=os.environ["OPENAI_API_KEY"])

Loading default rankgpt model for language en
Default Model: gpt-4-turbo-preview
Loading RankGPTRanker model gpt-4-turbo-preview


The default "rankgpt" model uses `gpt-4-turbo-preview`. You can also load `rankgpt3` to use gpt-3.5, or `rankgpt4` to use GPT-4 (non-Turbo):

In [21]:
from rerankers import Reranker

ranker = Reranker("rankgpt3", api_key=os.environ["OPENAI_API_KEY"])
results = ranker.rank(query=query, docs=docs)
results.top_k(1)

Loading default rankgpt3 model for language en
Default Model: gpt-3.5-turbo
Loading RankGPTRanker model gpt-3.5-turbo
Querying model gpt-3.5-turbo with via LiteLLM...


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


[Result(document=Document(text='Gone with the wind is a masterclass in bad storytelling.', doc_id=0, metadata={}), score=None, rank=1)]

Note that there's no score provided for RankGPT models, as it never outputs one and all the reranking is done inside the "black box".

Finally, we use LiteLLM as the backend, in order to support different providers. You can check out [their documentation](https://litellm.vercel.app/) to see how to use different LLM providers. All you need to do, in that case, is to specify `model_type="rankgpt"`. For instance, if you wanted to use RankGPT with an OpenAI Azure deployment, you'd do it like this:

In [22]:
# LiteLLM uses env variables
import os

os.environ["AZURE_API_KEY"] = ""
os.environ["AZURE_API_BASE"] = ""
os.environ["AZURE_API_VERSION"] = ""
deployment_name = "my-azure-gpt-deployment"

# Just like Cohere's finetuned rankers above -- we try/except this as we're not actually running an Azure OpenAI model in this example!
try:
    ranker = Reranker(
        f"azure/{deployment_name}",
        model_type="rankgpt",
        api_key=os.environ["AZURE_API_KEY"],
    )
except:
    pass

Loading RankGPTRanker model azure/my-azure-gpt-deployment


## RankLLM

RankLLM is a refinement on the RankGPT approach, by [Jimmy Lin's lab at the University of Waterloo](http://castorini.io).

It introduces a safer, more refined codebase for RankGPT calls, as well as the possibility to use non-GPT models, with very strong performance from just 7B models such as RankVicuna and RankZephyr.

It effectively functions the same as RankGPT, with the added possibility (CURRENTLY UNTESTED) to use local models:

In [23]:
from rerankers import Reranker

ranker = Reranker("rankllm", api_key=os.environ["OPENAI_API_KEY"])
ranker.rank(query=query, docs=docs)

Loading default rankllm model for language en
Default Model: gpt-4o
Loading RankGPTRanker model gpt-4o
Querying model gpt-4o with via LiteLLM...


RankedResults(results=[Result(document=Document(text='Gone with the wind is a masterclass in bad storytelling.', doc_id=0, metadata={}), score=None, rank=1), Result(document=Document(text='Gone with the wind is an all-time classic', doc_id=1, metadata={}), score=None, rank=2)], query='Gone with the wind is an absolute masterpiece', has_scores=False)

Currently, passing in just the name of a `GPT` model, such as `gpt-4-turbo`, to your `Reranker()` initialisation will initialise RankGPT. This will change in version 0.0.5, at which point it'll default to RankLLM. If you'd like to keep a stable behaviour accross versions, or use RankLLM with `GPT` models already, just pass in the `model_type` argument:

In [24]:
ranker = Reranker(
    "gpt-4-turbo", model_type="rankllm", api_key=os.environ["OPENAI_API_KEY"]
)
ranker.rank(query=query, docs=docs)

Loading RankLLMRanker model gpt-4-turbo


100%|██████████| 1/1 [00:01<00:00,  1.36s/it]


RankedResults(results=[Result(document=Document(text='Gone with the wind is an all-time classic', doc_id=1, metadata={}), score=None, rank=0), Result(document=Document(text='Gone with the wind is a masterclass in bad storytelling.', doc_id=0, metadata={}), score=None, rank=1)], query='Gone with the wind is an absolute masterpiece', has_scores=False)

## ColBERT Rerankers

This one is a bit of a tricky case: it leverages ColBERT and its late-interaction approach to re-rank documents. ColBERT, however, is actually closer to a bi-encoder in spirit: it encodes documents without having any knowledge of the query (and vice-versa), and scores them at a later time. However, it's a very powerful retrieval model, and can be a very strong zero-shot reranker in some settings.

Loading a ColBERT model is similar to loading any other model:

In [25]:
from rerankers import Reranker

ranker = Reranker("colbert")
results = ranker.rank(query=query, docs=docs)
results.top_k(1)



Loading default colbert model for language en
Default Model: colbert-ir/colbertv2.0
Loading ColBERTRanker model colbert-ir/colbertv2.0
No device set
Using device mps
No dtype set
Using dtype torch.float16
Loading model colbert-ir/colbertv2.0, this might take a while...


[Result(document=Document(text='Gone with the wind is an all-time classic', doc_id=1, metadata={}), score=0.9578188061714172, rank=1)]

You can of course load any non-default ColBERT model via the `model_type` argument:

In [26]:
ranker = Reranker("antoinelouis/colbertv2-camembert-L4-mmarcoFR", model_type="colbert")

Loading ColBERTRanker model antoinelouis/colbertv2-camembert-L4-mmarcoFR
No device set
Using device mps
No dtype set
Using dtype torch.float16
Loading model antoinelouis/colbertv2-camembert-L4-mmarcoFR, this might take a while...


The full argument for ColBERT models are pretty similar to other transformers-based models, except for a few ColBERT-specific argument you'd only want to modify for custom models:

In [27]:
ranker = Reranker(
    "colbert",
    model_type="colbert",
    verbose=1,  # How verbose the reranker will be. Defaults to 1, setting it to 0 will suppress most messages.
    dtype=None,  # Which dtype the model should use. If None will figure out if your platform + model combo supports fp16 and use it if so, other fp32.
    device=None,  # Which device the model should use. If None will figure out what the most powerful supported platform available is (cuda > mps > cpu)
    batch_size=16,  # The batch size the model will use. Defaults to 16
    query_token="[unused0]",  # A ColBERT-specific argument. The token that your model prepends to queries.
    document_token="[unused1]",  # A ColBERT-specific argument. The token that your model prepends to documents.
)

Loading default colbert model for language en
Default Model: colbert-ir/colbertv2.0
Loading ColBERTRanker model colbert-ir/colbertv2.0
No device set
Using device mps
No dtype set
Using dtype torch.float16
Loading model colbert-ir/colbertv2.0, this might take a while...


## FlashRank Rerankers

[FlashRank](https://github.com/PrithivirajDamodaran/FlashRank) is not a new approach, but an optimisation one. It's a library maintained by [Prithiviraj Damodaran](https://github.com/PrithivirajDamodaran), which provides ONNX weights optimised for CPU inference for various common reranking methods.

Rerankers wraps FlashRank, so you can load any flashrank model the usual way:

In [28]:
ranker = Reranker("flashrank")  # Defaults to MiniLM-L12-v2
results = ranker.rank(query=query, docs=docs)
results.top_k(1)

Loading default flashrank model for language en
Default Model: ms-marco-MiniLM-L-12-v2
Loading FlashRankRanker model ms-marco-MiniLM-L-12-v2
Loading model FlashRank model ms-marco-MiniLM-L-12-v2...


[Result(document=Document(text='Gone with the wind is a masterclass in bad storytelling.', doc_id=0, metadata={}), score=0.9945825934410095, rank=1)]

You can also load any model supported by the flashrank library, with the usual explicit `model_type` argument:

In [29]:
ranker = Reranker("ms-marco-TinyBERT-L-2-v2", model_type="flashrank")
results = ranker.rank(query=query, docs=docs)
results

INFO:flashrank.Ranker:Downloading ms-marco-TinyBERT-L-2-v2...


Loading FlashRankRanker model ms-marco-TinyBERT-L-2-v2
Loading model FlashRank model ms-marco-TinyBERT-L-2-v2...


ms-marco-TinyBERT-L-2-v2.zip: 100%|██████████| 3.26M/3.26M [00:00<00:00, 7.75MiB/s]


RankedResults(results=[Result(document=Document(text='Gone with the wind is a masterclass in bad storytelling.', doc_id=0, metadata={}), score=0.9263709783554077, rank=1), Result(document=Document(text='Gone with the wind is an all-time classic', doc_id=1, metadata={}), score=0.8937801122665405, rank=2)], query='Gone with the wind is an absolute masterpiece', has_scores=True)

##  That's all folks! Thanks for checking the full overview.