# Sparse Cross-Encoder Example Notebook

This is a short example notebook for re-ranking a set of passages with a pre-trained and fine-tuned sparse cross-encoder.

In [34]:
import sys
from textwrap import wrap

import torch
from transformers import AutoTokenizer

if torch.cuda.is_available():
    device = torch.device("cuda")
else:
    device = torch.device("cpu")

sys.path.append("..")

from sparse_cross_encoder.model.sparse_cross_encoder import (
    SparseCrossEncoderModelForSequenceClassification,
)

Load the model and tokenizer

In [35]:
model_name = "webis/sparse-cross-encoder-4-512"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = SparseCrossEncoderModelForSequenceClassification.from_pretrained(
    model_name
)
model = model.eval().to(device)

An example query and a set of passages

In [16]:
# trec dl '19 qid 146187
query = "difference between a mcdouble and a double cheeseburger"
passages = [
    "There is a discernible difference between the two burgers, especially in the meat quality. The McDouble contains gristle and carry's a different flavor than the Double Cheeseburger. The double cheeseburger tastes more like beef. The McDouble tastes of whatever meat filler they use to fill the gap.",
    "Review: Triple Cheeseburger from McDonald’s. When it comes to the various sizes of classic McDonald’s cheeseburgers, the prices might have you scratching your head. The price differences between a McDonald’s Cheeseburger, the beefier McDouble and the cheesier Double Cheeseburger now all fall within 50-cents of one another.",
    "At first glance, the double cheeseburger looks a lot like the McDouble, but it is not exactly the same thing. The main difference that you will notice is the amount of cheese that is on this sandwich. There are two slices of cheese on the double cheeseburger instead of just the one slice of cheese that comes on the McDouble. Even though it comes with two slices of cheese, it still has the onions, pickles, ketchup and mustard that come on the other burgers.",
    "But I did it for you. With McDonald's recent release of the McDouble, they now have three basic, sub-$2 cheeseburgers: the cheeseburger, McDouble (two patties with one slice of cheese), and double cheeseburger (two patties with two slices of cheese).",
    "Although we previously reported that the McDouble would be on the Dollar Menu, in Manhattan the cheeseburger appears on the Dollar Menu while the McDouble is $1.39 and the double cheeseburger is $1.49.",
    "The Double Cheesburger is now $1.19 instead of a buck. Seriously, $.19/slice of cheese and a removal from the Dollar Menu. Do I order the Double Cheeseburger at its new full price or do I stick to the Dollar Menu and go with the McDouble. Of course, I go with the McDouble. One piece of McCheese might be better than two pieces of McCheese. I guess McDonalds thinks people like me who bought the Double Cheeseburger for a buck will pony up and pay the extra pennies. Sooner or later they will be charging for extra napkins and ketchup.",
    "The Double Cheeseburger is priced at $1.49 and my Triple Cheeseburger was priced at $2.19. I just don’t understand the point of paying 70-cents for an extra burger patty when you can just get two McDoubles for $2.00. Obviously the value of the Triple Cheeseburger makes no sense when you’re comparing menu items.",
    "The McDouble is listed as a slice of American cheese between two 100% beef patties, topped with pickles, onions, ketchup and mustard. mcdonalds.com. The Double Cheeseburger is listed as two slices of golden American cheese with two 100% all beef patties, pickles, onions, ketchup and mustard.",
    "So whenever the McDouble hit the Dollar Menu and the Double Cheeseburger was thrown into the “Burgers” section at $1.19 or even higher ($1.29 in most places now), the average consumer scratched their head, winced their eyes, and fell over in a wrath of confusion and bewilderment.",
    "The difference is that the Mighty Kid's Meal provides more food than what is typically found in a Happy Meal, providing a McDouble (a cheeseburger consisting of two patties and one slice of cheese) instead of a cheeseburger or a hamburger, and more Chicken McNuggets (6 versus 4), plus a larger drink (16oz vs 12oz).",
]

Encode the query and passages. The sparse cross-encoder handles the padding of the input and adding the special tokens automatically.

In [17]:
query_input_ids = tokenizer(query, return_tensors="pt", add_special_tokens=False).input_ids[0].to(device)
doc_input_ids = [
    tokenizer(passage, return_tensors="pt", add_special_tokens=False).input_ids[0].to(device)
    for passage in passages
]
print(len(doc_input_ids))

10


Feed the query and passages into the model and get the scores for each passage.

In [20]:
with torch.inference_mode():
    out = model([query_input_ids], [doc_input_ids])
print(out.logits)

[tensor([10.3506,  7.1963,  9.6861,  8.2152,  8.7845,  3.6254,  2.0268,  8.3568,
         3.7185,  8.7403])]


Print the passages in the re-ranked order.

In [33]:
print(f"Query: {query}")
sorted_passages = [
    passage
    for passage in sorted(enumerate(passages), key=lambda x: out.logits[0][x[0]], reverse=True)
]

for idx, passage in sorted_passages:
    print(f"Passage #{idx+1}(score={out.logits[0][idx]:.2f}):")
    print("\n".join(wrap(passage, 80)))
    print("-" * 20)
out.logits

Query: difference between a mcdouble and a double cheeseburger
Passage #1(score=10.35):
There is a discernible difference between the two burgers, especially in the
meat quality. The McDouble contains gristle and carry's a different flavor than
the Double Cheeseburger. The double cheeseburger tastes more like beef. The
McDouble tastes of whatever meat filler they use to fill the gap.
--------------------
Passage #3(score=9.69):
At first glance, the double cheeseburger looks a lot like the McDouble, but it
is not exactly the same thing. The main difference that you will notice is the
amount of cheese that is on this sandwich. There are two slices of cheese on the
double cheeseburger instead of just the one slice of cheese that comes on the
McDouble. Even though it comes with two slices of cheese, it still has the
onions, pickles, ketchup and mustard that come on the other burgers.
--------------------
Passage #5(score=8.78):
Although we previously reported that the McDouble would be on 

[tensor([10.3506,  7.1963,  9.6861,  8.2152,  8.7845,  3.6254,  2.0268,  8.3568,
          3.7185,  8.7403])]