# WOWS-Eval Autoqrels Pairwise Baseline

This is a pairwise autoqrels baseline to WOWS-EVAL that uses autoqrels to predict the probability that a document is relevant to a query given a already known relevant document.

## Step 1: Install Dependencies

In [None]:
!pip3 install wows-eval 'git+https://github.com/mam10eks/autoqrels.git'

## Step 2: Load the Data

Pairwise models have a query, a known relevant document, and an document with an unknown relevance to a query as input and predict the probability that the unknown document is relevant to the query given the known relevant document into a field `probability_relevant`.

In the following, we will process the pwise smoke test dataset. Please modify the variable `DATASET_ID` to submit for other datasets. See [tira.io/datasets?query=wows-eval](https://archive.tira.io/datasets?query=wows-eval) for an complete overview of dataset identifiers.


In [1]:
from tira.rest_api_client import Client
from wows_eval import evaluate as wows_evaluate
from autoqrels.oneshot import DuoPrompt
import pandas as pd

# For measuring consumed resources (e.g., GPU, CPU, RAM, etc.)
from tirex_tracker import tracking, ExportFormat

pd.set_option('display.max_colwidth', None)

DATASET_ID = 'wows-eval/pairwise-smoke-test-20250210-training'
# DATASET_ID = 'wows-eval/pairwise-20250309-test'

tira = Client()
input_data = tira.pd.inputs(DATASET_ID)

  from .autonotebook import tqdm as notebook_tqdm


## Step 3: Look at the data

In [2]:
input_data.head(2)

Unnamed: 0,id,query,relevant,unknown
0,3d080873-98a1-4388-af86-fe2c8b47ebca,who sings monk theme song,exists and is an alternate of . The Monk theme song is It's a Jungle Out There by Randy Newman. The Monk theme song is It's a Jungle Out There by Randy Newman.,"Randy Newman (album) Randy Newman is the debut recording by Randy Newman, released in 1968. Unlike his later albums which featured Newman and his piano backed by guitar, bass guitar and drums, Randy Newman was highly orchestral and aimed to blend the orchestra with Newman's voice and piano."
1,468a9e92-467f-47c9-810b-fe6fa9dca634,who sings monk theme song,exists and is an alternate of . The Monk theme song is It's a Jungle Out There by Randy Newman. The Monk theme song is It's a Jungle Out There by Randy Newman.,"One of Monk's most important contributions to jazz was his use of space and simplicity in his performances and in his compositions. Monk's unconventional use of harmony and rhythm has had a lasting influence on jazz as well. Finally, Monk's compositions are very well-known, both by performers and by listeners."


## Step 4: Implement the AutoQrels approach

Here, we use a prompted language model (you can modify the prompt and the backbone model) to predict the relevance of the document to the query and store the probability in the field `probability_relevant`. We wrap all computations into a `tira_measure.Environment` to measure the resources consumed for our computations to later include the used resources into the ir-metadata of our run.

In [3]:
BACKBONE_MODEL = "flan-t5-small"

PROMPT = """Determine if passage B is as relevant as passage A
for the given query.
'Passage A: "...{{ rel_doc_text | replace("\\"", "\'") }}..."
'Passage B: "...{{ unk_doc_text | replace("\\"", "\'") }}..."
'Query: "{{ query_text }}" '
"Is passage B as relevant as passage A? </s>"""

In [4]:
autoqrels_assessor = DuoPrompt(
    backbone=f'google/{BACKBONE_MODEL}',
    prompt=PROMPT,
    dataset=None
)

## Step 5: Run the predictions and look at the outputs

In [5]:
!rm -Rf runs
with tracking(export_format = ExportFormat.IR_METADATA, export_file_path="runs/ir_metadata.yml") as tracked:
    predictions = autoqrels_assessor.predict(input_data)


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
PCM Info: setrlimit for file limit 1000000 failed with error Operation not permitted

=====  Processor information  =====
Linux arch_perfmon flag  : yes
Hybrid processor         : yes
IBRS and IBPB supported  : yes
STIBP supported          : yes
Spec arch caps supported : yes
Max CPUID level          : 32
CPU model number         : 154
PCM Error: can't open MSR handle for core 0 (No such file or directory)
Try no-MSR mode by setting env variable PCM_NO_MSR=1
Can not access CPUs Model Specific Registers (MSRs).
execute 'modprobe msr' as root user, then execute pcm as root user.


Give read access to /sys/class/powercap/intel-rapl/intel-rapl:1/energy_uj


  0%|          | 0/6 [00:00<?, ?it/s]Passing a tuple of `past_key_values` is deprecated and will be removed in Transformers v4.48.0. You should pass an instance of `EncoderDecoderCache` instead, e.g. `past_key_values=EncoderDecoderCache.from_legacy_cache(past_key_values)`.
100%|██████████| 6/6 [00:01<00:00,  3.70it/s]

Give read access to /sys/class/powercap/intel-rapl/intel-rapl:1/energy_uj





In [6]:
predictions.head(3)

Unnamed: 0,id,query,relevant,unknown,probability_relevant
0,3d080873-98a1-4388-af86-fe2c8b47ebca,who sings monk theme song,exists and is an alternate of . The Monk theme song is It's a Jungle Out There by Randy Newman. The Monk theme song is It's a Jungle Out There by Randy Newman.,"Randy Newman (album) Randy Newman is the debut recording by Randy Newman, released in 1968. Unlike his later albums which featured Newman and his piano backed by guitar, bass guitar and drums, Randy Newman was highly orchestral and aimed to blend the orchestra with Newman's voice and piano.",0.616158
1,468a9e92-467f-47c9-810b-fe6fa9dca634,who sings monk theme song,exists and is an alternate of . The Monk theme song is It's a Jungle Out There by Randy Newman. The Monk theme song is It's a Jungle Out There by Randy Newman.,"One of Monk's most important contributions to jazz was his use of space and simplicity in his performances and in his compositions. Monk's unconventional use of harmony and rhythm has had a lasting influence on jazz as well. Finally, Monk's compositions are very well-known, both by performers and by listeners.",0.713638
2,846a69d0-0c0e-4d86-baf2-c3e8d31fdc86,who sings monk theme song,exists and is an alternate of . The Monk theme song is It's a Jungle Out There by Randy Newman. The Monk theme song is It's a Jungle Out There by Randy Newman.,"Singing elegant, melancholic songs in a glamorously tattered voice, Leonard Cohen emerged from Montreal in the 1960s, an artist well into his thirties before he even made his first album. After a few records, he was royalty, on equal footing with Joni Mitchell, Randy Newman, and other top-notch singer-songwriters.",0.684873


## Step 6: Evaluate and Submit Your Run

We use the `wows_evaluate` method imported above to evaluate our predictions and to upload them, to TIRA.

The `wows_evaluate` method has optional parameters that you can pass to describe your system and to include the resource measurements used during your computations in the ir-metadata format into your submission. You can remove those attributes or modify them for your submission accordingly. Call `help(wows_evaluate)` to see a full description.

In [7]:
wows_evaluate(
    predictions,
    DATASET_ID,
    tracking_results=tracked,
    upload=True,
    system_name=f'auto-qrels-pairwise-{BACKBONE_MODEL}',
    system_description="We use autoqrels [1] with a custom in-context learning prompt for pairwise relevance judgments.\n\n[1] - https://github.com/seanmacavaney/autoqrels",
)

Download: 36.8kiB [00:00, 537kiB/s]


Download finished. Extract...
Extraction finished:  /home/maik/.tira/extracted_datasets/wows-eval/pairwise-smoke-test-20250210-training/
Give read access to /sys/class/powercap/intel-rapl/intel-rapl:1/energy_uj
Give read access to /sys/class/powercap/intel-rapl/intel-rapl:1/energy_uj
Run uploaded to TIRA. Claim ownership via: https://www.tira.io/claim-submission/d27b0a49-3a94-46cb-ad48-ce3c47f9f4da


Unnamed: 0,system,tau_ap,kendall,spearman,pearson
0,auto-qrels-pairwise-flan-t5-small,-0.303889,-0.152381,-0.178571,-0.178571


## Step 7: Register to TIRA and to the WOWS-EVAL task

To finalize your submission to WOWS-EVAL, you must now claim your submission via the URL printed above. For this, please register at [tira.io](https://www.tira.io) and navigate to the [WOWS-EVAL](https://www.tira.io/task-overview/wows-eval/) task and click on "Register". You can choose your team name from a list of [fictional](https://en.wikipedia.org/wiki/Category:Fictional_librarians) and [real](https://en.wikipedia.org/wiki/List_of_librarians) librarians ([please drop a message](#contact) if your favourite team name is not in the list).

For instance, after clicking on the ownership link above, you can claim ownership via a form in TIRA (where you can potentially overwrite the name and description of your system):


![example of claim ownership form](../figures/claim-run-in-tira.png)