# WOWS-Eval Naive Pairwise Baseline

This is a naive baseline to WOWS-EVAL that always predicts that a document is relevant to a query with a probability of 50%.

## Step 1: Install Dependencies



In [None]:
!pip3 install 'wows-eval>=0.0.6'

## Step 2: Load the Data

Pairwise models have a query, a known relevant document, and an document with an unknown relevance to a query as input and predict the probability that the unknown document is relevant to the query given the known relevant document into a field `probability_relevant`. For this naive baseline, we always predict a probability of 0.5.

In the following, we will process the pwise smoke test dataset. Please modify the variable `DATASET_ID` to submit for other datasets. See [tira.io/datasets?query=wows-eval](https://archive.tira.io/datasets?query=wows-eval) for an complete overview of dataset identifiers.


In [None]:
from tira.rest_api_client import Client
from wows_eval import evaluate as wows_evaluate
import pandas as pd

# For measuring consumed resources (e.g., GPU, CPU, RAM, etc.)
from tirex_tracker import tracking, ExportFormat

pd.set_option('display.max_colwidth', None)

DATASET_ID = 'wows-eval/pairwise-smoke-test-20250210-training'
# DATASET_ID = 'wows-eval/pairwise-20250309-test'

tira = Client()
input_data = tira.pd.inputs(DATASET_ID)

## Step 3: Look at the data

In [2]:
input_data.head(2)

Unnamed: 0,id,query,relevant,unknown
0,3d080873-98a1-4388-af86-fe2c8b47ebca,who sings monk theme song,exists and is an alternate of . The Monk theme song is It's a Jungle Out There by Randy Newman. The Monk theme song is It's a Jungle Out There by Randy Newman.,"Randy Newman (album) Randy Newman is the debut recording by Randy Newman, released in 1968. Unlike his later albums which featured Newman and his piano backed by guitar, bass guitar and drums, Randy Newman was highly orchestral and aimed to blend the orchestra with Newman's voice and piano."
1,468a9e92-467f-47c9-810b-fe6fa9dca634,who sings monk theme song,exists and is an alternate of . The Monk theme song is It's a Jungle Out There by Randy Newman. The Monk theme song is It's a Jungle Out There by Randy Newman.,"One of Monk's most important contributions to jazz was his use of space and simplicity in his performances and in his compositions. Monk's unconventional use of harmony and rhythm has had a lasting influence on jazz as well. Finally, Monk's compositions are very well-known, both by performers and by listeners."


## Step 4: Implement the Naive Baseline

Here, we just fill the expected field `probability_relevant` with 0.5. We wrap all computations into a tira_measure.Environment to measure the resources consumed for our computations to later include the used resources into the ir-metadata of our run.

In [3]:
!rm -Rf runs
with tracking(export_format = ExportFormat.IR_METADATA, export_file_path="runs/ir_metadata.yml") as tracked:
    # now we do the "computation"
    predictions = input_data.copy()
    predictions['probability_relevant'] = 0.5

PCM Info: setrlimit for file limit 1000000 failed with error Operation not permitted

=====  Processor information  =====
Linux arch_perfmon flag  : yes
Hybrid processor         : yes
IBRS and IBPB supported  : yes
STIBP supported          : yes
Spec arch caps supported : yes
Max CPUID level          : 32
CPU model number         : 154
ERROR: Can not open /sys/module/msr/parameters/allow_writes file.
PCM Error: can't open MSR handle for core 0 (No such file or directory)
Try no-MSR mode by setting env variable PCM_NO_MSR=1
Can not access CPUs Model Specific Registers (MSRs).
execute 'modprobe msr' as root user, then execute pcm as root user.


## Step 5: Evaluate and Submit Your Run

We use the `wows_evaluate` method imported above to evaluate our predictions and to upload them, to TIRA.

The `wows_evaluate` method has optional parameters that you can pass to describe your system and to include the resource measurements used during your computations in the ir-metadata format into your submission. You can remove those attributes or modify them for your submission accordingly. Call `help(wows_evaluate)` to see a full description.

In [4]:
wows_evaluate(
    predictions,
    DATASET_ID,
    tracking_results=tracked,
    upload=True,
    system_name='naive-pairwise',
    system_description='A naive approach that predicts that each document is relevant with a probabilty of 50%.'
)

Download: 36.8kiB [00:00, 870kiB/s]


Download finished. Extract...
Extraction finished:  /root/.tira/extracted_datasets/wows-eval/pairwise-smoke-test-20250210-training/
Run uploaded to TIRA. Claim ownership via: https://www.tira.io/claim-submission/64fdd4fe-6684-4752-9997-c7ceef0f1d96


Unnamed: 0,system,tau_ap,kendall,spearman,pearson
0,naive-pairwise,0.34,0.361905,0.45,0.45


## Step 6: Register to TIRA and to the WOWS-EVAL task

To finalize your submission to WOWS-EVAL, you must now claim your submission via the URL printed above. For this, please register at [tira.io](https://www.tira.io) and navigate to the [WOWS-EVAL](https://www.tira.io/task-overview/wows-eval/) task and click on "Register". You can choose your team name from a list of [fictional](https://en.wikipedia.org/wiki/Category:Fictional_librarians) and [real](https://en.wikipedia.org/wiki/List_of_librarians) librarians ([please drop a message](#contact) if your favourite team name is not in the list).

For instance, after clicking on the ownership link above, you can claim ownership via a form in TIRA (where you can potentially overwrite the name and description of your system):


![example of claim ownership form](../figures/claim-run-in-tira.png)