# IR Lab Tutorial: PyTerrier Artifacts

This tutorial shows how to use [PyTerrier Artifacts](https://dl.acm.org/doi/abs/10.1145/3726302.3730147). The idea of using artifacts it to simplify experiments and to improve reproducibility.

In the following, we will some artifacts for the [2025 IR lab in the courses of Jena, Kassel, and Radboud](https://archive.tira.io/task-overview/ir-lab-wise-2025) that we aim to encourage in the [2026 edition of the WOWS workshop](https://github.com/OpenWebSearch/wows-code/tree/main/ecir26) at [ECIR 2026](https://ecir2026.eu/).

## Preparation: Install dependencies

In [None]:
!pip3 install 'python-terrier>=1.0' 'tira>=0.0.189'

## Our Scenario

We want to simplify the access to some Artifacts. We basically aim to allow simple comparisons against other approaches by directly loading runs (having the run allows statistical comparisons against strong baselines) and prepared indexes.

An complete overview (currently in alpha) of PyTerrier Artifacts available in TIRA is available at: [tira.io/tirex/pyterrier-artifacts-beta?query=ir-lab-wise-2025](https://www.tira.io/tirex/pyterrier-artifacts-beta?query=ir-lab-wise-2025)


In [2]:
import pyterrier as pt
from tira.third_party_integrations import ensure_pyterrier_is_loaded

dataset_id = 'radboud-validation-20251114-training'
ensure_pyterrier_is_loaded()
pt_dataset = pt.datasets.get_dataset(f"irds:ir-lab-wise-2025/{dataset_id}")

## Usage of Artifacts for Comparisons Against Baselines

First, we load some runs as baselines for comparison

In [3]:
chatnoir_bm25 = pt.Artifact.from_url(f"tira:{dataset_id}/ows/chatnoir-title-bm25-100")
chatnoir_default = pt.Artifact.from_url(f"tira:{dataset_id}/ows/chatnoir-title-default-10")
pyterrier_bm25 = pt.Artifact.from_url(f"tira:{dataset_id}/ows/pyterrier-BM25-on-default")
pyterrier_dirichlet = pt.Artifact.from_url(f"tira:{dataset_id}/ows/pyterrier-DirichletLM-on-default")


In [5]:
pt.Experiment(
    [chatnoir_bm25, chatnoir_default, pyterrier_bm25, pyterrier_dirichlet],
    pt_dataset.get_topics("title"),
    pt_dataset.get_qrels(),
    ["ndcg_cut.10"],
    ["ChatNoir (BM25)", "ChatNoir (Default)", "PyTerrier (BM25)", "PyTerrier (DirichletLM)"]
)

Unnamed: 0,name,ndcg_cut.10
0,ChatNoir (BM25),0.24184
1,ChatNoir (Default),0.398346
2,PyTerrier (BM25),0.451635
3,PyTerrier (DirichletLM),0.352952


## Usage of Artifacts for Simplified Development

In the following, we will load an PyTerrier Index via the Artifacts API from TIRA. On the dataset that we use, creating an PyTerrier index takes ca. 30 minutes, so directly loading the index can simplify development.

In [11]:
index = pt.Artifact.from_url(f"tira:{dataset_id}/ows/pyterrier-index-default")

bm25 = pt.terrier.Retriever(index, wmodel="BM25")
pl2 = pt.terrier.Retriever(index, wmodel="PL2")

In [12]:
pt.Experiment(
    [bm25, pl2],
    pt_dataset.get_topics("title"),
    pt_dataset.get_qrels(),
    ["ndcg_cut.10"]
)

Unnamed: 0,name,ndcg_cut.10
0,TerrierRetr(BM25),0.451635
1,TerrierRetr(PL2),0.47269
