# Tutorial on Retrieval Engines for the lsr-benchmark

This tutorial aims to show step-by-step how retrieval engines work in the `lsr-benchmark` and how  you can evaluate your own retrieval engine against existing ones.

You can find an overview over all retrieval engines that are implemented in the [step-03-retrieval-approaches](https://github.com/reneuir/lsr-benchmark/tree/main/step-03-retrieval-approaches) directory.

We would be happy if you want to contribute new retrieval engines, please add them to the step-03-retrieval-approaches directory via a pull request.

## Step 1: Execute Retrieval Engines

All retrieval engines are implemented in the `step-03-retrieval-approaches` directory.

For this tutorial, we want to run PyTerrier splade engine implemented in the script [pyterrier-splade/run-pyterrier-splade.py](https://github.com/reneuir/lsr-benchmark/blob/main/step-03-retrieval-approaches/pyterrier-splade/run-pyterrier-splade.py) as this does not need many dependencies for execution.

We will execute the retrieval engine on different embeddings on the `clueweb09/en/trec-web-2009` dataset.

First, we run the `--help` command to see what options we have:

In [2]:
!../step-03-retrieval-approaches/pyterrier-splade/run-pyterrier-splade.py --help

Usage: run-pyterrier-splade.py [OPTIONS]

Options:
  --k INTEGER                   Number of results to return per each query.
  --embedding EMBEDDING_OR_DIR  The embedding model.
  --output PATH                 The directory where the output should be
                                stored.  [required]
  --dataset DATASET_OR_DIR      The dataset id or a local directory.
                                [required]
  --help                        Show this message and exit.


Next, we get an overview of the available embeddings:

In [5]:
!../step-03-retrieval-approaches/pyterrier-splade/run-pyterrier-splade.py \
    --output does-not-matter \
    --dataset clueweb09/en/trec-web-2009 \
    --embedding does-not-exist

Usage: run-pyterrier-splade.py [OPTIONS]
Try 'run-pyterrier-splade.py --help' for help.

Error: Invalid value for '--embedding': 'does-not-exist' is not a supported embedding (bge-m3, bm25, castorini-unicoil-noexp-msmarco-passage, naver-splade-v3, naver-splade-v3-distilbert, naver-splade-v3-doc, naver-splade-v3-lexical, naver-splade_v2_distil, opensearch-project-opensearch-neural-sparse-encoding-doc-v2-distill, opensearch-project-opensearch-neural-sparse-encoding-doc-v2-mini, opensearch-project-opensearch-neural-sparse-encoding-doc-v3-distill, opensearch-project-opensearch-neural-sparse-encoding-v2-distill, webis-splade) or a valid directory path


Now, we run it first on webis/splade, then on OpenSearch-Doc-v3-Distill as embedding models

In [6]:
!../step-03-retrieval-approaches/pyterrier-splade/run-pyterrier-splade.py \
    --output webis-splade \
    --dataset clueweb09/en/trec-web-2009 \
    --embedding webis-splade

Java started and loaded: pyterrier.java, pyterrier.terrier.java [version=5.11 (build: craig.macdonald 2025-01-13 21:29), helper_version=0.0.8]
load doc embeddings: 100%|█████████| 109683/109683 [00:00<00:00, 2193526.12it/s]
transform dataset: 100%|██████████████| 109683/109683 [00:19<00:00, 5753.18it/s]

=====  Processor information  =====
Linux arch_perfmon flag  : yes
Hybrid processor         : yes
IBRS and IBPB supported  : yes
STIBP supported          : yes
Spec arch caps supported : yes
Max CPUID level          : 32
CPU model number         : 154
Number of logical cores: 16
Number of online logical cores: 16
Threads (logical cores) per physical core: 2 (maybe imprecise due to core offlining/hybrid CPU)
Offlined cores: 
Num sockets: 1
Physical cores per socket: 8 (maybe imprecise due to core offlining/hybrid CPU)
Core PMU (perfmon) version: 5
Number of core PMU generic (programmable) counters: 6
Width of generic (programmable) counters: 48 bits
Number of core PMU fixed counters: 3


In [7]:
!../step-03-retrieval-approaches/pyterrier-splade/run-pyterrier-splade.py \
    --output opensearch-doc-v3 \
    --dataset clueweb09/en/trec-web-2009 \
    --embedding opensearch-project-opensearch-neural-sparse-encoding-doc-v3-distill

Java started and loaded: pyterrier.java, pyterrier.terrier.java [version=5.11 (build: craig.macdonald 2025-01-13 21:29), helper_version=0.0.8]
load doc embeddings: 100%|█████████| 109683/109683 [00:00<00:00, 1580933.85it/s]
transform dataset: 100%|██████████████| 109683/109683 [00:21<00:00, 5147.14it/s]

=====  Processor information  =====
Linux arch_perfmon flag  : yes
Hybrid processor         : yes
IBRS and IBPB supported  : yes
STIBP supported          : yes
Spec arch caps supported : yes
Max CPUID level          : 32
CPU model number         : 154
Number of logical cores: 16
Number of online logical cores: 16
Threads (logical cores) per physical core: 2 (maybe imprecise due to core offlining/hybrid CPU)
Offlined cores: 
Num sockets: 1
Physical cores per socket: 8 (maybe imprecise due to core offlining/hybrid CPU)
Core PMU (perfmon) version: 5
Number of core PMU generic (programmable) counters: 6
Width of generic (programmable) counters: 48 bits
Number of core PMU fixed counters: 3


# Step 6: Run Evaluation

Finally, we can make an efficiency/effectiveness-oriented evaluation of the two runs that we executed.

In [20]:
!lsr-benchmark evaluate webis-splade opensearch-doc-v3

100%|█████████████████████████████████████████████| 2/2 [00:01<00:00,  1.42it/s]
                                                 webis-splade           opensearch-doc-v3
index.runtime_wallclock                              66055 ms                    53027 ms
index.energy_total                                       13.0                        10.0
retrieval.runtime_wallclock                           5538 ms                     1380 ms
retrieval.energy_total                                    1.0                         0.0
embedding/doc.runtime_wallclock                     384318 ms                   420054 ms
embedding/doc.energy_total                            76231.0                     91142.0
embedding/query.runtime_wallclock                     1969 ms                     1075 ms
embedding/query.energy_total                            136.0                        77.0
nDCG(judged_only=True)@10                            0.243467                    0.253003
RR                 