# Evaluation Notebook

This notebook contains some evaluations and subsequently some prototyping.

### Step 1: Import All Libraries


In [2]:
from tira.third_party_integrations import ensure_pyterrier_is_loaded
from tira.rest_api_client import Client

# This method ensures that that PyTerrier is loaded so that it also works in the TIRA sandbox
ensure_pyterrier_is_loaded()
import pyterrier as pt

tira = Client()

### Step 2: Some Analysis

In [6]:
dataset_id = 'longeval-tiny-train-20240315-training'
pt_dataset = pt.get_dataset(f'irds:ir-lab-padua-2024/{dataset_id}')

In [15]:
# This assumes we have execited the retrieve-with-pyterrier.ipynb notebook before to create the run.txt file
bm25 = pt.io.read_results('run.txt')

pt.Experiment(
    [bm25],
    pt_dataset.get_topics('title'),
    pt_dataset.get_qrels(),
    ["ndcg_cut.10", "recip_rank", "recall_1000"],
    names=["BM25"]
)

Unnamed: 0,name,ndcg_cut.10,recip_rank,recall_1000
0,BM25,0.177146,0.262423,0.829312


### Step 3: Do some prototyping, e.g., try to improve upon BM25 :)

In [12]:
dataset_id = 'longeval-tiny-train-20240315-training'
pt_dataset = pt.get_dataset(f'irds:ir-lab-padua-2024/{dataset_id}')

In [13]:
print('Build index:')
# Both the indexer and batch retrieve use terriers default porter stemmer and no stopword removal
iter_indexer = pt.IterDictIndexer("/tmp/index", overwrite = True, blocks = True,meta = {'docno':100, 'text': 20480}, stemmer = 'PorterStemmer', stopwords=[])
!rm -Rf /tmp/index
index_ref = iter_indexer.index(pt_dataset.get_corpus_iter())

print('Done. Index is created')

Build index:


Download: 83.2MiB [00:23, 3.74MiB/s]


Download finished. Extract...
Extraction finished:  /root/.tira/extracted_datasets/ir-lab-padua-2024/longeval-tiny-train-20240315-training/


ir-lab-padua-2024/longeval-tiny-train-20240315-training documents: 100%|██████████| 47064/47064 [00:53<00:00, 882.90it/s] 


Done. Index is created


In [17]:
index = pt.IndexFactory.of(index_ref)

bm25_no_stopwords = pt.BatchRetrieve(index, wmodel="BM25", verbose=True)
pl2_no_stopwords = pt.BatchRetrieve(index, wmodel="PL2", verbose=True)

In [18]:
pt.Experiment(
    [bm25_no_stopwords, pl2_no_stopwords, bm25],
    pt_dataset.get_topics('title'),
    pt_dataset.get_qrels(),
    ["ndcg_cut.10", "recip_rank", "recall_1000"],
    names=["BM25 (No stopwrods)", "PL2 (No stopwords)", "BM25"]
)

BR(BM25):   0%|          | 0/672 [00:00<?, ?q/s]

BR(BM25): 100%|██████████| 672/672 [06:41<00:00,  1.67q/s]
BR(PL2): 100%|██████████| 672/672 [06:55<00:00,  1.62q/s]


Unnamed: 0,name,ndcg_cut.10,recip_rank,recall_1000
0,BM25 (No stopwrods),0.166217,0.245907,0.818721
1,PL2 (No stopwords),0.150624,0.228728,0.777832
2,BM25,0.177146,0.262423,0.829312


### Next Steps

The default stopword list seems to be more effective than using no stop wordlist :)

For more examples and inspiration on potential ideas that you can try out, you can have a look at the following resources:

- A dashboard overviewing retrieval components available in [TIRA](https://www.tira.io/)/[TIREx](https://www.tira.io/tirex) with tutorials on how you could re-use them: [https://tira-io.github.io/teaching-ir-with-shared-tasks/](https://tira-io.github.io/teaching-ir-with-shared-tasks/)
- Pyterrier Tutorial: [https://github.com/terrier-org/ecir2021tutorial](https://github.com/terrier-org/ecir2021tutorial)