# LTR Experimentation Report

This notebook creates a LTR pipeline and runs several experiments using pyterrier.


## Pre-req

- java installed
- jupyter notebook
- python 3
- pip3
- pyterrier
- sklearn
- numpy
- pandas
- matplotlib


## How to install pyterrier

```
pip install python-terrier
pip install --upgrade git+https://github.com/terrier-org/pyterrier.git#egg=python-terrier
```

In [None]:
import numpy as np
import pandas as pd
import pyterrier as pt
import time
import os
import matplotlib.pyplot as plt  # plotting libraries

if not pt.started(): # initalizes pyterrir. Make sure you are using with unix
  pt.init()

In [None]:
start = time.time() # start time of notebook -- used when running all cells to see how long it will take to run

## Load Dataset

Boolean toggles whether we use the vaswani or deel learning dataset for this notebook.
This block will download and index the deep-learning dataset if we don't already have it downloaded locally.

In [None]:
b = False # vaswani / trec
if b:
    dataset = pt.datasets.get_dataset("vaswani")
    indexref = dataset.get_index()
    topics = dataset.get_topics()
    qrels = dataset.get_qrels()
else:
    dataset = pt.datasets.get_dataset("trec-deep-learning-docs")
    corpus = dataset.get_corpus()
    index_path = './trec_dldocs_index'
    if not os.path.isdir(index_path):
        indexer = pt.TRECCollectionIndexer(index_path)
        index_properties = {'block.indexing': 'true', 'invertedfile.lexiconscanner': 'pointers'}
        indexer.setProperties(**index_properties)

        indexref = indexer.index(dataset.get_corpus())
    else:
        indexref = pt.autoclass("org.terrier.querying.IndexRef").of(os.path.join(index_path, "data.properties"))

    topics = dataset.get_topics('test')
    qrels = dataset.get_qrels('test')

## Naive Rankers

This initializes several word frequency driven IR rankers.

In [None]:
BM25 = pt.BatchRetrieve(indexref, controls = {"wmodel": "BM25"})
TF_IDF =  pt.BatchRetrieve(indexref, controls = {"wmodel": "TF_IDF"})
PL2 =  pt.BatchRetrieve(indexref, controls = {"wmodel": "PL2"})
DPH = pt.BatchRetrieve(indexref, controls = {"wmodel": "DPH"})
PL2F =  pt.BatchRetrieve(indexref, controls = {"wmodel": "PL2F"})

## Feature Retriever

The feature retriever is reponsible for fetching documents and annotating them with features

In [None]:
feature_batch_retriever = pt.FeaturesBatchRetrieve(indexref, controls = {"wmodel": "BM25"}, features=["WMODEL:TF_IDF", "WMODEL:PL2", "WMODEL:DPH", "WMODEL:BM25"]) 

In [None]:
# example query with trimming operator
(BM25 %2).transform("world")

## LTR Algorithm

This section of the code creates the ltr model that we re-rank our top k results with

In [None]:
# create train, test, validate split for dataset
train_topics, valid_topics, test_topics = np.split(topics, [int(.6*len(topics)), int(.8*len(topics))])

In [None]:
# trains vanilla random forest regression LTR model
from sklearn.ensemble import RandomForestRegressor

ltr_model = feature_batch_retriever >> pt.pipelines.LTR_pipeline(RandomForestRegressor(n_estimators=400))
ltr_model.fit(train_topics, qrels)

In [None]:
# displays table of baseline performance of LTR model compared against some of our other statistical models
pt.pipelines.Experiment([TF_IDF, BM25, PL2, ltr_model], test_topics, qrels, ["map", "ndcg"], names=["TF-IDF", "BM25 algorithm", "PL2 Baseline", "LTR Random Forest"])

## Experimentations

Now that we have our LTR model, we can run experiments with it.

In [None]:
import matplotlib.pyplot as plt  # plotting libraries

# this is a simple experiment that compares two retrieval models with varying values for k
def run_experiment(trained_model, mop, start= 10, finish = 600, incrementer = 25, top_k_model = BM25, title="sample title"):
    k_list = []
    moe_list = []
    for k in range(start, finish, incrementer):
        efficient_pipeline = top_k_model % k >> trained_model
        results = pt.pipelines.Experiment([efficient_pipeline], test_topics, qrels, [mop], names=["model"])
        k_list.append(k)
        moe_list.append(results[mop].iloc[0])
        
        plt.plot(k_list, moe_list)
        plt.xlabel("K")
        plt.ylabel(mop)
        file_name = title.replace(" ", "-")
        plt.title(title)
        plt.savefig(file_name)
        
run_experiment(ltr_model, "ndcg", title = "K's Affect on NDCG in Learning to Rank")

In [None]:
run_experiment(ltr_model, "map",incrementer=100,finish=2000, title = "K's Affect on MAP in Learning to Rank")

In [None]:
from datetime import datetime

# this experiments with varying the k value with PL2, BM25 and their LTR pipelines
def run_experiment_execution_time(trained_model, samples = 1, start= 10, finish = 600, incrementer = 20, title="K Verses Time to Rank"):
    k_list = []
    moe_list_mb25 = []
    moe_list_pl2 = []
    moe_bm25_baseline = []
    moe_pl2_baseline = []
    for k in range(start, finish, incrementer):
        total = 0.0
        for i in range(0, samples):
            start_time = datetime.now()
            efficient_pipeline = BM25 % k >> trained_model
            results = pt.pipelines.Experiment([efficient_pipeline], test_topics, qrels, ["map"], names=["model"])
            finish_time = datetime.now()
            elapse_time = (finish_time - start_time).total_seconds()
            total += elapse_time
        k_list.append(k)
        moe_list_mb25.append(total/samples)
        total = 0.0
        for i in range(0, samples):
            start_time = datetime.now()
            efficient_pipeline = PL2 % k >> trained_model
            results = pt.pipelines.Experiment([efficient_pipeline], test_topics, qrels, ["map"], names=["model"])
            finish_time = datetime.now()
            elapse_time = (finish_time - start_time).total_seconds()
            total += elapse_time
        moe_list_pl2.append(total/samples)
        
        total = 0.0
        for i in range(0, samples):
            start_time = datetime.now()
            efficient_pipeline = BM25 % k
            results = pt.pipelines.Experiment([efficient_pipeline], test_topics, qrels, ["map"], names=["model"])
            finish_time = datetime.now()
            elapse_time = (finish_time - start_time).total_seconds()
            total += elapse_time
        moe_bm25_baseline.append(total/samples)
        
        total = 0.0
        for i in range(0, samples):
            start_time = datetime.now()
            efficient_pipeline = PL2 % k
            results = pt.pipelines.Experiment([efficient_pipeline], test_topics, qrels, ["map"], names=["model"])
            finish_time = datetime.now()
            elapse_time = (finish_time - start_time).total_seconds()
            total += elapse_time
        moe_pl2_baseline.append(total/samples)
        
        
        
    plt.plot(k_list, moe_list_mb25, label="LTR BM25 Pipeline")
    plt.plot(k_list, moe_list_pl2, label="LTR PL2 Pipeline")
    plt.plot(k_list, moe_bm25_baseline, label="BM25")
    plt.plot(k_list, moe_pl2_baseline, label="PL2")
    
    plt.xlabel("K")
    plt.ylabel("Time Seconds")
    file_name = title.replace(" ", "-")
    plt.title(title)
    plt.legend()
    plt.savefig(file_name)


In [None]:
run_experiment_execution_time(ltr_model, incrementer=20, samples=3, title="K Verses Time to Rank Trec Deep Learning")

In [None]:
# rather than displaying execution time in terms of ranking the entire corpus, this displays ranking time in terms of executing a single query.
def run_experiment_execution_time_single_query(trained_model, samples = 1, start= 10, finish = 600, incrementer = 20, title="K Verses Time to Rank"):
    k_list = []
    moe_list_mb25 = []
    moe_list_pl2 = []
    moe_bm25_baseline = []
    moe_pl2_baseline = []
    for k in range(start, finish, incrementer):
        total = 0.0
        for i in range(0, samples):
            efficient_pipeline = BM25 % k >> trained_model
            start_time = datetime.now()
            results = (efficient_pipeline).transform("world")
            finish_time = datetime.now()
            elapse_time = (finish_time - start_time).total_seconds()
            total += elapse_time
        k_list.append(k)
        moe_list_mb25.append(total/samples)
        total = 0.0
        for i in range(0, samples):
            efficient_pipeline = PL2 % k >> trained_model
            start_time = datetime.now()
            results = (efficient_pipeline).transform("world")
            finish_time = datetime.now()
            elapse_time = (finish_time - start_time).total_seconds()
            total += elapse_time
        moe_list_pl2.append(total/samples)
        
        total = 0.0
        for i in range(0, samples):
            efficient_pipeline = BM25 % k
            start_time = datetime.now()
            results = (efficient_pipeline).transform("world")
            finish_time = datetime.now()
            elapse_time = (finish_time - start_time).total_seconds()
            total += elapse_time
        moe_bm25_baseline.append(total/samples)
        
        total = 0.0
        for i in range(0, samples):
            efficient_pipeline = PL2 % k
            start_time = datetime.now()
            results = (efficient_pipeline).transform("world")
            finish_time = datetime.now()
            elapse_time = (finish_time - start_time).total_seconds()
            total += elapse_time
        moe_pl2_baseline.append(total/samples)
        
        
        
    plt.plot(k_list, moe_list_mb25, label="LTR BM25 Pipeline")
    plt.plot(k_list, moe_list_pl2, label="LTR PL2 Pipeline")
    plt.plot(k_list, moe_bm25_baseline, label="BM25")
    plt.plot(k_list, moe_pl2_baseline, label="PL2")
    
    plt.xlabel("K")
    plt.ylabel("Time Seconds")
    file_name = title.replace(" ", "-")
    plt.title(title)
    plt.legend()
    plt.savefig(file_name)
run_experiment_execution_time_single_query(ltr_model, incrementer=20, samples=20, title="K Verses Time to Rank Trec Deep Learning Single Query Time")

In [None]:
# runs experiment showing performance of several models at the same time varying the values for k
# the measure of performance (MOP) is variable
def run_experiment_model(trained_model, mop, start= 10, finish = 600, incrementer = 25, title="sample title"):
    k_list = []
    bm25_list = []
    pl2_list = []
    
    bm25_base = []
    pl2_base = []
    for k in range(start, finish, incrementer):
        efficient_pipeline_bm_52 = BM25 % k >> trained_model
        results = pt.pipelines.Experiment([efficient_pipeline_bm_52], test_topics, qrels, [mop], names=["model"])
        k_list.append(k)
        bm25_list.append(results[mop].iloc[0])
        
        efficient_pipeline_pl2 = PL2 % k >> trained_model
        results = pt.pipelines.Experiment([efficient_pipeline_pl2], test_topics, qrels, [mop], names=["model"])
        pl2_list.append(results[mop].iloc[0])
        
        
        efficient_pipeline_bm25 = BM25 % k
        results = pt.pipelines.Experiment([efficient_pipeline_bm25], test_topics, qrels, [mop], names=["model"])
        bm25_base.append(results[mop].iloc[0])
        
        efficient_pipeline_pl2 = PL2 % k
        results = pt.pipelines.Experiment([efficient_pipeline_pl2], test_topics, qrels, [mop], names=["model"])
        pl2_base.append(results[mop].iloc[0])
        
    plt.plot(k_list, bm25_list, label="LTR BM25 Pipeline")
    plt.plot(k_list, pl2_list, label="LTR PL2 Pipeline")
    plt.plot(k_list, bm25_base, label="BM25")
    plt.plot(k_list, pl2_base, label="PL2")
    plt.xlabel("K")
    plt.ylabel(mop)
    file_name = title.replace(" ", "-")
    plt.title(title)
    plt.legend()
    plt.savefig(file_name)

In [None]:
run_experiment_model(ltr_model,"ndcg", title="NDCG vs K Trec Deep Learning")       

In [None]:
run_experiment_model(ltr_model,"map", title="MAP vs K Trec Deep Learning")    

In [None]:
# displays total time it takes for this notebook to run
end = time.time()
print(end - start)