# Pyterrier - Example Grid Search

# Preparation

In [1]:
#!pip install python-terrier
!pip install --upgrade git+https://github.com/terrier-org/pyterrier.git#egg=python-terrier

Collecting python-terrier
  Cloning https://github.com/terrier-org/pyterrier.git to /tmp/pip-install-ootrg_ul/python-terrier
  Running command git clone -q https://github.com/terrier-org/pyterrier.git /tmp/pip-install-ootrg_ul/python-terrier
Collecting pyjnius~=1.3.0
[?25l  Downloading https://files.pythonhosted.org/packages/d8/50/098cb5fb76fb7c7d99d403226a2a63dcbfb5c129b71b7d0f5200b05de1f0/pyjnius-1.3.0-cp36-cp36m-manylinux2010_x86_64.whl (1.1MB)
[K     |████████████████████████████████| 1.1MB 2.8MB/s 
Collecting wget
  Downloading https://files.pythonhosted.org/packages/47/6a/62e288da7bcda82b935ff0c6cfe542970f04e29c756b0e147251b2fb251f/wget-3.2.zip
Collecting pytrec_eval
  Downloading https://files.pythonhosted.org/packages/36/0a/5809ba805e62c98f81e19d6007132712945c78e7612c11f61bac76a25ba3/pytrec_eval-0.4.tar.gz
Collecting matchpy
[?25l  Downloading https://files.pythonhosted.org/packages/47/95/d265b944ce391bb2fa9982d7506bbb197bb55c5088ea74448a5ffcaeefab/matchpy-0.5.1-py3-none-any

# Init 

You must run `pt.init()` before other pyterrier functions and classes

Arguments:
 - `version` - Terrier platform version e.g. `"5.2"`    
 - `mem` - megabytes allocated to Java e.g. `4096`      


In [1]:
import pyterrier as pt
if not pt.started():
  pt.init()

Again, we're using the Dataset interface to quickly access a test collection.

In [2]:
vaswani = pt.datasets.get_dataset("vaswani")

# GridSearch - Simple Retrieval Pipeline

Тhe `GridSearch` function allows you to empirically maximise a number of parameters. Say I want to tune BM25's "b" barameter.

First create the BatchRetrieve objects with the configuration you wish to use. I know that, currently, the way to get BM25's "c" parameter is to set a control.

In [3]:
BM25 = pt.BatchRetrieve(vaswani.get_index(), wmodel="BM25", controls={"c" : 0.75}, id="bm25")

Call `pt.GridSearch` with the retrieval transformer, topics, qrels and list of metrics, and a list of parameters to optimise



In [4]:
pt.pipelines.GridSearch(BM25, vaswani.get_topics().head(10), vaswani.get_qrels(), {"bm25" : {"c" : [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1 ]}})


GridSearch: 100%|██████████| 11/11 [00:03<00:00,  3.53it/s]The best ndcg score is: 0.592658
The best parameters map is :
('bm25', 'c', 0.4)



(BR(/Users/craigm/.pyterrier/corpora/vaswani/index/data.properties,{'terrierql': 'on', 'parsecontrols': 'on', 'parseql': 'on', 'applypipeline': 'on', 'localmatching': 'on', 'filters': 'on', 'decorate': 'on', 'wmodel': 'BM25', 'c': 0.4},{'querying.processes': 'terrierql:TerrierQLParser,parsecontrols:TerrierQLToControls,parseql:TerrierQLToMatchingQueryTerms,matchopql:MatchingOpQLParser,applypipeline:ApplyTermPipeline,localmatching:LocalManager$ApplyLocalMatching,qe:QueryExpansion,labels:org.terrier.learning.LabelDecorator,filters:LocalManager$PostFilterProcess', 'querying.postfilters': 'decorate:SimpleDecorate,site:SiteFilter,scope:Scope', 'querying.default.controls': 'wmodel:DPH,parsecontrols:on,parseql:on,applypipeline:on,terrierql:on,localmatching:on,filters:on,decorate:on', 'querying.allowed.controls': 'scope,qe,qemodel,start,end,site,scope,applypipeline', 'termpipelines': 'Stopwords,PorterStemmer'}),
 {'bm25': {'c': 0.4}})

Now we can evaluate this tuned pipelined wrt to an untunted pipeline, and evaluate using the Experiment function. We're comparing both NDCG and MAP metrics, and have asked Experiment to highlight the highest values for each metric.

In [5]:
BM25_untuned = pt.BatchRetrieve(vaswani.get_index(), wmodel="BM25", controls={"c" : 0.75})
pt.Experiment([BM25_untuned, BM25], 
    vaswani.get_topics(), vaswani.get_qrels(), 
    eval_metrics=["ndcg", "map"], 
    names=["BM25 untuned", "BM25 tuned"], highlight="bold")

Unnamed: 0,name,ndcg,map
0,BM25 untuned,0.621197,0.296517
1,BM25 tuned,0.623774,0.299185


Checking the output, its clear that in this case, tuning helped improve performance, as expected.

# GridSearch for complex ranking pipeline

Lets try to tune a more complex example, involving QE. QE has a number of parameters, such as the number of feedback terms and the number of feedback documents.

As we have multiple components in our ranker, we need to use different ids for them.

Finally, note how we reuse the `bm25_for_qe` ranker for first-pass and second-pass retrieval. In this case, setting BN25's b parameter affects both components.

In [6]:
bm25_for_qe = pt.BatchRetrieve(vaswani.get_index(), wmodel="BM25", controls={"c" : 0.75}, id="bm25_qe")

pipe_qe = bm25_for_qe >> pt.rewrite.Bo1QueryExpansion(vaswani.get_index(), fb_terms=10, fb_docs=3, id="bo1") >> bm25_for_qe



Firstly, lets save the results from the untuned pipeline, so we can compare to them later.

In [7]:
default_res = pipe_qe.transform(vaswani.get_topics())

Now, lets configure our parameter map. We're going to use the `list(range())` syntax as a shorthand to typing out lots of combinations.  We print the param_map to see all the possible values.


In [8]:
param_map = {
        "bm25_qe" : { "c" : [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1 ]},
        "bo1" : { 
            "fb_terms" : list(range(1, 12, 3)), 
            "fb_docs" : list(range(2, 30, 6))
        }
}
print(param_map)

{'bm25_qe': {'c': [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1]}, 'bo1': {'fb_terms': [1, 4, 7, 10], 'fb_docs': [2, 8, 14, 20, 26]}}


Now let's run the grid search, and evaluate the outcome. This tuning takes about five minutes on my machine.

In [9]:


_, best_param_map = pt.pipelines.GridSearch(pipe_qe, vaswani.get_topics().head(10), vaswani.get_qrels(), param_map)

pt.Experiment([pt.transformer.SourceTransformer(default_res), pipe_qe], 
    vaswani.get_topics(), vaswani.get_qrels(), 
    eval_metrics=["ndcg", "map"], 
    names=["BM25 + QE untuned", "BM25 + QE tuned (%s)" % best_param_map], highlight="bold")

GridSearch: 100%|██████████| 220/220 [02:05<00:00,  1.76it/s]
The best ndcg score is: 0.621431
The best parameters map is :
('bm25_qe', 'c', 0.3)
('bo1', 'fb_docs', 26)
('bo1', 'fb_terms', 10)


Unnamed: 0,name,ndcg,map
0,BM25 + QE untuned,0.624243,0.304647
1,"BM25 + QE tuned ({'bm25_qe': {'c': 0.3}, 'bo1': {'fb_docs': 26, 'fb_terms': 10}})",0.631028,0.307615


Again, as expected, we see that jointly tuning QE and BM25's b parameter increases effectiveness.