# PyTerrier ANCE Demo Notebook - Vaswani

This notebook demonstrates use of [PyTerrier plugin for ANCE](https://github.com/terrierteam/pyterrier_ance) for dense passage retrieval. 

[ANCE](https://github.com/microsoft/ANCE) is a dense retrieval system leveraging single representations to encode documents and queries. ANCE does not require combination with sparse retrieval. ANCE leverages a training mechanism that constructs negatives from an Approximate Nearest Neighbor (ANN) index of the corpus, which is parallelly updated with the learning process to select more realistic negative training instances than the negative training instances selected by a sparse retrieval mechanism.

ANCE is built on top of [BERT](https://arxiv.org/abs/1810.04805), and it nearly matches the accuracy of sparse retrieval and BERT reranking using dot-product in the ANCE-learned representation space and provides almost 100x speed-up.

The corpus used in this demo is the [Vaswani NPL corpus](http://ir.dcs.gla.ac.uk/resources/test_collections/npl/), a corpus of 11,429 scientific abstract, with corresponding queries and relevance assessments.

## Installation 

We need to install [PyTerrier](https://github.com/terrier-org/pyterrier).

In [1]:
!pip install -q python-terrier

[ANCE](https://github.com/microsoft/ANCE) requires [FAISS](https://github.com/facebookresearch/faiss), a library for efficient similarity search and clustering of dense vectors.

This is the setup for FAISS on Colab. YMMV outside of Colab.

In [2]:
!apt install libomp-dev
!pip install faiss

Reading package lists... Done
Building dependency tree       
Reading state information... Done
libomp-dev is already the newest version (5.0.1-1).
0 upgraded, 0 newly installed, 0 to remove and 39 not upgraded.


This installs the [PyTerrier plugin for ANCE](https://github.com/terrierteam/pyterrier_ance). It supplies an indexer and a retrieval transformer. This also installs [ANCE](https://github.com/microsoft/ANCE).

In [3]:
!pip install --upgrade git+https://github.com/terrierteam/pyterrier_ance.git

  Building wheel for pyterrier-ance (setup.py) ... [?25l[?25hdone


# Setup

Lets get [PyTerrier](https://github.com/terrier-org/pyterrier) started. This will download the latest version of the [Terrier](http://terrier.org) IR platform.

In [4]:
import pyterrier as pt
pt.init(tqdm='notebook')

  from pandas import Panel


PyTerrier 0.6.0 has loaded Terrier 5.5 (built by craigmacdonald on 2021-05-20 13:12)


We are using the [Vaswani dataset](http://ir.dcs.gla.ac.uk/resources/test_collections/npl/) – lets collect the topics & qrels.

In [5]:
dataset = pt.get_dataset("irds:vaswani")

This downloads the model checkpoint listed on the [ANCE github repository](https://github.com/microsoft/ANCE/#results). Download time can vary, on average it requires 11-12 minutes.

In [6]:
import os
if not os.path.exists("Passage_ANCE_FirstP_Checkpoint.zip"):
  !wget https://webdatamltrainingdiag842.blob.core.windows.net/semistructstore/OpenSource/Passage_ANCE_FirstP_Checkpoint.zip
  !unzip Passage_ANCE_FirstP_Checkpoint.zip

## Indexing

This indexes the [Vaswani dataset](http://ir.dcs.gla.ac.uk/resources/test_collections/npl/). Indexing takes about 3 minutes using a Colab GPU.

In [7]:
!rm -rf /content/anceindex

import pyterrier_ance
indexer = pyterrier_ance.ANCEIndexer(checkpoint_path="/content/Passage ANCE(FirstP) Checkpoint",
                                     index_path="/content/anceindex",
                                     num_docs=11429)
indexer.index(dataset.get_corpus_iter())

HBox(children=(FloatProgress(value=0.0, description='vaswani documents', max=11429.0, style=ProgressStyle(desc…

Using mean: False


HBox(children=(FloatProgress(value=0.0, description='Indexing', max=11429.0, style=ProgressStyle(description_w…

Segment 0


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Inferencing', max=1.0, style=ProgressSt…

Not running in distributed mode





'/content/anceindex'

We will not need the indexer anymore, so we free up some memory.

In [8]:
del(indexer)

The indexing procedure generates a number of [FAISS](https://github.com/facebookresearch/faiss) shards, together with some additional files.

In [9]:
!ls /content/anceindex

0.docids.pkl  0.faiss  shards.pkl


# Retrieval

Now that indexing has completed, we can load in the index and the checkpoint model (which we will need for encoding queries). Index loading can take some times, as the [FAISS](https://github.com/facebookresearch/faiss) shards need to be loaded in main memory.

In [10]:
ance_retr = pyterrier_ance.ANCERetrieval(checkpoint_path="/content/Passage ANCE(FirstP) Checkpoint",
                                        index_path="/content/anceindex")

Loading model
Using mean: False
Loading shard metadata


HBox(children=(FloatProgress(value=0.0, description='Loading shards', max=1.0, style=ProgressStyle(description…




Here we can ask [PyTerrier](https://github.com/terrier-org/pyterrier) to search the [ANCE](https://github.com/microsoft/ANCE) index for `'chemical reactions'`, returning the top 10 relevant documents.

In [11]:
(ance_retr % 10).search("chemical reactions")

***** inference of 1 queries *****


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Inferencing', max=1.0, style=ProgressSt…

Not running in distributed mode

***** faiss search for 1 queries on 1 shards *****


HBox(children=(FloatProgress(value=0.0, max=1.0), HTML(value='')))




Unnamed: 0,qid,docid,docno,score,rank
0,1,7048,7049,709.171814,0
1,1,3451,3452,708.950439,1
2,1,1605,1606,708.893311,2
3,1,9373,9374,708.687378,3
4,1,5507,5508,708.424622,4
5,1,10059,10060,708.145691,5
6,1,7921,7922,708.093506,6
7,1,10540,10541,708.003906,7
8,1,8157,8158,707.991089,8
9,1,6285,6286,707.990051,9


# Running an Experiment

Lets prepare an experiment. Firstly, lets create in a BM25 baseline transformer.

In [12]:
bm25 = pt.BatchRetrieve(pt.get_dataset("vaswani").get_index(), wmodel="BM25")

You can also use ANCE as a re-ranker. We'll compare with that baseline as well here.

In [13]:
ance_rerank = (bm25 % 100) >> pt.text.get_text(dataset, 'text') >> pyterrier_ance.ANCETextScorer(checkpoint_path="/content/Passage ANCE(FirstP) Checkpoint")

Using mean: False


Finally, lets evaluate our performance. We also load in an BM25 index for the same corpus for comparison reasons.

In [15]:
pt.Experiment(
    [bm25, ance_rerank, ance_retr], 
    dataset.get_topics(), 
    dataset.get_qrels(), 
    eval_metrics=["map", "recip_rank", "mrt"],
    names=['BM25', 'BM25 >> ANCE Re-Rank', 'ANCE']
    )

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Inferencing', max=1.0, style=ProgressSt…

Not running in distributed mode



HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Inferencing', max=1.0, style=ProgressSt…

Not running in distributed mode

***** inference of 93 queries *****


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Inferencing', max=1.0, style=ProgressSt…

Not running in distributed mode

***** faiss search for 93 queries on 1 shards *****


HBox(children=(FloatProgress(value=0.0, max=1.0), HTML(value='')))




Unnamed: 0,name,map,recip_rank,mrt
0,BM25,0.296517,0.725665,23.543057
1,BM25 >> ANCE Re-Rank,0.228006,0.6965,724.781738
2,ANCE,0.1514,0.668049,9.951693


So on this collection, ANCE isnt as effective under MAP or MRR (either as a ranker or a BM25 re-ranker), but the ranker does have a lower mean response time.