<a href="https://colab.research.google.com/github/beir-nlp/beir/blob/main/Retrieval_Example.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **BEIR Retrieval Example** 

This notebook shows the usage of BEIR for passage retrieval.

Here, we show retrieval using various dense retriever models including Sentence-Transformers, USE-QA and DPR.

In [18]:
!nvidia-smi

Wed Jan 27 14:17:10 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla K80           Off  | 00000000:00:04.0 Off |                    0 |
| N/A   42C    P0    64W / 149W |   8538MiB / 11441MiB |      0%      Default |
|                               |                      |                 ERR! |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [2]:
!pip install beir

Collecting beir
  Downloading https://files.pythonhosted.org/packages/82/b4/5bbfec1357e7898e6f4d19aca3b4664efbe9e2e4989744d6df84f72ff8b6/beir-0.0.5-py3-none-any.whl
Collecting tensorflow-text
[?25l  Downloading https://files.pythonhosted.org/packages/a0/86/22ad798f94d564c3e423758b60ddd3689e83ad629b3f31ff2ae45a6e3eed/tensorflow_text-2.4.3-cp36-cp36m-manylinux1_x86_64.whl (3.4MB)
[K     |████████████████████████████████| 3.4MB 6.3MB/s 
[?25hCollecting pytrec-eval
  Downloading https://files.pythonhosted.org/packages/2e/03/e6e84df6a7c1265579ab26bbe30ff7f8c22745aa77e0799bba471c0a3a19/pytrec_eval-0.5.tar.gz
Collecting sentence-transformers
[?25l  Downloading https://files.pythonhosted.org/packages/6a/e2/84d6acfcee2d83164149778a33b6bdd1a74e1bcb59b2b2cd1b861359b339/sentence-transformers-0.4.1.2.tar.gz (64kB)
[K     |████████████████████████████████| 71kB 7.5MB/s 
Collecting transformers<5.0.0,>=3.1.0
[?25l  Downloading https://files.pythonhosted.org/packages/88/b1/41130a228dd656a1a31ba2

# **Download Dataset**

You can download any dataset from the ones mentioned below, we use NFCorpus for our example.

Warning: Exact search on big datasets can often take time (few hours)!

We Provide IR evaluation for the following datasets:

*   TREC-COVID
*   NFCorpus
*   NQ
*   HotpotQA
*   NewsQA
*   FiQA
*   ArguAna
*   Touche-2020
*   CQaDupstack
*   Quora
*   DBPedia-v2
*   SCIDOCS
*   FEVER
*   Climate-FEVER



In [5]:
import pathlib, os

from beir import util

dataset = "nfcorpus.zip"
url = "https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/{}".format(dataset)
out_dir = os.path.join(os.getcwd(), "datasets")

data_path = util.download_and_unzip(url, out_dir)

Downloading nfcorpus.zip ...
Unzipping nfcorpus.zip ...


# **Folder Structure**

* nfcorpus/
    * corpus.jsonl
    * queries.jsonl
    * qrels/

        * train.tsv
        * dev.tsv
        * test.tsv

In [19]:
!ls datasets/nfcorpus/

corpus.jsonl  qrels  queries.jsonl


# **Data Loading**

In [8]:
from beir.datasets.data_loader import GenericDataLoader

data_path = "datasets/nfcorpus"
corpus, queries, qrels = GenericDataLoader(data_path).load(split="test")

# **Data Format**


# **Exact Search using Sentence-Transformers**

We use the [``distilroberta-base-msmarco-v2``](https://www.sbert.net/docs/pretrained-models/msmarco-v2.html) model in this example.

In [10]:
from beir.retrieval.evaluation import EvaluateRetrieval

retriever = EvaluateRetrieval(model="sbert", model_name="distilroberta-base-msmarco-v2")
results = retriever.retrieve(corpus, queries, qrels)

100%|██████████| 305M/305M [00:35<00:00, 8.66MB/s]


Encoding Queries...


HBox(children=(FloatProgress(value=0.0, description='Batches', max=3.0, style=ProgressStyle(description_width=…


Encoding Corpus...


HBox(children=(FloatProgress(value=0.0, description='Batches', max=29.0, style=ProgressStyle(description_width…




# **Exact Search using DPR**

We use the [``facebook/dpr-question_encoder-single-nq-base``](https://huggingface.co/transformers/model_doc/dpr.html#dprquestionencoder) and the [``facebook/dpr-ctx_encoder-single-nq-base``](https://huggingface.co/transformers/model_doc/dpr.html#dprcontextencoder) models in this example.

In [21]:
import tqdm
from beir.retrieval.evaluation import EvaluateRetrieval

retriever = EvaluateRetrieval(model="dpr")
results = retriever.retrieve(corpus, queries, qrels)


que:   0%|          | 0/3 [00:00<?, ?it/s][A

Encoding Queries...



que:  67%|██████▋   | 2/3 [00:00<00:00,  3.93it/s][A
que: 100%|██████████| 3/3 [00:00<00:00,  4.10it/s]

pas:   0%|          | 0/29 [00:00<?, ?it/s][A

Encoding Corpus...



pas:   3%|▎         | 1/29 [00:00<00:26,  1.06it/s][A
pas:   7%|▋         | 2/29 [00:09<01:23,  3.10s/it][A
pas:  10%|█         | 3/29 [00:17<02:00,  4.63s/it][A
pas:  14%|█▍        | 4/29 [00:25<02:22,  5.70s/it][A
pas:  17%|█▋        | 5/29 [00:33<02:34,  6.46s/it][A
pas:  21%|██        | 6/29 [00:41<02:40,  6.99s/it][A
pas:  24%|██▍       | 7/29 [00:50<02:42,  7.38s/it][A
pas:  28%|██▊       | 8/29 [00:58<02:40,  7.65s/it][A
pas:  31%|███       | 9/29 [01:06<02:36,  7.84s/it][A
pas:  34%|███▍      | 10/29 [01:15<02:31,  7.98s/it][A
pas:  38%|███▊      | 11/29 [01:23<02:25,  8.09s/it][A
pas:  41%|████▏     | 12/29 [01:31<02:18,  8.17s/it][A
pas:  45%|████▍     | 13/29 [01:40<02:11,  8.22s/it][A
pas:  48%|████▊     | 14/29 [01:48<02:03,  8.25s/it][A
pas:  52%|█████▏    | 15/29 [01:56<01:55,  8.29s/it][A
pas:  55%|█████▌    | 16/29 [02:05<01:47,  8.30s/it][A
pas:  59%|█████▊    | 17/29 [02:13<01:39,  8.31s/it][A
pas:  62%|██████▏   | 18/29 [02:21<01:31,  8.32s/it][A


RuntimeError: ignored

# **Exact Search using USE-QA**

We use the [``universal-sentence-encoder-qa/3``](https://tfhub.dev/google/universal-sentence-encoder-qa/3) model in our example.

In [22]:
import tqdm
from beir.retrieval.evaluation import EvaluateRetrieval

retriever = EvaluateRetrieval(model="use-qa")
results = retriever.retrieve(corpus, queries, qrels)

INFO:absl:Using /tmp/tfhub_modules to cache modules.
INFO:absl:Downloading TF-Hub Module 'https://tfhub.dev/google/universal-sentence-encoder-qa/3'.
INFO:absl:Downloading https://tfhub.dev/google/universal-sentence-encoder-qa/3: 550.04MB
INFO:absl:Downloaded https://tfhub.dev/google/universal-sentence-encoder-qa/3, Total size: 588.94MB
INFO:absl:Downloaded TF-Hub Module 'https://tfhub.dev/google/universal-sentence-encoder-qa/3'.


InternalError: ignored

# **Evaluation**

We evaluate IR models ``NDCG@k``, ``MAP@K`` and ``Recall@K``.

For this Example, we chose to evaluate at k = ``[1,3,5,10,100,1000]``

In [23]:
ndcg, _map, recall = retriever.evaluate(qrels, results, retriever.k_values)

In [24]:
print(ndcg)

{'ndcg@1': 0.33282, 'ndcg@3': 0.27662, 'ndcg@5': 0.25527, 'ndcg@10': 0.23186, 'ndcg@100': 0.21083, 'ndcg@1000': 0.30276}


In [25]:
print(_map)

{'map@1': 0.04549, 'map@3': 0.06389, 'map@5': 0.07227, 'map@10': 0.08216, 'map@100': 0.09821, 'map@1000': 0.1084}


In [26]:
print(recall)

{'recall@1': 0.04549, 'recall@3': 0.06965, 'recall@5': 0.08473, 'recall@10': 0.10574, 'recall@100': 0.21654, 'recall@1000': 0.53311}
