Skip to content

v2.0.0: We are back with bugfixes and improving BEIR after a long break

Latest
Compare
Choose a tag to compare
@thakur-nandan thakur-nandan released this 03 Aug 21:42
· 1 commit to main since this release

After a long stale year full of no changes. I've merged many pull requests and made changes to the BEIR code. You can find the latest changes mentioned here below:

1. Heap Queue for keeping track of top-k documents when evaluating with dense retrieval.

Thanks to @kwang2049, starting from v2.0.0, we include a heap queue for keeping track of top-k documents when using the DenseRetrievalExactSearch class module. This considerably reduces the RAM consumed, especially during the evaluation of large corpora such as MS MARCO or BIOASQ.

The logic remains the same for keeping track of elements during the chunking of the collection.

  • If your heapq is less than k size, push the item, i.e. document into the heap.
  • If your heapq is at max k size, if the item is larger than the smallest item in the heap, push the item on the heap and then pop the smallest element.

2. Removed all major typing errors from the BEIR code.

We removed all typing errors from the BEIR code as we implemented an abstract base class for search. The base class function will take in the corpus, queries, and a top-k value. We return the results, where you would have query_id and corresponding doc_id and score.

class BaseSearch(ABC):

    @abstractmethod
    def search(self, 
               corpus: Dict[str, Dict[str, str]], 
               queries: Dict[str, str], 
               top_k: int, 
               **kwargs) -> Dict[str, Dict[str, float]]:
        pass

Example: evaluate_sbert_multi_gpu.py

3. Updated Faiss Code to include GPU options.

I added the GPU option with FaissSearch base class. Using the GPU can reduce latency immensely. However, sometimes it takes time to transfer the faiss index from CPU to GPU. Pass the use_gpu=True parameter in the DenseRetrievalFaissSearch class to use GPU for faiss inference with PQ, PCA, or with FlatIP Search.

4. New publication -- Resources for Brewing BEIR: Reproducible Reference Models and an Official Leaderboard.

We have a new publication, where we describe our official leaderboard hosted on eval.ai and provide reproducible reference models on BEIR using the Pyserini Repository (https://github.com/castorini/pyserini).

Link to the arxiv version: https://arxiv.org/abs/2306.07471

If you use numbers from our leaderboard, please cite the following paper:

@misc{kamalloo2023resources,
      title={Resources for Brewing BEIR: Reproducible Reference Models and an Official Leaderboard}, 
      author={Ehsan Kamalloo and Nandan Thakur and Carlos Lassance and Xueguang Ma and Jheng-Hong Yang and Jimmy Lin},
      year={2023},
      eprint={2306.07471},
      archivePrefix={arXiv},
      primaryClass={cs.IR}
}