http://big-ann-benchmarks.com/
The only prerequisite is Python (tested with 3.6) and Docker. Works with newer versions of Python as well but probably requires an updated requirements.txt
on the host. (Suggestion: copy requirements.txt
to requirements${PYTHON_VERSION}.txt
and remove all fixed versions. requirements.txt
has to be kept for the docker containers.)
- Clone the repo.
- Run
pip install -r requirements.txt
(Userequirements_py38.txt
if you have Python 3.8.) - Install docker by following instructions here. You might also want to follow the post-install steps for running docker in non-root user mode.
- Run
python install.py
to build all the libraries inside Docker containers.
The framework assumes that all data is stored in data/
.
Please use a symlink if your datasets and indices are supposed to be stored somewhere else.
The location of the linked folder matters a great deal for SSD-based search performance in T2.
A local SSD such as the one found on Azure Ls-series VMs is better than remote disks, even premium ones.
See T1/T2 for more details.
See http://big-ann-benchmarks.com/ for details on the different datasets.
Before running experiments, datasets have to be downloaded. All preparation can be carried out by calling
python create_dataset.py --dataset [bigann-1B | deep-1B | text2image-1B | ssnpp-1B | msturing-1B | msspacev-1B]
Note that downloading the datasets can potentially take many hours.
For local testing, there exist smaller random datasets random-xs
and random-range-xs
.
Furthermore, most datasets have 1M, 10M and 100M versions, run python create_dataset -h
to get an overview.
Run python run.py --dataset $DS --algorithm $ALGO
where DS
is the dataset you are running on,
and ALGO
is the name of the algorithm. (Use python run.py --list-algorithms
) to get an overview.
python run.py -h
provides you with further options.
The parameters used by the implementation to build and query the index can be found in algos.yaml
.
After running the installation, we can evaluate the baseline as follows.
for DS in bigann-1B deep-1B text2image-1B ssnpp-1B msturing-1B msspacev-1B;
do
python run.py --dataset $DS --algorithm faiss-t1;
done
On a 28-core Xeon E5-2690 v4 that provided 100MB/s downloads, carrying out the baseline experiments took roughly 7 days.
To evaluate the results, run
sudo chmod -R 777 results/
python data_export.py --output res.csv
python3.8 eval/show_operating_points.py --algorithm faiss-t1 --threshold 10000
See Track T1/T2 for more details on evaluation for Tracks T1 and T2.
See Track T3 for more details on evaluation for Track T3.
This project is a version of ann-benchmarks by Erik Bernhardsson and contributors targetting billion-scale datasets.