Skip to content

codelibs/search-ann-benchmark

Repository files navigation

Search ANN Benchmark

Benchmark the search performance of Approximate Nearest Neighbor (ANN) algorithms implemented in various systems. This repository contains notebooks and scripts to evaluate and compare the efficiency and accuracy of ANN searches across different platforms.

Introduction

Approximate Nearest Neighbor (ANN) search algorithms are essential for handling high-dimensional data spaces, enabling fast and resource-efficient retrieval of similar items from large datasets. This benchmarking suite aims to provide an empirical basis for comparing the performance of several popular ANN-enabled search systems.

Prerequisites

Before running the benchmarks, ensure you have the following installed:

  • Docker
  • Python 3.10 or higher

Setup Instructions

  1. Prepare the Environment:

    Create directories for datasets and output files, then download the necessary datasets using the provided script.

    /bin/bash ./scripts/setup.sh
  2. Install Dependencies:

    Install all required Python libraries.

    pip install -r requirements.txt

Benchmark Notebooks

The repository includes the following Jupyter notebooks for conducting benchmarks:

Notebook GitHub Actions
Elasticsearch Run Elasticsearch on Linux
Milvus Run Milvus on Linux
OpenSearch Run OpenSearch on Linux
pgvector Run PGVector on Linux
Qdrant Run Qdrant on Linux
Vespa Run Vespa on Linux
Weaviate Run Weaviate on Linux

Each notebook guides you through the process of setting up the test environment, loading the dataset, executing the search queries, and analyzing the results.

Benchmark Results

For a comparison of the results, including response times and precision metrics for different ANN algorithms, see Benchmark Results Page.

Contributing

We welcome contributions! If you have suggestions for additional benchmarks, improvements to existing ones, or fixes for any issues, please feel free to open an issue or submit a pull request.

License

This project is licensed under the Apache License 2.0.

About

Evaluating and comparing ANN search algorithms across various platforms

Resources

Stars

Watchers

Forks