We study the performance of vector search algorithms such as IVF, HNSW, and Flat on different hardware architectures such as Sapphire Rapids CPU, A100 GPUs, and Intel AMX accelerators.
- CPU/AMX: r7iz.metal-32xl
- GPU: Lambda H100 80GB SXM5
git clone --recursive git@github.com:JayjeetAtGithub/vector-search-hw-benchWe use the Yandex Text-to-Image dataset from here.
It has float32 elements in vectors of 200 dimensions and built using IP as the distance metric.
mkdir -p /workspace/dataset/t2i
cd /workspace/dataset/t2i
axel -o dataset.bin https://storage.yandexcloud.net/yandex-research/ann-datasets/T2I/query.learn.50M.fbin
axel -o query.bin https://storage.yandexcloud.net/yandex-research/ann-datasets/T2I/query.public.100K.fbincd src/
aws s3 cp --recursive s3://cpu-faiss-indexes .# for x86_64 CPU
./install_faiss_cpu.sh
# for x86_64 CPU with NVIDIA GPU
./install_faiss_gpu.shcurl -o- https://get.docker.com | bash
docker pull intel/oneapi:2025.1.0-0-devel-ubuntu24.04
docker run -it -v $PWD:/workspace intel/oneapi:2025.1.0-0-devel-ubuntu24.04 bashwget https://github.com/Kitware/CMake/releases/download/v3.30.5/cmake-3.30.5-linux-x86_64.sh
chmod +x cmake-3.30.5-linux-x86_64.sh
./cmake-3.30.5-linux-x86_64.sh
cd cmake-3.30.5-linux-x86_64
sudo cp -r bin/* /usr/local/bin/
sudo cp -r doc/* /usr/local/doc/
sudo cp -r man/* /usr/local/man/
sudo cp -r share/* /usr/local/share/
sudo cp -r bin/* /usr/bin/
sudo cp -r doc/* /usr/doc/
sudo cp -r man/* /usr/man/
sudo cp -r share/* /usr/share/
cmake --version