Skip to content

hweller1/avs_performance_testing

Repository files navigation

avs_performance_testing

Run Atlas Vector Search under various conditions to assess performance.

Configurable parameters include:

  • Filtering
  • numCandidates
  • Limit
  • Request concurrency

Future Improvements:

  • Factor out test cases into a config file that can be passed to run.py via a CLI
  • 100M vector performance testing

Official Benchmark

All test results provided in the Atlas Vector Search Benchmark were produced using the scripts titled run_amazon_ecommerce_voyage_15m.py and run_amazon_ecommerce_voyage_multidim.py.

These run scripts use this dataset embedded with voyage-3-large assessing scalar and binary quantized indexes produced using save_voyage_embeddings. The multidimensional script issues queries against an index with 4 different dimensionalities of that embedding model, produced by building a view that slices the original embeddings (detailed here).

Original scripts

Original run.py script use sphere dataset.

Cohere run script uses cohere wikipedia dataset.

Jina/amazon script uses this dataset embedded with jina-embeddings-v3(https://huggingface.co/jinaai/jina-embeddings-v3) and an index that is binary quantized, using saved exact results at 1M and 17M vectors. This also uses a range filter instead of a point filter as the sphere dataset tests used.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages