Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarking and documentation improvements towards ann-benchmarks integration #150

Merged
merged 68 commits into from
Sep 13, 2020

Conversation

alexklibisz
Copy link
Owner

@alexklibisz alexklibisz commented Sep 5, 2020

Running ann-benchmarks is still a bit flaky, but this PR includes a bunch of useful improvements to the general benchmarking setup.

External-facing changes:

  • Renamed parameter r in L2Lsh mapping to w, which is more appropriate and common for "width".
  • Updates and fixes in the Python client based on usage for ann-benchmarks. Mainly adding/fixing data classes in elastiknn.api.

Internal changes:

  • Removed continuous benchmark github workflow. Had to constantly tweak it to make it fit under 7gb memory (especially after using tmpfs). Also merging segments takes abnormally long on the GH workers compared to locally.
  • Simplified benchmark app CLI params.
  • Benchmarking argo workflow and local docker-compose use in-memory (i.e. tmpfs) storage for the Elasticsearch. This mainly speeds up indexing: exact glove25 drops from 186s to 114s, exact sift drops from 249s to 123s. No obvious improvement in query speed, likely because the OS is caching access to the Lucene files anyways.
  • Python client uses stored field for document ID.
  • Re-wrote results report script to generate results in markdown which can be easily copied into benchmarks.md doc page.
  • Documentation improvements.

@alexklibisz alexklibisz marked this pull request as ready for review September 13, 2020 17:54
@alexklibisz alexklibisz changed the title WIP: benchmarking improvements Benchmarking and documentation improvements towards ann-benchmarks integration Sep 13, 2020
@alexklibisz alexklibisz merged commit f116893 into master Sep 13, 2020
@alexklibisz alexklibisz deleted the benchmark-workflow branch September 13, 2020 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant