Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ANN_BENCH #130

Open
wants to merge 79 commits into
base: branch-24.08
Choose a base branch
from
Open

ANN_BENCH #130

wants to merge 79 commits into from

Conversation

achirkin
Copy link
Contributor

@achirkin achirkin commented May 17, 2024

Porting the ANN benchmarks from RAFT.

  • Make it build

Sanity check that benchmarks work (runs and gives reasonable recall for Deep-1M dataset)

  • cuVS brute force kNN
  • cuVS IVF-Flat
  • cuVS IVF-PQ (+ refinement)
  • cuVS CAGRA
  • cuVS CAGRA-Q (+refinement)
  • Faiss GPU/CPU IVF-Flat & IVF-PQ
  • HNSW
  • CAGRA + HNSW
  • GGNN

NB: the indices built using the old ANN_BENCH in raft tend to crash in cuvs search benchmarks during index deserialization - don't forget to build the indexes anew when testing.

@achirkin achirkin changed the title ANN_BENCH [WIP] ANN_BENCH May 17, 2024
@cjnolet cjnolet added improvement Improves an existing functionality non-breaking Introduces a non-breaking change benchmarking labels May 17, 2024
@achirkin achirkin requested a review from cjnolet June 5, 2024 18:50
@achirkin
Copy link
Contributor Author

achirkin commented Jun 6, 2024

I've just realized the benchmarks are not compiled during conda-cpp-build CI. @cjnolet what's the best way to add the benchmark component to CI build?

@achirkin
Copy link
Contributor Author

achirkin commented Jun 6, 2024

NB: some algorithms are likely to fail with OOM thrown by the limiting_resource_adapter with very large datasets (e.g. DEEP-1B); the fix is in #181

@achirkin achirkin requested a review from a team as a code owner June 10, 2024 07:58
@achirkin achirkin requested a review from jameslamb June 10, 2024 07:58
@achirkin
Copy link
Contributor Author

I tried to add benchmarks to the test build in CI, but got the error from faiss not being able to find BLAS libraries, even though I added openblas as a conda dependency. @cjnolet, @benfred, could you please have a look?

Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Artem for the updates. This PR ports the existing infrastructure from raft and enables us to run all the existing the benchmarks. I believe it would be useful to have this merged, and work o follow up improvements in separate PRs. I have added open discussion points to the tracker issue #160 (comment).

The PR looks good to me. I have also contributed to the PR, so I shall not be the only approver.

Copy link
Member

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed on behalf of packaging-codeowners. Left 2 small comments, neither needs to block merging this.

conda/recipes/libcuvs/meta.yaml Outdated Show resolved Hide resolved
@@ -1,5 +1,5 @@
#!/usr/bin/env bash
# Copyright (c) 2022-2024, NVIDIA CORPORATION.

./build.sh tests --allgpuarch --no-nvtx --build-metrics=tests_bench --incl-cache-stats
./build.sh tests bench-ann --allgpuarch --no-nvtx --build-metrics=tests_bench --incl-cache-stats
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Over in RAFT, bench-ann is its own package with its own dependencies, build scripts, etc.:

https://github.com/rapidsai/raft/tree/877644a423c0268746af62cecb7150afa65d8386/python/raft-ann-bench

https://github.com/rapidsai/raft/blob/877644a423c0268746af62cecb7150afa65d8386/conda/recipes/raft-ann-bench/meta.yaml

For my own understanding in reviewing this... why is this being added to the libcuvs-tests package here instead of creating a new standalone package as exists in RAFT?

Having these separate things be their own packages can be helpful for parallelizing development, limiting the potential impact of packaging changes, and speeding up debugging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that benchmarks should probably be a separate package. I'm only hesitating to add this because I'm not very familiar with this conda/CI setup. I hoped to postpone this question till a follow-on PR together with the naming decision (#160 (comment)). @benfred, @cjnolet , what do you think?

@@ -55,6 +55,7 @@ option(BUILD_SHARED_LIBS "Build cuvs shared libraries" ON)
option(BUILD_TESTS "Build cuvs unit-tests" ON)
option(BUILD_C_LIBRARY "Build raft C API library" OFF)
option(BUILD_C_TESTS "Build raft C API tests" OFF)
option(BUILD_ANN_BENCH "Build cuVS ann benchmarks" OFF)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add an option to build this in CI?

It looks to me that this is off by default (which is fine), but that does mean that we don't have any guarantee that any of this is compiling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We enable this currently together with tests in build.sh (see #130 (comment))

@achirkin achirkin requested a review from benfred June 19, 2024 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmarking CMake cpp improvement Improves an existing functionality non-breaking Introduces a non-breaking change
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

None yet

7 participants